How to run Nutch in Hadoop installed in pseudo-distributed mode

0 votes

I have Nutch 1.13 installed on my Ubuntu. I can run a crawl in standalone mode. It successfully runs and produces the desired results but I have no idea how to run it in hadoop now? I have Hadoop installed in pseudo distributed mode and I want to run a Nutch crawl with Hadoop and monitor it. How can I do it? There are a lot of tutorials for running it in standalone mode but I couldn't find any clear instructions on how Can I run it in Hadoop except that I have to use "Nutch Job" after I build it with ant.

Jan 24 in Big Data Hadoop by Neha
• 6,280 points
48 views

1 answer to this question.

0 votes

Make sure you have built Nutch from source i.e. don't use the binary release which works only in local mode. Once you've compile with

ant clean runtime

go to runtime/deploy/bin and run the scripts as usual.

NB you need to modify the conf files prior to recompiling.

answered Jan 24 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
0 answers

How to run Hadoop in Docker containers?

I want to incorporate Hadoop in Docker ...READ MORE

Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 10,730 points
95 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
6,148 views
0 votes
1 answer

How to work with distributed cache in Hadoop?

The problem with your code is that ...READ MORE

answered Apr 20, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
437 views
0 votes
10 answers

Difference between single node & pseudo-distributed mode in Hadoop?

Both are the same thing but single ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Vilola
4,511 views
0 votes
1 answer

The file exists before processing with hadoop command

Took session and it got resolved. READ MORE

answered Dec 18, 2017 in Big Data Hadoop by Sudhir
• 1,610 points
86 views
0 votes
1 answer

How to sync Hadoop configuration files to multiple nodes?

For syncing Hadoop configuration files, you have ...READ MORE

answered Jun 21, 2018 in Big Data Hadoop by HackTheCode
200 views
0 votes
10 answers

What is the difference between Mongodb and Hadoop?

Apart from the similarity that they are ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Deeraj
2,685 views
0 votes
3 answers

What is Hive? Is Hive a database?

Hive is a data Warehouse infrastructure/system built ...READ MORE

answered Jul 1 in Big Data Hadoop by Ved Gupta
7,070 views
0 votes
1 answer

How to format the output being written by MapReduce in Hadoop?

Here is a simple code demonstrate the ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
150 views
0 votes
1 answer

What is Custom partitioner in Hadoop? How to write partition function ?

Don't think that in Hadoop the same ...READ MORE

answered Sep 18, 2018 in Big Data Hadoop by Frankie
• 9,810 points
193 views