How to run Nutch in Hadoop installed in pseudo-distributed mode

0 votes

I have Nutch 1.13 installed on my Ubuntu. I can run a crawl in standalone mode. It successfully runs and produces the desired results but I have no idea how to run it in hadoop now? I have Hadoop installed in pseudo distributed mode and I want to run a Nutch crawl with Hadoop and monitor it. How can I do it? There are a lot of tutorials for running it in standalone mode but I couldn't find any clear instructions on how Can I run it in Hadoop except that I have to use "Nutch Job" after I build it with ant.

Jan 24 in Big Data Hadoop by Neha
• 6,140 points
15 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

Make sure you have built Nutch from source i.e. don't use the binary release which works only in local mode. Once you've compile with

ant clean runtime

go to runtime/deploy/bin and run the scripts as usual.

NB you need to modify the conf files prior to recompiling.

answered Jan 24 by Frankie
• 9,570 points

Related Questions In Big Data Hadoop

0 votes
0 answers

How to run Hadoop in Docker containers?

I want to incorporate Hadoop in Docker ...READ MORE

Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 9,030 points
46 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
3,223 views
0 votes
1 answer

How to work with distributed cache in Hadoop?

The problem with your code is that ...READ MORE

answered Apr 20, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
294 views
0 votes
10 answers

Difference between single node & pseudo-distributed mode in Hadoop?

Both are the same thing but single ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Vilola
2,266 views
0 votes
1 answer

The file exists before processing with hadoop command

Took session and it got resolved. READ MORE

answered Dec 18, 2017 in Big Data Hadoop by Sudhir
• 1,610 points
36 views
0 votes
1 answer

How to sync Hadoop configuration files to multiple nodes?

For syncing Hadoop configuration files, you have ...READ MORE

answered Jun 21, 2018 in Big Data Hadoop by HackTheCode
108 views
0 votes
10 answers

What is the difference between Mongodb and Hadoop?

Apart from the similarity that they are ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Deeraj
1,813 views
0 votes
2 answers

What is Hive? Is Hive a database?

Hey, HIVE:- Hive is an ETL (extract, transform, load) ...READ MORE

answered May 8 in Big Data Hadoop by Gitika
• 6,700 points
3,135 views
0 votes
1 answer

How to format the output being written by MapReduce in Hadoop?

Here is a simple code demonstrate the ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,570 points
45 views
0 votes
1 answer

What is Custom partitioner in Hadoop? How to write partition function ?

Don't think that in Hadoop the same ...READ MORE

answered Sep 18, 2018 in Big Data Hadoop by Frankie
• 9,570 points
78 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.