Executing mapper and reducer individually

0 votes

I want to know how to execute the mapper and reducer file individual and want the outptut to be in separate files?

Dec 18, 2018 in Big Data Hadoop by slayer
• 29,170 points
38 views

1 answer to this question.

0 votes

This is what happens:

Map reduce framework will store intermediate output into local disk rather than HDFS as this would cause unnecessarily replication of files.

After, the whole Map computation everything eventually gets merged and dumped to disk and becomes the input for the Shuffling and Sorting stages that precede the Reducer.

Mapper output (intermediate data) is written to the Local file system (NOT HDFS) of each mapper slave node. Once data gets transferred to Reducer, We won’t be able to access these temporary files.

But. We have MultipleOutputFormat. It allows you to define multiple file names for the output of the Mapper or Reducer.

For further insight into MultipleOutputFormat, refer to the below links:

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.htm

answered Dec 18, 2018 by Omkar
• 67,660 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to solve error caused due to output types of mapper and reducer not matching?

job.setOutputValueClass will set the types expected as ...READ MORE

answered Jul 9 in Big Data Hadoop by Rishab
33 views
0 votes
1 answer

Output types of mapper and reducer does not match

job.setOutputValueClass will set the types expected as ...READ MORE

answered Jul 22 in Big Data Hadoop by Reena
93 views
0 votes
1 answer

When is an identity mapper/reducer used?

1.One of the simplest example of Iterative ...READ MORE

answered Apr 3, 2018 in Big Data Hadoop by Ashish
• 2,630 points
871 views
0 votes
1 answer

Can one implement combiner and reducer separately?

Surely, you can use combiner separately along ...READ MORE

answered Apr 10, 2018 in Big Data Hadoop by Ashish
• 2,630 points
245 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
3,327 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
398 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,447 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,198 views
–1 vote
1 answer

How to start namenode and datanode individually?

You can use these commands. For namenode: ./hadoop-daemon.sh start ...READ MORE

answered Dec 21, 2018 in Big Data Hadoop by Omkar
• 67,660 points
64 views
0 votes
5 answers

Hadoop hdfs: list all files in a directory and its subdirectories

Hi, You can try this command: hadoop fs -ls ...READ MORE

answered Aug 1 in Big Data Hadoop by Dinish
2,127 views