Executing mapper and reducer individually

0 votes

I want to know how to execute the mapper and reducer file individual and want the outptut to be in separate files?

Dec 18, 2018 in Big Data Hadoop by slayer
• 29,050 points
30 views

1 answer to this question.

0 votes

This is what happens:

Map reduce framework will store intermediate output into local disk rather than HDFS as this would cause unnecessarily replication of files.

After, the whole Map computation everything eventually gets merged and dumped to disk and becomes the input for the Shuffling and Sorting stages that precede the Reducer.

Mapper output (intermediate data) is written to the Local file system (NOT HDFS) of each mapper slave node. Once data gets transferred to Reducer, We won’t be able to access these temporary files.

But. We have MultipleOutputFormat. It allows you to define multiple file names for the output of the Mapper or Reducer.

For further insight into MultipleOutputFormat, refer to the below links:

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.htm

answered Dec 18, 2018 by Omkar
• 67,120 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to solve error caused due to output types of mapper and reducer not matching?

job.setOutputValueClass will set the types expected as ...READ MORE

answered Jul 9 in Big Data Hadoop by Rishab
17 views
0 votes
1 answer

When is an identity mapper/reducer used?

1.One of the simplest example of Iterative ...READ MORE

answered Apr 3, 2018 in Big Data Hadoop by Ashish
• 2,630 points
783 views
0 votes
1 answer

Can one implement combiner and reducer separately?

Surely, you can use combiner separately along ...READ MORE

answered Apr 10, 2018 in Big Data Hadoop by Ashish
• 2,630 points
210 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
2,046 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
196 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
10,472 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
764 views
0 votes
1 answer

How to start namenode and datanode individually?

You can use these commands. For namenode: ./hadoop-daemon.sh start ...READ MORE

answered Dec 21, 2018 in Big Data Hadoop by Omkar
• 67,120 points
48 views
0 votes
4 answers