Executing mapper and reducer individually

0 votes

I want to know how to execute the mapper and reducer file individual and want the outptut to be in separate files?

Dec 18, 2018 in Big Data Hadoop by slayer
• 29,310 points
138 views

1 answer to this question.

0 votes

This is what happens:

Map reduce framework will store intermediate output into local disk rather than HDFS as this would cause unnecessarily replication of files.

After, the whole Map computation everything eventually gets merged and dumped to disk and becomes the input for the Shuffling and Sorting stages that precede the Reducer.

Mapper output (intermediate data) is written to the Local file system (NOT HDFS) of each mapper slave node. Once data gets transferred to Reducer, We won’t be able to access these temporary files.

But. We have MultipleOutputFormat. It allows you to define multiple file names for the output of the Mapper or Reducer.

For further insight into MultipleOutputFormat, refer to the below links:

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.htm

answered Dec 18, 2018 by Omkar
• 69,170 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to solve error caused due to output types of mapper and reducer not matching?

job.setOutputValueClass will set the types expected as ...READ MORE

answered Jul 9, 2019 in Big Data Hadoop by Rishab
319 views
0 votes
1 answer

Output types of mapper and reducer does not match

job.setOutputValueClass will set the types expected as ...READ MORE

answered Jul 22, 2019 in Big Data Hadoop by Reena
1,471 views
0 votes
1 answer

When is an identity mapper/reducer used?

1.One of the simplest example of Iterative ...READ MORE

answered Apr 3, 2018 in Big Data Hadoop by Ashish
• 2,650 points
1,479 views
0 votes
1 answer

Can one implement combiner and reducer separately?

Surely, you can use combiner separately along ...READ MORE

answered Apr 10, 2018 in Big Data Hadoop by Ashish
• 2,650 points
535 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
8,048 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,371 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
67,104 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
3,062 views
–1 vote
1 answer

How to start namenode and datanode individually?

You can use these commands. For namenode: ./hadoop-daemon.sh start ...READ MORE

answered Dec 21, 2018 in Big Data Hadoop by Omkar
• 69,170 points
3,655 views
0 votes
5 answers

Hadoop hdfs: list all files in a directory and its subdirectories

Hi, You can try this command: hadoop fs -ls ...READ MORE

answered Aug 1, 2019 in Big Data Hadoop by Dinish
12,787 views