How hadoop mapreduce job is submitted to worker nodes?

0 votes

I have started learning hadoop and its already been 2 weeks but, I am still not able to understand many things in Hadoop mapreduce. One of the concept that I am finding very difficult is how job gets distributed to worker nodes from master node? Suppose, I have hadoop cluster with 1 master and 3 slave nodes. Now, when a client is submitting job how does it gets passed to master and then on to the slave nodes?

Mar 29, 2018 in Big Data Hadoop by Damon Salvatore
• 5,510 points
1,823 views

1 answer to this question.

0 votes

Alright, I think you are basically looking for the entire workflow of the a mapreduce job. Here is the picture that will help you understand how things happen:

image

  • At first, you have your MR code i.e. the jar of the job which you submit for the client node using hadoop jar command. Here you pass all the details such as the class name, input path and output path.

  • Now, i once your job has been submitted the Resource Manager will assign a new application id to this job which will be then passed on to the client.

  • Client will copy the jar file and other job resources to HDFS. Basically, client submits the job through Resource Manager.

  • Resource Manager, being master node, allocate the resources needed for the job to run and keeps track of cluster utilization. It also, initiates an application master for each job who is responsible to co-ordinate the job execution.

  • Application master gets the meta data info from namenode to determine where the blocks (input split) for input are located and then supervises the respective nodemanagers to submit the tasks

  • Basically, the App Master creates a map task object for each input split, as well as a number of reduce task objects

answered Mar 29, 2018 by Ashish
• 2,630 points

Related Questions In Big Data Hadoop

0 votes
11 answers
0 votes
1 answer

To whom is the Hadoop job submitted?

When a hadoop job is submitted, it ...READ MORE

answered Feb 18 in Big Data Hadoop by Loki
79 views
0 votes
2 answers
0 votes
1 answer

Which is better to create a Hadoop Job? MapRed or MapReduce package?

There is no much difference between the ...READ MORE

answered May 13 in Big Data Hadoop by ravikiran
• 4,560 points

edited May 14 by Omkar 60 views
0 votes
1 answer

Apache Hadoop Yarn example program

You can go to this location $Yarn_Home/share/hadoop/mapreduce . You'll ...READ MORE

answered Apr 4, 2018 in Big Data Hadoop by nitinrawat895
• 10,710 points
242 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
3,343 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
399 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,553 views
0 votes
1 answer
+1 vote
1 answer

Is it necessary to use Zookeeper in Hadoop Stack?

ZooKeeper is a centralized service for maintaining ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by Ashish
• 2,630 points
128 views