Big Data and Hadoop (168 Blogs) Become a Certified Professional

Introduction to Hadoop Job Tracker

Last updated on May 22,2019 8.3K Views

Hadoop Job Tacker

Job Tracker is the master daemon for both Job resource management and scheduling/monitoring of jobs. It acts as a liaison between Hadoop and your application.

The Process

The user first copies files in to the Distributed File System (DFS), before submitting a job to the client. The client then receives these input files. The user will receive the splits or blocks based on the input files. The client could create the splits or blocks in a manner it prefers, as there are certain considerations behind it. If an analysis is done on the complete data, you will divide the data into splits. Files are not copied through client, but are copied using flume or Sqoop or any external client.

Once the files are copied in to the DFS and the client interacts with the DFS, the splits will run a MapReduce job. The job is submitted through a job tracker. The job tracker is the master daemon which runs on the same node that runs these multiple jobs on data nodes. This data will be lying on various data nodes but it is the responsibility of the job tracker to take care of that.

After a client submits on the job tracker, the job is initialized on the job queue and the job tracker creates maps and reduces. Based on the program that is contained in the map function and reduce function, it will create the map task and reduce task. These two will  run on the input splits. Note: When created by the clients, this input split contains the whole data.

Each input split has a map job running in it and the output of the map task goes into the reduce task . Job tracker runs the track on a particular data.  There can be multiple replications of that so it picks the local data and runs the task on that particular task tracker. The task tracker is the one that actually runs the task on the data node. Job tracker will pass the information to the task tracker and the task tracker will run the job on the data node.

Once the job has been assigned to the task tracker, there is a heartbeat associated with each task tracker and job tracker. It sends signals to find out if the data nodes are still alive. The two are often  in sync since there is a possibility for the nodes to fade out.

Got a question for us? Mention them in the comments section and we will get back to you. 

Related Posts:

Importance of Hadoop Tutorial

Introduction to Pig

Get started with Big Data and Hadoop

Upcoming Batches For Big Data Hadoop Certification Training Course
Course NameDate
Big Data Hadoop Certification Training Course

Class Starts on 23rd May,2022

23rd May

MON-FRI (Weekday Batch)
View Details
Big Data Hadoop Certification Training Course

Class Starts on 28th May,2022

28th May

SAT&SUN (Weekend Batch)
View Details
Big Data Hadoop Certification Training Course

Class Starts on 20th June,2022

20th June

MON-FRI (Weekday Batch)
View Details

Join the discussion

Browse Categories

Send OTP
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.

image not found!
image not found!

Introduction to Hadoop Job Tracker