Speculative Execution in Hadoop

0 votes
What do you know about the Speculative Execution?
Aug 27, 2018 in Big Data Hadoop by Ashish
• 2,650 points
3,279 views

1 answer to this question.

0 votes

In Hadoop, Speculative Execution is a process that takes place during the slower execution of a task at a node. In this process, the master node starts executing another instance of that same task on the other node. And the task which is finished first is accepted and the execution of other is stopped by killing that.

answered Aug 27, 2018 by kurt_cobain
• 9,390 points
@Kurt_cobin,

What if Speculative execution is set to false for mappers and one node goes down in-between while processing the task, in that case will another node will execute the another instance of the same task or not.

Means, Speculative execution only affects if it is slow or also in case of nodes goes down.

Hello, @Kamboj,

Hadoop does not fix or diagnose slow-running tasks. Instead, it tries to detect when a task is running slower than expected and launches another, an equivalent task as a backup (the backup task is called a speculative task). This process is called speculative execution in Hadoop.

There may be various reasons for the slowdown of tasks, including hardware degradation or software misconfiguration, but it may be difficult to detect causes since the tasks still complete successfully, although more time is taken than the expected time. Hadoop doesn’t try to diagnose and fix slow running tasks, instead, it tries to detect them and runs backup tasks for them. This is called speculative execution in Hadoop. These backup tasks are called Speculative tasks in Hadoop.

I hope this explanation will help you to understand your query.

Hi@Kamboj,

According to my knowledge, Speculative execution-only effects if the tasks are taking much time then expected. It will not look if any node goes down for some reason. when a task is running slower than expected, then launches another, an equivalent task as a backup. This process is called speculative execution in Hadoop.

Thanks for the update MD

I was also expecting the same but I have run the task by setting Speculative execution false and executed a job and maps started running on different nodes. Then I have stopped on node in-between and I observed that the map which was running on that node did not got run by any other working node and task stuck in  running state.

The same scenario worked file when Speculative execution was set to true and task got completed.

Hi Kamboj,

Ok understood. But this can be automatically done if some node goes down and failure tasks will assign to another node. You can check this below blog once. It has a good discussion related to task failure and its recovery.

http://timepasstechies.com/handling-failures-hadoopmapreduce-yarn/

Thanks MD,

The shared link has detailed information about the failure and recovery of nodes.  Waiting time helped me to understand the actual scenario.  I have re-run the same task with (speculative false) and found that failed node jobs got completed after waiting for 17 minutes. But on RM, completed jobs showing corresponding to the nodes on which initially that jobs were submit  and that nodes went down.

Related Questions In Big Data Hadoop

0 votes
1 answer

What is Hadoop Speculative task execution?

One problem with the Hadoop system is ...READ MORE

answered Oct 3, 2018 in Big Data Hadoop by Frankie
• 9,810 points
314 views
0 votes
1 answer

How to run Hadoop in Docker containers?

Hi, You can run Hadoop in Docker container. Follow ...READ MORE

answered Jan 24, 2020 in Big Data Hadoop by MD
• 95,060 points
548 views
+1 vote
1 answer

Is Hadoop only Framework in Big Data Ecosystem ?

Actually there are many other frameworks, one of ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Ashish
• 2,650 points
273 views
0 votes
7 answers

How to run a jar file in hadoop?

I used this command to run my ...READ MORE

answered Dec 10, 2018 in Big Data Hadoop by Dasinto
13,852 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
6,846 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
48,271 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,596 views
0 votes
1 answer
0 votes
1 answer

When hadoop-env.sh will be executed in hadoop

Yes you need to put in the ...READ MORE

answered Apr 3, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
457 views