What happens when a datanode that is dead becomes active again?

0 votes
If a data node which became dead becomes active, how does Hadoop destroy one extra replication of data which has come up? Is the data from this now active node deleted or data from any other replication is deleted?
Jun 20 in Big Data Hadoop by Firoz
56 views

1 answer to this question.

0 votes
When NameNode notices that it has not received a heartbeat message from a datanode after a certain amount of time (usually 10 minutes by default), the data node is marked as dead. Since blocks will be under-replicated, the system begins replicating the blocks that were stored on the dead DataNode.

The NameNode replicates the data blocks from one DataNode to another. The replication data transfer happens directly between DataNode and the data never passes through the Name Node.

After the dead Datanode again comes back to the cluster then it is the case of Over Replicated blocks. HDFS will automatically delete the excess replicas as the default replication factor has to be maintained 3. The replica from the now active datanode is going to be removed.
answered Jun 20 by Rishi

Related Questions In Big Data Hadoop

0 votes
1 answer

What metadata is stored on a DataNode when a block is written to it?

Let me explain you step by step.  Each ...READ MORE

answered Jul 23, 2018 in Big Data Hadoop by nitinrawat895
• 10,670 points
182 views
0 votes
1 answer
0 votes
3 answers

What is Hive? Is Hive a database?

Hive is a data Warehouse infrastructure/system built ...READ MORE

answered Jul 1 in Big Data Hadoop by Ved Gupta
5,727 views
0 votes
1 answer

What is a container in YARN?

A container basically represents a resource on ...READ MORE

answered Apr 9, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
785 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,679 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
279 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,296 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
978 views
0 votes
1 answer

What is the command to count number of lines in a file in hdfs?

hadoop fs -cat /example2/doc1 | wc -l READ MORE

answered Nov 22, 2018 in Big Data Hadoop by Omkar
• 67,380 points
292 views
0 votes
1 answer

What is the use of Apache Kafka in a Big Data Cluster?

Kafka is a Distributed Messaging System which ...READ MORE

answered Jun 21 in Big Data Hadoop by ravikiran
• 4,560 points
21 views