Different between zkfc and zookeeper

0 votes
Jul 30 in Big Data Hadoop by anonymous
24 views

1 answer to this question.

0 votes

Hey,

The key difference between both the components of HDFS deployment are as follows:

Automatic failover adds two new components to an HDFS deployment: a ZooKeeper quorum, and the ZKFailoverController process (abbreviated as ZKFC)

  • Apache ZooKeeper is a highly available service for maintaining small amounts of coordination data, notifying clients of changes in that data, and monitoring clients for failures. The implementation of automatic HDFS failover relies on ZooKeeper for the following things:
  1. Failure detection: Each of the NameNode machines in the cluster maintains a persistent session in ZooKeeper. If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode that a failover should be triggered.
  2. Active NameNode election: ZooKeeper provides a simple mechanism to exclusively elect a node as active. If the current active NameNode crashes, another node may take a special exclusive lock in ZooKeeper indicating that it should become the next active.
  • The ZKFailoverController (ZKFC) is a new component which is a ZooKeeper client which also monitors and manages the state of the NameNode. Each of the machines which runs a NameNode also runs a ZKFC, and that ZKFC is responsible for:
  1. Health monitoring: The ZKFC pings its local NameNode on a periodic basis with a health-check command. So long as the NameNode responds in a timely fashion with a healthy status, the ZKFC considers the node healthy.
  2. ZooKeeper session management: When the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper. If the local NameNode is active, it also holds a special "lock" znode. If the session expires, the lock node will be automatically deleted.

If the local NameNode is healthy, and the ZKFC sees that no other node currently holds the lock znode, it will itself try to acquire the lock. If it succeeds, then it has "won the election", and is responsible for running a failover to make its local NameNode active. 

answered Jul 31 by Sunny

Related Questions In Big Data Hadoop

0 votes
1 answer
0 votes
10 answers

What is the difference between Mongodb and Hadoop?

Apart from the similarity that they are ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Deeraj
2,312 views
+2 votes
10 answers

Is there any difference between “hdfs dfs” and “hadoop fs” shell commands?

Yes, there's a difference between hadoop fs and ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Kunal
9,270 views
0 votes
1 answer

Is there any Relationship between Hadoop and Databases?

As such, there is no relationship between ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
38 views
0 votes
1 answer

Is Kafka and Zookeeper are required in a Big Data Cluster?

Apache Kafka is one of the components ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by nitinrawat895
• 10,490 points
349 views
0 votes
3 answers

What are differences between NameNode and Secondary NameNode?

File metadata information is stored by Namenode ...READ MORE

answered Apr 7 in Big Data Hadoop by anonymous
1,531 views
0 votes
1 answer

Relationship between Spark, Hadoop and Cassandra?

Spark is a distributed in memory processing ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by nitinrawat895
• 10,490 points
113 views
0 votes
1 answer
0 votes
1 answer

Difference between ensemble and quorum in zookeeper?

Hey, When you want to have high availability ...READ MORE

answered May 24 in Big Data Hadoop by Gitika
• 25,300 points
40 views
0 votes
1 answer

Explain to me the difference between name node and secondary name node.

Firstly, You need to understand the major ...READ MORE

answered Apr 30 in Big Data Hadoop by ravikiran
• 4,200 points
41 views