This is one of the most important features of Hadoop 2.0. Before discussing the Namenode High Availability feature, it is essential to know what Quorum is. Quorum is a generic term used in clustering where we say a particular cluster is stable. Quorum gives a list of machines and helps to determine the health of the cluster. There are two types of Quorum: Expected Quorum and Calculated Quorum.
NameNode High Availability with Quorum Journal Manager (QJM)
Prior to Hadoop 2.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster. Each cluster had a single NameNode, and if that machine was unavailable, the cluster on the whole would be unavailable until the NameNode was either restarted or started on a separate machine. In a classic HA cluster, two separate machines are configured as NameNodes. At any point, one of the NameNodes will be in Active state and the other will be in a Standby state. The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover.
In order for the Standby node to keep its state coordinated with the Active node, both nodes communicate with a group of separate daemons called ‘JournalNodes’ (JNs). When any namespace modification is performed by the Active node, it logs a record of the changes made, in the JournalNodes. The Standby node is capable of reading the amended information from the JNs, and is regularly monitoring them for changes. As the Standby Node sees the changes, it then applies them to its own namespace. In case of a failover, the Standby will make sure that it has read all the changes from the JounalNodes before changing its state to ‘Active state’. This guarantees that the namespace state is fully synched before a failover occurs.
To provide a fast failover, it is essential that the Standby node have to have the updated and current information regarding the location of blocks in the cluster. For this to happen, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.
It is essential that only one of the NameNodes must be Active at a time. Otherwise, the namespace state would deviate between the two and lead to data loss or erroneous results. In order to avoid this, the JournalNodes will only permit a single NameNode to a writer at a time. During a failover, the NameNode which is to become active will take over the responsibility of writing to the JournalNodes.
Got a question for us? Please mention them in the comments section and we will get back to you.