Why minimum 3 Journal Nodes are required in Hadoop HA architecture?

0 votes
I have installed Hadoop in multi-distributed mode. I need high availability (HA architecture) for my cluster. So, I am planning to set up HA using Quorum Journals. While going through the official documentation I found, there must be at least 3 JournalNode daemons.  Can anyone help me in understanding why we need 3 Journal Nodes.
Apr 20, 2018 in Big Data Hadoop by Shubham
• 13,290 points
1,538 views

1 answer to this question.

0 votes
Initially in Hadoop 1.x, the NameNode was the single point of failure and once the NameNode goes down the cluster goes down. This is the reason why, Hadoop 2.x has High Availability architecture, where there are 2 NameNodes where one NameNode is the active NameNode & other one is the passive NameNode.

To make the cluster highly available, both the NameNode should be in sync. So, for this Journal Node was introduced. Journal Node are the ones which will perform the synchronisation activities between Active & Passive NameNode.

Now imagine a situation where the JournalNode fails. The whole purpose of the High availability fails. Again, the Journal Node will become single point of failure.

More than half of the total journal nodes should be healthy and running. In case of 2 journal node, more than half means both the journal node should be up & running. So, you cannot bear any node failure in this situation.

Thus, the minimum number of nodes is 3 suggested, as it can handle Journal Node failure.
answered Apr 20, 2018 by kurt_cobain
• 9,240 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Why does one remove or add nodes in a Hadoop cluster frequently?

One of the most attractive features of ...READ MORE

answered Dec 13, 2018 in Big Data Hadoop by Frankie
• 9,810 points
151 views
0 votes
1 answer

Is Kafka and Zookeeper are required in a Big Data Cluster?

Apache Kafka is one of the components ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by nitinrawat895
• 10,670 points
374 views
0 votes
1 answer

Why we are configuring mapred.job.tracker in YARN?

I really dont know the reason behind ...READ MORE

answered Mar 29, 2018 in Big Data Hadoop by Ashish
• 2,630 points
195 views
0 votes
1 answer

What are SUCCESS and part-r-00000 files in Hadoop?

Yes, both the files i.e. SUCCESS and ...READ MORE

answered Apr 12, 2018 in Big Data Hadoop by nitinrawat895
• 10,670 points
2,162 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,652 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,216 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
972 views
0 votes
1 answer
0 votes
1 answer

Why Java Code in Hadoop uses own Data Types instead of basic Data types?

Hadoop provides us Writable interface based data ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
76 views
0 votes
1 answer

What is new in Hadoop 3?

Here are few changes in Hadoop 3 1. ...READ MORE

answered May 28, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
30 views