Commissioning and Decommissioning Nodes in a Hadoop Cluster | Edureka

Big Data and Hadoop (170 Blogs) Become a Certified Professional

Become a Certified Professional

One of the most attractive features of Hadoop framework is its utilization of commodity hardware. However, this leads to frequent DataNode crashes in a Hadoop cluster. Another striking feature of Hadoop Framework is the ease of scale in accordance to the rapid growth in data volume. Because of these two reasons, one of the most common task of a Hadoop administrator is to commission (Add) and decommission (Remove) Data Nodes in a Hadoop Cluster.

Commissioning and Decommissioning Nodes in a Hadoop Cluster:

Above diagram shows a step by step process to decommission a DataNode in the cluster.

The first task is to update the ‘exclude‘ files for both HDFS (hdfs-site.xml) and MapReduce (mapred-site.xml).

The ‘exclude’ file:

for jobtracker contains the list of hosts that should be excluded by the jobtracker. If the value is empty, no hosts are excluded.
for Namenode contains a list of hosts that are not permitted to connect to the Namenode.

Here is the sample configuration for the exclude file in hdfs-site.xml and mapred-site.xml:

hdfs-site.xml

<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>

mapred-site.xml

<property>
<name>mapred.hosts.exclude</name>
<value>/home/hadoop/excludes</value>
<final>true</final>
</property>

Note: The full pathname of the files must be specified.

Similarly, we have the ‘include’ files:

for jobtracker containing the list of nodes that may connect to the JobTracker. If the value is empty, all hosts are permitted.
for Namenode containing a list of hosts that are permitted to connect to the Namenode. If the value is empty, all hosts are permitted.

The ‘dfsadmin’ and ‘mradmin’ commands refresh the configuration with the changes to make them aware of the new node.

The ‘slaves’ file on master server contains the list of all data nodes. This must also be updated to ensure any issues in future hadoop daemon start/stop.

The important step in data node commission process is to run the Cluster Balancer.

>hadoop balancer -threshold 40

Balancer attempts to provide a balance to a certain threshold among data nodes by copying block data from older nodes to newly commissioned nodes.

So, this is how you can do – Commissioning and Decommissioning Nodes in a Hadoop Cluster.

Get a better understanding of the Hadoop Cluster from this Big Data Course.

Got a question for us? Please mention it in the comments section and we will get back to you.

Related Links:

5 Reasons to Learn Hadoop

Get Started with Big Data & Hadoop

Recommended videos for you

Hadoop for Java Professionals

What-Is-Hadoop-Hadoop-Tutorial-For-Beginners-Introduction-to-Hadoop-Hadoop-Training-Edureka.jpeg

What Is Hadoop – All You Need To Know About Hadoop

Improve Customer Service With Big Data

Distributed Cache With MapReduce

MapReduce-Tutorial-What-is-MapReduce-Hadoop-MapReduce-Tutorial-Edureka.jpeg

MapReduce Tutorial – All You Need To Know About MapReduce

Pig-Tutorial-Apache-Pig-Script-Hadoop-Pig-Tutorial-Edureka.jpeg

Pig Tutorial – Know Everything About Apache Pig Script

What is Big Data and Why Learn Hadoop!!!

Ways to Succeed with Hadoop in 2015

Introduction to Apache Solr-1

Secure Your Hadoop Cluster With Kerberos

Logistic Regression In Data Science

Big Data Processing With Apache Spark

Hadoop-Interview-Questions-and-Answers-Big-Data-Interview-Questions-Hadoop-Tutorial-Edureka.jpeg

Top Hadoop Interview Questions and Answers – Ace Your Interview

Webinar: Introduction to Big Data & Hadoop

filtering-on-hbase-using-mapreduce-filtering-pattern.jpg

Filtering on HBase Using MapReduce Filtering Pattern

HBase-Tutorial-Apache-HBase-Tutorial-for-Beginners-NoSQL-Databases-Hadoop-Tutorial-Edureka.jpeg

HBase Tutorial – A Complete Guide On Apache HBase

Introduction to Hadoop Administration

boost-your-data-career-with-predictive-analytics-learn-how.jpg

Boost Your Data Career with Predictive Analytics! Learn How ?

What is Apache Storm all about?

Big Data – XML Parsing With MapReduce

Recommended blogs for you

Apache Pig UDF: Part 2 – Load Functions

Big Data Engineer Salary – How Much Can You Expect As A Big Data Engineer?

Big Data Applications-Sears Case Study

Brief Introduction to Oozie

Pig Programming: Apache Pig Script in Local Mode

ELK Stack Tutorial – Discover, Analyze And Visualize Your Data Efficiently

Why Should a Data Warehouse Professional Move to Big Data Hadoop?

Running Scala Application In Eclipse IDE Using Sbteclipse

A Day In The Life Of A Hadoop Administrator

What is Big Data Analytics – Turning Insights Into Action

Spark Java Tutorial : Your One Stop Solution to Spark in Java

Big Prospects for Big Data

Hadoop-3.0-300x175.png

What’s New in Hadoop 3.0 – Enhancements in Apache Hadoop 3

Apache Storm Use Cases

Real Time Storm Project

Why SAP HANA is a Game Changer?

Big-Data-21st-Century-Fuel-Big-Data-Infographic-Edureka-300x175.png

Infographics: How Big is Big Data?

Spark MLlib – Machine Learning Library Of Apache Spark

Hadoop-Admin-Interview-Questions-and-Answers-300x175.png

Hadoop Administration Interview Questions and Answers For 2024

Feature-image-Overview-of-Hadoop-2.0-Cluster-Architecture-Federation-Edureka-300x175.png

Overview of Hadoop 2.0 Cluster Architecture Federation

Comments

15 Comments

Masked says:
Feb 19, 2018 at 4:22 am GMT
Hi, I am gonna start Hadoop cluster setup project. Could anyone help me in identifying the list of task/activities involved in this. I need to create a project plan for this.
Reply
- EdurekaSupport says:
  Feb 22, 2018 at 6:59 am GMT
  Greetings, we’d recommend you to go through our blogs on Hadoop single node installation first and then our blog on Hadoop multi node installation. This will help you in getting a better idea and draw out a project plan better. Here is the link:
  1. https://www.edureka.co/blog/install-hadoop-single-node-hadoop-cluster
  2. https://www.edureka.co/blog/setting-up-a-multi-node-cluster-in-hadoop-2.X
  You can revert back to us for any other query. Hope this helps :)
  Reply
dsv dinesh says:
Mar 22, 2017 at 8:01 pm GMT
/home/hadoop/excludes – In this path is it EXCLUDES or EXCLUDE?
Reply
- EdurekaSupport says:
  Mar 24, 2017 at 2:46 pm GMT
  Hey Dinesh, thanks for checking out our blog.
  Well, dfs.hosts.exclude is just a property in hdfs-site.xml and mapred-site.xml. You can check the content of these files by going to the place where hadoop is installed and then you can check for these hdfs-site.xml and mapred-site.xml. These are configuration files so you will find these in etc folder of hadoop. Basically, the main attribute is dfs.hosts.exclude whose name is fixed and hadoop expects only “exclude” word here. And it’s value is the file which has the list. In our example our filename is excludes. It’s name can even be /home/hadoop/myCoolName and then in you will have to write
  dfs.hosts.exclude
  /home/hadoop/myCoolName
  true
  Hope this helps. Cheers!
  Reply
  - dsv dinesh says:
    Mar 25, 2017 at 6:32 am GMT
    Thanks for reverting me back
    Reply
Raj Kiran says:
Jul 19, 2016 at 4:05 pm GMT
I got the enough information from your website.. But there is a mistake in the Flow Diagram of “Commissioning” .. In the 3rd block hadoop mraadmin -refreshNodes should come.. Please check and update so that everyone know the exact flow.
Reply
- EdurekaSupport says:
  Aug 4, 2016 at 7:18 am GMT
  Thanks for taking the time out to go through our blog in detail. We are sorry you encountered an error. We have flagged this to the concerned folks and they are working on fixing it ASAP. Thanks again and do keep checking back in whenever you have the time.
  Reply
Akshat says:
Mar 17, 2015 at 7:37 pm GMT
Beautiful !! Thanks a ton !!
Reply
- EdurekaSupport says:
  Mar 18, 2015 at 10:37 am GMT
  You are welcome Akshat!!
  Reply
K Sandeep says:
Nov 6, 2014 at 5:49 am GMT
very good explanation..!
Reply
- EdurekaSupport says:
  Nov 12, 2014 at 6:35 am GMT
  Thanks Sandeep. Feel free to check out our other posts as well.
  Reply
Santosh says:
Aug 7, 2014 at 8:46 am GMT
Good blog
Reply
- EdurekaSupport says:
  Aug 8, 2014 at 4:33 am GMT
  Thanks Santosh!
  Reply
Bala Chandar says:
Jul 29, 2014 at 1:56 pm GMT
Super Explanation.
Reply
- EdurekaSupport says:
  Jul 30, 2014 at 8:32 am GMT
  Thanks Bala! Feel free to go through our other blog posts as well.
  Reply

Join the discussionCancel reply

webinar

REGISTER FOR FREE WEBINAR

webinar_success

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

image not found!

Commissioning and Decommissioning Nodes in a Hadoop Cluster

edureka.co