How to configure Hosts file for Hadoop Eco-System?

0 votes

The question may seem pretty obvious, but I have faced it many times, due to bad configuration of hosts file on a hadoop cluster.

Can anyone describe how to setup hosts file and other related network configuration for hadoop and similar environment usage (like cloudera).

Specially when i have to add both the hostname and FQDN

Here is the host file of one of the machine from host name cdh4hdm have role of hadoop Master

 127.0.0.1       cdh4hdm        localhost
  #127.0.1.1      cdh4hdm 

 # The following lines are desirable for IPv6 capable hosts

   172.26.43.40    cdh4hdm.imp.co.in            kdc1
   172.26.43.41    cdh4hbm.imp.co.in   
   172.26.43.42    cdh4s1.imp.co.in    
   172.26.43.43    cdh4s2.imp.co.in    
   172.26.43.44    cdh4s3.imp.co.in    
   172.26.43.45    cdh4s4.imp.co.in    

   ::1     ip6-localhost ip6-loopback
   fe00::0 ip6-localnet
   ff00::0 ip6-mcastprefix
   ff02::1 ip6-allnodes
   ff02::2 ip6-allrouters 


Here on cluster some nodes are getting FQDN and some are getting hostname.

Also IP of hostname is not proper and showing 127.0.0.1 instead of host IP

Please suggest

Sep 25, 2018 in Big Data Hadoop by Neha
• 6,280 points
259 views

1 answer to this question.

0 votes

For UBUNTU

Hosts File and other configuration for Hadoop Cluster

Provide hostname to all cluster machines, to do so add hostname in /etc/hostname file as

hostname-of-machine

On all the host, hosts file should be like this:

hosts

127.0.0.1       localhost
#127.0.1.1      localhost

<ip of host>    FQDN                hostname    other_name
172.26.43.10    cdh4hdm.domain.com  cdh4hdm     kdc1
172.26.43.11    cdh4hbm.domain.com  cdh4hbm
172.26.43.12    cdh4s1.domain.com   cdh4s1
172.26.43.13    cdh4s2.domain.com   cdh4s2
172.26.43.14    cdh4s3.domain.com   cdh4s3
172.26.43.15    cdh4s4.domain.com   cdh4s4


Make sure to comment line 127.0.1.1 localhost it may create problem in zookeeper and cluster.

Add DNS server IP in /etc/resolv.conf

resolve.conf

search domain.com
nameserver 10.0.1.1

to verify configuration check hostfile and your should be able to ping all the machines by their hostname

To check hostname and FQDN on all machines run following commands:

hostname        //should return the hostname
hostname -f     //Fully Qualified Hostname
hostname -d     //Domain name

All commands will be same for RHEL except the hostname.

                                                     AND

f you mean the /etc/hosts file, then here is how I have set it in my hadoop cluster:

127.0.0.1       localhost
192.168.0.5     master
192.168.0.6     slave1
192.168.0.7     slave2
192.168.0.18    slave3
192.168.0.3     slave4
192.168.0.4     slave5  nameOfCurrentMachine

, where nameOfCurrentMachine is the machine that this file is set, used as slave5. Some people say that the first line should be removed, but I have not faced any issues, nor have I tried removing it.

Then, the $HADOOP_CONF_DIR/masters file in the master node should be:

master

and the $HADOOP_CONF_DIR/slaves file in the master node should be:

slave1
slave2
slave3
slave4
slave5

If you mean the /etc/hosts file, then here is how I have set it in my hadoop cluster:

127.0.0.1       localhost
192.168.0.5     master
192.168.0.6     slave1
192.168.0.7     slave2
192.168.0.18    slave3
192.168.0.3     slave4
192.168.0.4     slave5  nameOfCurrentMachine

, where nameOfCurrentMachine is the machine that this file is set, used as slave5. Some people say that the first line should be removed, but I have not faced any issues, nor have I tried removing it.

Then, the $HADOOP_CONF_DIR/masters file in the master node should be:

master

and the $HADOOP_CONF_DIR/slaves file in the master node should be:

slave1
slave2
slave3
slave4
slave5

In every other node, I have simply set these two files to contain just:

localhost

You should also make sure that you can ssh from master to every other node (using its name, not its IP) without a password. This post describes how to achieve that.

answered Sep 25, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to execute python script in hadoop file system (hdfs)?

If you are simply looking to distribute ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by digger
• 26,550 points
2,360 views
0 votes
1 answer

How to configure the Hadoop cluster with proxyuser for the Oozie process?

Hey, The following two properties are required in ...READ MORE

answered Jun 10 in Big Data Hadoop by Gitika
• 25,340 points
47 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,300 points
691 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
3,301 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
391 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,246 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,185 views
0 votes
1 answer
0 votes
1 answer

How should I provide a URL for hdfs file system?

If you are trying to access your ...READ MORE

answered Sep 10, 2018 in Big Data Hadoop by Frankie
• 9,810 points
232 views