Can anyone help me with the installation and configuration procedures of Hadoop Multi-Node Cluster?

0 votes
Hi, Am a fresher to the Hadoop Technology and I have got recently certified from an online training institute. I am looking for setting up a Hadoop Cluster of my own with a handful of four systems which includes one master and three workers and all are running on Windows operating systems. Regardless of numerous trial and errors, am still unable to get the Hadoop Cluster set up fixed. I am ready to switch to Linux as well. Can anyone please help me or guide me to set up my Hadoop Cluster?
May 28 in Big Data Hadoop by nitinrawat895
• 10,730 points

edited May 28 by nitinrawat895 63 views

1 answer to this question.

0 votes

If you are ready to switch to Linux then follow these steps

Step 1: Get rid of windows. Currently, Hadoop is available for Linux machines. You can have ubuntu 14.04 or later versions (or CentOS, Redhat etc)

Step 2: Install and setup Java $ sudo apt-get install python-software-properties $ sudo add-apt-repository ppa:ferramroberto/java $ sudo apt-get update $ sudo apt-get install sun-java6-jdk

# Select Sun's Java as the default on your machine.
# See 'sudo update-alternatives --config java' for more information.    
#
$ sudo update-java-alternatives -s java-6-sun

Step 3: Set the path in .bashrc file (open this file using a text editor(vi/nano) and append the below text)

export JAVA_HOME=/usr/local/jdk1.7.0_71
export PATH=PATH:$JAVA_HOME/bin

Step 4: Add a dedicated user (While that’s not required it is recommended)

# useradd hadoop 
# passwd hadoop

Step 5: Edit hosts file in /etc/ folder on all nodes, specify the IP address of each system followed by their host names.( open the file in using vi /etc/hosts and append the text below --

<ip address of master node> hadoop-master 
<ip address of slave node 1> hadoop-slave-1 
<ip address of slave node 2> hadoop-slave-2
<ip address of slave node 3> hadoop-slave-3

Step 6: Setup ssh in every node such that they can communicate with one another without any prompt for password.

$ su hadoop
$ ssh-keygen -t rsa 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-master 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp1@hadoop-slave-1 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp2@hadoop-slave-2
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp3@hadoop-slave-3
$ chmod 0600 ~/.ssh/authorized_keys 
$ exit

for more information on SSH go to [https://www.ssh.com/ssh/][1]

Step 7: In master, server download and install Hadoop.

# mkdir /opt/hadoop 
# cd /opt/hadoop/ 
# wget http://apache.mesi.com.ar/hadoop/common/hadoop-1.2.1/hadoop-
  1.2.0.tar.gz 
# tar -xzf hadoop-1.2.0.tar.gz 
# mv hadoop-1.2.0 hadoop
# chown -R hadoop /opt/hadoop 
# cd /opt/hadoop/hadoop/

Installation is finished here!

Next step is: Configuring Hadoop

Step 1: Open core-site.xml and edit it as below :

<configuration>
<property> 
  <name>fs.default.name</name> 
  <value>hdfs://hadoop-master:9000/</value> 
</property> 
<property> 
  <name>dfs.permissions</name> 
  <value>false</value> 
</property> 
</configuration>

Step 2: open hdfs-site.xml and edit it as below :

<configuration>
<property> 
  <name>dfs.data.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name/data</value> 
  <final>true</final> 
</property> 

<property> 
  <name>dfs.name.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name</value> 
  <final>true</final> 
</property> 
 <property> 
  <name>dfs.name.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name</value> 
  <final>true</final> 
</property> 

<property> 
  <name>dfs.replication</name> 
  <value>3</value> 
</property> 
</configuration>

Step 3: open MapRed-site.xml and edit --

<configuration>
<property> 
  <name>mapred.job.tracker</name> 
  <value>hadoop-master:9001</value> 
</property> 
</configuration>

Step 4: Append below text in Hadoop-env.sh

export JAVA_HOME=/opt/jdk1.7.0_17 export 
HADOOP_OPTS=Djava.net.preferIPv4Stack=true export 
HADOOP_CONF_DIR=/opt/hadoop/hadoop/conf

Step 5: Configure master --

$ vi etc/hadoop/masters 
hadoop-master

Step 5: Install it on slave nodes as well --

# su hadoop 
$ cd /opt/hadoop 
$ scp -r hadoop hadoop-slave-1:/opt/hadoop 
$ scp -r hadoop hadoop-slave-2:/opt/hadoop
$ scp -r hadoop hadoop-slave-3:/opt/hadoop

Step 6: Configure slaves --

$ vi etc/hadoop/slaves
hadoop-slave-1 
hadoop-slave-2
hadoop-slave-3

Step 7: format the nodes (ONLY ONE TIME OTHERWISE ALL THE DATA WILL BE LOST PERMANENTLY)

# su hadoop 
$ cd /opt/hadoop/hadoop 
$ bin/hadoop namenode –format

You are all set!!

You can start the services as follows --

$ cd $HADOOP_HOME/sbin
$ start-all.sh

This must resolve your issue.

answered May 28 by ravikiran
• 4,560 points

Related Questions In Big Data Hadoop

+1 vote
3 answers

when i tried to run the hdfs namenode -format command...its failed to run..can anyone help me with this???

Hi Suriyaprakash, I guess the path for hadoop ...READ MORE

answered Jul 23, 2018 in Big Data Hadoop by avi_man_utd
858 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,310 points
703 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to run Map Reduce program using Ubuntu terminal?

 I used the following steps to execute it ...READ MORE

answered Aug 7, 2018 in Big Data Hadoop by Neha
• 6,280 points
201 views
0 votes
1 answer

Copy files to all Hadoop DFS directories

Hi @Bhavish. There is no Hadoop command ...READ MORE

answered Feb 23 in Big Data Hadoop by Omkar
• 67,660 points
514 views
0 votes
1 answer

Can anyone help me in installing and configuring a Multi-Node Hadoop Cluster?

To install Hadoop setup on the 4-node ...READ MORE

answered Jun 4 in Big Data Hadoop by ravikiran
• 4,560 points
102 views
0 votes
1 answer