Can anyone help me with the installation and configuration procedures of Hadoop Multi-Node Cluster

Question

Hi, Am a fresher to the Hadoop Technology and I have got recently certified from an online training institute. I am looking for setting up a Hadoop Cluster of my own with a handful of four systems which includes one master and three workers and all are running on Windows operating systems. Regardless of numerous trial and errors, am still unable to get the Hadoop Cluster set up fixed. I am ready to switch to Linux as well. Can anyone please help me or guide me to set up my Hadoop Cluster?

ravikiran · Answer 1 · May 28, 2019

If you are ready to switch to Linux then follow these steps

Step 1: Get rid of windows. Currently, Hadoop is available for Linux machines. You can have ubuntu 14.04 or later versions (or CentOS, Redhat etc)

Step 2: Install and setup Java $ sudo apt-get install python-software-properties $ sudo add-apt-repository ppa:ferramroberto/java $ sudo apt-get update $ sudo apt-get install sun-java6-jdk

# Select Sun's Java as the default on your machine.
# See 'sudo update-alternatives --config java' for more information.    
#
$ sudo update-java-alternatives -s java-6-sun

Step 3: Set the path in .bashrc file (open this file using a text editor(vi/nano) and append the below text)

export JAVA_HOME=/usr/local/jdk1.7.0_71
export PATH=PATH:$JAVA_HOME/bin

Step 4: Add a dedicated user (While that’s not required it is recommended)

# useradd hadoop 
# passwd hadoop

Step 5: Edit hosts file in /etc/ folder on all nodes, specify the IP address of each system followed by their host names.( open the file in using vi /etc/hosts and append the text below --

<ip address of master node> hadoop-master 
<ip address of slave node 1> hadoop-slave-1 
<ip address of slave node 2> hadoop-slave-2
<ip address of slave node 3> hadoop-slave-3

Step 6: Setup ssh in every node such that they can communicate with one another without any prompt for password.

$ su hadoop
$ ssh-keygen -t rsa 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-master 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp1@hadoop-slave-1 
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp2@hadoop-slave-2
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp3@hadoop-slave-3
$ chmod 0600 ~/.ssh/authorized_keys 
$ exit

for more information on SSH go to [https://www.ssh.com/ssh/][1]

Step 7: In master, server download and install Hadoop.

# mkdir /opt/hadoop 
# cd /opt/hadoop/ 
# wget http://apache.mesi.com.ar/hadoop/common/hadoop-1.2.1/hadoop-1.2.0.tar.gz 
# tar -xzf hadoop-1.2.0.tar.gz 
# mv hadoop-1.2.0 hadoop
# chown -R hadoop /opt/hadoop 
# cd /opt/hadoop/hadoop/

Installation is finished here!

Next step is: Configuring Hadoop

Step 1: Open core-site.xml and edit it as below :

<configuration>
<property> 
  <name>fs.default.name</name> 
  <value>hdfs://hadoop-master:9000/</value> 
</property> 
<property> 
  <name>dfs.permissions</name> 
  <value>false</value> 
</property> 
</configuration>

Step 2: open hdfs-site.xml and edit it as below :

<configuration>
<property> 
  <name>dfs.data.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name/data</value> 
  <final>true</final> 
</property> 

<property> 
  <name>dfs.name.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name</value> 
  <final>true</final> 
</property> 
 <property> 
  <name>dfs.name.dir</name> 
  <value>/opt/hadoop/hadoop/dfs/name</value> 
  <final>true</final> 
</property> 

<property> 
  <name>dfs.replication</name> 
  <value>3</value> 
</property> 
</configuration>

Step 3: open MapRed-site.xml and edit --

<configuration>
<property> 
  <name>mapred.job.tracker</name> 
  <value>hadoop-master:9001</value> 
</property> 
</configuration>

Step 4: Append below text in Hadoop-env.sh

export JAVA_HOME=/opt/jdk1.7.0_17 export 
HADOOP_OPTS=Djava.net.preferIPv4Stack=true export 
HADOOP_CONF_DIR=/opt/hadoop/hadoop/conf

Step 5: Configure master --

$ vi etc/hadoop/masters 
hadoop-master

Step 5: Install it on slave nodes as well --

# su hadoop 
$ cd /opt/hadoop 
$ scp -r hadoop hadoop-slave-1:/opt/hadoop 
$ scp -r hadoop hadoop-slave-2:/opt/hadoop
$ scp -r hadoop hadoop-slave-3:/opt/hadoop

Step 6: Configure slaves --

$ vi etc/hadoop/slaves
hadoop-slave-1 
hadoop-slave-2
hadoop-slave-3

Step 7: format the nodes (ONLY ONE TIME OTHERWISE ALL THE DATA WILL BE LOST PERMANENTLY)

# su hadoop 
$ cd /opt/hadoop/hadoop 
$ bin/hadoop namenode –format

You are all set!!

You can start the services as follows --

$ cd $HADOOP_HOME/sbin
$ start-all.sh

This must resolve your issue.

Can anyone help me with the installation and configuration procedures of Hadoop Multi-Node Cluster

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

when i tried to run the hdfs namenode -format command...its failed to run..can anyone help me with this???

while executing this iam getting this error can anyone please help me with the solution please

How to install and configure a multi-node Hadoop cluster?

What are some of the famous visualization tools which can be integrated with Hadoop & Hive?

How do I verify the kernel parameters to check for the pre-requisities of Oracle 11 + versions?

I get an error stating- Hadoop: «ERROR : JAVA_HOME is not set». How to resolve this?

How to run Map Reduce program using Ubuntu terminal?

Copy files to all Hadoop DFS directories

Can anyone help me in installing and configuring a Multi-Node Hadoop Cluster?

Can anyone help me out to install the following packages R-MR, R-HDFS, and R-HBase on R-HAdoop?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES