The total number of Data Nodes running in this operation are 0 and note that there are no nodes excluded in this process.

0 votes

I have set up a multi-node Hadoop Cluster. The NameNode and Secondary name node runs on the same machine and the cluster has only one Datanode. All the nodes are configured on Amazon EC2 machines.

Following are the configuration files on the master node:

masters
54.68.218.192 (public IP of the master node)

slaves
54.68.169.62 (public IP of the slave node)

core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

Now are the configuration files on the data node:

core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://54.68.218.192:10001</value>
</property>
</configuration>

mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>54.68.218.192:10002</value>
</property>
</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

the jps run on the Namenode give the following:

5696 NameNode
6504 Jps
5905 SecondaryNameNode
6040 ResourceManager

and jps on data node:

2883 DataNode
3496 Jps
3381 NodeManager

which to me seems right.

Now when I try to run a put command:

hadoop fs -put count_inputfile /test/input/

It gives me the following error:

put: File /count_inputfile._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

The logs on the data node say the following:

hadoop-datanode log
INFO org.apache.hadoop.ipc.Client: Retrying connect to server:      54.68.218.192/54.68.218.192:10001. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

yarn-nodemanager log:

INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

The web UI of node manager(50070) shows that there are 0 live nodes and 0 dead nodes and the dfs used is 100%

I have also disabled IPV6.

On a few websites, I found out that I should also edit the /etc/hosts file. I have also edited them and they look like this:

127.0.0.1 localhost
172.31.25.151 ip-172-31-25-151.us-west-2.compute.internal
172.31.25.152 ip-172-31-25-152.us-west-2.compute.internal

Why I am still geting the error?

Jul 31 in Big Data Hadoop by nitinrawat895
• 10,490 points

retagged Jul 31 by nitinrawat895 18 views

1 answer to this question.

0 votes

Follow these steps:

STEP 1 : stop hadoop and clean temp files from hduser

sudo rm -R /tmp/*

also, you may need to delete and recreate /app/hadoop/tmp (mostly when I change hadoop version from 2.2.0 to 2.7.0)

sudo rm -r /app/hadoop/tmp
sudo mkdir -p /app/hadoop/tmp
sudo chown hduser:hadoop /app/hadoop/tmp
sudo chmod 750 /app/hadoop/tmp

STEP 2: format namenode

hdfs namenode -format

Now, I can see DataNode

hduser@prayagupd:~$ jps
19135 NameNode
20497 Jps
19477 DataNode
20447 NodeManager
19902 SecondaryNameNode
20106 ResourceManager
answered Jul 31 by ravikiran
• 4,200 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to find the running namenodes and secondary name nodes in hadoop?

Name nodes: hdfs getconf -namenodes Secondary name nodes: hdfs getconf ...READ MORE

answered Nov 26, 2018 in Big Data Hadoop by Omkar
• 67,290 points
58 views
0 votes
0 answers

Number of tasks submitted and executed within the day in hadoop

Tasks Write a software application, service, daemon or ...READ MORE

Dec 11, 2018 in Big Data Hadoop by Irfan
• 120 points
27 views
0 votes
1 answer

Is Kafka and Zookeeper are required in a Big Data Cluster?

Apache Kafka is one of the components ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by nitinrawat895
• 10,490 points
350 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,490 points
2,386 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,490 points
243 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
12,164 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
894 views
0 votes
1 answer