Copy files to all Hadoop DFS directories

0 votes

I am trying to copy files to hdfs. I need to copy all files from a specified local folder(myData) to all directories available in my hdfs. I am using wildcards for this.

Here is my code:

bin/hdfs dfs -copyFromLocal myData/* /*/

This is my error:

copyFromLocal: `/var/': No such file or directory 

Normally when i specify a single directory in hdfs, all my files are being copied successfully to that directory.
But when i use wildcards(*) to copy to all directories in my hdfs, i get the above error.

Is there any way to get around this?
Thank you. 

Feb 23, 2019 in Big Data Hadoop by Bhavish
• 370 points
2,977 views

1 answer to this question.

+1 vote
Best answer

Hi @Bhavish. There is no Hadoop command to copy a local file to all/multiple directories in hdfs. But you can do it using a loop in the bash script. 

Create a new bash script, I named my file copytoallhdfs.sh:

$ nano copytoallhdfs.sh

I came up with the following script for your problem statement:

#!/bin/bash
myarray=`hdfs dfs -ls -C /path/to/directory`
for name in $myarray
do hdfs dfs -copyFromLocal /path/to/source /$name; done

Save(Ctrl+o) and close(Ctrl+x) this file

Now make this script executable:

$ sudo chmod 777 copytoallhdfs.sh

and run the script:

$./copytoallhdfs.sh

This worked for me. 

answered Feb 24, 2019 by Omkar
• 69,210 points

selected Feb 24, 2019 by Bhavish

Hello @Omkar,

Your response is working perfectly for me. I just made some changes to the script.

Here is my script:

#!/bin/bash

myarray=`bin/hdfs dfs -ls -C /`

echo $myarray;
for name in $myarray
do bin/hdfs dfs -copyFromLocal myData/* $name;
done

The line  echo $myarray; gives me /folder1 /folder2 /folder3.

Since the slash notation '/' is already in the output, i have removed the slash before the variable $name. Also i removed the slash before pathToSource since it was not working.

Many thanks.

Happy to be of help @Bhavish!

Related Questions In Big Data Hadoop

0 votes
1 answer

Hadoop: ERROR datanode.DataNode: All directories in dfs.data.dir are invalid.

Hi, Try this, first delete all contents from ...READ MORE

answered Aug 5, 2019 in Big Data Hadoop by Gitika
• 65,910 points
1,541 views
0 votes
1 answer

How to sync Hadoop configuration files to multiple nodes?

For syncing Hadoop configuration files, you have ...READ MORE

answered Jun 21, 2018 in Big Data Hadoop by HackTheCode
1,170 views
0 votes
1 answer
0 votes
1 answer
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,555 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,184 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,184 views
+2 votes
5 answers

Not able to start hadoop dfs

You can re-install openssh-client and openssh-server: $ sudo ...READ MORE

answered Oct 25, 2018 in Big Data Hadoop by Jino
2,450 views
0 votes
5 answers

Hadoop hdfs: list all files in a directory and its subdirectories

Hi, You can try this command: hadoop fs -ls ...READ MORE

answered Aug 1, 2019 in Big Data Hadoop by Dinish
17,297 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP