Copy files to all Hadoop DFS directories

0 votes

I am trying to copy files to hdfs. I need to copy all files from a specified local folder(myData) to all directories available in my hdfs. I am using wildcards for this.

Here is my code:

bin/hdfs dfs -copyFromLocal myData/* /*/

This is my error:

copyFromLocal: `/var/': No such file or directory 

Normally when i specify a single directory in hdfs, all my files are being copied successfully to that directory.
But when i use wildcards(*) to copy to all directories in my hdfs, i get the above error.

Is there any way to get around this?
Thank you. 

Feb 23 in Big Data Hadoop by Bhavish
• 310 points
89 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
+1 vote
Best answer

Hi @Bhavish. There is no Hadoop command to copy a local file to all/multiple directories in hdfs. But you can do it using a loop in the bash script. 

Create a new bash script, I named my file copytoallhdfs.sh:

$ nano copytoallhdfs.sh

I came up with the following script for your problem statement:

#!/bin/bash
myarray=`hdfs dfs -ls -C /path/to/directory`
for name in $myarray
do hdfs dfs -copyFromLocal /path/to/source /$name; done

Save(Ctrl+o) and close(Ctrl+x) this file

Now make this script executable:

$ sudo chmod 777 copytoallhdfs.sh

and run the script:

$./copytoallhdfs.sh

This worked for me. 

answered Feb 23 by Omkar
• 65,850 points

selected Feb 23 by Bhavish

Hello @Omkar,

Your response is working perfectly for me. I just made some changes to the script.

Here is my script:

#!/bin/bash

myarray=`bin/hdfs dfs -ls -C /`

echo $myarray;
for name in $myarray
do bin/hdfs dfs -copyFromLocal myData/* $name;
done

The line  echo $myarray; gives me /folder1 /folder2 /folder3.

Since the slash notation '/' is already in the output, i have removed the slash before the variable $name. Also i removed the slash before pathToSource since it was not working.

Many thanks.

Happy to be of help @Bhavish!

Related Questions In Big Data Hadoop

0 votes
1 answer

How to sync Hadoop configuration files to multiple nodes?

For syncing Hadoop configuration files, you have ...READ MORE

answered Jun 21, 2018 in Big Data Hadoop by HackTheCode
108 views
0 votes
1 answer
0 votes
5 answers
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
1,657 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
130 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
8,041 views
+2 votes
5 answers

Not able to start hadoop dfs

You can re-install openssh-client and openssh-server: $ sudo ...READ MORE

answered Oct 25, 2018 in Big Data Hadoop by Jino
114 views
0 votes
3 answers

Hadoop hdfs: list all files in a directory and its subdirectories

You can do it using queue: private static ...READ MORE

answered Dec 4, 2018 in Big Data Hadoop by Ishwar
845 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.