How can I put file to HDFS directly without copying it local disk?

0 votes

There is a dataset present on web location which is around 31 GB and has been compressed in .gz format. I have a wordcount program which I want to execute over it. I have a remote Hadoop cluster and I am connecting to it using ssh.

The main problem is my home directory cannot hold this dataset on the remote machine due to disk usage quota. So , I was wondering if there is a way to wget the dataset to my HDFS directory Can anyone help me out ?

Apr 15, 2018 in Big Data Hadoop by Shubham
• 12,270 points
404 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes
Can use pipe from wget to hdfs.

You might face problem as gz files are not splittable, this will stop you from running distributed MapReduce code over it.

I would suggest to download file in a local system, then unzip the file and then use pipe operator.

cat test123.txt | ssh uname@master "hadoop dfs -put - FolderName/test123.txt"
answered Apr 15, 2018 by kurt_cobain
• 9,260 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to use jar file without putting it in hdfs?

If by "using", you mean distributing it, ...READ MORE

answered Dec 5, 2018 in Big Data Hadoop by Omkar
• 65,850 points
30 views
0 votes
1 answer

How can we transfer a PDF file to HDFS?

You can easily upload any file to ...READ MORE

answered Apr 13, 2018 in Big Data Hadoop by nitinrawat895
• 9,070 points
46 views
0 votes
1 answer

How can I use my host machine’s web browser to check my HDFS services running in the VM?

The sole purpose of the virtual machine ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by Shubham
• 12,270 points
77 views
0 votes
1 answer

Error while copying the file from local to HDFS

Well, the reason you are getting such ...READ MORE

answered May 2, 2018 in Big Data Hadoop by Ashish
• 2,630 points
245 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,070 points
1,679 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
8,164 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
573 views
0 votes
1 answer
0 votes
1 answer

How to upload file to HDFS in Ubuntu

you can use  hadoop fs -copyFromLocal  "/home/ritwik ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
65 views
0 votes
1 answer

Is it possible to only install Hadoop HDFS?

First of all think of Hadoop as ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
64 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.