How can I put file to HDFS directly without copying it local disk

0 votes
There is a dataset present on web location which is around 31 GB and has been compressed in .gz format. I have a wordcount program which I want to execute over it. I have a remote Hadoop cluster and I am connecting to it using ssh.

The main problem is my home directory cannot hold this dataset on the remote machine due to disk usage quota. So , I was wondering if there is a way to wget the dataset to my HDFS directory Can anyone help me out ?
Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,490 points

edited Jul 11, 2023 by Khan Sarfaraz 4,256 views

1 answer to this question.

0 votes
Can use pipe from wget to hdfs.

You might face problem as gz files are not splittable, this will stop you from running distributed MapReduce code over it.

I would suggest to download file in a local system, then unzip the file and then use pipe operator.

cat test123.txt | ssh uname@master "hadoop dfs -put - FolderName/test123.txt"
answered Apr 15, 2018 by kurt_cobain
• 9,390 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to use jar file without putting it in hdfs?

If by "using", you mean distributing it, ...READ MORE

answered Dec 6, 2018 in Big Data Hadoop by Omkar
• 69,210 points
553 views
0 votes
1 answer

How can I append data to an existing file in HDFS?

You have to do some configurations as ...READ MORE

answered Jul 25, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
8,092 views
0 votes
1 answer

How can we transfer a PDF file to HDFS?

You can easily upload any file to ...READ MORE

answered Apr 13, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
861 views
0 votes
1 answer

How can I use my host machine’s web browser to check my HDFS services running in the VM?

The sole purpose of the virtual machine ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by Shubham
• 13,490 points
1,103 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,627 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
105,009 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,303 views
0 votes
1 answer
0 votes
1 answer

How to upload file to HDFS in Ubuntu

you can use  hadoop fs -copyFromLocal  "/home/ritwi ...READ MORE

answered Apr 19, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
1,267 views
0 votes
1 answer

Is it possible to only install Hadoop HDFS?

First of all think of Hadoop as ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
1,119 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP