How can I put file to HDFS directly without copying it local disk?

0 votes

There is a dataset present on web location which is around 31 GB and has been compressed in .gz format. I have a wordcount program which I want to execute over it. I have a remote Hadoop cluster and I am connecting to it using ssh.

The main problem is my home directory cannot hold this dataset on the remote machine due to disk usage quota. So , I was wondering if there is a way to wget the dataset to my HDFS directory Can anyone help me out ?

Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,350 points
1,049 views

1 answer to this question.

0 votes
Can use pipe from wget to hdfs.

You might face problem as gz files are not splittable, this will stop you from running distributed MapReduce code over it.

I would suggest to download file in a local system, then unzip the file and then use pipe operator.

cat test123.txt | ssh uname@master "hadoop dfs -put - FolderName/test123.txt"
answered Apr 15, 2018 by kurt_cobain
• 9,280 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to use jar file without putting it in hdfs?

If by "using", you mean distributing it, ...READ MORE

answered Dec 5, 2018 in Big Data Hadoop by Omkar
• 68,180 points
48 views
0 votes
1 answer

How can I append data to an existing file in HDFS?

You have to do some configurations as ...READ MORE

answered Jul 25 in Big Data Hadoop by ravikiran
• 4,580 points
622 views
0 votes
1 answer

How can we transfer a PDF file to HDFS?

You can easily upload any file to ...READ MORE

answered Apr 13, 2018 in Big Data Hadoop by nitinrawat895
• 10,760 points
90 views
0 votes
1 answer

How can I use my host machine’s web browser to check my HDFS services running in the VM?

The sole purpose of the virtual machine ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by Shubham
• 13,350 points
171 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,760 points
3,544 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
18,087 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
1,312 views
0 votes
1 answer
0 votes
1 answer

How to upload file to HDFS in Ubuntu

you can use  hadoop fs -copyFromLocal  "/home/ritwik ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
176 views
0 votes
1 answer

Is it possible to only install Hadoop HDFS?

First of all think of Hadoop as ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
133 views