How should I provide a URL for hdfs file system

0 votes

 have some data in HDFS @ /user/Cloudera/Test/. I am very well able to see the records by running "hdfs -dfs -cat Test/".

Now the same file, I need it to be read as RDD in scala. I have tried the following in scala shell.

val file = sc.textFile("hdfs://quickstart.cloudera:8020/user/Cloudera/Test")

Then I have written some filter and for loop to read the words. But when I use the Println at last, it says file not found.

Can anyone please help me know what would be the HDFS url in this case. Note: I am using Cloudera CDH5.0 VM

Sep 10, 2018 in Big Data Hadoop by Neha
• 6,300 points
3,305 views

1 answer to this question.

0 votes

If you are trying to access your file in spark job then you can simply use url val file = sc.textFile("/user/Cloudera/Test") Spark will automatically detect this file you do not need to add localhost as prefix because spark job by default read them from HDfS directory.

Or you can also try this:

Instead of using "quickstart.cloudera" and the port, use just the ip address:

val file = sc.textFile("hdfs://<ip>/user/Cloudera/Test")
answered Sep 10, 2018 by Frankie
• 9,830 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How can I download hadoop documentation for a specific version?

You can go through this SVN link:- ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by Shubham
• 13,490 points
623 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points

edited Mar 22, 2018 by nitinrawat895 2,682 views
0 votes
1 answer
0 votes
1 answer

How can we transfer a PDF file to HDFS?

You can easily upload any file to ...READ MORE

answered Apr 13, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
855 views
0 votes
1 answer

What is -cp command in hadoop? How it works?

/user/cloudera/data1 is not a directory, it is ...READ MORE

answered Oct 17, 2018 in Big Data Hadoop by Frankie
• 9,830 points
3,851 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,617 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,215 views
0 votes
1 answer
0 votes
1 answer

How to configure Hosts file for Hadoop Eco-System?

For UBUNTU Hosts File and other configuration for Hadoop ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by Frankie
• 9,830 points
3,924 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP