How should I provide a URL for hdfs file system?

0 votes

 have some data in HDFS @ /user/Cloudera/Test/. I am very well able to see the records by running "hdfs -dfs -cat Test/".

Now the same file, I need it to be read as RDD in scala. I have tried the following in scala shell.

val file = sc.textFile("hdfs://quickstart.cloudera:8020/user/Cloudera/Test")

Then I have written some filter and for loop to read the words. But when I use the Println at last, it says file not found.

Can anyone please help me know what would be the HDFS url in this case. Note: I am using Cloudera CDH5.0 VM

Sep 10, 2018 in Big Data Hadoop by Neha
• 6,280 points
280 views

1 answer to this question.

0 votes

If you are trying to access your file in spark job then you can simply use url val file = sc.textFile("/user/Cloudera/Test") Spark will automatically detect this file you do not need to add localhost as prefix because spark job by default read them from HDfS directory.

Or you can also try this:

Instead of using "quickstart.cloudera" and the port, use just the ip address:

val file = sc.textFile("hdfs://<ip>/user/Cloudera/Test")
answered Sep 10, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How can I download hadoop documentation for a specific version?

You can go through this SVN link:- ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,350 points
104 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,800 points

edited Mar 21, 2018 by nitinrawat895 401 views
0 votes
1 answer
0 votes
1 answer

How can we transfer a PDF file to HDFS?

You can easily upload any file to ...READ MORE

answered Apr 13, 2018 in Big Data Hadoop by nitinrawat895
• 10,800 points
91 views
0 votes
1 answer

What is -cp command in hadoop? How it works?

/user/cloudera/data1 is not a directory, it is ...READ MORE

answered Oct 17, 2018 in Big Data Hadoop by Frankie
• 9,810 points
379 views
+1 vote
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
3,561 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
445 views
0 votes
1 answer
0 votes
1 answer

How to configure Hosts file for Hadoop Eco-System?

For UBUNTU Hosts File and other configuration for Hadoop ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by Frankie
• 9,810 points
334 views