How to save and retrieve the Spark RDD from HDFS

0 votes
I am performing some analytics in Spark. I want to save an RDD in HDFS, which can read back later from HDFS for further processing. How can I do that?
May 29, 2018 in Apache Spark by code799
13,001 views

1 answer to this question.

+1 vote

You can save the RDD using saveAsObjectFile and saveAsTextFile method. Whereas you can read the RDD using textFile and sequenceFile function from SparkContext.

​rdd.saveAsTextFile ("hdfs:/localhost:9000/abc/");
val loadRdds = sparkContext.textFile("hdfs:/localhost:9000/abc/").map (x => {})
answered May 29, 2018 by Shubham
• 13,490 points

Related Questions In Apache Spark

+1 vote
8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
60,646 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

answered Jul 23, 2019 in Apache Spark by Gitika
• 65,910 points
3,250 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

answered Jul 24, 2019 in Apache Spark by Yogi
3,393 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
9,000 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,557 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,207 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,260 views
+1 vote
1 answer
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
7,906 views
0 votes
5 answers

How to change the spark Session configuration in Pyspark?

You aren't actually overwriting anything with this ...READ MORE

answered Dec 14, 2020 in Apache Spark by Gitika
• 65,910 points
121,589 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP