How to save and retrieve the Spark RDD from HDFS?

0 votes
I am performing some analytics in Spark. I want to save an RDD in HDFS, which can read back later from HDFS for further processing. How can I do that?
May 29, 2018 in Apache Spark by code799
1,274 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You can save the RDD using saveAsObjectFile and saveAsTextFile method. Whereas you can read the RDD using textFile and sequenceFile function from SparkContext.

​rdd.saveAsTextFile ("hdfs:/localhost:9000/abc/");
val loadRdds = sparkContext.textFile("hdfs:/localhost:9000/abc/").map (x => {})
answered May 29, 2018 by Shubham
• 12,890 points

Related Questions In Apache Spark

0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
6,223 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,240 points
759 views
0 votes
1 answer

How is Apache Spark different from the Hadoop approach?

In Hadoop MapReduce the input data is ...READ MORE

answered May 7, 2018 in Apache Spark by BD Master
52 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 9,670 points
925 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,670 points
1,884 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
9,416 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
678 views
+1 vote
1 answer
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 12,890 points
762 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

answered Dec 10, 2018 in Apache Spark by Vini
9,289 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.