What is persist() in Spark?

0 votes

Can anyone explain what persist() function in Spark?

Jul 2, 2019 in Apache Spark by Tina

recategorized Jul 4, 2019 by Gitika 87 views

1 answer to this question.

0 votes

Hi,

Spark’s RDDs are by default recomputed each time you run an action on them. If you would like to reuse an RDD in multiple actions, you can ask Spark to persist it using RDD.persist().

After computing, it the first time, Spark will store the RDD contents in memory (partitioned across the machines in your cluster), and reuse them in future actions. Persisting RDDs on disk instead of memory is also possible.

answered Jul 2, 2019 by Gitika
• 33,150 points

Related Questions In Apache Spark

+1 vote
1 answer

Can anyone explain what is RDD in Spark?

RDD is a fundamental data structure of ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,380 points
908 views
0 votes
1 answer

Spark 2.3? What is new in it?

Here are the changes in new version ...READ MORE

answered May 28, 2018 in Apache Spark by kurt_cobain
• 9,310 points
129 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,600 points
27,939 views
0 votes
1 answer

What is Executor Memory in a Spark application?

Every spark application has same fixed heap ...READ MORE

answered Jan 4, 2019 in Apache Spark by Frankie
• 9,810 points
2,500 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
5,457 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
33,858 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,310 points
2,061 views
0 votes
1 answer
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 33,150 points
1,604 views
0 votes
1 answer

What is RDD in Apache spark?

Hi, RDD in spark stands for REsilient distributed ...READ MORE

answered Jul 1, 2019 in Apache Spark by Gitika
• 33,150 points
217 views