What do we mean by an RDD in Spark?

0 votes
I am learning Spark and I heard about something called Resilient Distributed Datasets in Spark. Can anyone explain to me what exactly are RDD's?
Jun 18, 2018 in Apache Spark by kurt_cobain
• 9,280 points
222 views

1 answer to this question.

0 votes

The full form of RDD is a resilient distributed dataset. It is a representation of data located on a network which is:

Immutable – You can operate on the RDD to produce another RDD but you can’t alter it.
Partitioned / Parallel – The data located on RDD is operated in parallel. Any operation on RDD is done using multiple nodes.
Resilience – If one of the nodes hosting the partition fails, other nodes takes its data.
You can always think of RDD as a big array which is under the hood spread over many computers which are completely abstracted. So, RDD is made up many partitions each partition on different computers.

answered Jun 18, 2018 by nitinrawat895
• 10,800 points

Related Questions In Apache Spark

0 votes
1 answer

Can anyone explain what is RDD in Spark?

RDD is a fundamental data structure of ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,350 points
658 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,350 points
1,760 views
0 votes
1 answer

What is RDD in Apache spark?

Hi, RDD in spark stands for REsilient distributed ...READ MORE

answered Jul 1 in Apache Spark by Gitika
• 25,420 points
107 views
0 votes
1 answer

What is RDD Lineage in Spark?

Hey, Lineage is an RDD process to reconstruct ...READ MORE

answered Jul 4 in Apache Spark by Gitika
• 25,420 points
112 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,280 points
6,572 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 10,800 points
1,606 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,800 points
8,459 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
18,137 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 10,800 points
1,664 views