What do we mean by an RDD in Spark?

0 votes
I am learning Spark and I heard about something called Resilient Distributed Datasets in Spark. Can anyone explain to me what exactly are RDD's?
Jun 18, 2018 in Apache Spark by kurt_cobain
• 9,260 points
91 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

The full form of RDD is a resilient distributed dataset. It is a representation of data located on a network which is:

Immutable – You can operate on the RDD to produce another RDD but you can’t alter it.
Partitioned / Parallel – The data located on RDD is operated in parallel. Any operation on RDD is done using multiple nodes.
Resilience – If one of the nodes hosting the partition fails, other nodes takes its data.
You can always think of RDD as a big array which is under the hood spread over many computers which are completely abstracted. So, RDD is made up many partitions each partition on different computers.

answered Jun 18, 2018 by nitinrawat895
• 9,030 points

Related Questions In Apache Spark

0 votes
1 answer

Can anyone explain what is RDD in Spark?

RDD is a fundamental data structure of ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 12,150 points
466 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 12,150 points
637 views
0 votes
1 answer

Spark 2.3? What is new in it?

Here are the changes in new version ...READ MORE

answered May 28, 2018 in Apache Spark by kurt_cobain
• 9,260 points
31 views
0 votes
1 answer

How RDD persist the data in Spark?

There are two methods to persist the ...READ MORE

answered Jun 18, 2018 in Apache Spark by nitinrawat895
• 9,030 points
92 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
3,362 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 9,030 points
493 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 9,030 points
2,902 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
7,230 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 9,030 points
769 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.