What happens to RDD when one of the nodes goes down?

0 votes
In Spark, data is distributed on the different nodes in the form of RDD. What happens if one of the node goes down where RDD is distributed?

Thanks in advance!
Sep 3, 2018 in Apache Spark by Shubham
• 13,290 points
195 views

1 answer to this question.

0 votes
Whenever a node goes down, Spark knows how to prepare a certain data set because it is aware of various transformations and actions that have lead to the dataset in the form of a DAG, it will be able to apply the same transformations and actions to prepare the lost partition of the node which has gone down.

Hope this will help!
answered Sep 3, 2018 by nitinrawat895
• 10,670 points

Related Questions In Apache Spark

0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,240 points
177 views
0 votes
1 answer

Is it better to have one large parquet file or lots of smaller parquet files?

Ideally, you would use snappy compression (default) ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,670 points
1,934 views
0 votes
1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,290 points
318 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
5,340 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 10,670 points
1,174 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,670 points
5,899 views
0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 10,670 points
1,108 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
9,833 views