Why Partitions are immutable in Spark?

0 votes

Can anyone explain why partition are immutable in spark?

Jul 3, 2019 in Apache Spark by neha

recategorized Jul 4, 2019 by Gitika 88 views

1 answer to this question.

0 votes

Hi,

Every transformation generates a new partition. Partitions use HDFS API so that partition is immutable, distributed and fault tolerance. Partition also aware of data locality.

answered Jul 3, 2019 by Gitika
• 25,440 points

Related Questions In Apache Spark

0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 26, 2018 in Apache Spark by zombie
• 3,750 points
633 views
0 votes
0 answers

what are the memory issues in spark ?

Mar 17, 2019 in Apache Spark by satish kumar
• 180 points
490 views
0 votes
1 answer

what are the job optimization Technics in spark and scala ?

There are different methods to achieve optimization ...READ MORE

answered Mar 18, 2019 in Apache Spark by Veer
475 views
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

answered Apr 19, 2018 in Apache Spark by Ashish
• 2,630 points
5,559 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
3,855 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
20,382 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,290 points
1,442 views
0 votes
1 answer
0 votes
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 25,440 points
133 views
0 votes
1 answer

What is RDD in Apache spark?

Hi, RDD in spark stands for REsilient distributed ...READ MORE

answered Jul 1, 2019 in Apache Spark by Gitika
• 25,440 points
127 views