Can anyone explain why partition are immutable in spark?
Every transformation generates a new partition. Partitions use HDFS API so that partition is immutable, distributed and fault tolerance. Partition also aware of data locality.
> In order to reduce the processing ...READ MORE
There are different methods to achieve optimization ...READ MORE
Yes, you can reorder the dataframe elements.
You need ...READ MORE
Firstly you need to understand the concept ...READ MORE
put <localSrc> <dest>
copyFr ...READ MORE
In your case there is no difference ...READ MORE
The distributed copy command, distcp, is a ...READ MORE
Well, it depends on the block of ...READ MORE
RDD in spark stands for REsilient distributed ...READ MORE