Not able to preserve shuffle files in Spark

Question

I am trying to execute MapReduce with Spark. I am facing a problem with the shuffle. I am using executors for this and when I remove the executors, I am losing all the shuffle files. Please help.

score 0 · Answer 1 · Feb 24, 2019

You lose the files because by default, there is no setting to save files created by external executors. To save the files even after removing the executors, you will have to change the configuration. The property for this is spark.shuffle.service.enabled and the command to save files even after the executor is removed will be like this:

./bin/spark-submit <all your existing options> --conf spark.shuffle.service.enabled=true

answered Feb 24, 2019 by Rana

Not able to preserve shuffle files in Spark

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

Not able to use sc in spark shell

not able to get output in spark streaming??

I am not able to run the apache spark program in mac oc

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

Spark SQL in databricks

Why is Spark map output compressed?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES