Not able to preserve shuffle files in Spark

0 votes
I am trying to execute MapReduce with Spark. I am facing a problem with the shuffle. I am using executors for this and when I remove the executors, I am losing all the shuffle files. Please help.
Feb 24, 2019 in Apache Spark by Rohit
1,246 views

1 answer to this question.

0 votes

You lose the files because by default, there is no setting to save files created by external executors. To save the files even after removing the executors, you will have to change the configuration. The property for this is spark.shuffle.service.enabled and the command to save files even after the executor is removed will be like this:

./bin/spark-submit <all your existing options> --conf spark.shuffle.service.enabled=true
answered Feb 24, 2019 by Rana

Related Questions In Apache Spark

–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3, 2019 in Apache Spark by Omkar
• 69,210 points
1,427 views
0 votes
0 answers

not able to get output in spark streaming??

Hi everyone, I tried to count individual words ...READ MORE

Feb 4, 2020 in Apache Spark by akhtar
• 38,230 points
676 views
0 votes
1 answer

I am not able to run the apache spark program in mac oc

Hi@Srinath, It seems you didn't set Hadoop for ...READ MORE

answered Sep 21, 2020 in Apache Spark by MD
• 95,440 points
1,156 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

answered Feb 13, 2019 in Apache Spark by Omkar
• 69,210 points
1,155 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,636 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,223 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
105,131 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,307 views
0 votes
1 answer

Spark SQL in databricks

In sparkSql, we can use CASE when ...READ MORE

answered Feb 24, 2019 in Apache Spark by Rishi
2,098 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

answered Feb 24, 2019 in Apache Spark by Wasim
884 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP