Not able to preserve shuffle files in Spark

0 votes
I am trying to execute MapReduce with Spark. I am facing a problem with the shuffle. I am using executors for this and when I remove the executors, I am losing all the shuffle files. Please help.
Feb 23 in Apache Spark by Rohit
16 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You lose the files because by default, there is no setting to save files created by external executors. To save the files even after removing the executors, you will have to change the configuration. The property for this is spark.shuffle.service.enabled and the command to save files even after the executor is removed will be like this:

./bin/spark-submit <all your existing options> --conf spark.shuffle.service.enabled=true
answered Feb 23 by Rana

Related Questions In Apache Spark

0 votes
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3 in Apache Spark by Omkar
• 65,850 points
70 views
0 votes
1 answer
0 votes
1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,260 points
738 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 12,150 points
637 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
1,661 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
130 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
8,051 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
561 views
0 votes
1 answer

Spark SQL in databricks

In sparkSql, we can use CASE when ...READ MORE

answered Feb 23 in Apache Spark by Rishi
66 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

answered Feb 23 in Apache Spark by Wasim
21 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.