How to increase the amount of data to be transferred to shuffle service at the same time

Question

Facing problem with shuffle service. The data being sent to shuffle service is big. The complete data is not being sent and when the client retries, after some time I am getting fetch fail error. I was finding a solution for this and found that this can be avoided by increasing the size of data being sent at a time. So how can i do it?

Omkar · Answer 1 · Mar 1, 2019

The amount of data to be transferred at the same time is set pretty high and you wouldn't usually reach that limit unless you have a really huge dataset. I am not sure if the solution you think might work will actually work but anyway, here's how you can do it. You can change it at runtime.

val sc = new SparkContext(new SparkConf())

./bin/spark-submit <all your existing options> --spark.shuffle.maxChunksBeingTransferred=NEW_VALUE

answered Mar 1, 2019 by Omkar
• 69,180 points

How to increase the amount of data to be transferred to shuffle service at the same time

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

How to increase wait time to launch data-local task?

How to get the number of elements in partition?

How to import the dependencies of Spark MLlib into eclipse project?

How to find the number of elements present in the array in a Spark DataFame column?

How do I get number of columns in each line from a delimited file??

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

How to find the number of null contain in dataframe?

How to select all columns with group by?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES