How to increase the amount of data to be transferred to shuffle service at the same time?

0 votes
Facing problem with shuffle service. The data being sent to shuffle service is big. The complete data is not being sent and when the client retries, after some time I am getting fetch fail error. I was finding a solution for this and found that this can be avoided by increasing the size of data being sent at a time. So how can i do it?
Mar 1, 2019 in Apache Spark by Yashita
145 views

1 answer to this question.

0 votes

The amount of data to be transferred at the same time is set pretty high and you wouldn't usually reach that limit unless you have a really huge dataset. I am not sure if the solution you think might work will actually work but anyway, here's how you can do it. You can change it at runtime. 

val sc = new SparkContext(new SparkConf())
./bin/spark-submit <all your existing options> --spark.shuffle.maxChunksBeingTransferred=NEW_VALUE
answered Mar 1, 2019 by Omkar
• 69,040 points

Related Questions In Apache Spark

0 votes
1 answer

How to increase wait time to launch data-local task?

You can increase the locality wait time ...READ MORE

answered Mar 11, 2019 in Apache Spark by Raj
75 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,310 points
492 views
0 votes
1 answer
0 votes
1 answer
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
5,443 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
801 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
33,675 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,040 points
635 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

answered Feb 18, 2019 in Apache Spark by Omkar
• 69,040 points
304 views