How to increase the amount of data to be transferred to shuffle service at the same time?

0 votes
Facing problem with shuffle service. The data being sent to shuffle service is big. The complete data is not being sent and when the client retries, after some time I am getting fetch fail error. I was finding a solution for this and found that this can be avoided by increasing the size of data being sent at a time. So how can i do it?
Mar 1, 2019 in Apache Spark by Yashita
81 views

1 answer to this question.

0 votes

The amount of data to be transferred at the same time is set pretty high and you wouldn't usually reach that limit unless you have a really huge dataset. I am not sure if the solution you think might work will actually work but anyway, here's how you can do it. You can change it at runtime. 

val sc = new SparkContext(new SparkConf())
./bin/spark-submit <all your existing options> --spark.shuffle.maxChunksBeingTransferred=NEW_VALUE
answered Mar 1, 2019 by Omkar
• 68,860 points

Related Questions In Apache Spark

0 votes
1 answer

How to increase wait time to launch data-local task?

You can increase the locality wait time ...READ MORE

answered Mar 11, 2019 in Apache Spark by Raj
55 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,290 points
312 views
0 votes
1 answer
0 votes
1 answer
+1 vote
1 answer
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
3,915 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
538 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
20,817 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 68,860 points
357 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

answered Feb 18, 2019 in Apache Spark by Omkar
• 68,860 points
80 views