How to increase the amount of data to be transferred to shuffle service at the same time

0 votes
Facing problem with shuffle service. The data being sent to shuffle service is big. The complete data is not being sent and when the client retries, after some time I am getting fetch fail error. I was finding a solution for this and found that this can be avoided by increasing the size of data being sent at a time. So how can i do it?
Mar 1, 2019 in Apache Spark by Yashita
318 views

1 answer to this question.

0 votes

The amount of data to be transferred at the same time is set pretty high and you wouldn't usually reach that limit unless you have a really huge dataset. I am not sure if the solution you think might work will actually work but anyway, here's how you can do it. You can change it at runtime. 

val sc = new SparkContext(new SparkConf())
./bin/spark-submit <all your existing options> --spark.shuffle.maxChunksBeingTransferred=NEW_VALUE
answered Mar 1, 2019 by Omkar
• 69,170 points

Related Questions In Apache Spark

0 votes
1 answer

How to increase wait time to launch data-local task?

You can increase the locality wait time ...READ MORE

answered Mar 11, 2019 in Apache Spark by Raj
150 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,390 points
955 views
0 votes
1 answer
0 votes
1 answer
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
8,055 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,371 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
67,204 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,170 points
2,790 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

answered Feb 19, 2019 in Apache Spark by Omkar
• 69,170 points
5,202 views