Why is Spark map output compressed

Question

I am using Spark for MapReduce and I see that the output file after map phase is always compressed. Why is this happening?

score 0 · Answer 1 · Feb 24, 2019

Spark thinks that it is a good idea to compress output files and it is in fact right. The reason for the compression of output files is due to the property spark.shuffle.compress. This property is used to decide whether the output file should be compressed or not and by default is set to true. If you do not want the output to be changed then you can change this property dynamically:

./bin/spark-submit --conf spark.shuffle.compress=false

answered Feb 24, 2019 by Wasim

Why is Spark map output compressed

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

Why is Spark faster than Hadoop Map Reduce

What is Map and flatMap in Spark?

Is there any way to check the Spark version?

Why is collect in SparkR slow?

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

Not able to preserve shuffle files in Spark

Spark SQL in databricks

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES