How to compress serialized RDD partition

Question

Hi. I am facing memory issues with Spark serialization. The serialized RDD are huge in size and want to know if there's any way I could compress the size. Please help.

score 0 · Answer 1 · Mar 7, 2019

Yes, you can do this by enabling the compression of RDD. To enable it, use the following command in Spark shell:

val sc = new SparkContext(new SparkConf())

./bin/spark-submit <all your existing options> --spark.rdd.compress=true

answered Mar 7, 2019 by Pavitra

How to compress serialized RDD partition

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

How to get the number of elements in partition?

How to find max value in pair RDD?

How to save and retrieve the Spark RDD from HDFS?

How to convert rdd object to dataframe in spark

How do I get number of columns in each line from a delimited file??

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

How can I convert Spark Dataframe to Spark RDD?

How to print the contents of RDD in Apache Spark?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES