How to compress serialized RDD partition?

0 votes
Hi. I am facing memory issues with Spark serialization. The serialized RDD are huge in size and want to know if there's any way I could compress the size. Please help.
Mar 7 in Apache Spark by Rajath
120 views

1 answer to this question.

0 votes

Yes, you can do this by enabling the compression of RDD. To enable it, use the following command in Spark shell:

val sc = new SparkContext(new SparkConf())
./bin/spark-submit <all your existing options> --spark.rdd.compress=true
answered Mar 7 by Pavitra

Related Questions In Apache Spark

0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,260 points
232 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 10,710 points
2,555 views
0 votes
1 answer

How to save and retrieve the Spark RDD from HDFS?

You can save the RDD using saveAsObjectFile and saveAsTextFile method. ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,300 points
2,459 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 10,710 points
1,545 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
3,295 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,710 points
390 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,210 views
0 votes
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,690 points
1,833 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
12,257 views