How to change the spark Session configuration in Pyspark?

0 votes

I am trying to change the default configuration of Spark Session. But it is not working.

spark_session  = SparkSession.builder
                      .master("ip")
                      .enableHiveSupport()
                      .getOrCreate()

spark_session.conf.set("spark.executor.memory", '8g')
spark_session.conf.set('spark.executor.cores', '3')
spark_session.conf.set('spark.cores.max', '3')
spark_session.conf.set("spark.driver.memory",'8g')
sc = spark_session.sparkContext

But if I put the configuration in Spark submit, then it works fine for me.

spark-submit --master ip --executor-cores=3 --diver 8G sample.py
May 29, 2018 in Apache Spark by code799
7,674 views

4 answers to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You are not changing the configuration of PySpark. Just open pyspark shell and check the settings:

sc.getConf().getAll()

Now you can execute the code and again check the setting of the Pyspark shell.

You first have to create conf and then you can create the Spark Context using that configuration object.

config = pyspark.SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.cores.max', '3'), ('spark.driver.memory','8g')])
sc.stop()
sc = pyspark.SparkContext(conf=config)
answered May 29, 2018 by Shubham
• 12,270 points
0 votes

Adding to Shubham's answer, after updating the configuration, you have to stop the spark session and create a new spark session.

spark.sparkContext.stop()
spark = SparkSession.builder.config(conf=conf).getOrCreate()
answered Dec 10, 2018 by Hilight
0 votes

This should work

spark = SparkSession.builder.config(conf=conf1).getOrCreate()
sc = spark.sparkContext
answered Dec 10, 2018 by Shikar
0 votes

You can dynamically load properties. First create a new empty conf and then pass your conf on run-time:

val sc = new SparkContext(new SparkConf())
spark-submit --master ip --executor-cores=3 --diver 8G sample.py
answered Dec 10, 2018 by Vini

Related Questions In Apache Spark

0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
5,054 views
0 votes
1 answer

How to change the location of Spark event logs?

You can change the location where you ...READ MORE

answered Mar 6 in Apache Spark by Rohit
21 views
0 votes
1 answer

How to change commiter algorithm version in Spark?

To change to version 2, run the ...READ MORE

answered Mar 10 in Apache Spark by Siri
60 views
0 votes
1 answer

How to change scheduling mode in Spark?

You can change the scheduling mode as ...READ MORE

answered Mar 11 in Apache Spark by Raj
58 views
0 votes
0 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 2 in Big Data Hadoop by ravikiran
• 1,540 points
18 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,070 points
1,684 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
8,182 views
0 votes
1 answer
0 votes
1 answer

In a Spark DataFrame how can I flatten the struct?

You can go ahead and use the ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 12,270 points
344 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.