Difference between sparkContext, JavaSparkContext, SQLContext, & SparkSession?

0 votes

I'm new to Spark and I'm completely confused between sparkContext, javaSparkContext, SQLContext and SparkSession.

Can anyone answer the below questions.

  1. Is there any method to convert or create Context using Sparksession ?
  2. Can I completely replace all the Context using one single entry SparkSession ?
  3. Does all the functions in SQLContext, SparkContext,JavaSparkContext etc are added in SparkSession?
  4. Some functions like parallelize has different usage in SparkContext and JavaSparkContext. How to use such function in SparkSession?
  5. How to create the following using SparkSession?
  • RDD
  • JavaRDD
  • JavaPairRDD
  • Dataset
Thanks in advance!
Jul 4, 2018 in Apache Spark by Shubham
• 13,300 points
1,689 views

1 answer to this question.

0 votes

Yes, there is a difference between the sparkContext, javaSparkContext, SQLContext and SparkSession.

Let me answer your question one by one.

sparkContext is a Scala implementation entry point and JavaSparkContext is a java wrapper of sparkContext.

SQLContext is entry point of SparkSQL which can be received from sparkContext.Prior to 2.x.x, RDD ,DataFrame and Data-set were three different data abstractions.Since Spark 2.x.x, All three data abstractions are unified and  SparkSession is the unified entry point of Spark.

An additional note is, RDD meant for unstructured data, strongly typed data and DataFrames are for structured and loosely typed data.

Is there any method to convert or create Context using Sparksession ?

yes. its sparkSession.sparkContext() and for SQL, sparkSession.sqlContext()

Can I completely replace all the Context using one single entry SparkSession ?

yes. you can get respective contexs from sparkSession.

Does all the functions in SQLContext, SparkContext,JavaSparkContext etc are added in SparkSession?

Not directly. you got to get respective context and make use of it.something like backward compatibility

How to use such function in SparkSession?

get respective context and make use of it.

How to create the following using SparkSession?

RDD can be created from sparkSession.sparkContext.parallelize()
JavaRDD same applies with this but in java implementation
JavaPairRDD sparkSession.sparkContext.parallelize().map(//making your data as key-value pair here is one way)
Dataset what sparkSession returns is Dataset if it is structured data.

Hope this will help you a lot.

answered Jul 4, 2018 by nitinrawat895
• 10,710 points

Related Questions In Apache Spark

0 votes
1 answer

Difference between Spark ML & Spark MLlib package

org.apache.spark.mllib is the old Spark API while ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 13,300 points
380 views
0 votes
1 answer

Difference between createOrReplaceTempView and registerTempTable

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this ...READ MORE

answered Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,260 points
2,248 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
16,832 views
0 votes
1 answer

Difference between RDD as val and var

Variable declaration can be done in two ...READ MORE

answered May 23 in Apache Spark by Arun
112 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
6,080 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 10,710 points
1,431 views
0 votes
1 answer

Is it better to have one large parquet file or lots of smaller parquet files?

Ideally, you would use snappy compression (default) ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,710 points
2,450 views
0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 10,710 points
2,107 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,710 points
7,420 views