Difference between createOrReplaceTempView and registerTempTable

0 votes
I'm new to Spark and I cannot seem to find a difference between createOrReplaceTempView and registerTempTable.

They seem almost same.

Help needed
Apr 25, 2018 in Apache Spark by Ashish
• 2,650 points
5,820 views

2 answers to this question.

0 votes

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this view is dependent to SparkSession class, is you want to drop this view :

spark.catalog.dropTempView("name")


createGlobalTempView() creates a global temporary view with the dataframe provided . Lifetime of this view is dependent to spark application itself. If you want to drop :

spark.catalog.dropGlobalTempView("name")
answered Apr 25, 2018 by kurt_cobain
• 9,320 points
0 votes

I am pretty sure createOrReplaceTempView just replaced registerTempTable in Spark 2.0. I don't know exactly what changed in the underlying code, but you can use them basically the same way.

https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html?highlight=registertemptable#pyspark.sql.DataFrame.registerTempTable

registerTempTable(name)[source]

Registers this DataFrame as a temporary table using the given name.

The lifetime of this temporary table is tied to the SparkSession that was used to create this DataFrame.

>>> df.registerTempTable("people")
>>> df2 = spark.sql("select * from people")
>>> sorted(df.collect()) == sorted(df2.collect())
True
>>> spark.catalog.dropTempView("people")

Note

Deprecated in 2.0, use createOrReplaceTempView instead.

answered Sep 18 by Nathan Mott

Related Questions In Apache Spark

0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,950 points
16,214 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,630 points
30,006 views
0 votes
1 answer

Difference between RDD as val and var

Variable declaration can be done in two ...READ MORE

answered May 23, 2019 in Apache Spark by Arun
526 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 37,660 points
1,778 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
730 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
406 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,580 points
1,627 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,320 points
3,308 views
0 votes
1 answer

start-master and start-all?

sbin/start-master.sh : Starts a master instance on ...READ MORE

answered May 7, 2018 in Apache Spark by kurt_cobain
• 9,320 points
565 views