Difference between createOrReplaceTempView and registerTempTable

0 votes
I'm new to Spark and I cannot seem to find a difference between createOrReplaceTempView and registerTempTable.

They seem almost same.

Help needed
Apr 25, 2018 in Apache Spark by Ashish
• 2,650 points
13,681 views

2 answers to this question.

0 votes

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this view is dependent to SparkSession class, is you want to drop this view :

spark.catalog.dropTempView("name")


createGlobalTempView() creates a global temporary view with the dataframe provided . Lifetime of this view is dependent to spark application itself. If you want to drop :

spark.catalog.dropGlobalTempView("name")
answered Apr 25, 2018 by kurt_cobain
• 9,350 points
0 votes

I am pretty sure createOrReplaceTempView just replaced registerTempTable in Spark 2.0. I don't know exactly what changed in the underlying code, but you can use them basically the same way.

https://spark.apache.org/docs/2.4.5/api/python/pyspark.sql.html?highlight=registertemptable#pyspark.sql.DataFrame.registerTempTable

registerTempTable(name)[source]

Registers this DataFrame as a temporary table using the given name.

The lifetime of this temporary table is tied to the SparkSession that was used to create this DataFrame.

>>> df.registerTempTable("people")
>>> df2 = spark.sql("select * from people")
>>> sorted(df.collect()) == sorted(df2.collect())
True
>>> spark.catalog.dropTempView("people")

Note

Deprecated in 2.0, use createOrReplaceTempView instead.

answered Sep 18, 2020 by Nathan Mott

Related Questions In Apache Spark

0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points
34,456 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 28, 2018 in Apache Spark by shams
• 3,670 points
43,126 views
0 votes
1 answer

Difference between RDD as val and var

Variable declaration can be done in two ...READ MORE

answered May 23, 2019 in Apache Spark by Arun
2,509 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
3,613 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
1,913 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
2,363 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,660 points
2,865 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points
5,993 views
0 votes
1 answer

start-master and start-all?

sbin/start-master.sh : Starts a master instance on ...READ MORE

answered May 7, 2018 in Apache Spark by kurt_cobain
• 9,350 points
2,317 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP