Spark memory processing on a not temporary table

0 votes
When not using a temporary table, I am assuming the data is written in hdfs file. Spark does in memory processing, so this will be different from Spark's regular approach since the data will now be read from the file for further processing. Is this assumption correct?
Jul 14 in Apache Spark by Kunal
22 views

1 answer to this question.

0 votes
Temporary table is more like an index for which the spark doesn't even create meta-data that is the reason why its called temporary and we always create a temp table from a dataframe. Now once temp table data is stored in to a hive table its stores it in hive warehouse which is hdfs only so that part is absolutely correct. And whenever you are reading it the sqlcontext object reads it from HDFS instead of in-memory. Now since you are only reading the data like "select * from table_hive" there won't be much difference in processing time whether it reads from in-memory or from hdfs but if we consider the difference in microseconds then we will find the difference and in-memory processing takes the lead here.
answered Jul 14 by Suri

Related Questions In Apache Spark

0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,290 points
1,199 views
0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

answered Jun 5, 2018 in Apache Spark by Shubham
• 13,290 points
24,943 views
0 votes
1 answer

What is Executor Memory in a Spark application?

Every spark application has same fixed heap ...READ MORE

answered Jan 4 in Apache Spark by Frankie
• 9,810 points
333 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
2,736 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
289 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
13,556 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

answered Jul 13 in Apache Spark by Kiran
223 views
0 votes
1 answer