What's the difference between 'filter' and 'where' in Spark SQL?

0 votes
I have a file which contains employee data and I want to filter out the results using Spark SQL. So I've tried filter as well as where clause and I found they both works same.

example:
val items =  List(1, 2, 3)
 

using filter
employees.filter($"emp_id".isin(items:_*)).show
 

using where
employees.where($"emp_id".isin(items:_*)).show
 

Got the same result in both the cases.
Can anyone tell me why am I getting the same result using filter and where?
May 23, 2018 in Apache Spark by kurt_cobain
• 9,260 points

recategorized May 23, 2018 by kurt_cobain 2,955 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes
Both 'filter' and 'where' in Spark SQL gives same result. There is no difference between the two.

// The following are equivalent:
employee.filter($"age" > 15)
employee.where($"age" > 15)
employees.filter($"emp_id".isin(items:_*)).show
employees.where($"emp_id".isin(items:_*)).show

 
It's just filter is simply the standard Scala name for such a function, and where is for people who prefer SQL.
answered May 23, 2018 by nitinrawat895
• 9,070 points

Related Questions In Apache Spark

0 votes
1 answer
0 votes
1 answer

what are the job optimization Technics in spark and scala ?

There are different methods to achieve optimization ...READ MORE

answered Mar 18 in Apache Spark by Veer
83 views
0 votes
1 answer

Difference between createOrReplaceTempView and registerTempTable

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this ...READ MORE

answered Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,260 points
812 views
0 votes
1 answer

cache tables in apache spark sql

Caching the tables puts the whole table ...READ MORE

answered May 4, 2018 in Apache Spark by Data_Nerd
• 2,340 points
393 views
0 votes
1 answer

SQLInterpreter in Spark

SQL Interpreter & Optimizer handles the functional ...READ MORE

answered Jun 7, 2018 in Apache Spark by kurt_cobain
• 9,260 points
38 views
0 votes
1 answer

Functions of Spark SQL?

Spark SQL is capable of: Loading data from ...READ MORE

answered Sep 3, 2018 in Apache Spark by nitinrawat895
• 9,070 points
66 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
3,419 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
7,382 views
0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 9,070 points
1,452 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.