What's the difference between 'filter' and 'where' in Spark SQL?

0 votes
I have a file which contains employee data and I want to filter out the results using Spark SQL. So I've tried filter as well as where clause and I found they both works same.

example:
val items =  List(1, 2, 3)
 

using filter
employees.filter($"emp_id".isin(items:_*)).show
 

using where
employees.where($"emp_id".isin(items:_*)).show
 

Got the same result in both the cases.
Can anyone tell me why am I getting the same result using filter and where?
May 23, 2018 in Apache Spark by kurt_cobain
• 9,240 points

recategorized May 23, 2018 by kurt_cobain 4,071 views

1 answer to this question.

0 votes
Both 'filter' and 'where' in Spark SQL gives same result. There is no difference between the two.

// The following are equivalent:
employee.filter($"age" > 15)
employee.where($"age" > 15)
employees.filter($"emp_id".isin(items:_*)).show
employees.where($"emp_id".isin(items:_*)).show

 
It's just filter is simply the standard Scala name for such a function, and where is for people who prefer SQL.
answered May 23, 2018 by nitinrawat895
• 10,030 points

Related Questions In Apache Spark

0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3 in Apache Spark by Gitika
• 19,720 points
51 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

answered 3 days ago in Apache Spark by Kiran
16 views
0 votes
1 answer
0 votes
1 answer

what are the job optimization Technics in spark and scala ?

There are different methods to achieve optimization ...READ MORE

answered Mar 18 in Apache Spark by Veer
145 views
0 votes
1 answer

SQLInterpreter in Spark

SQL Interpreter & Optimizer handles the functional ...READ MORE

answered Jun 7, 2018 in Apache Spark by kurt_cobain
• 9,240 points
46 views
0 votes
1 answer

Functions of Spark SQL?

Spark SQL is capable of: Loading data from ...READ MORE

answered Sep 3, 2018 in Apache Spark by nitinrawat895
• 10,030 points
76 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
4,282 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
9,938 views
0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 10,030 points
1,657 views