How to groupBy/count then filter on count in Scala

0 votes

I'm using spark 2.1, I was trying to use the groupBy on the "count" column i have. It throws an exception.

Code: 

df.groupBy("travel").count()
  .filter("count >= 1000")
  .show()

java.lang.RuntimeException: [1.15] failure: ``('' expected but `>=' found count >= 1000
Apr 19, 2018 in Big Data Hadoop by Shubham
• 12,270 points
3,645 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

I think the exception is caused because you used the keyword Count.

Now when you use the filter function, in the background it's actually SQL code running. So count being a keyword in SQL is misinterpreted here.

You can either specify it as a column by using $ sign

df.groupBy("travel").count()
  .filter($"count >= 1000")
  .show()

Alternatively, you can use the rename function also

df.groupBy("travel").count().withColumnRenamed("count", "x")
  .filter("x >= 1000")
  .show()
answered Apr 19, 2018 by kurt_cobain
• 9,260 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to count lines in a file on hdfs command?

Use the below commands: Total number of files: hadoop ...READ MORE

answered Aug 10, 2018 in Big Data Hadoop by Neha
• 6,140 points
1,321 views
+1 vote
0 answers

How to set up Hadoop cluster on Mac in intelliJ IDEA

I have Installed hadoop using brew and ...READ MORE

Jul 25, 2018 in Big Data Hadoop by Neha
• 6,140 points
73 views
+1 vote
1 answer

How to count number of rows in alias in PIG?

COUNT is part of pig LOGS= LOAD 'log'; LOGS_GROUP= ...READ MORE

answered Oct 15, 2018 in Big Data Hadoop by Omkar
• 65,850 points
32 views
0 votes
3 answers

Spark Scala: How to list all folders in directory

val spark = SparkSession.builder().appName("Demo").getOrCreate() val path = new ...READ MORE

answered Dec 4, 2018 in Big Data Hadoop by Mark
675 views
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
3,418 views
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

answered Apr 19, 2018 in Apache Spark by Ashish
• 2,630 points
2,662 views
+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

answered Mar 21 in Apache Spark by anonymous
17,883 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
3,172 views
0 votes
1 answer

How to start working on Hadoop?

Ok, so basically, you are looking forward ...READ MORE

answered Mar 29, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
23 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.