How to select all columns with group by

0 votes

How to select all columns with group by in spark

df.select(*).groupby("id").agg(sum("salary"))

I tried using select but could not make it work.

Feb 18, 2019 in Apache Spark by Ishan
4,538 views

1 answer to this question.

0 votes

You can use the following to print all the columns:

resultset = df.groupBy("id").sum("salary");
joinedDS = studentDataset.join(resultset, "id");
answered Feb 18, 2019 by Omkar
• 69,170 points

Related Questions In Apache Spark

0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 28, 2020 in Apache Spark by GAURAV
• 140 points
1,581 views
0 votes
1 answer
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 73,527 views
0 votes
2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2)   // ...READ MORE

answered Aug 16, 2018 in Apache Spark by zombie
• 3,790 points
6,393 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,817 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,306 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
61,242 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,952 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

answered Mar 1, 2019 in Apache Spark by Omkar
• 69,170 points
305 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,170 points
2,518 views