How to select all columns with group by

How to select all columns with group by in spark

df.select(*).groupby("id").agg(sum("salary"))

I tried using select but could not make it work.

Feb 19, 2019 in Apache Spark by Ishan
• 15,757 views

1 answer to this question.

You can use the following to print all the columns:

resultset = df.groupBy("id").sum("salary");
joinedDS = studentDataset.join(resultset, "id");

answered Feb 19, 2019 by Omkar
• 69,180 points

Try

df.select(df("*")).groupby("id").agg(sum("salary"))

answered Sep 17, 2021 by Parimi Pavan

edited Mar 5

Related Questions In Apache Spark

0 votes

1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 29, 2020 in Apache Spark by GAURAV
• 140 points • 4,601 views

0 votes

1 answer

How to index one csv file with no header , after converting the csv to a dataframe, i need to name the columns in order to normalize in minmaxScaler.

Hi@Manas, You can read your dataset from CSV ...READ MORE

answered Sep 10, 2020 in Apache Spark by MD
• 95,460 points • 3,136 views

+2 votes

14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar • 93,265 views

0 votes

2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2) // ...READ MORE

answered Aug 17, 2018 in Apache Spark by zombie
• 3,790 points • 10,565 views

+1 vote

1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 13,567 views

0 votes

1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 4,462 views

+2 votes

11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points • 116,600 views

–1 vote

1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points • 6,633 views

0 votes

1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

answered Mar 1, 2019 in Apache Spark by Omkar
• 69,180 points • 1,608 views

0 votes

1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,180 points • 5,929 views

Subscribe to our Newsletter, and get personalized recommendations.

REGISTER FOR FREE WEBINAR

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP