How to select all columns with group by

0 votes

How to select all columns with group by in spark

df.select(*).groupby("id").agg(sum("salary"))

I tried using select but could not make it work.

Feb 18, 2019 in Apache Spark by Ishan
3,198 views

1 answer to this question.

0 votes

You can use the following to print all the columns:

resultset = df.groupBy("id").sum("salary");
joinedDS = studentDataset.join(resultset, "id");
answered Feb 18, 2019 by Omkar
• 69,130 points

Related Questions In Apache Spark

0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 28, 2020 in Apache Spark by GAURAV
• 140 points
1,273 views
0 votes
1 answer
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 69,164 views
0 votes
2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2)   // ...READ MORE

answered Aug 16, 2018 in Apache Spark by zombie
• 3,790 points
5,730 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,249 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,178 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
53,326 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,749 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

answered Mar 1, 2019 in Apache Spark by Omkar
• 69,130 points
256 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,130 points
1,891 views