How to transpose Spark DataFrame?

0 votes

I have Spark 2.1. My Spark Dataframe is as follows:

COLUMN                          VALUE
Column-1                       value-1
Column-2                       value-2
Column-3                       value-3
Column-4                       value-4
Column-5                       value-5

I have to transpose these column & values. It should be look like:

Column-1  Column-2  Column-3  Column-4  Column-5
value-1   value-2   value-3   value-4   value-5

Can anyone help me out with this? Preferably in Scala

May 24, 2018 in Apache Spark by anonymous
6,045 views

3 answers to this question.

0 votes

In this situation, collect all the Columns which will help in you in creating the schema of the new dataframe & then you can collect the Values and then all the Values to form the rows.

val new_schema = StructType(df1.select(collect_list("Column")).first().getAs[Seq[String]](0).map(z => StructField(z, StringType)))
val new_values = sc.parallelize(Seq(Row.fromSeq(df.select(collect_list("Value")).first().getAs[Seq[String]](0))))
sqlContext.createDataFrame(new_values, new_schema).show(false)

Hope this helps.

answered May 24, 2018 by Shubham
• 13,300 points
0 votes
Here's how to do it python:

import numpy as np

from pyspark.sql import SQLContext

from pyspark.sql.functions import lit

dt1 = {'one':[<insert data>],'two':[<insert data>]}

dt = sc.parallelize([ (k,) + tuple(v[0:]) for k,v in dt1.items()]).toDF()

dt.show()
answered Dec 7, 2018 by shri
+1 vote
Please check the below mentioned links for Dynamic Transpose and Reverse Transpose
1. https://dzone.com/articles/how-to-use-reverse-transpose-in-spark
2. https://dzone.com/articles/dynamic-transpose-in-spark
answered Dec 31, 2018 by anonymous
Hey! Thanks for those links. Do you know how to implement a static transpose?

Related Questions In Apache Spark

0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 10,690 points
1,460 views
0 votes
6 answers
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 23,931 views
0 votes
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,690 points
1,468 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,260 points
1,202 views
0 votes
1 answer

How to get Spark dataset metadata?

There are a bunch of functions that ...READ MORE

answered Apr 26, 2018 in Apache Spark by kurt_cobain
• 9,260 points
412 views
0 votes
1 answer
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8 in Big Data Hadoop by Vijay Dixon
• 180 points
1,406 views
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,260 points
76 views
0 votes
1 answer