How to transpose Spark DataFrame

0 votes

I have Spark 2.1. My Spark Dataframe is as follows:

COLUMN                          VALUE
Column-1                       value-1
Column-2                       value-2
Column-3                       value-3
Column-4                       value-4
Column-5                       value-5

I have to transpose these column & values. It should be look like:

Column-1  Column-2  Column-3  Column-4  Column-5
value-1   value-2   value-3   value-4   value-5

Can anyone help me out with this? Preferably in Scala

May 24, 2018 in Apache Spark by anonymous

3 answers to this question.

0 votes

In this situation, collect all the Columns which will help in you in creating the schema of the new dataframe & then you can collect the Values and then all the Values to form the rows.

val new_schema = StructType("Column")).first().getAs[Seq[String]](0).map(z => StructField(z, StringType)))
val new_values = sc.parallelize(Seq(Row.fromSeq("Value")).first().getAs[Seq[String]](0))))
sqlContext.createDataFrame(new_values, new_schema).show(false)

Hope this helps.

answered May 24, 2018 by Shubham
• 13,480 points
0 votes
Here's how to do it python:

import numpy as np

from pyspark.sql import SQLContext

from pyspark.sql.functions import lit

dt1 = {'one':[<insert data>],'two':[<insert data>]}

dt = sc.parallelize([ (k,) + tuple(v[0:]) for k,v in dt1.items()]).toDF()
answered Dec 7, 2018 by shri
+1 vote
Please check the below mentioned links for Dynamic Transpose and Reverse Transpose
answered Jan 1, 2019 by anonymous
Hey! Thanks for those links. Do you know how to implement a static transpose?

