How to transpose Spark DataFrame?

0 votes

I have Spark 2.1. My Spark Dataframe is as follows:

COLUMN                          VALUE
Column-1                       value-1
Column-2                       value-2
Column-3                       value-3
Column-4                       value-4
Column-5                       value-5

I have to transpose these column & values. It should be look like:

Column-1  Column-2  Column-3  Column-4  Column-5
value-1   value-2   value-3   value-4   value-5

Can anyone help me out with this? Preferably in Scala

May 24, 2018 in Apache Spark by anonymous

3 answers to this question.

0 votes

In this situation, collect all the Columns which will help in you in creating the schema of the new dataframe & then you can collect the Values and then all the Values to form the rows.

val new_schema = StructType("Column")).first().getAs[Seq[String]](0).map(z => StructField(z, StringType)))
val new_values = sc.parallelize(Seq(Row.fromSeq("Value")).first().getAs[Seq[String]](0))))
sqlContext.createDataFrame(new_values, new_schema).show(false)

Hope this helps.

answered May 24, 2018 by Shubham
• 13,450 points
0 votes
Here's how to do it python:

import numpy as np

from pyspark.sql import SQLContext

from pyspark.sql.functions import lit

dt1 = {'one':[<insert data>],'two':[<insert data>]}

dt = sc.parallelize([ (k,) + tuple(v[0:]) for k,v in dt1.items()]).toDF()
answered Dec 7, 2018 by shri
+1 vote
Please check the below mentioned links for Dynamic Transpose and Reverse Transpose
answered Dec 31, 2018 by anonymous
Hey! Thanks for those links. Do you know how to implement a static transpose?

Related Questions In Apache Spark

0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 10,950 points
0 votes
7 answers
0 votes
12 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 60,774 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,770 points
+1 vote
1 answer
0 votes
1 answer

How to insert data into Cassandra table using Spark DataFrame?

Hi@akhtar, You can write the spark dataframe in ...READ MORE

answered Sep 21 in Apache Spark by MD
• 78,990 points
+1 vote
2 answers
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,320 points
0 votes
1 answer