How to transpose Spark DataFrame

0 votes

I have Spark 2.1. My Spark Dataframe is as follows:

COLUMN                          VALUE
Column-1                       value-1
Column-2                       value-2
Column-3                       value-3
Column-4                       value-4
Column-5                       value-5

I have to transpose these column & values. It should be look like:

Column-1  Column-2  Column-3  Column-4  Column-5
value-1   value-2   value-3   value-4   value-5

Can anyone help me out with this? Preferably in Scala

May 24, 2018 in Apache Spark by anonymous
15,378 views

3 answers to this question.

0 votes

In this situation, collect all the Columns which will help in you in creating the schema of the new dataframe & then you can collect the Values and then all the Values to form the rows.

val new_schema = StructType(df1.select(collect_list("Column")).first().getAs[Seq[String]](0).map(z => StructField(z, StringType)))
val new_values = sc.parallelize(Seq(Row.fromSeq(df.select(collect_list("Value")).first().getAs[Seq[String]](0))))
sqlContext.createDataFrame(new_values, new_schema).show(false)

Hope this helps.

answered May 24, 2018 by Shubham
• 13,480 points
0 votes
Here's how to do it python:

import numpy as np

from pyspark.sql import SQLContext

from pyspark.sql.functions import lit

dt1 = {'one':[<insert data>],'two':[<insert data>]}

dt = sc.parallelize([ (k,) + tuple(v[0:]) for k,v in dt1.items()]).toDF()

dt.show()
answered Dec 7, 2018 by shri
+1 vote
Please check the below mentioned links for Dynamic Transpose and Reverse Transpose
1. https://dzone.com/articles/how-to-use-reverse-transpose-in-spark
2. https://dzone.com/articles/dynamic-transpose-in-spark
answered Dec 31, 2018 by anonymous
Hey! Thanks for those links. Do you know how to implement a static transpose?

Related Questions In Apache Spark

0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
2,938 views
+1 vote
8 answers

How to replace null values in Spark DataFrame?

Hi, In Spark, fill() function of DataFrameNaFunctions class is used to replace ...READ MORE

answered Dec 15, 2020 in Apache Spark by MD
• 95,220 points
58,775 views
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 70,609 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
13,857 views
+1 vote
1 answer
0 votes
1 answer

How to insert data into Cassandra table using Spark DataFrame?

Hi@akhtar, You can write the spark dataframe in ...READ MORE

answered Sep 21, 2020 in Apache Spark by MD
• 95,220 points
1,068 views
+1 vote
2 answers
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
5,963 views
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,390 points
228 views
0 votes
1 answer