Spark Can we add column to dataframe

+1 vote
Can we add column to dataframe? If yes, please share the code.
Aug 9, 2019 in Apache Spark by Chirag
2,633 views

2 answers to this question.

+1 vote

Yes we can add a column using withColumn with the function as shown below for your reference.

val sqlContext = new SQLContext(sc)

import sqlContext.implicits._ // for `toDF` and $""

import org.apache.spark.sql.functions._ // for `when`


val df = sc.parallelize(Seq((4, "blah", 2), (2, "", 3), (56, "foo", 3), (100, null, 5)))

    .toDF("A", "B", "C")

val newDf = df.withColumn("D", when($"B".isNull or $"B" === "", 0).otherwise(1))

newDf.show() shows

+---+----+---+---+

| A| B| C| D|

+---+----+---+---+

| 4|blah| 2| 1|

| 2| | 3| 0|

| 56| foo| 3| 1|

|100|null| 5| 0|

+---+----+---+---+
answered Aug 9, 2019 by Shirish
+1 vote

Yes we can add columns to the existing data frame in Spark

import pandas as pd

data = {'Name': ['Indis', 'Sachin', 'Rohit', 'Dhoni'],

        'Height': [5.1, 6.2, 5.1, 5.2],

        'Qualification': ['Team', 'Opener', 'Hitman', 'Keeper']}  

df = pd.DataFrame(data)

address = ['India', 'Mumbai', 'Chennai', 'Patna']

df['Address'] = address

df

on Spark Online Training

answered Oct 24, 2019 by Siva
• 160 points

Related Questions In Apache Spark

+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 64,926 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
11,634 views
+1 vote
1 answer
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

answered Apr 19, 2018 in Apache Spark by Ashish
• 2,650 points
9,729 views
+1 vote
2 answers
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,090 points
1,380 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
31,860 views
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
5,163 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

answered Jul 10, 2019 in Apache Spark by Jishnu
3,068 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

answered Jul 23, 2019 in Apache Spark by Ritu
1,623 views