Spark Can we add column to dataframe

+1 vote
Can we add column to dataframe? If yes, please share the code.
Aug 9, 2019 in Apache Spark by Chirag
3,153 views

2 answers to this question.

+1 vote

Yes we can add a column using withColumn with the function as shown below for your reference.

val sqlContext = new SQLContext(sc)

import sqlContext.implicits._ // for `toDF` and $""

import org.apache.spark.sql.functions._ // for `when`


val df = sc.parallelize(Seq((4, "blah", 2), (2, "", 3), (56, "foo", 3), (100, null, 5)))

    .toDF("A", "B", "C")

val newDf = df.withColumn("D", when($"B".isNull or $"B" === "", 0).otherwise(1))

newDf.show() shows

+---+----+---+---+

| A| B| C| D|

+---+----+---+---+

| 4|blah| 2| 1|

| 2| | 3| 0|

| 56| foo| 3| 1|

|100|null| 5| 0|

+---+----+---+---+
answered Aug 9, 2019 by Shirish
+1 vote

Yes we can add columns to the existing data frame in Spark

import pandas as pd

data = {'Name': ['Indis', 'Sachin', 'Rohit', 'Dhoni'],

        'Height': [5.1, 6.2, 5.1, 5.2],

        'Qualification': ['Team', 'Opener', 'Hitman', 'Keeper']}  

df = pd.DataFrame(data)

address = ['India', 'Mumbai', 'Chennai', 'Patna']

df['Address'] = address

df

on Spark Online Training

answered Oct 24, 2019 by Siva
• 160 points

Related Questions In Apache Spark

+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 68,969 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
13,155 views
+1 vote
1 answer
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

answered Apr 19, 2018 in Apache Spark by Ashish
• 2,650 points
10,228 views
+1 vote
2 answers
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,130 points
1,868 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
33,677 views
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
5,702 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

answered Jul 10, 2019 in Apache Spark by Jishnu
3,341 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

answered Jul 23, 2019 in Apache Spark by Ritu
1,791 views