How to create a not null column in case class in spark

Question

How to create a column in case class with not null package

package com.spark.sparkpkg

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.Encoders
import org.apache.log4j.Logger
import org.apache.log4j.Level
object CaseClassSample extends App{
  Logger.getLogger("org").setLevel(Level.OFF)
  val spark= SparkSession.builder().master("local[*]").appName("caseClass").getOrCreate()

  import spark.implicits._
  case class test(empid : String , userName : String)
  val ds=spark.read
             .option("header","true")
             .csv("C:/Users/dkumar77.EAD/Desktop/SparkData/11.csv").as[test]
   ds.show()
   ds.printSchema()

}

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
+-----+--------+
|empid|userName|
+-----+--------+
|    1| Deepak|
| null|    Test|
+-----+--------+

root
|-- empid: string (nullable = true) ****** this should be nullable=false
|-- userName: string (nullable = true)

how can we do this. I tried few thing Option[String] but did not worked. Can you please help

MD · Answer 1 · May 14, 2020

Hi@Deepak,

In your test class you passed empid as string, that's why it shows nullable=true. So you have to import the below package.

import org.apache.spark.sql.types

You can use these kind of codes in your program.

df.withColumn("empid", $"empid".cast(IntegerType))
df.withColumn("username", $"username".cast(StringType))

answered May 14, 2020 by MD
• 95,460 points

How to create a not null column in case class in spark

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

How to find the number of elements present in the array in a Spark DataFame column?

How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

How can I write a text file in HDFS not from an RDD, in Spark program?

How to get ID of a map task in Spark?

How to restrict a group to only view in Spark?

How to check if a particular keyword exists in Apache Spark?

what is Paired RDD and how to create paired RDD in Spark?

How to create paired RDD using subString method in Spark?

How to create new column with function in Spark Dataframe?

How to replace null values in Spark DataFrame?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES