How to replace null values in Spark DataFrame

I want to remove null values from a csv file. So tried the following things.

val df ="com.databricks.spark.csv").option("header", "true").load("/usr/local/spark/cars.csv")

After loading the file it looks like as shown below. Now, I want to remove null values.

So, I do this :"e",Seq("blank"))
But the null values didn't change.Can anyone help me?

May 31, 2018 in Apache Spark by kurt_cobain
edited Dec 15, 2020 by MD

This is basically very simple. You'll need to create a new DataFrame. I'm using the DataFrame df that you have defined earlier.

val newDf ="e",Seq("blank"))

DataFrames are immutable structures. Each time you perform a transformation which you need to store, you'll need to affect the transformed DataFrame to a new value.

answered May 31, 2018 by nitinrawat895
val map = Map("comment" -> "a", "blank" -> "a2")
answered Dec 10, 2018 by Sute
df1 ="e",Seq("blank"));
answered Dec 10, 2018 by Shanti
String[] colNames = {"NameOfColumn"}
dataframe ="ValueToBeFilled", colNames)
answered Dec 10, 2018 by Sada
def isEvenOption(n: Integer): Option[Boolean] = {
  val num = Option(n).getOrElse(return None)
  Some(num % 2 == 0)

val isEvenOptionUdf = udf[Option[Boolean], Integer](isEvenOption)

Source: Dealing with null in Spark

answered Dec 10, 2018 by Mohan
For ,we have to use, drop()

drop() will remove all the null from the DF
Hi i hope this will help for you.


val df ="com.databricks.spark.csv").option("nullValue","defaultvalue").option("header", "true").load("/usr/local/spark/cars.csv"

answered Feb 5, 2019 by Srinivasreddy
Is a closed parenthesis missing at the end of the command?
Sir, Can you please explain this code?
in spark 2.x you can directly use df.dropna()  you can drop null from dataframe
answered Mar 29, 2020 by gaurav
In Spark, fill() function of DataFrameNaFunctions class is used to replace NULL values on the DataFrame column with either zero(0), empty string, space, or any constant literal values.

//Replace all integer and long columns

//Replace with specific columns,Array("population"))
answered Dec 15, 2020 by MD
