How to replace null values in Spark DataFrame?

0 votes

Announcement! Career Guide 2019 is out now. Explore careers to become a Big Data Developer or Architect!

I want to remove null values from a csv file. So tried the following things.

val df ="com.databricks.spark.csv").option("header", "true").load("/usr/local/spark/cars.csv")

After loading the file it looks like as shown below. Now, I want to remove null values.

So, I do this :"e",Seq("blank"))
But the null values didn't change.Can anyone help me?

May 31, 2018 in Apache Spark by kurt_cobain
• 9,260 points

6 answers to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes
This is basically very simple. You'll need to create a new DataFrame. I'm using the DataFrame df that you have defined earlier.

val newDf ="e",Seq("blank"))

DataFrames are immutable structures. Each time you perform a transformation which you need to store, you'll need to affect the transformed DataFrame to a new value.
answered May 31, 2018 by nitinrawat895
• 9,310 points
0 votes
val map = Map("comment" -> "a", "blank" -> "a2")
answered Dec 10, 2018 by Sute
0 votes
df1 ="e",Seq("blank"));
answered Dec 10, 2018 by Shanti
0 votes
String[] colNames = {"NameOfColumn"}
dataframe ="ValueToBeFilled", colNames)
answered Dec 10, 2018 by Sada
0 votes
def isEvenOption(n: Integer): Option[Boolean] = {
  val num = Option(n).getOrElse(return None)
  Some(num % 2 == 0)

val isEvenOptionUdf = udf[Option[Boolean], Integer](isEvenOption)

Source: Dealing with null in Spark

answered Dec 10, 2018 by Mohan
0 votes

Hi i hope this will help for you.


val df ="com.databricks.spark.csv").option("nullValue","defaultvalue").option("header", "true").load("/usr/local/spark/cars.csv"

answered Feb 5 by Srinivasreddy
• 140 points
Is a closed parenthesis missing at the end of the command?
Sir, Can you please explain this code?

Related Questions In Apache Spark

+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,260 points
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3 in Apache Spark by Omkar
• 66,880 points
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
0 votes
1 answer

In a Spark DataFrame how can I flatten the struct?

You can go ahead and use the ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 12,710 points
0 votes
0 answers
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,260 points
0 votes
1 answer

Minimizing Data Transfers in Spark

Minimizing data transfers and avoiding shuffling helps ...READ MORE

answered Jun 19, 2018 in Apache Spark by Data_Nerd
• 2,340 points
0 votes
2 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8 in Big Data Hadoop by Vijay Dixon
• 180 points
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 9,310 points
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 11,261 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.