Is there any efficient way of dealing null values during concat functionality of pyspark sql version 2 3 4

+1 vote
Nov 5, 2019 in Apache Spark by aizhar
• 130 points
16,172 views

1 answer to this question.

+1 vote

When you concatenate any string with a NULL value, it will result in NULL. To avoid this, you can use the COALESCE function. 

spark.sql(SELECT COALESCE(Name, '') + ' '+ COALESCE(Column2, '') AS Result FROM table_test).show()

The COALESCE function returns the first non-Null value. So, when there is a value in the column that is not null, that will be concatenated. And if the value in the column is null, then an empty string will be concatenated.

After that it will work.

To know more about it, get your Pyspark certification today and become expert.

Thanks.

answered Nov 6, 2019 by Rishi
Can you please suggest me how can I concatenate a date column if it is having null value?

Thanks in Advance.

Pravin

Hi@Pravin,

You can replace your null values with some significant value maybe 0. In this way, you can avoid this null value problem. You can also see the below example.

df\
.withColumn('Created-formatted',when((df.Created.isNull() | (df.Created == '')) ,'0')\
.otherwise(unix_timestamp(df.Created,'yyyy-MM-dd')))\
.withColumn('EventDate-formatted',when((df.EventDate.isNull() | (df.EventDate == '')) ,'0')\
.otherwise(unix_timestamp(df.EventDate,'yyyy-MM-dd')))\
.drop('Created','EventDate')\
.show()

But before that check the format of your dataset and set accordingly.

Related Questions In Apache Spark

0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points
5,458 views
0 votes
1 answer

Is there any way to uncache RDD?

RDD can be uncached using unpersist() So. use ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
812 views
0 votes
1 answer

Spark 2.3? What is new in it?

Here are the changes in new version ...READ MORE

answered May 28, 2018 in Apache Spark by kurt_cobain
• 9,390 points
255 views
+1 vote
2 answers
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
6,689 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Jan 1, 2019 in Apache Spark by anonymous
16,118 views
0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Dhara dhruve
3,961 views
+1 vote
1 answer

is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [51, 53, 10, 10]

Hi@akhtar, Here you are trying to read a ...READ MORE

answered Feb 3, 2020 in Apache Spark by MD
• 95,320 points
8,185 views
0 votes
1 answer

Cannot create directory /hive/xzxz/_temporary/0. Name node is in safe mode.

Hi@akhtar, Here you are trying to save csv ...READ MORE

answered Feb 3, 2020 in Apache Spark by MD
• 95,320 points
215 views