use length function in substring in spark

+2 votes
I'm using spark 2.1.

Using a length function inside a substring for a Dataframe is giving me an error (mismatch).

val SSDF = testDF.withColumn("newcol", substring($"col", 1, length($"col")-1))

May 3, 2018 in Apache Spark by Data_Nerd
• 2,390 points
42,883 views

4 answers to this question.

+1 vote

You can use the function expr

val data = List("..", "...", "...")
val df = sparkContext.parallelize(data).toDF("value")
val result = df.withColumn("cutted", expr("substring(value, 1, length(value)-1)"))
result.show(false)

This might help

answered May 3, 2018 by kurt_cobain
• 9,390 points
can you provide some working examples????
Only the last column is shown by this method
0 votes

You can try this

val substrDF =testDF.withColumn("newcol", $"col".substr(lit(1), length($"col")-1))

answered Dec 10, 2018 by Devatha
thank you so much !!
0 votes

You have passed the wrong parameters. Here is the right syntax:

substring(str: Column, pos: Int, len: Int): Column 
answered Dec 10, 2018 by Saloni
0 votes

You can also use UDF

testDF.withColumn("newcol", regexp_replace($"name", ".$" , "") ).show

answered Dec 10, 2018 by Foane

Related Questions In Apache Spark

+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 88,659 views
0 votes
1 answer

In what kind of use cases has Spark outperformed Hadoop in processing?

I can list some but there can ...READ MORE

answered Sep 19, 2018 in Apache Spark by zombie
• 3,790 points
1,104 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3, 2019 in Apache Spark by Omkar
• 69,220 points
1,733 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

answered Jan 29, 2019 in Apache Spark by Omkar
• 69,220 points
2,804 views
+1 vote
2 answers
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,220 points
5,059 views
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
12,729 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Jan 1, 2019 in Apache Spark by anonymous
19,862 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

answered Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,795 views
0 votes
1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,390 points
7,815 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP