use length function in substring in spark

+2 votes
I'm using spark 2.1.

Using a length function inside a substring for a Dataframe is giving me an error (mismatch).

val SSDF = testDF.withColumn("newcol", substring($"col", 1, length($"col")-1))

May 3, 2018 in Apache Spark by Data_Nerd
• 2,360 points
11,203 views

4 answers to this question.

+1 vote

You can use the function expr

val data = List("..", "...", "...")
val df = sparkContext.parallelize(data).toDF("value")
val result = df.withColumn("cutted", expr("substring(value, 1, length(value)-1)"))
result.show(false)

This might help

answered May 3, 2018 by kurt_cobain
• 9,240 points
can you provide some working examples????
Only the last column is shown by this method
0 votes

You can try this

val substrDF =testDF.withColumn("newcol", $"col".substr(lit(1), length($"col")-1))

answered Dec 10, 2018 by Devatha
thank you so much !!
0 votes

You have passed the wrong parameters. Here is the right syntax:

substring(str: Column, pos: Int, len: Int): Column 
answered Dec 10, 2018 by Saloni
0 votes

You can also use UDF

testDF.withColumn("newcol", regexp_replace($"name", ".$" , "") ).show

answered Dec 10, 2018 by Foane

Related Questions In Apache Spark

0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 17,659 views
0 votes
1 answer

In what kind of use cases has Spark outperformed Hadoop in processing?

I can list some but there can ...READ MORE

answered Sep 19, 2018 in Apache Spark by zombie
• 3,690 points
60 views
0 votes
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3 in Apache Spark by Omkar
• 67,290 points
120 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

answered Jan 29 in Apache Spark by Omkar
• 67,290 points
156 views
0 votes
1 answer
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3 in Apache Spark by Omkar
• 67,290 points
136 views
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8 in Big Data Hadoop by Vijay Dixon
• 180 points
1,157 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
4,869 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

answered Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,240 points
99 views
0 votes
1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,240 points
1,115 views