use length function in substring in spark

+2 votes
I'm using spark 2.1.

Using a length function inside a substring for a Dataframe is giving me an error (mismatch).

val SSDF = testDF.withColumn("newcol", substring($"col", 1, length($"col")-1))

May 3, 2018 in Apache Spark by Data_Nerd
• 2,370 points
20,088 views

4 answers to this question.

+1 vote

You can use the function expr

val data = List("..", "...", "...")
val df = sparkContext.parallelize(data).toDF("value")
val result = df.withColumn("cutted", expr("substring(value, 1, length(value)-1)"))
result.show(false)

This might help

answered May 3, 2018 by kurt_cobain
• 9,310 points
can you provide some working examples????
Only the last column is shown by this method
0 votes

You can try this

val substrDF =testDF.withColumn("newcol", $"col".substr(lit(1), length($"col")-1))

answered Dec 10, 2018 by Devatha
thank you so much !!
0 votes

You have passed the wrong parameters. Here is the right syntax:

substring(str: Column, pos: Int, len: Int): Column 
answered Dec 10, 2018 by Saloni
0 votes

You can also use UDF

testDF.withColumn("newcol", regexp_replace($"name", ".$" , "") ).show

answered Dec 10, 2018 by Foane

Related Questions In Apache Spark

0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 36,915 views
0 votes
1 answer

In what kind of use cases has Spark outperformed Hadoop in processing?

I can list some but there can ...READ MORE

answered Sep 19, 2018 in Apache Spark by zombie
• 3,750 points
94 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3, 2019 in Apache Spark by Omkar
• 68,940 points
262 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

answered Jan 29, 2019 in Apache Spark by Omkar
• 68,940 points
390 views
+1 vote
1 answer
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 68,940 points
422 views
0 votes
3 answers

How to connect Spark to a remote Hive server?

JDBC is not required here. Create a hive ...READ MORE

answered Mar 8, 2019 in Big Data Hadoop by Vijay Dixon
• 190 points
2,015 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
8,525 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

answered Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,310 points
180 views
0 votes
1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,310 points
1,834 views