Delimiter on the data

0 votes

I have a file with records as below.

s.no,name,Country
101,Raju,India,IN
102,Reddy,UnitedStates,US

here the my country column has data as "India,IN" which is single value and it has comma as well. Can you let me know how to handle this data when we read the file using comma delimiter in spark-scala? I tried with split(",") which did not give me expected output.

for ex: expected output for the first record:

S.no: 101
name: Raju
Country: India,IN
Jul 25, 2019 in Big Data Hadoop by Karan
45 views

1 answer to this question.

0 votes

You can use this:

import org.apache.spark.sql.functions.struct

val df = Seq((1,2), (3,4), (5,3)).toDF("a", "b")

val new = df.withColumn("NewColumn", struct(df("a"), df("b"))

new.show()


+---+---+---------+

|a |b |NewColumn|

+---+---+---------+

|1 |2 |[1,2] |

|3 |4 |[3,4] |

|5 |3 |[5,3] |

+---+---+---------+


val data = new.drop("a");

val data = data.drop("b");
answered Jul 25, 2019 by Vinay

Related Questions In Big Data Hadoop

0 votes
1 answer

Where does HDFS stores data on the local file system?

First find the Hadoop directory present in ...READ MORE

answered May 8, 2018 in Big Data Hadoop by Shubham
• 13,380 points
5,631 views
0 votes
1 answer

Explain to me the method to transfer data between Azure tables and Hadoop on Azure

this article on HiveStorageHandler will let you create ...READ MORE

answered May 2, 2019 in Big Data Hadoop by ravikiran
• 4,600 points
62 views
0 votes
1 answer
0 votes
1 answer

What are the hardware requirements for installing Hadoop on my Laptop?

You can either install Apache Hadoop on ...READ MORE

answered Apr 10, 2018 in Big Data Hadoop by Shubham
• 13,380 points
3,544 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
5,446 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
803 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
33,715 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,310 points
2,054 views
0 votes
3 answers

How to change the delimiter in Sqoop?

--fields-terminated-by <char> READ MORE

answered Jun 25, 2019 in Big Data Hadoop by anonymous
3,993 views
0 votes
2 answers

Hey for all, how to get on large data i want use in hadoop?

Hi, To work with Hadoop you can also ...READ MORE

answered Jul 30, 2019 in Big Data Hadoop by Sunny
91 views