Scala save filtered data row by row using saveAsTextFile

0 votes

Hello Team,

I want to save filtered data row by row using saveAsTextFile. Kindly help.

I tried with flatmap but it's flattening every column.

val rdd1=sc.textFile("/user/edureka_40114/AppleStore.csv")

val rdd3=apple.map(x=>x.split(",")).filter(x=>x(12).equals("\"Games\"")).map(x=>(x(0),x(1)))

rdd3.collect()

I want the output to be saved in hdfs file as below but using flatmap it's giving everything in a separate line.

"1","281656475"

"6","283619399"

"10","284736660"
Aug 2, 2019 in Apache Spark by Hari
840 views

1 answer to this question.

0 votes

Try this code, it worked for me:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext.implicits._​

val data = sqlContext.read.format("csv").option("header", "false").load("/user/edureka_425640/AppleStore.csv")​

data.write.format("csv").save("mobil_out")​


Hope this helps!

If you need to know more about Scala, join Apache Spark course today and become the expert.

Thanks!!

answered Aug 2, 2019 by Karan

Related Questions In Apache Spark

+1 vote
1 answer

Scala: CSV file to Save data into HBase

Check the reference code mentioned below: def main(args: ...READ MORE

answered Jul 25, 2019 in Apache Spark by Hari
784 views
0 votes
1 answer

Scala pass input data as arguments

Please refer to the below code as ...READ MORE

answered Jun 19, 2019 in Apache Spark by Lisa
846 views
0 votes
1 answer

How can we iterate any function using "foreach" function in scala?

Hi, Yes, "foreach" function you use because it will ...READ MORE

answered Jul 5, 2019 in Apache Spark by Gitika
• 65,950 points
775 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

answered Nov 5, 2019 in Apache Spark by Begum
679 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,929 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,341 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
63,187 views
0 votes
1 answer

Spark comparing two big data files using scala

Try this and see if this does ...READ MORE

answered Apr 2, 2019 in Apache Spark by Omkar
• 69,170 points
4,642 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

answered Aug 1, 2019 in Apache Spark by Esha
1,888 views