Spark Saving file csv

0 votes

It would be great if you can suggest to me what I am doing wrong in the below code. I just want to save output in Ans3AppleStore.csv. I think it is last the last part of the code where I need some change.

def main(args: Array[String])
{
val conf = new SparkConf().setAppName('mod5sol')
val sc:SparkContext = new SparkContext(conf)
val sqlContext: SQLContext = new SQLContext(sc)
val df: DataFrame = sqlContext.read.format('csv').option('header', 'true').load('AppleStore.csv')
df.registerTempTable('Apple5')
var dfsize=sqlContext.sql('select size_bytes Size, (size_bytes/1024) In_MB, ((size_bytes/1024)/1024) In_GB from Apple5').write.format('csv').save('Ans3Apple
Store.csv')}
May 22, 2019 in Apache Spark by Jai
2,381 views

1 answer to this question.

0 votes

 If you need a single output file (still in a folder) you can repartition (preferred if upstream data is large, but requires a shuffle):

df
      .repartition(1)
      .write.format("com.databricks.spark.csv")
      .option("header", "true")
      .save("mydata.csv")
      

or coalesce:

df
      .coalesce(1)
      .write.format("com.databricks.spark.csv")
      .option("header", "true")
      .save("mydata.csv")

All data will be written to mydata.csv/part-00000.

answered May 22, 2019 by Rishi

Related Questions In Apache Spark

0 votes
1 answer

Efficient way to read specific columns from parquet file in spark

As parquet is a column based storage ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,390 points
7,263 views
0 votes
1 answer

Can I read a CSV represented as a string into Apache Spark?

You can use the following command. This ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
2,124 views
0 votes
1 answer

Spark cannot access local file anymore?

By default it will access the HDFS. ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,260 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
7,906 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,558 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,185 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,214 views
+1 vote
1 answer

Scala: CSV file to Save data into HBase

Check the reference code mentioned below: def main(args: ...READ MORE

answered Jul 25, 2019 in Apache Spark by Hari
1,264 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

answered Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
4,682 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP