Spark.read.csv to create RDD into Dataframe

0 votes

 I am trying to parse the case class file as RDD and convert the RDD into DF. But I couldn't do it. I am trying to use spark.read.csv. Please help.

Jan 21 in Big Data Hadoop by slayer
• 29,050 points
729 views

1 answer to this question.

0 votes

You can use a case class and rdd and then convert it to dataframe. 

The common syntax to create a dataframe directly from a file is as shown below for your reference.

val df = spark.read.option("header","true").option(inferSchema,"true").csv("") 

if you are relying on in-built schema of the csv file.

And If you don't then you can create a schema such as:

val schema = StructType(Array(StructField("AirportID", IntegerType, true)))
answered Jan 21 by Omkar
• 67,140 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Spark - load CSV file as DataFrame?

spark-csv is part of core Spark functionality ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by slayer
• 29,050 points
2,274 views
0 votes
1 answer

How to read more than one files in Apache Spark?

Try this: val text = sc.wholeTextFiles("student/*") text.collect() READ MORE

answered Dec 11, 2018 in Big Data Hadoop by Omkar
• 67,140 points
168 views
0 votes
1 answer

How to read Spark elements having multiple lines each?

Try this: val new_records = sc.newAPIHadoopRDD(hadoopConf,classOf[N ...READ MORE

answered Dec 12, 2018 in Big Data Hadoop by Omkar
• 67,140 points
93 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,150 points

edited Mar 21, 2018 by nitinrawat895 213 views
0 votes
0 answers
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,150 points
2,062 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,150 points
199 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
10,562 views
0 votes
1 answer

How to convert Spark data into CSV?

You can use this: df.write .option("header", "true") ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 67,140 points
39 views
0 votes
1 answer

How to save Spark dataframe as dynamic partitioned table in Hive?

Hey, you can try something like this: df.write.partitionBy('year', ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 67,140 points
658 views