Spark.read.csv to create RDD into Dataframe

0 votes

 I am trying to parse the case class file as RDD and convert the RDD into DF. But I couldn't do it. I am trying to use spark.read.csv. Please help.

Jan 21 in Big Data Hadoop by slayer
• 29,040 points
372 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You can use a case class and rdd and then convert it to dataframe. 

The common syntax to create a dataframe directly from a file is as shown below for your reference.

val df = spark.read.option("header","true").option(inferSchema,"true").csv("") 

if you are relying on in-built schema of the csv file.

And If you don't then you can create a schema such as:

val schema = StructType(Array(StructField("AirportID", IntegerType, true)))
answered Jan 21 by Omkar
• 65,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Spark - load CSV file as DataFrame?

spark-csv is part of core Spark functionality ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by slayer
• 29,040 points
1,542 views
0 votes
1 answer

How to read more than one files in Apache Spark?

Try this: val text = sc.wholeTextFiles("student/*") text.collect() READ MORE

answered Dec 11, 2018 in Big Data Hadoop by Omkar
• 65,810 points
104 views
0 votes
1 answer

How to read Spark elements having multiple lines each?

Try this: val new_records = sc.newAPIHadoopRDD(hadoopConf,classOf[N ...READ MORE

answered Dec 12, 2018 in Big Data Hadoop by Omkar
• 65,810 points
53 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 9,030 points

edited Mar 21, 2018 by nitinrawat895 143 views
0 votes
0 answers
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
1,632 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
130 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
7,921 views
0 votes
1 answer

How to convert Spark data into CSV?

You can use this: df.write .option("header", "true") ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 65,810 points
33 views
0 votes
1 answer

How to save Spark dataframe as dynamic partitioned table in Hive?

Hey, you can try something like this: df.write.partitionBy('year', ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 65,810 points
308 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.