How can I remove headers from dataframe?

0 votes

Hi ,
I have been trying to remove the headers from dataframe below is my code:

val file_source_read1=spark.read.option("header",false).option("delimiter",source_del).csv(source_path)
val file_source_read2=file_source_read1.first()
val file_source_read3 = file_source_read1.except(file_source_read2)

but it is throwing an error. Would you please help on this? If possible please tell me how to do it with PySpark

Feb 14 in Apache Spark by Dinesh
1,838 views

1 answer to this question.

0 votes

You can use filter to do this. Something like this:​​

val header = data.first 
val rows = data.filter(line => line != header)
answered Feb 14 by Aryan

Related Questions In Apache Spark

0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4 in Apache Spark by Dhara dhruve
1,227 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,310 points
1,592 views
0 votes
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,690 points
2,059 views
0 votes
1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,310 points
464 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,730 points
3,382 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,730 points
407 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,897 views
0 votes
1 answer

How to prevent executor from self-destructing?

I think there is a timeout set ...READ MORE

answered Mar 12 in Apache Spark by Veer
57 views
0 votes
1 answer

How to disable automatic remove of application of failures?

Yes, you have read it right. The ...READ MORE

answered Mar 25 in Apache Spark by Hari
38 views