How to skip headers when reading a CSV file in S3 and creating a table in AWS Athena?

0 votes

I am trying to read csv file from s3 bucket and create a table in AWS Athena. My table when created is unable to skip the header information of my CSV file.

Query Example :

CREATE EXTERNAL TABLE IF NOT EXISTS table_name (   `event_type_id`
     string,   `customer_id` string,   `date` string,   `email` string )
     ROW FORMAT SERDE  'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
     WITH
     SERDEPROPERTIES (   "separatorChar" = "|",   "quoteChar"     = "\"" )
     LOCATION 's3://location/' 
     TBLPROPERTIES ("skip.header.line.count"="1");
This doesn't seem to work. Is there any other way that I could get through this?
Sep 4, 2018 in AWS by datageek
• 2,440 points
893 views

2 answers to this question.

0 votes

This is a known deficiency. The best method I've seen was tweeted by Eric Hammond:

...WHERE date NOT LIKE '#%'

This appears to skip header lines during a Query. I'm not sure how it works, but it might be a method for skipping NULLs.

answered Sep 4, 2018 by Archana
• 4,090 points
+2 votes
Thanks for the answer.

This should be clear & prominent in the aws doco, but unfortunately is not !
answered 4 days ago by athenauserz

Related Questions In AWS

0 votes
1 answer

How to download the latest file in a S3 bucket using AWS CLI?

You can use the below command $ aws ...READ MORE

answered Sep 6, 2018 in AWS by Archana
• 4,090 points
3,017 views
–1 vote
1 answer

How to read a csv file stored in Amazon S3 using csv.DictReader

The code would be something like this: import ...READ MORE

answered Oct 25, 2018 in AWS by Archana
• 5,560 points
6,319 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to create a DynamoDB table in AWS?

Creating a DynamoDB table is made very ...READ MORE

answered Feb 22 in AWS by Priyaj
• 56,200 points
68 views
0 votes
1 answer