Getting memory leak error on AWS EMR - Saving Dataframe into hive external table on S3 -

0 votes

Hello i am writing spark using python and tring to write the dataframe into table and table is hive external and stored on AWS S3

below is the command : sqlContext.sql(selectQuery).write.mode("overwrite").format(trgFormat).option("compression", trgCompression).save(trgDataFileBase)

Below is the error

ERROR ResourceLeakDetector: LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak reporting to find out where the leak occurred. To enable advanced leak reporting, specify the JVM option '-Dio.netty.leakDetection.level=advanced' or call ResourceLeakDetector.setLevel() See for more information.

spark sumit: spark-submit 

--master yarn 

--queue default 

--deploy-mode client 

--num-executors 10 

--executor-memory 12g 

--executor-cores 2 

--conf spark.debug.maxToStringFields=100 

--conf spark.yarn.executor.memoryOverhead=2048

Aug 9, 2018 in AWS by bug_seeker
• 15,550 points

1 answer to this question.

0 votes

You can start by creating  a temp table, say trgDataFileBasetmp, then using the same definition create the table on s3. You will need all the parameters in the definition like SERDEPROPERTIES, TBLPROPERTIES. Here the difference I have is saveAsTable:

sqlContext.sql(selectQuery).write.mode("overwrite").format(trgFormat).option("compression", trgCompression).saveAsTable(trgDataFileBase)

If this does not work then you can start with:


Hope one out of the two would help.

answered Aug 9, 2018 by Priyaj
• 58,140 points

Related Questions In AWS

0 votes
1 answer

A strange spark ERROR on AWS EMR

Those warning messages can be suppressed by ...READ MORE

answered Jul 13, 2018 in AWS by Flying geek
• 3,260 points
0 votes
1 answer

AWS S3 CLI : error while trying to copy files locally using terminal

For the first error you should add ...READ MORE

answered Aug 3, 2018 in AWS by Archana
• 4,150 points
0 votes
1 answer

AWS s3 -trigger on object created, function gets invoked continuously

This is not supposed to happen. Please ...READ MORE

answered Feb 6, 2019 in AWS by Esha
0 votes
1 answer

Migration on-premise Postgresql into AWS RDS

Actually the solution purposed works. But there's ...READ MORE

answered Aug 24, 2018 in AWS by Priyaj
• 58,140 points
+15 votes
2 answers

Git management technique when there are multiple customers and need multiple customization?

Consider this - In 'extended' Git-Flow, (Git-Multi-Flow, ...READ MORE

answered Mar 27, 2018 in DevOps & Agile by DragonLord999
• 8,450 points
+2 votes
1 answer
0 votes
1 answer

AWS EMR script-runner access error

It turns out that a bootstrap action ...READ MORE

answered Aug 6, 2018 in AWS by Priyaj
• 58,140 points
0 votes
1 answer

AWS Access Key error when uploading to S3

You can try getting creating a new ...READ MORE

answered Feb 6, 2019 in AWS by Priyaj
• 58,140 points