Need to load 40 GB data to elasticsearch using spark

+1 vote
I am working in psedo distributed spark cluster on system with 2 cores, 4 logical processor and 30 GB RAM. Data is in 80 csv file where each one is 500 mb. With default configuration, simple spark job is taking 2 hrs. Please advise the things to consider for performance improvement.
Jul 16, 2019 in Apache Spark by Amit
• 130 points
233 views
Probably because your data is too large?

1 answer to this question.

–1 vote
Did you find any documents or example for this issue? I have the same situaiton and i try to find something for that. However i didnt find anything.
answered Nov 5, 2019 by Begum

Related Questions In Apache Spark

0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 10,840 points
1,516 views
0 votes
1 answer

How to authenticate Spark internal connections using a secret key?

You need to set the secret key ...READ MORE

answered Mar 13, 2019 in Apache Spark by Venu
167 views
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
129 views
0 votes
1 answer

Using R to display configuration of Spark SQL

Try the below-mentioned code. sparkR.session() properties <- sql("SET -v") showDF(properties, ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
30 views
+1 vote
1 answer
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
4,147 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
586 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
22,414 views
0 votes
1 answer

Need to disable unpersist in Spark

You can dynamically change this function by ...READ MORE

answered Mar 19, 2019 in Apache Spark by Jai
101 views
0 votes
1 answer

How to use ftp scheme using Yarn in Spark application?

In case Yarn does not support schemes ...READ MORE

answered Mar 28, 2019 in Apache Spark by Raj
187 views