Need to load 40 GB data to elasticsearch using spark

+1 vote
I am working in psedo distributed spark cluster on system with 2 cores, 4 logical processor and 30 GB RAM. Data is in 80 csv file where each one is 500 mb. With default configuration, simple spark job is taking 2 hrs. Please advise the things to consider for performance improvement.
Jul 16 in Apache Spark by Amit
• 130 points
164 views
Probably because your data is too large?

1 answer to this question.

–1 vote
Did you find any documents or example for this issue? I have the same situaiton and i try to find something for that. However i didnt find anything.
answered Nov 5 by Begum

Related Questions In Apache Spark

0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 10,800 points
1,310 views
0 votes
1 answer
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18 in Apache Spark by John
92 views
0 votes
1 answer

Using R to display configuration of Spark SQL

Try the below-mentioned code. sparkR.session() properties <- sql("SET -v") showDF(properties, ...READ MORE

answered Mar 18 in Apache Spark by John
23 views
+1 vote
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
3,587 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,800 points
457 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
18,498 views
0 votes
1 answer

Need to disable unpersist in Spark

You can dynamically change this function by ...READ MORE

answered Mar 19 in Apache Spark by Jai
80 views
0 votes
1 answer

How to use ftp scheme using Yarn in Spark application?

In case Yarn does not support schemes ...READ MORE

answered Mar 28 in Apache Spark by Raj
151 views