what are the job optimization Technics in spark and scala ?

0 votes
Mar 17, 2019 in Apache Spark by satish kumar
• 180 points
493 views

1 answer to this question.

0 votes

There are different methods to achieve optimization in Spark, for example:

  • Data Serialization
  • Memory Management
  • Memory Consumption
  • Data Structure Tuning
  • Garbage Collection
  • Parallelism
  • Data Locality

To know more on the optimization techniques, visit the documentation: https://spark.apache.org/docs/latest/tuning.html

answered Mar 18, 2019 by Veer

Related Questions In Apache Spark

0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

answered Mar 18, 2019 in Apache Spark by Pavan
1,256 views
0 votes
0 answers

what are the memory issues in spark ?

Mar 17, 2019 in Apache Spark by satish kumar
• 180 points
506 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 25,460 points
771 views
0 votes
1 answer
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,840 points
9,647 views
0 votes
1 answer

What are the parameters in local[a,b,c] explains?

SparkContext.createTaskScheduler property parses the master parameter Local: 1 ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,370 points
115 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

answered Feb 13, 2019 in Apache Spark by Omkar
• 68,880 points
189 views
0 votes
1 answer

what are the spark real time issues ?

Some of the issues I have faced ...READ MORE

answered Mar 18, 2019 in Apache Spark by Sharman
1,205 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
19,766 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 26, 2018 in Apache Spark by zombie
• 3,750 points
648 views