what are the job optimization Technics in spark and scala ?

0 votes
Mar 17 in Apache Spark by satish kumar
• 180 points
259 views

1 answer to this question.

0 votes

There are different methods to achieve optimization in Spark, for example:

  • Data Serialization
  • Memory Management
  • Memory Consumption
  • Data Structure Tuning
  • Garbage Collection
  • Parallelism
  • Data Locality

To know more on the optimization techniques, visit the documentation: https://spark.apache.org/docs/latest/tuning.html

answered Mar 18 by Veer

Related Questions In Apache Spark

0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

answered Mar 18 in Apache Spark by Pavan
401 views
0 votes
0 answers

what are the memory issues in spark ?

Mar 17 in Apache Spark by satish kumar
• 180 points
272 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3 in Apache Spark by Gitika
• 25,340 points
247 views
0 votes
1 answer
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 10,670 points
5,892 views
0 votes
1 answer

What are the parameters in local[a,b,c] explains?

SparkContext.createTaskScheduler property parses the master parameter Local: 1 ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,290 points
80 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

answered Feb 13 in Apache Spark by Omkar
• 67,480 points
102 views
0 votes
1 answer

what are the spark real time issues ?

Some of the issues I have faced ...READ MORE

answered Mar 18 in Apache Spark by Sharman
576 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,580 points
13,914 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 26, 2018 in Apache Spark by zombie
• 3,690 points
413 views