what are the job optimization Technics in spark and scala

0 votes
Mar 17, 2019 in Apache Spark by satish kumar
• 180 points
996 views

1 answer to this question.

0 votes

There are different methods to achieve optimization in Spark, for example:

  • Data Serialization
  • Memory Management
  • Memory Consumption
  • Data Structure Tuning
  • Garbage Collection
  • Parallelism
  • Data Locality

To know more on the optimization techniques, visit the documentation: https://spark.apache.org/docs/latest/tuning.html

answered Mar 18, 2019 by Veer

Related Questions In Apache Spark

0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

answered Mar 18, 2019 in Apache Spark by Pavan
5,966 views
0 votes
0 answers

what are the memory issues in spark ?

Mar 17, 2019 in Apache Spark by satish kumar
• 180 points
1,038 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 65,870 points
2,133 views
0 votes
1 answer
0 votes
1 answer

What are some of the things you can monitor in the Spark Web UI?

Option c) Mapr Jobs that are submitted READ MORE

answered Nov 25, 2020 in Apache Spark by Gitika
• 65,870 points
153 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points
18,766 views
0 votes
1 answer

What are the parameters in local[a,b,c] explains?

SparkContext.createTaskScheduler property parses the master parameter Local: 1 ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,480 points
210 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

answered Feb 13, 2019 in Apache Spark by Omkar
• 69,090 points
505 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

answered Aug 27, 2018 in Apache Spark by shams
• 3,660 points
34,374 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

answered Jul 26, 2018 in Apache Spark by zombie
• 3,790 points
1,588 views