Most voted questions in Apache Spark

0 votes
1 answer

How to disable executor from fetching file from cache?

When a Spark application is running, the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
2,097 views
0 votes
1 answer

How to make driver update metrics quickly to executor?

There's a heartbeat signal sent to the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
524 views
0 votes
1 answer

How to disable broadcast checksum?

Run the following in the Spark shell: val ...READ MORE

Mar 9, 2019 in Apache Spark by Siri
629 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
10,728 views
0 votes
1 answer

Array of RDD

You can create an array of RDDs ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
1,481 views
0 votes
1 answer

What is Spark Core?

It is not like a CPU to ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
3,268 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
460 views
0 votes
1 answer

How to increase Garbage Collection speed?

The time interval between Garbage Collection is ...READ MORE

Mar 8, 2019 in Apache Spark by Pavitra
1,358 views
0 votes
1 answer

How to increase Spark memory for execution?

Probably the spill is because you have ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra

edited Mar 8, 2019 952 views
0 votes
1 answer

How to compress serialized RDD partition?

Yes, you can do this by enabling ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,393 views
0 votes
1 answer

Getting "buffer limit exceeded" exception inside Kryo.

Seems like the object being sent for ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,363 views
0 votes
1 answer

How to change default Spark dashboard port?

You can change it dynamically while using ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
923 views
0 votes
1 answer

What port the Spark dashboard run on?

Spark dashboard by default runs on port ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
539 views
0 votes
1 answer

How to delay live entity updates on Spark ?

You can do this by increasing the ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
460 views
0 votes
1 answer

Prevent jobs to be killed from Web UI

You need to be careful with this. ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
428 views
0 votes
1 answer

Disable Web UI for Spark Application

You can disable it like this: val sc ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
3,033 views
0 votes
1 answer

Spark logs not overwriting

Spark does not allow you to overwrite ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
819 views
0 votes
1 answer

How to enable Spark event logging?

To make Spark store the event logs, ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
2,623 views
0 votes
1 answer

How to change the location of Spark event logs?

You can change the location where you ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
4,050 views
0 votes
1 answer

Spark event log location

Unless and until you have not changed ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
455 views
0 votes
1 answer

Log every block update in Spark

By default, Spark does not log all ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
754 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,210 points
716 views
0 votes
1 answer

Spark shuffle service port number

The default port that shuffle service runs ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,210 points
634 views
0 votes
1 answer

Spark workers are not accepting any job (Kubernetes-Docker-Spark)

When kubernetes picks 10.*.*.*/16 network as it's ...READ MORE

Mar 1, 2019 in Apache Spark by Hamza
• 200 points
1,814 views
0 votes
1 answer

Spark2-submit does not generate output file.

To generate the output file, you can ...READ MORE

Feb 24, 2019 in Apache Spark by Esha
4,522 views
0 votes
1 answer

Companion objects in Scala

When a singleton object is named the ...READ MORE

Feb 24, 2019 in Apache Spark by Uma
625 views
0 votes
1 answer

Spark SQL in databricks

In sparkSql, we can use CASE when ...READ MORE

Feb 24, 2019 in Apache Spark by Rishi
2,093 views
0 votes
1 answer

Not able to preserve shuffle files in Spark

You lose the files because by default, ...READ MORE

Feb 24, 2019 in Apache Spark by Rana
1,243 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

Feb 24, 2019 in Apache Spark by Wasim
875 views
0 votes
1 answer

How do spark extra listeners work?

Yes. You can use extra listeners by setting ...READ MORE

Feb 24, 2019 in Apache Spark by Rishi
2,631 views
0 votes
1 answer

Increase number of cores in Spark

Now that the job is already running, ...READ MORE

Feb 23, 2019 in Apache Spark by Reshma
1,805 views
0 votes
1 answer

Loading Spark properties dynamically

First, create an empty conf using this ...READ MORE

Feb 22, 2019 in Apache Spark by Mansoor
1,264 views
0 votes
0 answers

Why doesn't my Spark Yarn client runs on all available worker machines?

I am running an application on Spark ...READ MORE

Feb 22, 2019 in Apache Spark by Uzair Ahmad

edited Feb 22, 2019 by Omkar 7,711 views
0 votes
1 answer

Apache Spark, usage of yield.

Yield is used in sequence comprehensions. It is ...READ MORE

Feb 22, 2019 in Apache Spark by Saruj
2,738 views
0 votes
1 answer

Installing Spark on Ubuntu

Hey. Follow these steps to install Spark ...READ MORE

Feb 20, 2019 in Apache Spark by Omkar
• 69,210 points
1,591 views
0 votes
1 answer

Passing condition dynamically to Spark application.

You can try this: d.filter(col("value").isin(desiredThings: _*)) and if you ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,210 points
8,401 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,210 points
13,436 views
0 votes
1 answer

Parquet to ORC format in Spark

I appreciate that you want to try ...READ MORE

Feb 15, 2019 in Apache Spark by Anjali
2,099 views
0 votes
1 answer

How can I remove headers from dataframe?

You can use filter to do this. ...READ MORE

Feb 15, 2019 in Apache Spark by Aryan
19,624 views
0 votes
1 answer

Spark context (sc) not found

Maybe the hadoop service didn't start properly. Try ...READ MORE

Feb 14, 2019 in Apache Spark by John
1,626 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

Feb 13, 2019 in Apache Spark by Omkar
• 69,210 points
1,148 views
0 votes
1 answer

Multidimensional Array in Scala

Multidimensional array is an array which store ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,210 points
1,657 views
0 votes
1 answer

Error using double map.

You have forgotten to mention the case ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,210 points
436 views
0 votes
1 answer

Query regarding a spark split logic

First, import the data in Spark and ...READ MORE

Feb 9, 2019 in Apache Spark by Omkar
• 69,210 points
384 views
0 votes
1 answer

Error reading avro dataset in spark

For avro, you need to download and ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
1,925 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
561 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31, 2019 in Apache Spark by Omkar
• 69,210 points
1,839 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

Jan 29, 2019 in Apache Spark by Omkar
• 69,210 points
2,470 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

Jan 8, 2019 in Apache Spark by Omkar
• 69,210 points
531 views
0 votes
1 answer

Is there an API for implementing graphs in Spark?

GraphX is the Spark API for graphs and ...READ MORE

Jan 5, 2019 in Apache Spark by Frankie
• 9,830 points
500 views