Most voted questions in Apache Spark

0 votes
1 answer

How to disable executor from fetching file from cache?

When a Spark application is running, the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
2,090 views
0 votes
1 answer

How to make driver update metrics quickly to executor?

There's a heartbeat signal sent to the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
522 views
0 votes
1 answer

How to disable broadcast checksum?

Run the following in the Spark shell: val ...READ MORE

Mar 9, 2019 in Apache Spark by Siri
626 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
10,726 views
0 votes
1 answer

Array of RDD

You can create an array of RDDs ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
1,475 views
0 votes
1 answer

What is Spark Core?

It is not like a CPU to ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
3,267 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
457 views
0 votes
1 answer

How to increase Garbage Collection speed?

The time interval between Garbage Collection is ...READ MORE

Mar 8, 2019 in Apache Spark by Pavitra
1,355 views
0 votes
1 answer

How to increase Spark memory for execution?

Probably the spill is because you have ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra

edited Mar 8, 2019 947 views
0 votes
1 answer

How to compress serialized RDD partition?

Yes, you can do this by enabling ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,389 views
0 votes
1 answer

Getting "buffer limit exceeded" exception inside Kryo.

Seems like the object being sent for ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,359 views
0 votes
1 answer

How to change default Spark dashboard port?

You can change it dynamically while using ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
917 views
0 votes
1 answer

What port the Spark dashboard run on?

Spark dashboard by default runs on port ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
535 views
0 votes
1 answer

How to delay live entity updates on Spark ?

You can do this by increasing the ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
458 views
0 votes
1 answer

Prevent jobs to be killed from Web UI

You need to be careful with this. ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
426 views
0 votes
1 answer

Disable Web UI for Spark Application

You can disable it like this: val sc ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
3,029 views
0 votes
1 answer

Spark logs not overwriting

Spark does not allow you to overwrite ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
811 views
0 votes
1 answer

How to enable Spark event logging?

To make Spark store the event logs, ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
2,622 views
0 votes
1 answer

How to change the location of Spark event logs?

You can change the location where you ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
4,042 views
0 votes
1 answer

Spark event log location

Unless and until you have not changed ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
454 views
0 votes
1 answer

Log every block update in Spark

By default, Spark does not log all ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
752 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,210 points
715 views
0 votes
1 answer

Spark shuffle service port number

The default port that shuffle service runs ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,210 points
632 views
0 votes
1 answer

Spark workers are not accepting any job (Kubernetes-Docker-Spark)

When kubernetes picks 10.*.*.*/16 network as it's ...READ MORE

Mar 1, 2019 in Apache Spark by Hamza
• 200 points
1,809 views
0 votes
1 answer

Spark2-submit does not generate output file.

To generate the output file, you can ...READ MORE

Feb 24, 2019 in Apache Spark by Esha
4,516 views
0 votes
1 answer

Companion objects in Scala

When a singleton object is named the ...READ MORE

Feb 24, 2019 in Apache Spark by Uma
623 views
0 votes
1 answer

Spark SQL in databricks

In sparkSql, we can use CASE when ...READ MORE

Feb 24, 2019 in Apache Spark by Rishi
2,092 views
0 votes
1 answer

Not able to preserve shuffle files in Spark

You lose the files because by default, ...READ MORE

Feb 24, 2019 in Apache Spark by Rana
1,242 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

Feb 24, 2019 in Apache Spark by Wasim
872 views
0 votes
1 answer

How do spark extra listeners work?

Yes. You can use extra listeners by setting ...READ MORE

Feb 24, 2019 in Apache Spark by Rishi
2,621 views
0 votes
1 answer

Increase number of cores in Spark

Now that the job is already running, ...READ MORE

Feb 23, 2019 in Apache Spark by Reshma
1,798 views
0 votes
1 answer

Loading Spark properties dynamically

First, create an empty conf using this ...READ MORE

Feb 22, 2019 in Apache Spark by Mansoor
1,261 views
0 votes
0 answers

Why doesn't my Spark Yarn client runs on all available worker machines?

I am running an application on Spark ...READ MORE

Feb 22, 2019 in Apache Spark by Uzair Ahmad

edited Feb 22, 2019 by Omkar 7,701 views
0 votes
1 answer

Apache Spark, usage of yield.

Yield is used in sequence comprehensions. It is ...READ MORE

Feb 22, 2019 in Apache Spark by Saruj
2,732 views
0 votes
1 answer

Installing Spark on Ubuntu

Hey. Follow these steps to install Spark ...READ MORE

Feb 20, 2019 in Apache Spark by Omkar
• 69,210 points
1,589 views
0 votes
1 answer

Passing condition dynamically to Spark application.

You can try this: d.filter(col("value").isin(desiredThings: _*)) and if you ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,210 points
8,392 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

Feb 19, 2019 in Apache Spark by Omkar
• 69,210 points
13,410 views
0 votes
1 answer

Parquet to ORC format in Spark

I appreciate that you want to try ...READ MORE

Feb 15, 2019 in Apache Spark by Anjali
2,096 views
0 votes
1 answer

How can I remove headers from dataframe?

You can use filter to do this. ...READ MORE

Feb 15, 2019 in Apache Spark by Aryan
19,607 views
0 votes
1 answer

Spark context (sc) not found

Maybe the hadoop service didn't start properly. Try ...READ MORE

Feb 14, 2019 in Apache Spark by John
1,615 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

Feb 13, 2019 in Apache Spark by Omkar
• 69,210 points
1,145 views
0 votes
1 answer

Multidimensional Array in Scala

Multidimensional array is an array which store ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,210 points
1,654 views
0 votes
1 answer

Error using double map.

You have forgotten to mention the case ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,210 points
433 views
0 votes
1 answer

Query regarding a spark split logic

First, import the data in Spark and ...READ MORE

Feb 9, 2019 in Apache Spark by Omkar
• 69,210 points
382 views
0 votes
1 answer

Error reading avro dataset in spark

For avro, you need to download and ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
1,923 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
557 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31, 2019 in Apache Spark by Omkar
• 69,210 points
1,834 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

Jan 29, 2019 in Apache Spark by Omkar
• 69,210 points
2,459 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

Jan 8, 2019 in Apache Spark by Omkar
• 69,210 points
529 views
0 votes
1 answer

Is there an API for implementing graphs in Spark?

GraphX is the Spark API for graphs and ...READ MORE

Jan 5, 2019 in Apache Spark by Frankie
• 9,830 points
499 views