Most voted questions in Apache Spark

0 votes
2 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

Jun 17 in Apache Spark by vishal
• 160 points
3,581 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8 in Apache Spark by Raj
733 views
0 votes
1 answer

Array of RDD

You can create an array of RDDs ...READ MORE

Mar 8 in Apache Spark by Raj
45 views
0 votes
1 answer

What is Spark Core?

It is not like a CPU to ...READ MORE

Mar 8 in Apache Spark by Raj
85 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

Mar 8 in Apache Spark by Raj
38 views
0 votes
1 answer

How to increase Garbage Collection speed?

The time interval between Garbage Collection is ...READ MORE

Mar 7 in Apache Spark by Pavitra
24 views
0 votes
1 answer

How to increase Spark memory for execution?

Probably the spill is because you have ...READ MORE

Mar 7 in Apache Spark by Pavitra

edited Mar 7 37 views
0 votes
1 answer

How to compress serialized RDD partition?

Yes, you can do this by enabling ...READ MORE

Mar 7 in Apache Spark by Pavitra
124 views
0 votes
1 answer

Getting "buffer limit exceeded" exception inside Kryo.

Seems like the object being sent for ...READ MORE

Mar 7 in Apache Spark by Pavitra
81 views
0 votes
1 answer

How to change default Spark dashboard port?

You can change it dynamically while using ...READ MORE

Mar 6 in Apache Spark by Rohit
32 views
0 votes
1 answer

What port the Spark dashboard run on?

Spark dashboard by default runs on port ...READ MORE

Mar 6 in Apache Spark by Rohit
84 views
0 votes
1 answer

How to delay live entity updates on Spark ?

You can do this by increasing the ...READ MORE

Mar 6 in Apache Spark by Rohit
42 views
0 votes
1 answer

Prevent jobs to be killed from Web UI

You need to be careful with this. ...READ MORE

Mar 6 in Apache Spark by Rohit
17 views
0 votes
1 answer

Disable Web UI for Spark Application

You can disable it like this: val sc ...READ MORE

Mar 6 in Apache Spark by Rohit
518 views
0 votes
1 answer

Spark logs not overwriting

Spark does not allow you to overwrite ...READ MORE

Mar 6 in Apache Spark by Rohit
21 views
0 votes
1 answer

How to enable Spark event logging?

To make Spark store the event logs, ...READ MORE

Mar 6 in Apache Spark by Rohit
221 views
0 votes
1 answer

How to change the location of Spark event logs?

You can change the location where you ...READ MORE

Mar 6 in Apache Spark by Rohit
266 views
0 votes
1 answer

Spark event log location

Unless and until you have not changed ...READ MORE

Mar 6 in Apache Spark by Rohit
21 views
0 votes
1 answer

Log every block update in Spark

By default, Spark does not log all ...READ MORE

Mar 6 in Apache Spark by Rohit
23 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

Mar 1 in Apache Spark by Omkar
• 67,660 points
61 views
0 votes
1 answer

Spark shuffle service port number

The default port that shuffle service runs ...READ MORE

Mar 1 in Apache Spark by Omkar
• 67,660 points
31 views
0 votes
1 answer

Spark workers are not accepting any job (Kubernetes-Docker-Spark)

When kubernetes picks 10.*.*.*/16 network as it's ...READ MORE

Mar 1 in Apache Spark by Hamza
• 180 points
214 views
0 votes
1 answer

Spark2-submit does not generate output file.

To generate the output file, you can ...READ MORE

Feb 23 in Apache Spark by Esha
751 views
0 votes
1 answer

Companion objects in Scala

When a singleton object is named the ...READ MORE

Feb 23 in Apache Spark by Uma
53 views
0 votes
1 answer

Spark SQL in databricks

In sparkSql, we can use CASE when ...READ MORE

Feb 23 in Apache Spark by Rishi
248 views
0 votes
1 answer

Not able to preserve shuffle files in Spark

You lose the files because by default, ...READ MORE

Feb 23 in Apache Spark by Rana
53 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

Feb 23 in Apache Spark by Wasim
70 views
0 votes
1 answer

How do spark extra listeners work?

Yes. You can use extra listeners by setting ...READ MORE

Feb 23 in Apache Spark by Rishi
334 views
0 votes
1 answer

Increase number of cores in Spark

Now that the job is already running, ...READ MORE

Feb 22 in Apache Spark by Reshma
177 views
0 votes
1 answer

Loading Spark properties dynamically

First, create an empty conf using this ...READ MORE

Feb 22 in Apache Spark by Mansoor
40 views
0 votes
0 answers

Why doesn't my Spark Yarn client runs on all available worker machines?

I am running an application on Spark ...READ MORE

Feb 22 in Apache Spark by Uzair Ahmad

edited Feb 22 by Omkar 783 views
0 votes
1 answer

Apache Spark, usage of yield.

Yield is used in sequence comprehensions. It is ...READ MORE

Feb 21 in Apache Spark by Saruj
239 views
0 votes
1 answer

Installing Spark on Ubuntu

Hey. Follow these steps to install Spark ...READ MORE

Feb 20 in Apache Spark by Omkar
• 67,660 points
272 views
0 votes
1 answer

Passing condition dynamically to Spark application.

You can try this: d.filter(col("value").isin(desiredThings: _*)) and if you ...READ MORE

Feb 19 in Apache Spark by Omkar
• 67,660 points
250 views
0 votes
1 answer

How to select all columns with group by?

You can use the following to print ...READ MORE

Feb 18 in Apache Spark by Omkar
• 67,660 points
66 views
0 votes
1 answer

Parquet to ORC format in Spark

I appreciate that you want to try ...READ MORE

Feb 14 in Apache Spark by Anjali
307 views
0 votes
1 answer

How can I remove headers from dataframe?

You can use filter to do this. ...READ MORE

Feb 14 in Apache Spark by Aryan
1,754 views
0 votes
1 answer

Spark context (sc) not found

Maybe the hadoop service didn't start properly. Try ...READ MORE

Feb 13 in Apache Spark by John
71 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

Feb 13 in Apache Spark by Omkar
• 67,660 points
139 views
0 votes
1 answer

Multidimensional Array in Scala

Multidimensional array is an array which store ...READ MORE

Feb 11 in Apache Spark by Omkar
• 67,660 points
187 views
0 votes
1 answer

Error using double map.

You have forgotten to mention the case ...READ MORE

Feb 11 in Apache Spark by Omkar
• 67,660 points
32 views
0 votes
1 answer

Query regarding a spark split logic

First, import the data in Spark and ...READ MORE

Feb 9 in Apache Spark by Omkar
• 67,660 points
34 views
0 votes
1 answer

Error reading avro dataset in spark

For avro, you need to download and ...READ MORE

Feb 4 in Apache Spark by Omkar
• 67,660 points
350 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

Feb 4 in Apache Spark by Omkar
• 67,660 points
35 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31 in Apache Spark by Omkar
• 67,660 points
53 views
0 votes
1 answer

Sliding function in spark

The sliding function is used when you ...READ MORE

Jan 29 in Apache Spark by Omkar
• 67,660 points
252 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

Jan 8 in Apache Spark by Omkar
• 67,660 points
45 views
0 votes
1 answer

Is there an API for implementing graphs in Spark?

GraphX is the Spark API for graphs and ...READ MORE

Jan 4 in Apache Spark by Frankie
• 9,810 points
30 views
0 votes
1 answer

What is Executor Memory in a Spark application?

Every spark application has same fixed heap ...READ MORE

Jan 4 in Apache Spark by Frankie
• 9,810 points
641 views
0 votes
1 answer

Unresolved dependency issue on sbt package command

Check if you are able to access ...READ MORE

Jan 3 in Apache Spark by Omkar
• 67,660 points
279 views