Trending questions in Apache Spark

0 votes
1 answer

How to restrict a group to only view in Spark?

You can do it dynamically be setting ...READ MORE

Mar 15, 2019 in Apache Spark by Raj
1,038 views
0 votes
1 answer

How to add modify access for Web UI user?

For a user to have modification access ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
1,067 views
0 votes
1 answer

Enable encryption for local Input and Output

You can enable local I/O encryption like ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
1,066 views
0 votes
1 answer

How to relaunch tasks that are running slowly?

The technical term for what you want ...READ MORE

Mar 12, 2019 in Apache Spark by Veer
1,140 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,180 points
1,608 views
0 votes
1 answer

How to make driver update metrics quickly to executor?

There's a heartbeat signal sent to the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
1,218 views
0 votes
1 answer

How to disable broadcast checksum?

Run the following in the Spark shell: val ...READ MORE

Mar 9, 2019 in Apache Spark by Siri
1,223 views
0 votes
1 answer

Companion objects in Scala

When a singleton object is named the ...READ MORE

Feb 24, 2019 in Apache Spark by Uma
1,806 views
0 votes
1 answer

How to disable existing directory check?

To disable this, run the below commands: val ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
1,181 views
0 votes
1 answer

What port the Spark dashboard run on?

Spark dashboard by default runs on port ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
1,262 views
0 votes
1 answer

Spark shuffle service port number

The default port that shuffle service runs ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,180 points
1,467 views
0 votes
1 answer

Prevent jobs to be killed from Web UI

You need to be careful with this. ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
1,243 views
0 votes
1 answer

Key Factor Algorithms used for encryption.

The default key factor algorithm used is PBKDF2WithHmacSHA1. You ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
924 views
0 votes
1 answer

Increasing retry before blacklisting a node

You can do it dynamically using the ...READ MORE

Mar 12, 2019 in Apache Spark by Raj
1,002 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
1,142 views
0 votes
1 answer

How to increase wait time to launch data-local task?

You can increase the locality wait time ...READ MORE

Mar 11, 2019 in Apache Spark by Raj
996 views
0 votes
1 answer

Delay requesting new executor in dynamic allocation

You can set the duration like this: val ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
912 views
0 votes
1 answer

How to set time for task speculation?

By default, the check for task speculation ...READ MORE

Mar 12, 2019 in Apache Spark by Veer
941 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31, 2019 in Apache Spark by Omkar
• 69,180 points
2,662 views
0 votes
1 answer

How to delay live entity updates on Spark ?

You can do this by increasing the ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
1,166 views
0 votes
1 answer

Spark event log location

Unless and until you have not changed ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
1,138 views
0 votes
1 answer

Changing port for Block Managers

By default, the port of which the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
982 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

Feb 24, 2019 in Apache Spark by Wasim
1,521 views
0 votes
1 answer

Unresolved dependency issue on sbt package command

Check if you are able to access ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,180 points
3,696 views
0 votes
2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2)   // ...READ MORE

Aug 17, 2018 in Apache Spark by zombie
• 3,790 points
10,563 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

Feb 13, 2019 in Apache Spark by Omkar
• 69,180 points
1,798 views
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

Apr 19, 2018 in Apache Spark by Ashish
• 2,650 points
14,721 views
+1 vote
1 answer

Spark interview

Preparing for an interview? We have something ...READ MORE

Feb 7, 2019 in Apache Spark by Edureka
• 2,960 points
1,520 views
0 votes
1 answer

Error using double map.

You have forgotten to mention the case ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,180 points
1,281 views
0 votes
1 answer

Languages supported by Apache Spark?

Apache Spark supports the following four languages:  Scala, ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,967 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,180 points
1,243 views
0 votes
1 answer

Query regarding a spark split logic

First, import the data in Spark and ...READ MORE

Feb 9, 2019 in Apache Spark by Omkar
• 69,180 points
922 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,180 points
2,392 views
0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

Nov 20, 2018 in Apache Spark by Frankie
• 9,830 points
4,050 views
–1 vote
1 answer

Deciding number of spark context objects

How many spark context objects you should ...READ MORE

Jan 16, 2019 in Apache Spark by Omkar
• 69,180 points
1,243 views
0 votes
1 answer

How to add third party java jars for use in PySpark?

You can add external jars as arguments ...READ MORE

Jul 4, 2018 in Apache Spark by nitinrawat895
• 11,380 points

edited Nov 19, 2021 by Sarfaraz 9,429 views
0 votes
1 answer

Is there an API for implementing graphs in Spark?

GraphX is the Spark API for graphs and ...READ MORE

Jan 5, 2019 in Apache Spark by Frankie
• 9,830 points
1,353 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

Jan 8, 2019 in Apache Spark by Omkar
• 69,180 points
1,174 views
0 votes
1 answer

How to open/stream .zip files through Spark?

You can try and check this below ...READ MORE

Nov 20, 2018 in Apache Spark by Frankie
• 9,830 points
3,049 views
0 votes
1 answer

Filter, Option or FlatMap in spark

If, for option 2, you mean have ...READ MORE

Nov 9, 2018 in Apache Spark by Frankie
• 9,830 points
3,365 views
+1 vote
2 answers

Apache Spark vs Apache Spark 2

Spark 2 doesn't differ much architecture-wise from ...READ MORE

Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points
10,377 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

May 29, 2018 in Apache Spark by Shubham
• 13,490 points
9,338 views
0 votes
1 answer

Is 'sparkline' a method?

I suggest you to check 2 things That jquery.sparkline.js is actually ...READ MORE

Nov 9, 2018 in Apache Spark by Frankie
• 9,830 points
1,962 views
0 votes
1 answer

How can I minimize data transfers when working with Spark?

Minimizing data transfers and avoiding shuffling helps ...READ MORE

Sep 19, 2018 in Apache Spark by zombie
• 3,790 points
3,764 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
8,631 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

Jul 27, 2018 in Apache Spark by zombie
• 3,790 points
5,787 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

Apr 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points
9,874 views
0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,257 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points
7,258 views
0 votes
1 answer

Difference between sparkContext, JavaSparkContext, SQLContext, & SparkSession?

Yes, there is a difference between the ...READ MORE

Jul 4, 2018 in Apache Spark by nitinrawat895
• 11,380 points
5,971 views