Most viewed questions in Apache Spark

0 votes
1 answer

What happens to RDD when one of the nodes goes down?

Whenever a node goes down, Spark knows ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,601 views
+1 vote
1 answer

Kafka Feature

Here are some of the important features of ...READ MORE

Jun 7, 2018 in Apache Spark by Data_Nerd
• 2,390 points
1,597 views
0 votes
1 answer

Installing Spark on Ubuntu

Hey. Follow these steps to install Spark ...READ MORE

Feb 20, 2019 in Apache Spark by Omkar
• 69,210 points
1,589 views
0 votes
1 answer

How to create singleton classes in Scala?

Hey, Scala introduces a new object keyword, which is used ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,569 views
0 votes
1 answer

How to read Avro Partition Data?

Hi@akhtar, When we try to retrieve the data ...READ MORE

Nov 4, 2020 in Apache Spark by MD
• 95,440 points
1,560 views
0 votes
1 answer

Why Partitions are immutable in Spark?

Hi, Every transformation generates a new partition. Partitions ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
1,556 views
0 votes
1 answer

Which syntax to use to take the sum of list of collection in scala?

Hi, You can see this example to get ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,554 views
+1 vote
1 answer

Facing out-of-memory errors in Spark driver

I am guessing that the configuration set ...READ MORE

Feb 23, 2019 in Apache Spark by Rishab
1,551 views
0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,544 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,542 views
0 votes
1 answer

env : R : No such file or directory

Hi@akhtar, I also got this error. I am able to ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,440 points
1,536 views
0 votes
1 answer

Spark Kill Running Application

you can copy the application id from ...READ MORE

Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,517 views
0 votes
1 answer

Is there any way to uncache RDD?

RDD can be uncached using unpersist() So. use ...READ MORE

May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,494 views
0 votes
1 answer

Array of RDD

You can create an array of RDDs ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
1,481 views
0 votes
1 answer

How to run spark in Standalone client mode?

Hi, These are the steps to run spark in ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,479 views
0 votes
1 answer

Scala: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream

Try executing the below code, def readExcel(file: String): ...READ MORE

Jul 30, 2019 in Apache Spark by Raman
1,462 views
+1 vote
1 answer

How to install Scala Build Tool (SBT) on ubuntu?

Hey, To install SBT on Ubuntu first you need ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,910 points
1,450 views
0 votes
1 answer

How to set keys & access tokens for Twitter Spark streaming?

Either you have to create a Twitter4j.properties ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points
1,445 views
0 votes
1 answer

Spark Submit: class does not exists

In the command, you have mentioned the ...READ MORE

Jul 26, 2019 in Apache Spark by Jimmy
1,443 views
0 votes
1 answer

Spark to Hive Table creation

There's an easier way to achieve your ...READ MORE

Jul 23, 2019 in Apache Spark by Dinesh
1,435 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 25, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 1,424 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,210 points
1,412 views
0 votes
1 answer

How to find values common to two sets in Scala

Hey, There are two ways to find the ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,411 views
0 votes
1 answer

How to enable dynamic resource allocation in Spark?

To dynamically enable dynamic resource allocation, you ...READ MORE

Mar 12, 2019 in Apache Spark by veer
1,394 views
0 votes
1 answer

How to compress serialized RDD partition?

Yes, you can do this by enabling ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,392 views
0 votes
1 answer

Why does sortBy transformation trigger a Spark job?

Actually, sortBy/sortByKey depends on RangePartitioner (JVM). So ...READ MORE

May 8, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,381 views
0 votes
1 answer

Query regarding Operator Overloading in Scala

All prefix operators' symbols are predefined: +, -, ...READ MORE

Jul 10, 2019 in Apache Spark by Karan
1,371 views
0 votes
1 answer

How can we iterate any function using "foreach" function in scala?

Hi, Yes, "foreach" function you use because it will ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,368 views
0 votes
1 answer

Getting "buffer limit exceeded" exception inside Kryo.

Seems like the object being sent for ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,362 views
0 votes
1 answer

How to implement my clustering algorithm in pyspark (without using the ready library for example k-means)?

Hi@dani, As you said you are a beginner ...READ MORE

Oct 14, 2020 in Apache Spark by MD
• 95,440 points
1,359 views
0 votes
1 answer

How to increase Garbage Collection speed?

The time interval between Garbage Collection is ...READ MORE

Mar 8, 2019 in Apache Spark by Pavitra
1,358 views
0 votes
1 answer

Does Spark provide the storage layer too?

No, it doesn’t provide storage layer but ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,351 views
0 votes
1 answer

load/save in spark

The reason why you are able to ...READ MORE

Jul 5, 2019 in Apache Spark by Firoz
1,349 views
0 votes
1 answer

How to increase HDFS replication level in Spark?

Hi @Raunak. You can change the replication ...READ MORE

Mar 27, 2019 in Apache Spark by Yash
1,347 views
0 votes
1 answer

How to set executors for static allocation in Spark Yarn?

Open Spark shell and run the following ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,342 views
0 votes
1 answer

How to create RDD from parallelized collection in scala?

Hi, You can check this example in your ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,339 views
0 votes
1 answer

In how many modes Apache spark can run?

Hey, You can launch spark application in four ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
1,334 views
0 votes
1 answer

Spark memory processing on a not temporary table

Temporary table is more like an index ...READ MORE

Jul 14, 2019 in Apache Spark by Suri
1,307 views
0 votes
1 answer

How is RDD in Spark different from Distributed Storage Management? Can anyone help me with this ?

Some of the key differences between an RDD and ...READ MORE

Jul 26, 2018 in Apache Spark by zombie
• 3,790 points
1,301 views
0 votes
1 answer

How to concatenate sets in Scala?

Hey, Yes, there are two ways of doing ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,295 views
+1 vote
1 answer

Scala: CSV file to Save data into HBase

Check the reference code mentioned below: def main(args: ...READ MORE

Jul 25, 2019 in Apache Spark by Hari
1,292 views
0 votes
1 answer

Spark cannot access local file anymore?

By default it will access the HDFS. ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,289 views
0 votes
0 answers

17)from the given choices, identify the value returned by $"whatever"?

17)from the given choices, identify the value ...READ MORE

Nov 25, 2020 in Apache Spark by ritu
• 960 points
1,276 views
0 votes
1 answer

How to use Spark jars for Yarn distribution?

First, store upload this archive to hdfs and ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,269 views
0 votes
1 answer

How to give user only view access for Spark application?

You can give users only view permission ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
1,265 views
0 votes
1 answer

Loading Spark properties dynamically

First, create an empty conf using this ...READ MORE

Feb 22, 2019 in Apache Spark by Mansoor
1,264 views
0 votes
1 answer

How to use uniform list in Scala?

Hey, The method List.fill() creates a list and ...READ MORE

Aug 1, 2019 in Apache Spark by Gitika
• 65,910 points
1,255 views
0 votes
1 answer

Functions of Spark SQL?

Spark SQL is capable of: Loading data from ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,248 views
0 votes
1 answer

Not able to preserve shuffle files in Spark

You lose the files because by default, ...READ MORE

Feb 24, 2019 in Apache Spark by Rana
1,243 views
0 votes
1 answer

What is “Unit” and “()” in Scala?

Hey, Unit is a subtype of scala.anyval and ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,910 points
1,241 views