Most viewed questions in Apache Spark

0 votes
1 answer

How to access private key password with Spark?

Spark allows you to retrieve the key ...READ MORE

Mar 15, 2019 in Apache Spark by Karan
1,605 views
+1 vote
1 answer

Kafka Feature

Here are some of the important features of ...READ MORE

Jun 7, 2018 in Apache Spark by Data_Nerd
• 2,390 points
1,600 views
0 votes
1 answer

Installing Spark on Ubuntu

Hey. Follow these steps to install Spark ...READ MORE

Feb 20, 2019 in Apache Spark by Omkar
• 69,210 points
1,592 views
0 votes
1 answer

Why Partitions are immutable in Spark?

Hi, Every transformation generates a new partition. Partitions ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
1,573 views
0 votes
1 answer

How to create singleton classes in Scala?

Hey, Scala introduces a new object keyword, which is used ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,569 views
0 votes
1 answer

How to read Avro Partition Data?

Hi@akhtar, When we try to retrieve the data ...READ MORE

Nov 4, 2020 in Apache Spark by MD
• 95,440 points
1,564 views
0 votes
1 answer

Which syntax to use to take the sum of list of collection in scala?

Hi, You can see this example to get ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,556 views
+1 vote
1 answer

Facing out-of-memory errors in Spark driver

I am guessing that the configuration set ...READ MORE

Feb 23, 2019 in Apache Spark by Rishab
1,554 views
0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,549 views
0 votes
1 answer

env : R : No such file or directory

Hi@akhtar, I also got this error. I am able to ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,440 points
1,544 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,544 views
0 votes
1 answer

Spark Kill Running Application

you can copy the application id from ...READ MORE

Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,524 views
0 votes
1 answer

Is there any way to uncache RDD?

RDD can be uncached using unpersist() So. use ...READ MORE

May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,497 views
0 votes
1 answer

Array of RDD

You can create an array of RDDs ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
1,486 views
0 votes
1 answer

How to run spark in Standalone client mode?

Hi, These are the steps to run spark in ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,483 views
0 votes
1 answer

Scala: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream

Try executing the below code, def readExcel(file: String): ...READ MORE

Jul 30, 2019 in Apache Spark by Raman
1,463 views
+1 vote
1 answer

How to install Scala Build Tool (SBT) on ubuntu?

Hey, To install SBT on Ubuntu first you need ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,910 points
1,457 views
0 votes
1 answer

How to set keys & access tokens for Twitter Spark streaming?

Either you have to create a Twitter4j.properties ...READ MORE

May 24, 2018 in Apache Spark by Shubham
• 13,490 points
1,450 views
0 votes
1 answer

Spark Submit: class does not exists

In the command, you have mentioned the ...READ MORE

Jul 26, 2019 in Apache Spark by Jimmy
1,444 views
0 votes
1 answer

Spark to Hive Table creation

There's an easier way to achieve your ...READ MORE

Jul 23, 2019 in Apache Spark by Dinesh
1,443 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 25, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 1,430 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,210 points
1,414 views
0 votes
1 answer

How to find values common to two sets in Scala

Hey, There are two ways to find the ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,412 views
0 votes
1 answer

How to enable dynamic resource allocation in Spark?

To dynamically enable dynamic resource allocation, you ...READ MORE

Mar 12, 2019 in Apache Spark by veer
1,400 views
0 votes
1 answer

How to compress serialized RDD partition?

Yes, you can do this by enabling ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,396 views
0 votes
1 answer

Why does sortBy transformation trigger a Spark job?

Actually, sortBy/sortByKey depends on RangePartitioner (JVM). So ...READ MORE

May 8, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,385 views
0 votes
1 answer

How can we iterate any function using "foreach" function in scala?

Hi, Yes, "foreach" function you use because it will ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,376 views
0 votes
1 answer

Query regarding Operator Overloading in Scala

All prefix operators' symbols are predefined: +, -, ...READ MORE

Jul 10, 2019 in Apache Spark by Karan
1,371 views
0 votes
1 answer

Getting "buffer limit exceeded" exception inside Kryo.

Seems like the object being sent for ...READ MORE

Mar 7, 2019 in Apache Spark by Pavitra
1,368 views
0 votes
1 answer

How to implement my clustering algorithm in pyspark (without using the ready library for example k-means)?

Hi@dani, As you said you are a beginner ...READ MORE

Oct 14, 2020 in Apache Spark by MD
• 95,440 points
1,365 views
0 votes
1 answer

How to increase Garbage Collection speed?

The time interval between Garbage Collection is ...READ MORE

Mar 8, 2019 in Apache Spark by Pavitra
1,363 views
0 votes
1 answer

Does Spark provide the storage layer too?

No, it doesn’t provide storage layer but ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,354 views
0 votes
1 answer

How to increase HDFS replication level in Spark?

Hi @Raunak. You can change the replication ...READ MORE

Mar 27, 2019 in Apache Spark by Yash
1,352 views
0 votes
1 answer

load/save in spark

The reason why you are able to ...READ MORE

Jul 5, 2019 in Apache Spark by Firoz
1,351 views
0 votes
1 answer

How to create RDD from parallelized collection in scala?

Hi, You can check this example in your ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,348 views
0 votes
1 answer

How to set executors for static allocation in Spark Yarn?

Open Spark shell and run the following ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,344 views
0 votes
1 answer

In how many modes Apache spark can run?

Hey, You can launch spark application in four ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
1,338 views
0 votes
1 answer

Spark memory processing on a not temporary table

Temporary table is more like an index ...READ MORE

Jul 14, 2019 in Apache Spark by Suri
1,308 views
0 votes
1 answer

How is RDD in Spark different from Distributed Storage Management? Can anyone help me with this ?

Some of the key differences between an RDD and ...READ MORE

Jul 26, 2018 in Apache Spark by zombie
• 3,790 points
1,306 views
0 votes
1 answer

How to concatenate sets in Scala?

Hey, Yes, there are two ways of doing ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,301 views
+1 vote
1 answer

Scala: CSV file to Save data into HBase

Check the reference code mentioned below: def main(args: ...READ MORE

Jul 25, 2019 in Apache Spark by Hari
1,296 views
0 votes
1 answer

Spark cannot access local file anymore?

By default it will access the HDFS. ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,291 views
0 votes
0 answers

17)from the given choices, identify the value returned by $"whatever"?

17)from the given choices, identify the value ...READ MORE

Nov 25, 2020 in Apache Spark by ritu
• 960 points
1,281 views
0 votes
1 answer

How to use Spark jars for Yarn distribution?

First, store upload this archive to hdfs and ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,271 views
0 votes
1 answer

How to give user only view access for Spark application?

You can give users only view permission ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
1,270 views
0 votes
1 answer

Loading Spark properties dynamically

First, create an empty conf using this ...READ MORE

Feb 22, 2019 in Apache Spark by Mansoor
1,267 views
0 votes
1 answer

How to use uniform list in Scala?

Hey, The method List.fill() creates a list and ...READ MORE

Aug 1, 2019 in Apache Spark by Gitika
• 65,910 points
1,260 views
0 votes
1 answer

Functions of Spark SQL?

Spark SQL is capable of: Loading data from ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
1,249 views
0 votes
1 answer

What is “Unit” and “()” in Scala?

Hey, Unit is a subtype of scala.anyval and ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,910 points
1,247 views
0 votes
1 answer

How do I access the Map Task ID in Spark?

You can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

Jul 23, 2019 in Apache Spark by ravikiran
• 4,620 points
1,244 views