Latest questions in Apache Spark

0 votes
1 answer

How SparkSQL is different from HQL and SQL?

Hi, SparkSQL is a special component on the ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
3,449 views
0 votes
1 answer

What is Piping in Spark?

Hi, Spark provides a pipe() method on RDDs. ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
2,702 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
3,336 views
0 votes
1 answer

What is meant by Transformation? Give some examples.

Hi, The transformations are the functions that are ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
557 views
0 votes
0 answers

When we create an RDD, does it bring the data and load it into the memory?

Can anyone suggest when we create an ...READ MORE

Jul 3, 2019 in Apache Spark by monalisa

recategorized Jul 4, 2019 by Gitika 1,109 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
1,647 views
+1 vote
1 answer

What is reduce() action in Spark?

Hey, It takes a function that operates on two ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,910 points
5,273 views
0 votes
1 answer

What is persist() in Spark?

Hi, Spark’s RDDs are by default recomputed each ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,910 points
570 views
0 votes
1 answer

What is a Parquet file in Spark?

Hey, Parquet is a columnar format file supported ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,910 points
1,090 views
0 votes
2 answers

How to execute a function in apache-scala?

Function Definition : def test():Unit{ var a=10 var b=20 var c=a+b } calling ...READ MORE

Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy
679 views
0 votes
1 answer

How to calculate the result of formula with Scala?

Hi, You can use a simple mathematical calculation ...READ MORE

Jul 1, 2019 in Apache Spark by Gitika
• 65,910 points
1,013 views
0 votes
1 answer

By which components spark ecosystem libraries are composed of?

Hi, Spark ecosystem libraries are composed of various ...READ MORE

Jul 1, 2019 in Apache Spark by Gitika
• 65,910 points
497 views
0 votes
1 answer

What is polyglot in spark?

Hi, Spark provides a high-level API in Java, ...READ MORE

Jul 1, 2019 in Apache Spark by Gitika
• 65,910 points
2,096 views
0 votes
1 answer

What is RDD in Apache spark?

Hi, RDD in spark stands for REsilient distributed ...READ MORE

Jul 1, 2019 in Apache Spark by Gitika
• 65,910 points
1,169 views
0 votes
1 answer

Doubt in display(id, name, salary) before display function

The statement display(id, name, salary) is written before the display function ...READ MORE

Jun 19, 2019 in Apache Spark by Ritu
403 views
0 votes
1 answer

Scala pass input data as arguments

Please refer to the below code as ...READ MORE

Jun 19, 2019 in Apache Spark by Lisa
2,127 views
0 votes
1 answer

Scala: Add user input to array

You can try this:  object printarray { ...READ MORE

Jun 19, 2019 in Apache Spark by Dinesha
4,271 views
0 votes
1 answer

Spark CLI issue

For spark.read.textFile we need spark-2.x. Please try ...READ MORE

Jun 19, 2019 in Apache Spark by Maahi
517 views
0 votes
1 answer

Spark foldbykey doubt

Please have a look below for your ...READ MORE

Jun 19, 2019 in Apache Spark by Tina
2,246 views
+1 vote
1 answer

_spark_metadata/0 doesn't exist while Compacting batch 9 Structured streaming error

Please check https://kb.databricks.com/streaming/file-sink-str ...READ MORE

Nov 20, 2019 in Apache Spark by anonymous
3,627 views
0 votes
1 answer

Starting Spark Scala console

To get command prompt for Scala open ...READ MORE

May 24, 2019 in Apache Spark by Cassy
563 views
0 votes
1 answer

Spark Error: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

There seems to be a problem with ...READ MORE

May 24, 2019 in Apache Spark by Jishan
10,312 views
0 votes
1 answer

Difference between RDD as val and var

Variable declaration can be done in two ...READ MORE

May 23, 2019 in Apache Spark by Arun
2,217 views
0 votes
1 answer

Error while reading multiline Json

peopleDF: org.apache.spark.sql.DataFrame = [_corrupt_record: string] The above that ...READ MORE

May 23, 2019 in Apache Spark by Conny
2,633 views
0 votes
1 answer

Copy all files from local (Windows) to HDFS with Scala code

Please try the following Scala code: import org.apache.hadoop.conf.Configuration import ...READ MORE

May 22, 2019 in Apache Spark by Karan
3,752 views
0 votes
1 answer

Starting Spark in Windows

Run below commands spark-class org.apache.spark.deploy.master.Master spark-class org.apache.spark.deploy.worker.Worker spark://192.168.254.1:7077 NOTE: The ...READ MORE

May 22, 2019 in Apache Spark by Reshma
807 views
0 votes
1 answer

Spark: Saving file csv

 If you need a single output file ...READ MORE

May 22, 2019 in Apache Spark by Rishi
2,405 views
0 votes
0 answers

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [closed]

Hi All I am running Scala program on ...READ MORE

May 5, 2019 in Apache Spark by Vishal

closed May 6, 2019 by Omkar 5,274 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

May 3, 2019 in Apache Spark by Omkar
• 69,210 points
4,665 views
0 votes
1 answer

How can we use spark shell for scala without cluster?

You can run the Spark shell for ...READ MORE

Apr 28, 2019 in Apache Spark by Giri
502 views
0 votes
1 answer

Spark comparing two big data files using scala

Try this and see if this does ...READ MORE

Apr 2, 2019 in Apache Spark by Omkar
• 69,210 points
6,660 views
0 votes
1 answer

Spark Yarn: Changing maximum number of time to submit application

By default, the maximum number of times ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,678 views
0 votes
1 answer

Set Library to launch Yarn master

You can make use of Special Library path to ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
438 views
0 votes
1 answer

How to set extra JVM options for Spark application?

You cans set extra JVM options that ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
3,740 views
0 votes
1 answer

Thread to use Yarn application master is limited

This is because the maximum number of ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,014 views
0 votes
1 answer

How to use Spark jars for Yarn distribution?

First, store upload this archive to hdfs and ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,268 views
0 votes
1 answer

Changing Yarn queue in Spark application

To change the default queue to which ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
4,987 views
0 votes
1 answer

How to set executors for static allocation in Spark Yarn?

Open Spark shell and run the following ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,340 views
0 votes
1 answer

How to use ftp scheme using Yarn in Spark application?

In case Yarn does not support schemes ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
935 views
0 votes
1 answer

How to store files in executor's working directory?

You have to specify a comma-separated list ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
3,704 views
0 votes
1 answer

Set archives to be extracted in executor directory

I don't think you can copy and ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,719 views
0 votes
1 answer

Need help setting Spark yarn history server address

If you are running history server and ...READ MORE

Mar 27, 2019 in Apache Spark by Neha
1,647 views
0 votes
1 answer

How to retain Spark jar and app jar after staging?

By default, Spark jar, app jar, and ...READ MORE

Mar 27, 2019 in Apache Spark by Ginni
812 views
0 votes
1 answer

How to increase HDFS replication level in Spark?

Hi @Raunak. You can change the replication ...READ MORE

Mar 27, 2019 in Apache Spark by Yash
1,345 views
0 votes
1 answer

Increase Yarn wait time for Sparkcontext

The default time that the Yarn application waits ...READ MORE

Mar 27, 2019 in Apache Spark by Rohit
1,938 views
0 votes
1 answer

Increase cores for yarn in Spark application

By default, only one core is used for ...READ MORE

Mar 26, 2019 in Apache Spark by Bhuvan
975 views
0 votes
1 answer

Increasing memory to use for Yarn application master?

You can increase the memory dynamically by ...READ MORE

Mar 26, 2019 in Apache Spark by Tina
1,111 views
0 votes
1 answer

How to cleanup application work directories faster?

By default, the cleanup time is set ...READ MORE

Mar 26, 2019 in Apache Spark by Jyoti
507 views
0 votes
1 answer

How to change worker cleanup interval?

The default interval time is 1800 seconds ...READ MORE

Mar 25, 2019 in Apache Spark by Hari
604 views
0 votes
1 answer

How to enable worker cleanup in Spark?

To enable cleanup, open the spark shell ...READ MORE

Mar 25, 2019 in Apache Spark by Hari
2,298 views