Most viewed questions in Apache Spark

0 votes
1 answer

start-master and start-all?

sbin/start-master.sh : Starts a master instance on ...READ MORE

May 7, 2018 in Apache Spark by kurt_cobain
• 9,390 points
2,011 views
0 votes
1 answer

How to set keystore path?

You have to set the path to ...READ MORE

Mar 15, 2019 in Apache Spark by Karan
1,986 views
0 votes
0 answers

Unable to get the Job status and Group ID java- spark standalone program with databricks

package com.dataguise.test; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import org.apache.spark.SparkContext; import org.apache.spark.SparkJobInfo; import ...READ MORE

Jul 23, 2020 in Apache Spark by kamboj
• 140 points

recategorized Jul 28, 2020 by Gitika 1,963 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

May 8, 2018 in Apache Spark by kurt_cobain
• 9,390 points
1,955 views
0 votes
1 answer

which one of the following commands is used to see the structure of the Dataframe?

Hi @Ritu If you want to see the ...READ MORE

Nov 25, 2020 in Apache Spark by Gitika
• 65,910 points
1,943 views
0 votes
1 answer

Increase Yarn wait time for Sparkcontext

The default time that the Yarn application waits ...READ MORE

Mar 27, 2019 in Apache Spark by Rohit
1,938 views
0 votes
1 answer

error:error: only classes can have declared but undefined members.

Hi, This happens in Scala whenever you won't ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,910 points
1,927 views
0 votes
1 answer

Error reading avro dataset in spark

For avro, you need to download and ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,210 points
1,924 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1, 2019 in Apache Spark by Rishni
1,920 views
0 votes
1 answer

How to authenticate Spark internal connections using a secret key?

You need to set the secret key ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
1,916 views
0 votes
1 answer

Akka in Spark

Spark uses Akka basically for scheduling. All ...READ MORE

May 31, 2018 in Apache Spark by Data_Nerd
• 2,390 points
1,916 views
0 votes
1 answer

what are the job optimization Technics in spark and scala ?

There are different methods to achieve optimization ...READ MORE

Mar 18, 2019 in Apache Spark by Veer
1,909 views
0 votes
2 answers

Parquet Files Advantages

Parquet is a columnar format supported by ...READ MORE

Jul 4, 2018 in Apache Spark by zombie
• 3,790 points
1,897 views
0 votes
1 answer

What is Map and flatMap in Spark?

Hi, The map is a specific line or ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
1,884 views
0 votes
0 answers
0 votes
1 answer

How to declare a Empty Scala Map?

Hi, You can either declare an empty Scala ...READ MORE

Jul 29, 2019 in Apache Spark by Gitika
• 65,910 points

edited Jul 29, 2019 by Gitika 1,847 views
0 votes
1 answer

Can the executor core be greater than the total number of spark tasks?

Hi@Rishi, Yes, it is possible. If executor no. ...READ MORE

Jun 17, 2020 in Apache Spark by MD
• 95,440 points
1,846 views
0 votes
1 answer

Difference between Spark ML & Spark MLlib package

org.apache.spark.mllib is the old Spark API while ...READ MORE

Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points
1,846 views
0 votes
1 answer

Spark Core How to fetch max n rows of an RDD function without using Rdd.max()

Hi@Prasant, If Spark Streaming is not supporting tuple, ...READ MORE

Dec 3, 2020 in Apache Spark by MD
• 95,440 points
1,845 views
0 votes
1 answer

How do you load this multiline data in spark as a single record?

Hi@Ruben, I think you can add an escape ...READ MORE

Nov 23, 2020 in Apache Spark by MD
• 95,440 points
1,843 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31, 2019 in Apache Spark by Omkar
• 69,210 points
1,836 views
+1 vote
0 answers

how to access hive view using spark2

We do not have access to hive ...READ MORE

Dec 29, 2019 in Apache Spark by anonymous
• 130 points
1,827 views
0 votes
1 answer

How to import the dependencies of Spark MLlib into eclipse project?

I would recommend you create & build ...READ MORE

May 31, 2018 in Apache Spark by Shubham
• 13,490 points
1,812 views
0 votes
1 answer

Spark workers are not accepting any job (Kubernetes-Docker-Spark)

When kubernetes picks 10.*.*.*/16 network as it's ...READ MORE

Mar 1, 2019 in Apache Spark by Hamza
• 200 points
1,811 views
0 votes
1 answer

Spark Processing Internals

Spark uses a master/slave architecture. As you ...READ MORE

Jul 15, 2019 in Apache Spark by Jimmy

edited Jun 9, 2020 by MD 1,804 views
–1 vote
0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18, 2020 in Apache Spark by anonymous
• 110 points
1,800 views
0 votes
1 answer

Increase number of cores in Spark

Now that the job is already running, ...READ MORE

Feb 23, 2019 in Apache Spark by Reshma
1,799 views
0 votes
1 answer

What does the following code print?

error: expected class or object definition sc.parallelize (Array(1L, ...READ MORE

Nov 25, 2020 in Apache Spark by Gitika
• 65,910 points
1,789 views
0 votes
1 answer

Unable to submit the spark job in deployment mode - multinode cluster(using ubuntu machines) with yarn master

Hi@Ganendra, As you said you launched a multinode cluster, ...READ MORE

Jul 29, 2020 in Apache Spark by MD
• 95,440 points
1,773 views
0 votes
1 answer

Explain the for loop for printing the Map values in Scala in Apache Spark?

Hey, You can see this following code to ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,910 points
1,766 views
0 votes
1 answer

Set archives to be extracted in executor directory

I don't think you can copy and ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,719 views
0 votes
0 answers

what are the memory issues in spark ?

Mar 18, 2019 in Apache Spark by satish kumar
• 180 points
1,716 views
0 votes
1 answer

Load custom delimited file in Spark

Refer to the following code: val sqlContext = ...READ MORE

Jul 24, 2019 in Apache Spark by Ritu
1,695 views
0 votes
1 answer

How to use yield keyword in scala and why it is used instead of println?

Hi, The yield keyword is used because the ...READ MORE

Jul 6, 2019 in Apache Spark by Gitika
• 65,910 points
1,693 views
0 votes
2 answers

Which cluster type should I choose for Spark?

Spark is agnostic to the underlying cluster ...READ MORE

Aug 21, 2018 in Apache Spark by zombie
• 3,790 points
1,688 views
0 votes
1 answer

Spark Yarn: Changing maximum number of time to submit application

By default, the maximum number of times ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
1,677 views
0 votes
1 answer

ERROR thriftserver.SparkExecuteStatementOperation: Error executing query, currentState RUNNING, org.apache.spark.sql.catalyst.errors.package$TreeNodeException

Hi@akhtar, You may resolve this exception, by increasing the ...READ MORE

Apr 29, 2020 in Apache Spark by MD
• 95,440 points
1,676 views
0 votes
1 answer

How to load data of .csv file in MySQL Database Table?

You can do it using a code ...READ MORE

Jul 22, 2019 in Apache Spark by Vishwa
1,668 views
0 votes
1 answer

Multidimensional Array in Scala

Multidimensional array is an array which store ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,210 points
1,654 views
0 votes
1 answer

16)What allows spark to periodically persist data about an application such that it can recover from failures?

Hi@Edureka, Checkpointing is a process of truncating RDD ...READ MORE

Nov 26, 2020 in Apache Spark by MD
• 95,440 points
1,653 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
1,647 views
0 votes
1 answer

Need help setting Spark yarn history server address

If you are running history server and ...READ MORE

Mar 27, 2019 in Apache Spark by Neha
1,647 views
0 votes
1 answer

Can number of Spark task be greater than the executor core?

Hi@Rishi, Yes, number of spark tasks can be ...READ MORE

Jun 17, 2020 in Apache Spark by MD
• 95,440 points
1,638 views
+1 vote
1 answer

How to read .mp4 (video file) stored at HDFS using pyspark?

Hi@Amey, You can enable WebHDFS to do this ...READ MORE

May 29, 2020 in Apache Spark by MD
• 95,440 points
1,628 views
0 votes
1 answer

Spark context (sc) not found

Maybe the hadoop service didn't start properly. Try ...READ MORE

Feb 14, 2019 in Apache Spark by John
1,617 views
0 votes
1 answer

Scala: save filtered data row by row using saveAsTextFile

Try this code, it worked for me: val ...READ MORE

Aug 2, 2019 in Apache Spark by Karan
1,615 views
–2 votes
1 answer

What is the difference in Java’s “If..Else” and Scala’s “If..Else”? [closed]

Hey, Java’s “If. Else”: In Java, “If. Else” is a statement, ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,615 views
0 votes
1 answer

How to set max executors for dynamic allocation?

You can set it by assigning the ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
1,608 views
0 votes
1 answer

How to access private key password with Spark?

Spark allows you to retrieve the key ...READ MORE

Mar 15, 2019 in Apache Spark by Karan
1,600 views
0 votes
1 answer

7)From Schema RDD, data can be cache by which one of the given choices?

Hi, @Ritu, According to the official documentation of Spark 1.2, ...READ MORE

Nov 23, 2020 in Apache Spark by Gitika
• 65,910 points
1,599 views