Most viewed questions in Apache Spark

0 votes
0 answers

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [closed]

Hi All I am running Scala program on ...READ MORE

May 5, 2019 in Apache Spark by Vishal

closed May 6, 2019 by Omkar 5,302 views
0 votes
1 answer

error: identifier expected but ']' found.

Hi, You can try this remove brackets from ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
5,134 views
0 votes
1 answer
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,390 points
5,114 views
+1 vote
2 answers

How do I get number of columns in each line from a delimited file??

Instead of spliting on '\n'. You should ...READ MORE

Aug 7, 2019 in Apache Spark by ashish
5,081 views
–1 vote
0 answers
0 votes
1 answer

what are the spark real time issues ?

Some of the issues I have faced ...READ MORE

Mar 18, 2019 in Apache Spark by Sharman
5,028 views
0 votes
1 answer

Changing Yarn queue in Spark application

To change the default queue to which ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
5,020 views
0 votes
1 answer

Difference between sparkContext, JavaSparkContext, SQLContext, & SparkSession?

Yes, there is a difference between the ...READ MORE

Jul 4, 2018 in Apache Spark by nitinrawat895
• 11,380 points
4,984 views
0 votes
0 answers
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
4,750 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
4,744 views
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

May 3, 2019 in Apache Spark by Omkar
• 69,230 points
4,706 views
0 votes
1 answer

How to create a not null column in case class in spark

Hi@Deepak, In your test class you passed empid ...READ MORE

May 14, 2020 in Apache Spark by MD
• 95,440 points
4,604 views
0 votes
1 answer

Spark2-submit does not generate output file.

To generate the output file, you can ...READ MORE

Feb 24, 2019 in Apache Spark by Esha
4,563 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
4,529 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

Jul 27, 2018 in Apache Spark by zombie
• 3,790 points
4,511 views
0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
4,505 views
0 votes
1 answer

How to get Spark dataset metadata?

There are a bunch of functions that ...READ MORE

Apr 26, 2018 in Apache Spark by kurt_cobain
• 9,390 points
4,480 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
4,461 views
+1 vote
2 answers

Spark: Can we add column to dataframe?

Yes we can add columns to the ...READ MORE

Oct 24, 2019 in Apache Spark by Siva
• 160 points
4,460 views
0 votes
1 answer

What is the difference between Apache Spark SQLContext vs HiveContext?

Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE

May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
4,405 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,910 points
4,385 views
0 votes
1 answer

Scala: Add user input to array

You can try this:  object printarray { ...READ MORE

Jun 19, 2019 in Apache Spark by Dinesha
4,320 views
0 votes
1 answer

Caused by: java.lang.NumberFormatException: Empty String

Hi@akhtar, As we know text files are in ...READ MORE

Jan 31, 2020 in Apache Spark by MD
• 95,440 points
4,298 views
0 votes
1 answer

What is ofDim in Scala?

Hey, ofDim() is a method in Scala that ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,910 points
4,253 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
4,179 views
0 votes
1 answer

PySpark not starting: No active sparkcontext

Seems like Spark hadoop daemons are not ...READ MORE

Jul 30, 2019 in Apache Spark by Jishan
4,177 views
0 votes
1 answer

How to set cpu cores for spark task?

By default, each task is allocated with ...READ MORE

Mar 12, 2019 in Apache Spark by Veer
4,175 views
0 votes
1 answer

File not found exception while processing the spark job in yarn cluster mode with multinode hadoop cluster

Hi@Ganendra, I am not sure what's the issue, ...READ MORE

Jul 30, 2020 in Apache Spark by MD
• 95,440 points
4,150 views
0 votes
1 answer

Spark Installation problem

After downloading Spark, you need to set ...READ MORE

Jul 5, 2019 in Apache Spark by Rishi
4,140 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
4,139 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,620 points
4,111 views
0 votes
1 answer

How to change the location of Spark event logs?

You can change the location where you ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
4,088 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
4,087 views
0 votes
1 answer

What is RDD Lineage in Spark?

Hey, Lineage is an RDD process to reconstruct ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
4,080 views
0 votes
1 answer

how create distance vector in pyspark (Euclidean distance)

Hi@dani, You can find the euclidean distance using ...READ MORE

Oct 16, 2020 in Apache Spark by MD
• 95,440 points
4,070 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
3,994 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
3,994 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,910 points
3,920 views
0 votes
1 answer

Ways to create RDD in Apache Spark

There are two popular ways using which ...READ MORE

Jun 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,878 views
0 votes
1 answer

What do we mean by an RDD in Spark?

The full form of RDD is a ...READ MORE

Jun 18, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,842 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,910 points
3,833 views
0 votes
1 answer

Copy all files from local (Windows) to HDFS with Scala code

Please try the following Scala code: import org.apache.hadoop.conf.Configuration import ...READ MORE

May 22, 2019 in Apache Spark by Karan
3,806 views
0 votes
1 answer

error: Caused by: org.apache.spark.SparkException: Failed to execute user defined function.

Hi@akhtar, I think you got this error due to version mismatch ...READ MORE

Apr 22, 2020 in Apache Spark by MD
• 95,440 points
3,792 views
0 votes
1 answer

How to set extra JVM options for Spark application?

You cans set extra JVM options that ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
3,769 views
0 votes
1 answer

How to store files in executor's working directory?

You have to specify a comma-separated list ...READ MORE

Mar 28, 2019 in Apache Spark by Raj
3,766 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,732 views
0 votes
1 answer

How to parse a textFile to csv in pyspark?

Hi, Use this below given code, it will ...READ MORE

Apr 13, 2020 in Apache Spark by MD
• 95,440 points
3,719 views