Trending questions in Apache Spark

0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,870 points
2,853 views
0 votes
1 answer

How to append a list in Scala?

Hey, For this purpose, we use the single ...READ MORE

Jul 26, 2019 in Apache Spark by Gitika
• 65,870 points
3,041 views
0 votes
1 answer

Read multiple xml files in Spark

You can do this using globbing. See ...READ MORE

Jul 25, 2019 in Apache Spark by Jack
2,907 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
8,905 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
588 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 10, 2019 in Apache Spark by Jishnu
3,265 views
0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,870 points
2,246 views
0 votes
1 answer

Spark Error: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

There seems to be a problem with ...READ MORE

May 24, 2019 in Apache Spark by Jishan
5,216 views
+1 vote
1 answer

_spark_metadata/0 doesn't exist while Compacting batch 9 Structured streaming error

Please check https://kb.databricks.com/streaming/file-sink-str ...READ MORE

Nov 20, 2019 in Apache Spark by anonymous
1,111 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,870 points
2,265 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
1,109 views
0 votes
1 answer

How to compute the square root of sum of squares of numbers?

Hey, You need to follow some steps to complete ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
2,583 views
+1 vote
0 answers

Difference Between rdd dataframe dataset [closed]

Sep 12, 2019 in Apache Spark by Rajesh pagadala

closed Sep 13, 2019 by Omkar 292 views
0 votes
1 answer

How to call the Debug Mode in PySpark?

As far as I understand your intentions ...READ MORE

Jul 26, 2019 in Apache Spark by ravikiran
• 4,620 points
2,357 views
0 votes
1 answer

Spark Null Pointer Exception.

I used Spark 1.5.2 with Hadoop 2.6 ...READ MORE

Jul 19, 2019 in Apache Spark by ravikiran
• 4,620 points
2,621 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,870 points
2,086 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
272 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
1,056 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
2,025 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,870 points
1,784 views
+1 vote
1 answer

How do I turn off INFO Logging in Spark?

Hi, You need to edit one property in ...READ MORE

Jul 12, 2019 in Apache Spark by ravikiran
• 4,620 points

edited Dec 20, 2020 by MD 2,821 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
2,334 views
0 votes
1 answer

load/save text file in spark

The reason you are able to load ...READ MORE

Jul 22, 2019 in Apache Spark by Giri
2,146 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
1,588 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 24, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 596 views
+1 vote
1 answer

What is reduce() action in Spark?

Hey, It takes a function that operates on two ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,870 points
2,858 views
0 votes
1 answer

How fault tolerance is achieved in Apache Spark?

Hey, In Apache Spark, the data storage model is ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
1,931 views
0 votes
1 answer

PySpark not starting: No active sparkcontext

Seems like Spark hadoop daemons are not ...READ MORE

Jul 30, 2019 in Apache Spark by Jishan
1,577 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

Jul 24, 2019 in Apache Spark by Yogi
1,835 views
–1 vote
0 answers
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
1,809 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,870 points
1,296 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,870 points
1,413 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 23, 2019 in Apache Spark by Ritu
1,742 views
0 votes
1 answer

How to start spark history server?

Hey, You can use this command to start​ ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,870 points
1,518 views
0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
1,913 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
1,976 views
0 votes
1 answer

Spark: Read from Hive, store in HDFS

Below is an example of reading data ...READ MORE

Jul 26, 2019 in Apache Spark by Lohit
1,382 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

Jul 22, 2019 in Apache Spark by Firoz
1,541 views
0 votes
1 answer

How to concatenate Maps in Scala?

Hey, You can concatenate/join two Maps in more than ...READ MORE

Jul 29, 2019 in Apache Spark by Gitika
• 65,870 points

edited Jul 29, 2019 by Gitika 1,207 views
0 votes
1 answer

What are these in scala : _* & @_*

As is widely used, and has different ...READ MORE

Jul 31, 2019 in Apache Spark by Turic
1,109 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
1,486 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,870 points
992 views
0 votes
1 answer

What is ofDim in Scala?

Hey, ofDim() is a method in Scala that ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,870 points
1,369 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,870 points
2,220 views
0 votes
1 answer

Which File System is supported by Apache Spark?

Hi, Apache Spark is an advanced data processing ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,870 points
2,129 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

Dec 31, 2018 in Apache Spark by anonymous
14,831 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1, 2019 in Apache Spark by Rishni
922 views
0 votes
1 answer

Create dataframe for Avro file

Yes, we can work with Avro files ...READ MORE

Jul 22, 2019 in Apache Spark by Rishi
1,277 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,870 points
2,020 views