Trending questions in Apache Spark

0 votes
1 answer

How fault tolerance is achieved in Apache Spark?

Hey, In Apache Spark, the data storage model is ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,850 points
4,251 views
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
3,689 views
0 votes
1 answer

Read multiple xml files in Spark

You can do this using globbing. See ...READ MORE

Jul 25, 2019 in Apache Spark by Jack
3,948 views
0 votes
1 answer

How to compute the square root of sum of squares of numbers?

Hey, You need to follow some steps to complete ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,850 points
4,015 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
9,852 views
0 votes
1 answer

Spark Null Pointer Exception.

I used Spark 1.5.2 with Hadoop 2.6 ...READ MORE

Jul 19, 2019 in Apache Spark by ravikiran
• 4,620 points
4,048 views
0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
3,391 views
0 votes
1 answer

How to call the Debug Mode in PySpark?

As far as I understand your intentions ...READ MORE

Jul 26, 2019 in Apache Spark by ravikiran
• 4,620 points
3,633 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 10, 2019 in Apache Spark by Jishnu
4,277 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
1,149 views
+1 vote
1 answer

How do I turn off INFO Logging in Spark?

Hi, You need to edit one property in ...READ MORE

Jul 12, 2019 in Apache Spark by ravikiran
• 4,620 points

edited Dec 20, 2020 by MD 3,939 views
–1 vote
0 answers
0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

Mar 18, 2019 in Apache Spark by Pavan
8,934 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
1,962 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

Jan 1, 2019 in Apache Spark by anonymous
16,888 views
+1 vote
1 answer

_spark_metadata/0 doesn't exist while Compacting batch 9 Structured streaming error

Please check https://kb.databricks.com/streaming/file-sink-str ...READ MORE

Nov 20, 2019 in Apache Spark by anonymous
1,891 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,850 points
2,983 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
2,921 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,850 points
2,937 views
0 votes
1 answer

Which File System is supported by Apache Spark?

Hi, Apache Spark is an advanced data processing ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,850 points
3,860 views
+1 vote
1 answer

What is reduce() action in Spark?

Hey, It takes a function that operates on two ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,850 points
3,920 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
3,059 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
775 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,850 points
2,556 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
1,538 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,850 points
2,259 views
0 votes
1 answer

PySpark not starting: No active sparkcontext

Seems like Spark hadoop daemons are not ...READ MORE

Jul 30, 2019 in Apache Spark by Jishan
2,530 views
0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
3,168 views
0 votes
1 answer

What is ofDim in Scala?

Hey, ofDim() is a method in Scala that ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,850 points
2,650 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
2,248 views
+1 vote
0 answers

Difference Between rdd dataframe dataset [closed]

Sep 13, 2019 in Apache Spark by Rajesh pagadala

closed Sep 13, 2019 by Omkar 390 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
437 views
0 votes
1 answer

load/save text file in spark

The reason you are able to load ...READ MORE

Jul 22, 2019 in Apache Spark by Giri
2,596 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,850 points
2,549 views
0 votes
1 answer

error: identified expected but integer literal found.

Hi, You can resolve this error with a ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,850 points
3,323 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
2,848 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,850 points
2,466 views
0 votes
1 answer

error: identifier expected but ']' found.

Hi, You can try this remove brackets from ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,850 points
3,254 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

Jul 24, 2019 in Apache Spark by Yogi
2,340 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 23, 2019 in Apache Spark by Ritu
2,386 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,850 points
2,401 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 25, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 870 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
1,686 views
0 votes
1 answer

How to start spark history server?

Hey, You can use this command to start​ ...READ MORE

Jul 25, 2019 in Apache Spark by Gitika
• 65,850 points
2,085 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,850 points
2,897 views
0 votes
1 answer

How to concatenate Maps in Scala?

Hey, You can concatenate/join two Maps in more than ...READ MORE

Jul 29, 2019 in Apache Spark by Gitika
• 65,850 points

edited Jul 29, 2019 by Gitika 1,828 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

Jul 22, 2019 in Apache Spark by Firoz
2,121 views
0 votes
1 answer

What are these in scala : _* & @_*

As is widely used, and has different ...READ MORE

Jul 31, 2019 in Apache Spark by Turic
1,562 views
0 votes
1 answer

What is the difference between persist() and cache() in apache spark?

Hi, persist () allows the user to specify ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,850 points
2,757 views