Most answered questions in Apache Spark

0 votes
1 answer

Does spark streaming provides checkpoint?

Hi@akhtar, Yes, Spark streaming uses checkpoint. Checkpoint is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,440 points
1,192 views
0 votes
1 answer

Is Spark Sql provides indexing to improve processing speed?

Hi@akhtar, There is no concept of indexing in ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,440 points
617 views
0 votes
1 answer

What is the difference between spark streaming and spark structured streaming?

Hi@akhtar Generally, Spark streaming  is used for real time ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,440 points
3,481 views
0 votes
1 answer

What are Dstreams?

Hi@akhtar, Dstreams are the basic abstraction that is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,440 points
765 views
0 votes
1 answer
0 votes
1 answer

Cannot create directory /hive/xzxz/_temporary/0. Name node is in safe mode.

Hi@akhtar, Here you are trying to save csv ...READ MORE

Feb 3, 2020 in Apache Spark by MD
• 95,440 points
627 views
+1 vote
1 answer

is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [51, 53, 10, 10]

Hi@akhtar, Here you are trying to read a ...READ MORE

Feb 3, 2020 in Apache Spark by MD
• 95,440 points
17,407 views
0 votes
1 answer

What is pageRank in graphX??

Hi@akhtar, The PageRank algorithm outputs a probability distribution ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,440 points
969 views
0 votes
1 answer

env : R : No such file or directory

Hi@akhtar, I also got this error. I am able to ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,440 points
1,564 views
0 votes
1 answer

Not enough space to cache rdd_80_1 in memory!

Hi@akhtar, Currently, you are running with the default ...READ MORE

Jul 22, 2020 in Apache Spark by MD
• 95,440 points
2,403 views
0 votes
1 answer

Caused by: java.lang.NumberFormatException: Empty String

Hi@akhtar, As we know text files are in ...READ MORE

Jan 31, 2020 in Apache Spark by MD
• 95,440 points
4,300 views
0 votes
1 answer

Difference between map() and mapPartitions() function in Spark.

Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE

Jan 29, 2020 in Apache Spark by MD
• 95,440 points
6,150 views
+1 vote
1 answer

How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

spark do not have any concept of ...READ MORE

Jan 12, 2020 in Apache Spark by Sirish
• 160 points
12,433 views
+2 votes
1 answer

Spark code takes too much time to run on cluster

Hi @asif, Share with us please the application ...READ MORE

Jan 22, 2020 in Apache Spark by Alexandru
• 510 points
1,011 views
+1 vote
1 answer

Is there any efficient way of dealing null values during concat functionality of pyspark.sql version 2.3.4?

When you concatenate any string with a ...READ MORE

Nov 6, 2019 in Apache Spark by Rishi
38,408 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
3,997 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 2,347 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
1,168 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,620 points
4,117 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
3,247 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
2,179 views
+2 votes
1 answer

Type mismatch error in scala

Hello, Your problem is here: val df_merge_final = df_merge .withColumn("version_key", ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
11,795 views
0 votes
1 answer

Monitoring Spark application

Spark-submit jobs are also run from client/edge ...READ MORE

Aug 9, 2019 in Apache Spark by Umesh
860 views
+1 vote
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9, 2019 in Apache Spark by ravikiran
• 4,620 points
5,736 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
4,746 views
0 votes
1 answer

How to start spark history server?

Hi, You can use this command to start ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
589 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
2,918 views
0 votes
1 answer

How to handle data shuffle in Spark

Hi, You can do it using map partition ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
1,206 views
0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
6,065 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
5,865 views
0 votes
1 answer

In how many modes Apache spark can run?

Hey, You can launch spark application in four ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
1,363 views
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
4,756 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
2,375 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
9,123 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
2,580 views
0 votes
1 answer

Join in RDD using keys

Suppose you have two dataset results( id, ...READ MORE

Aug 2, 2019 in Apache Spark by Trisha
8,007 views
0 votes
1 answer

Scala: save filtered data row by row using saveAsTextFile

Try this code, it worked for me: val ...READ MORE

Aug 2, 2019 in Apache Spark by Karan
1,634 views
0 votes
1 answer

What is Hive on Spark?

Hi, Hive contains significant support for Apache Spark, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
499 views
0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,910 points
5,355 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
4,468 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1, 2019 in Apache Spark by Zed
8,562 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1, 2019 in Apache Spark by Rishni
1,944 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
3,386 views
0 votes
1 answer

How to reverse a Scala list?

Hi, This reverses the order of elements in ...READ MORE

Aug 1, 2019 in Apache Spark by Gitika
• 65,910 points
886 views
0 votes
1 answer

How to use uniform list in Scala?

Hey, The method List.fill() creates a list and ...READ MORE

Aug 1, 2019 in Apache Spark by Gitika
• 65,910 points
1,278 views
0 votes
1 answer

How shallow copy carry out using Scala?

Hey, Scala uses the method copy() to carry ...READ MORE

Aug 1, 2019 in Apache Spark by Gitika
• 65,910 points
521 views
0 votes
1 answer

How to access variables in s string interpolation in Scala?

Hey, You can use below code to access variables ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
896 views
–2 votes
1 answer

What is the difference in Java’s “If..Else” and Scala’s “If..Else”? [closed]

Hey, Java’s “If. Else”: In Java, “If. Else” is a statement, ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
1,646 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,910 points
7,349 views