Trending questions in Apache Spark

0 votes
1 answer
0 votes
1 answer

Spark: java.sql.SQLException: No suitable driver

The missing driver is the JDBC one ...READ MORE

Jul 24, 2019 in Apache Spark by John
10,770 views
0 votes
1 answer

How to create multiple producers in apache kafka?

Hi@akhtar, To create multiple producer you have to ...READ MORE

Feb 6, 2020 in Apache Spark by MD
• 95,360 points
2,223 views
0 votes
0 answers

One Hot Encoding in Apache Spark

The following code that I wrote for ...READ MORE

Feb 11, 2020 in Apache Spark by Manish
• 120 points
1,948 views
+2 votes
1 answer

Type mismatch error in scala

Hello, Your problem is here: val df_merge_final = df_merge .withColumn("version_key", ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
6,841 views
0 votes
1 answer

What is the difference between spark streaming and spark structured streaming?

Hi@akhtar Generally, Spark streaming  is used for real time ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,360 points
2,043 views
0 votes
1 answer

Caused by: java.lang.NumberFormatException: Empty String

Hi@akhtar, As we know text files are in ...READ MORE

Jan 31, 2020 in Apache Spark by MD
• 95,360 points
1,729 views
0 votes
2 answers

Difference between createOrReplaceTempView and registerTempTable

I am pretty sure createOrReplaceTempView just replaced ...READ MORE

Sep 18, 2020 in Apache Spark by Nathan Mott
10,431 views
+1 vote
0 answers

How to create a list of RDDs(or RDD of RDDs, if possible) from a single JavaRDD<List<Integers>> in Java?

Hi, I have the input RDD as a ...READ MORE

Jan 11, 2020 in Apache Spark by itsroops
• 130 points
2,103 views
0 votes
1 answer

How can I remove headers from dataframe?

You can use filter to do this. ...READ MORE

Feb 15, 2019 in Apache Spark by Aryan
16,273 views
0 votes
1 answer

Does spark streaming provides checkpoint?

Hi@akhtar, Yes, Spark streaming uses checkpoint. Checkpoint is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,360 points
669 views
0 votes
1 answer

Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

Since the file is in HDFS so ...READ MORE

Jul 31, 2019 in Apache Spark by Tina
8,519 views
0 votes
0 answers

not able to get output in spark streaming??

Hi everyone, I tried to count individual words ...READ MORE

Feb 4, 2020 in Apache Spark by akhtar
• 38,230 points
352 views
0 votes
1 answer

What are Dstreams?

Hi@akhtar, Dstreams are the basic abstraction that is ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,360 points
289 views
0 votes
1 answer

Is Spark Sql provides indexing to improve processing speed?

Hi@akhtar, There is no concept of indexing in ...READ MORE

Feb 4, 2020 in Apache Spark by MD
• 95,360 points
248 views
0 votes
1 answer

Cannot create directory /hive/xzxz/_temporary/0. Name node is in safe mode.

Hi@akhtar, Here you are trying to save csv ...READ MORE

Feb 3, 2020 in Apache Spark by MD
• 95,360 points
267 views
0 votes
0 answers

Error: Package: R-core-devel-3.6.0-1el7.x86_64 (epel) Requires: pcre2-devel

Hi, I am getting this error when try ...READ MORE

Jan 31, 2020 in Apache Spark by Hasid
• 370 points
386 views
+1 vote
0 answers

how to access hive view using spark2

We do not have access to hive ...READ MORE

Dec 29, 2019 in Apache Spark by anonymous
• 130 points
1,259 views
0 votes
2 answers

How to execute a function in apache-scala?

Function Definition : def test():Unit{ var a=10 var b=20 var c=a+b } calling ...READ MORE

Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy
313 views
+2 votes
1 answer

Spark code takes too much time to run on cluster

Hi @asif, Share with us please the application ...READ MORE

Jan 22, 2020 in Apache Spark by Alexandru
• 510 points
495 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

May 23, 2018 in Apache Spark by nitinrawat895
• 11,380 points
26,125 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

Jul 14, 2019 in Apache Spark by Kiran
7,807 views
0 votes
1 answer

What is the use of App class in Scala?

Hi, Scala provides a helper class, called App, that ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,850 points
7,019 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1, 2019 in Apache Spark by Zed
6,867 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
6,406 views
0 votes
1 answer

What is Action in Spark?

Hi, Actions are RDD’s operation, that value returns ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,850 points
7,684 views
0 votes
1 answer

Join in RDD using keys

Suppose you have two dataset results( id, ...READ MORE

Aug 2, 2019 in Apache Spark by Trisha
6,232 views
+1 vote
2 answers

Spark: Can we add column to dataframe?

Yes we can add columns to the ...READ MORE

Oct 24, 2019 in Apache Spark by Siva
• 160 points
4,005 views
0 votes
1 answer

Spark, Scala: Load custom delimited file

You can load a DAT file into ...READ MORE

Jul 16, 2019 in Apache Spark by Shri
6,740 views
+1 vote
2 answers

How do I get number of columns in each line from a delimited file??

Instead of spliting on '\n'. You should ...READ MORE

Aug 7, 2019 in Apache Spark by ashish
3,562 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 1,537 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

Jul 25, 2019 in Apache Spark by Rohit
5,986 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
2,881 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,850 points
5,618 views
0 votes
1 answer

What does the command df.registerTempTable() do?

df.registerTempTable(“airports”) This command is used to register ...READ MORE

Jul 14, 2019 in Apache Spark by James
6,103 views
0 votes
1 answer

Removing the header of a text file in SparkRDD

1) First we loaded the data to ...READ MORE

Jul 31, 2019 in Apache Spark by Namitha
5,161 views
0 votes
1 answer

Date formats : how to cast string to date?

Try this, it should work: > from pyspark.sql.functions ...READ MORE

Jul 29, 2019 in Apache Spark by Niall
5,208 views
0 votes
1 answer

Spark Error: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

There seems to be a problem with ...READ MORE

May 24, 2019 in Apache Spark by Jishan
7,827 views
+1 vote
1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
2,235 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
4,761 views
0 votes
1 answer

How to create RDD from existing RDD in scala?

scala> val rdd1 = sc.parallelize(List(1,2,3,4,5))                           -  Creating ...READ MORE

Feb 29, 2020 in Apache Spark by anonymous
671 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,620 points
3,094 views
+1 vote
2 answers

sparkstream.textfilstreaming(localpathdirectory). I am getting empty results

Hey @c.kothamasu You should copy your file to ...READ MORE

Nov 7, 2019 in Apache Spark by Manas
234 views
+1 vote
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9, 2019 in Apache Spark by ravikiran
• 4,620 points
4,043 views
0 votes
1 answer

Spark error: Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.

Give  read-write permissions to  C:\tmp\hive folder Cd to winutils bin folder ...READ MORE

Jul 11, 2019 in Apache Spark by Rajiv
5,326 views
0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,850 points
4,323 views
0 votes
1 answer

"main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

1. We will check whether master and ...READ MORE

Jul 29, 2019 in Apache Spark by Yogi
4,433 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,850 points
3,869 views
0 votes
1 answer

How to append a list in Scala?

Hey, For this purpose, we use the single ...READ MORE

Jul 26, 2019 in Apache Spark by Gitika
• 65,850 points
4,277 views