Trending questions in Apache Spark

+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

Mar 21, 2019 in Apache Spark by anonymous
45,147 views
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 41,609 views
0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

Jun 5, 2018 in Apache Spark by Shubham
• 13,380 points
47,654 views
0 votes
6 answers

How to replace null values in Spark DataFrame?

Hi i hope this will help for ...READ MORE

Feb 5, 2019 in Apache Spark by Srinivasreddy
• 140 points
35,112 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

Dec 10, 2018 in Apache Spark by Vini
24,713 views
0 votes
0 answers

Unable to run select query with selected columns on a temp view registered in spark application

HI, I am using hive jdbc to load ...READ MORE

2 days ago in Apache Spark by sid
• 120 points
35 views
0 votes
5 answers

groupByKey vs reduceByKey in Apache Spark.

ReduceByKey is the best for production. READ MORE

Mar 3, 2019 in Apache Spark by anonymous
20,024 views
0 votes
0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18 in Apache Spark by anonymous
• 120 points
43 views
0 votes
2 answers

java.lang.StringIndexOutOfBoundsException: String index out of range: 1

When using the Java substring() method, a ...READ MORE

Mar 13 in Apache Spark by evanbung
• 180 points
159 views
0 votes
0 answers

One Hot Encoding in Apache Spark

The following code that I wrote for ...READ MORE

Feb 11 in Apache Spark by Manish
• 120 points
114 views
0 votes
1 answer

is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [51, 53, 10, 10]

Hi@akhtar, Here you are trying to read a ...READ MORE

Feb 3 in Apache Spark by MD
• 7,150 points
417 views
0 votes
1 answer

How to create multiple producers in apache kafka?

Hi@akhtar, To create multiple producer you have to ...READ MORE

Feb 6 in Apache Spark by MD
• 7,150 points
91 views
0 votes
1 answer

What is the difference between spark streaming and spark structured streaming?

Hi@akhtar Generally, Spark streaming  is used for real time ...READ MORE

Feb 4 in Apache Spark by MD
• 7,150 points
114 views
0 votes
1 answer

Does spark streaming provides checkpoint?

Hi@akhtar, Yes, Spark streaming uses checkpoint. Checkpoint is ...READ MORE

Feb 4 in Apache Spark by MD
• 7,150 points
70 views
0 votes
1 answer

Is Spark Sql provides indexing to improve processing speed?

Hi@akhtar, There is no concept of indexing in ...READ MORE

Feb 4 in Apache Spark by MD
• 7,150 points
70 views
0 votes
1 answer

What are Dstreams?

Hi@akhtar, Dstreams are the basic abstraction that is ...READ MORE

Feb 4 in Apache Spark by MD
• 7,150 points
27 views
0 votes
0 answers

not able to get output in spark streaming??

Hi everyone, I tried to count individual words ...READ MORE

Feb 4 in Apache Spark by akhtar
• 3,600 points
48 views
0 votes
1 answer

Cannot create directory /hive/xzxz/_temporary/0. Name node is in safe mode.

Hi@akhtar, Here you are trying to save csv ...READ MORE

Feb 3 in Apache Spark by MD
• 7,150 points
33 views
0 votes
1 answer

Caused by: java.lang.NumberFormatException: Empty String

Hi@akhtar, As we know text files are in ...READ MORE

Jan 31 in Apache Spark by MD
• 7,150 points
76 views
0 votes
0 answers

env : R : No such file or directory

Hi, I tried to set sparkR .But I ...READ MORE

Jan 31 in Apache Spark by Hasid
• 330 points
60 views
0 votes
0 answers

Error: Package: R-core-devel-3.6.0-1el7.x86_64 (epel) Requires: pcre2-devel

Hi, I am getting this error when try ...READ MORE

Jan 31 in Apache Spark by Hasid
• 330 points
56 views
0 votes
0 answers

What is pageRank in graphX??

Hi, I am new in spark. Can somebody ...READ MORE

Jan 31 in Apache Spark by akhtar
• 3,600 points
46 views
0 votes
1 answer

Difference between map() and mapPartitions() function in Spark??

Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE

Jan 29 in Apache Spark by MD
• 7,150 points
211 views
0 votes
0 answers

Not enough space to cache rdd_80_1 in memory!

Hi everyone, I'm new in Spark. I am working ...READ MORE

Jan 29 in Apache Spark by akhtar
• 3,600 points
48 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

Dec 10, 2018 in Apache Spark by Akshay
20,749 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,310 points
22,125 views
+1 vote
1 answer

Reading a text file through spark data frame

Try this: val df = sc.textFile("HDFS://nameservice1/user/edureka_168049/Structure_IT/samplefile.txt") df.collect() val df = ...READ MORE

Jul 24, 2019 in Apache Spark by Suri
7,729 views
+1 vote
1 answer

How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

spark do not have any concept of ...READ MORE

Jan 12 in Apache Spark by Sirish
• 160 points
344 views
+2 votes
1 answer

Spark code takes too much time to run on cluster

Hi @asif, Share with us please the application ...READ MORE

Jan 21 in Apache Spark by Alexandru
• 510 points
90 views
+1 vote
0 answers
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

Aug 27, 2018 in Apache Spark by shams
• 3,580 points
22,404 views
0 votes
1 answer

Spark: Dataframe vs Dataset

Recently, there are two new data abstractions ...READ MORE

Jul 29, 2019 in Apache Spark by Jackie
7,303 views
+1 vote
0 answers

how to access hive view using spark2

We do not have access to hive ...READ MORE

Dec 29, 2019 in Apache Spark by anonymous
• 130 points
161 views
0 votes
2 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

Jun 17, 2019 in Apache Spark by vishal
• 160 points
10,620 views
+1 vote
1 answer

Is there any efficient way of dealing null values during concat functionality of pyspark.sql version 2.3.4?

When you concatenate any string with a ...READ MORE

Nov 6, 2019 in Apache Spark by Rishi
1,651 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

Jul 13, 2019 in Apache Spark by Puneet
6,222 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 252 views
+1 vote
2 answers

sparkstream.textfilstreaming(localpathdirectory). I am getting empty results

Hey @c.kothamasu You should copy your file to ...READ MORE

Nov 7, 2019 in Apache Spark by Manas
102 views
+2 votes
1 answer

Type mismatch error in scala

Hello, Your problem is here: val df_merge_final = df_merge .withColumn("version_key", ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
783 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

Jul 10, 2019 in Apache Spark by Tina
4,962 views
0 votes
1 answer

How to create RDD from existing RDD in scala?

scala> val rdd1 = sc.parallelize(List(1,2,3,4,5))                           -  Creating ...READ MORE

Feb 28 in Apache Spark by anonymous
69 views
+1 vote
1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
706 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
388 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
639 views
+1 vote
2 answers

Spark: Can we add column to dataframe?

Yes we can add columns to the ...READ MORE

Oct 24, 2019 in Apache Spark by Siva
• 160 points
315 views
0 votes
1 answer

Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

Since the file is in HDFS so ...READ MORE

Jul 31, 2019 in Apache Spark by Tina
2,386 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,600 points
682 views