Trending questions in Apache Spark

+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

Mar 21, 2019 in Apache Spark by anonymous
38,929 views
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 33,267 views
0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

Jun 5, 2018 in Apache Spark by Shubham
• 13,370 points
38,550 views
0 votes
6 answers

How to replace null values in Spark DataFrame?

Hi i hope this will help for ...READ MORE

Feb 5, 2019 in Apache Spark by Srinivasreddy
• 140 points
28,749 views
+1 vote
1 answer

How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

spark do not have any concept of ...READ MORE

Jan 12 in Apache Spark by Sirish
• 160 points
80 views
+1 vote
0 answers
+2 votes
0 answers

Spark code takes too much time to run on cluster

I have written a Spark application. My ...READ MORE

Jan 3 in Apache Spark by asif
• 140 points
38 views
+1 vote
0 answers

how to access hive view using spark2

We do not have access to hive ...READ MORE

Dec 29, 2019 in Apache Spark by anonymous
• 130 points
43 views
0 votes
5 answers

groupByKey vs reduceByKey in Apache Spark.

ReduceByKey is the best for production. READ MORE

Mar 3, 2019 in Apache Spark by anonymous
15,596 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

Dec 10, 2018 in Apache Spark by Vini
19,629 views
+1 vote
1 answer
+1 vote
2 answers

sparkstream.textfilstreaming(localpathdirectory). I am getting empty results

Hey @c.kothamasu You should copy your file to ...READ MORE

Nov 7, 2019 in Apache Spark by Manas
89 views
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 450 points

edited Dec 13, 2019 by Alexandru 120 views
+1 vote
1 answer

Reading a text file through spark data frame

Try this: val df = sc.textFile("HDFS://nameservice1/user/edureka_168049/Structure_IT/samplefile.txt") df.collect() val df = ...READ MORE

Jul 24, 2019 in Apache Spark by Suri
4,602 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

Aug 27, 2018 in Apache Spark by shams
• 3,580 points
19,352 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,290 points
18,470 views
+2 votes
1 answer

Type mismatch error in scala

Hello, Your problem is here: val df_merge_final = df_merge .withColumn("version_key", ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 450 points
487 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

Jul 13, 2019 in Apache Spark by Puneet
4,368 views
+1 vote
1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

Dec 13, 2019 in Apache Spark by Alexandru
• 450 points
587 views
+1 vote
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4, 2019 in Apache Spark by Jisha
233 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

Dec 10, 2018 in Apache Spark by Akshay
16,013 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
398 views
0 votes
1 answer

Spark: Dataframe vs Dataset

Recently, there are two new data abstractions ...READ MORE

Jul 29, 2019 in Apache Spark by Jackie
2,547 views
0 votes
2 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

Jun 17, 2019 in Apache Spark by vishal
• 160 points
6,454 views
+1 vote
2 answers

Spark: Can we add column to dataframe?

Yes we can add columns to the ...READ MORE

Oct 24, 2019 in Apache Spark by Siva
• 160 points
93 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
202 views
+1 vote
0 answers

Difference Between rdd dataframe dataset [closed]

Sep 12, 2019 in Apache Spark by Rajesh pagadala

closed Sep 13, 2019 by Omkar 137 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,600 points
106 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5, 2019 in Apache Spark by ravikiran
• 4,600 points
338 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

Jul 10, 2019 in Apache Spark by Tina
2,567 views
+1 vote
1 answer

_spark_metadata/0 doesn't exist while Compacting batch 9 Structured streaming error

Please check https://kb.databricks.com/streaming/file-sink-stre ...READ MORE

Nov 20, 2019 in Apache Spark by anonymous
318 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
307 views
0 votes
1 answer

Cannot load file to spark: "org.apache.spark.sql.AnalysisException: Path does not exist"

Since the file is in HDFS so ...READ MORE

Jul 31, 2019 in Apache Spark by Tina
1,363 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 25,440 points
1,326 views
+1 vote
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 24, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 115 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1, 2019 in Apache Spark by Zed
1,094 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 25,440 points
817 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
98 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 25,440 points
893 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

Jul 13, 2019 in Apache Spark by Kiran
1,719 views
0 votes
1 answer

Spark: java.sql.SQLException: No suitable driver

The missing driver is the JDBC one ...READ MORE

Jul 24, 2019 in Apache Spark by John
1,061 views
+1 vote
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9, 2019 in Apache Spark by ravikiran
• 4,600 points
303 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

Jul 25, 2019 in Apache Spark by Rohit
890 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 25,440 points
501 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 25,440 points
311 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
434 views
0 votes
1 answer

Read multiple xml files in Spark

You can do this using globbing. See ...READ MORE

Jul 25, 2019 in Apache Spark by Jack
754 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
442 views