Trending questions in Apache Spark

+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

Mar 21 in Apache Spark by anonymous
28,118 views
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 20,175 views
0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

Jun 5, 2018 in Apache Spark by Shubham
• 13,290 points
24,442 views
0 votes
6 answers

How to replace null values in Spark DataFrame?

Hi i hope this will help for ...READ MORE

Feb 5 in Apache Spark by Srinivasreddy
• 140 points
18,305 views
0 votes
0 answers

Difference Between rdd dataframe dataset [closed]

4 days ago in Apache Spark by Rajesh pagadala

closed 3 days ago by Omkar 15 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

6 days ago in Apache Spark by ravikiran
• 4,560 points
17 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5 in Apache Spark by ravikiran
• 4,560 points
21 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26 in Apache Spark by Karan
35 views
0 votes
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 24 in Apache Spark by anonymous
• 120 points

closed Aug 26 by Omkar 41 views
0 votes
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23 in Apache Spark by Karan
32 views
+1 vote
0 answers

Type mismatch error in scala

import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.functions.{col, ...READ MORE

Aug 16 in Apache Spark by anonymous
56 views
0 votes
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9 in Apache Spark by ravikiran
• 4,560 points
52 views
0 votes
1 answer

Spark: Can we add column to dataframe?

Yes we can add a column using withColumn with ...READ MORE

Aug 9 in Apache Spark by Shirish
39 views
0 votes
1 answer

Monitoring Spark application

Spark-submit jobs are also run from client/edge ...READ MORE

Aug 9 in Apache Spark by Umesh
29 views
0 votes
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
87 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
39 views
0 votes
1 answer

How to handle data shuffle in Spark

Hi, You can do it using map partition ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
35 views
0 votes
1 answer

How to start spark history server?

Hi, You can use this command to start ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
28 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve "`id`" given input columns

I have used a header-less csv file ...READ MORE

Jul 13 in Apache Spark by Puneet
995 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
58 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
33 views
0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
30 views
0 votes
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
35 views
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
29 views
0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
34 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
26 views
0 votes
1 answer
0 votes
1 answer

Scala: save filtered data row by row using saveAsTextFile

Try this code, it worked for me: val ...READ MORE

Aug 2 in Apache Spark by Karan
25 views
0 votes
1 answer

In how many modes Apache spark can run?

Hey, You can launch spark application in four ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
20 views
0 votes
1 answer

Join in RDD using keys

Suppose you have two dataset results( id, ...READ MORE

Aug 2 in Apache Spark by Trisha
23 views
0 votes
1 answer

What is Hive on Spark?

Hi, Hive contains significant support for Apache Spark, ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
17 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
92 views
0 votes
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1 in Apache Spark by Esha
43 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1 in Apache Spark by Zed
39 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1 in Apache Spark by Karan
33 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1 in Apache Spark by Rishni
23 views
0 votes
1 answer

What is the use of App class in Scala?

Hi, Scala provides a helper class, called App, that ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
60 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
49 views
0 votes
1 answer

How to use uniform list in Scala?

Hey, The method List.fill() creates a list and ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
31 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
37 views
0 votes
1 answer

How to access variables in s string interpolation in Scala?

Hey, You can use below code to access variables ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
25 views
0 votes
1 answer

How to reverse a Scala list?

Hi, This reverses the order of elements in ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
19 views
0 votes
1 answer

How shallow copy carry out using Scala?

Hey, Scala uses the method copy() to carry ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
19 views
0 votes
1 answer

How to create singleton classes in Scala?

Hey, Scala introduces a new object keyword, which is used ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
29 views
0 votes
0 answers

How to define SparkConf?

Can anyone explain how to define SparkConf? READ MORE

Aug 1 in Apache Spark by Danish
18 views
0 votes
1 answer

What are these in scala : _* & @_*

As is widely used, and has different ...READ MORE

Jul 31 in Apache Spark by Turic
27 views
0 votes
1 answer

What is 'TRAITS' in Scala

Hi, Traits are basically Scala's workaround for the ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
25 views
0 votes
1 answer

Removing the header of a text file in SparkRDD

1) First we loaded the data to ...READ MORE

Jul 31 in Apache Spark by Namitha
27 views
0 votes
1 answer

Scala: Loading a csv file

Refer to the below command: val input_df = ...READ MORE

Jul 31 in Apache Spark by Emma
23 views