Most voted questions in Apache Spark

+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

Mar 21 in Apache Spark by anonymous
30,854 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

May 3, 2018 in Apache Spark by kurt_cobain
• 9,240 points
14,065 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26 in Apache Spark by Karan
62 views
+1 vote
0 answers

Type mismatch error in scala

import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.functions.{col, ...READ MORE

Aug 16 in Apache Spark by anonymous
122 views
+1 vote
1 answer

How to install Scala Build Tool (SBT) on ubuntu?

Hey, To install SBT on Ubuntu first you need ...READ MORE

Jul 23 in Apache Spark by Gitika
• 25,340 points
121 views
+1 vote
1 answer

Facing out-of-memory errors in Spark driver

I am guessing that the configuration set ...READ MORE

Feb 22 in Apache Spark by Rishab
38 views
+1 vote
1 answer

Spark interview

Preparing for an interview? We have something ...READ MORE

Feb 7 in Apache Spark by Edureka
• 1,280 points
141 views
+1 vote
3 answers

What is the difference between rdd and dataframes in Apache Spark ?

Comparison between Spark RDD vs DataFrame 1. Release ...READ MORE

Aug 27, 2018 in Apache Spark by shams
• 3,580 points
15,654 views
+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,240 points
411 views
+1 vote
3 answers

Which cluster type should I choose for Spark?

According to me, start with a standalone ...READ MORE

Jun 27, 2018 in Apache Spark by nitinrawat895
• 10,690 points
141 views
+1 vote
2 answers

Apache Spark vs Apache Spark 2

Spark 2 doesn't differ much architecture-wise from ...READ MORE

Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,240 points
3,586 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,240 points
1,843 views
0 votes
1 answer

How to convert a json file structure with values in single quotes to quoteless ?

You can do this by turning off ...READ MORE

Oct 4 in Apache Spark by Jisha
62 views
0 votes
0 answers

Cannot resolve Error In Spark when filter records with two where condition

SPARK 1.6, SCALA, MAVEN i have created a ...READ MORE

Sep 30 in Apache Spark by anonymous
• 120 points
27 views
0 votes
0 answers

Difference Between rdd dataframe dataset [closed]

Sep 12 in Apache Spark by Rajesh pagadala

closed Sep 13 by Omkar 54 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11 in Apache Spark by ravikiran
• 4,560 points
47 views
0 votes
1 answer

How do I connect to a HIVE Meta store through a program in SparkSQL?

In spark 2.0.+ it should look something ...READ MORE

Sep 5 in Apache Spark by ravikiran
• 4,560 points
53 views
0 votes
0 answers

What is the use case of map and flatMap? [closed]

What is the major use case for ...READ MORE

Aug 24 in Apache Spark by anonymous
• 120 points

closed Aug 26 by Omkar 65 views
0 votes
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23 in Apache Spark by Karan
47 views
0 votes
1 answer

Spark: Can we add column to dataframe?

Yes we can add a column using withColumn with ...READ MORE

Aug 9 in Apache Spark by Shirish
45 views
0 votes
1 answer

Monitoring Spark application

Spark-submit jobs are also run from client/edge ...READ MORE

Aug 9 in Apache Spark by Umesh
35 views
0 votes
1 answer

Primary keys in Apache Spark

import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE

Aug 9 in Apache Spark by ravikiran
• 4,560 points
62 views
0 votes
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
149 views
0 votes
1 answer

How to start spark history server?

Hi, You can use this command to start ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
35 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
55 views
0 votes
1 answer

How to handle data shuffle in Spark

Hi, You can do it using map partition ...READ MORE

Aug 6 in Apache Spark by Gitika
• 25,340 points
48 views
0 votes
1 answer

How Foreach Operation works in Apache Spark?

Hi, foreach() operation is an action. It does not ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
37 views
0 votes
1 answer

How SortBykey() operation works in Spark?

Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
75 views
0 votes
1 answer

In how many modes Apache spark can run?

Hey, You can launch spark application in four ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
24 views
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
38 views
0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
48 views
0 votes
1 answer

what is Paired RDD and how to create paired RDD in Spark?

Hi, Paired RDD is a distributed collection of ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
561 views
0 votes
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
46 views
0 votes
1 answer

Join in RDD using keys

Suppose you have two dataset results( id, ...READ MORE

Aug 2 in Apache Spark by Trisha
31 views
0 votes
1 answer

Scala: save filtered data row by row using saveAsTextFile

Try this code, it worked for me: val ...READ MORE

Aug 2 in Apache Spark by Karan
37 views
0 votes
1 answer

What is Hive on Spark?

Hi, Hive contains significant support for Apache Spark, ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
20 views
0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2 in Apache Spark by Gitika
• 25,340 points
54 views
0 votes
0 answers

How to define SparkConf?

Can anyone explain how to define SparkConf? READ MORE

Aug 1 in Apache Spark by Danish
20 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1 in Apache Spark by Karan
96 views
0 votes
1 answer

Pyspark dataframe with random values

Hey @Esha, you can use this code. ...READ MORE

Aug 1 in Apache Spark by Zed
158 views
0 votes
1 answer

Spark + Hive connectivity

The problem is probably with the command. ...READ MORE

Aug 1 in Apache Spark by Rishni
38 views
0 votes
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1 in Apache Spark by Esha
126 views
0 votes
1 answer

How to reverse a Scala list?

Hi, This reverses the order of elements in ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
23 views
0 votes
1 answer

How to use uniform list in Scala?

Hey, The method List.fill() creates a list and ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
51 views
0 votes
1 answer

How shallow copy carry out using Scala?

Hey, Scala uses the method copy() to carry ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
24 views
0 votes
1 answer

How to access variables in s string interpolation in Scala?

Hey, You can use below code to access variables ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
32 views
0 votes
1 answer

How to work with Matrix Multiplication in Apache Spark?

Hey, You can follow this below solution for ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
360 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31 in Apache Spark by Gitika
• 25,340 points
120 views