Trending questions in Apache Spark

0 votes
2 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3 ...READ MORE

Jun 5, 2018 in Apache Spark by Shubham
• 13,450 points
68,884 views
0 votes
12 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 60,975 views
0 votes
7 answers

How to replace null values in Spark DataFrame?

in spark 2.x you can directly use ...READ MORE

Mar 28 in Apache Spark by gaurav
50,944 views
+5 votes
11 answers

Concatenate columns in apache spark dataframe

its late but this how you can ...READ MORE

Mar 21, 2019 in Apache Spark by anonymous
58,855 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

Dec 10, 2018 in Apache Spark by Vini
44,250 views
0 votes
5 answers

groupByKey vs reduceByKey in Apache Spark.

ReduceByKey is the best for production. READ MORE

Mar 3, 2019 in Apache Spark by anonymous
35,543 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

Dec 10, 2018 in Apache Spark by Akshay
36,560 views
0 votes
1 answer

Spark: Dataframe vs Dataset

Recently, there are two new data abstractions ...READ MORE

Jul 29, 2019 in Apache Spark by Jackie
22,635 views
0 votes
1 answer

What will be printed when the below code is executed?

Option D)  runtime error READ MORE

4 days ago in Apache Spark by Gitika
• 51,880 points
19 views
0 votes
1 answer

What will be printed when the below code is executed?

Option b) .List(0,3,5) The takeOrdered method returns the smallest n elements in a ...READ MORE

4 days ago in Apache Spark by Gitika
• 51,880 points
15 views
0 votes
1 answer

Which one of the following commands is used to start python-spark?

Hi@ritu, To start your python spark shell, you ...READ MORE

4 days ago in Apache Spark by MD
• 79,930 points
17 views
0 votes
1 answer

In AWS, if user wants to run spark, then on top of which one of the following can the user do it?

Hi@ritu, AWS has lots of services. For spark ...READ MORE

4 days ago in Apache Spark by MD
• 79,930 points
16 views
0 votes
1 answer

What will be printed when the below code is executed ?

Option a) List(5,100,10) The take method returns the first n elements in an ...READ MORE

4 days ago in Apache Spark by Gitika
• 51,880 points
11 views
0 votes
1 answer

What class is declared in the blow code?

Option D: String class READ MORE

4 days ago in Apache Spark by Gitika
• 51,880 points
14 views
0 votes
1 answer

The number of stages in a job is equal to the number of RDDs in DAG. however, under one of the cgiven conditions, the scheduler can truncate the lineage. identify it.

Hi@Edureka, Spark's internal scheduler may truncate the lineage of the RDD graph ...READ MORE

4 days ago in Apache Spark by MD
• 79,930 points
33 views
0 votes
1 answer

What is the output of the following code?

rror: expected class or object definition sc.parallelize(Array(1L,("SFO")),(2L,("ORD")),(3L,("DFW")))) ^ one error ...READ MORE

4 days ago in Apache Spark by Gitika
• 51,880 points
14 views
0 votes
1 answer

16)What allows spark to periodically persist data about an application such that it can recover from failures?

Hi@Edureka, Checkpointing is a process of truncating RDD ...READ MORE

4 days ago in Apache Spark by MD
• 79,930 points
19 views
0 votes
1 answer

which one of the following commands is used to see the structure of the Dataframe?

Hi @Ritu If you want to see the ...READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
20 views
0 votes
1 answer

What are some of the things you can monitor in the Spark Web UI?

Option c) Mapr Jobs that are submitted READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
18 views
0 votes
1 answer

13)Refer the input and identify the output if the below code is run

Option c)  Run time error - A READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
15 views
0 votes
1 answer

What is the output of the following code?

After executing your code, there is an ...READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
16 views
0 votes
1 answer

What does the below code print?

Option d) Run time error. READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
16 views
0 votes
1 answer

The number of stages in a job is equal to the number of RDDs in DAG. however, under one of the cgiven conditions, the scheduler can truncate the lineage. identify it.

Hi@ritu, Spark's internal scheduler may truncate the lineage of the RDD graph if ...READ MORE

5 days ago in Apache Spark by akhtar
• 33,720 points
17 views
0 votes
1 answer

From the following graph code ,which code snippet will return the no.of flight routes?

Hey, @Ritu, I am getting error in your ...READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
12 views
0 votes
1 answer

From the below code. what is the most appropriate next step in ML process?

Hi@ritu, The most appropriate step according to me ...READ MORE

5 days ago in Apache Spark by MD
• 79,930 points
15 views
0 votes
1 answer

What does the following code print?

error: expected class or object definition sc.parallelize (Array(1L, ...READ MORE

5 days ago in Apache Spark by Gitika
• 51,880 points
9 views
0 votes
0 answers

What allows spark to periodically persist data about an application such that it can recover from failures? [closed]

What allows spark to periodically persist data ...READ MORE

5 days ago in Apache Spark by ritu
• 700 points

closed 4 days ago by MD 19 views
0 votes
2 answers

5)Using which one of the given choices will you create an RDD with specific partitioning?

Hi, @Ritu, option b for you, as Hash Partitioning ...READ MORE

Nov 23 in Apache Spark by Gitika
• 51,880 points
17 views
0 votes
0 answers
0 votes
0 answers
0 votes
0 answers

17)from the given choices, identify the value returned by $"whatever"?

17)from the given choices, identify the value ...READ MORE

6 days ago in Apache Spark by ritu
• 700 points
15 views
0 votes
0 answers

What does the below code print? [closed]

What does the below code print? val AgeDs ...READ MORE

6 days ago in Apache Spark by ritu
• 700 points

closed 5 days ago by Gitika 13 views
0 votes
0 answers

What is the output of the following code? [closed]

What is the output of the following ...READ MORE

6 days ago in Apache Spark by Edureka
• 200 points

closed 4 days ago by MD 11 views
0 votes
1 answer
0 votes
1 answer

12)Which one of the given flows correctly describe the Spark Streaming Architecture?

Hi@ritu, You need to learn the Architecture of ...READ MORE

Nov 23 in Apache Spark by MD
• 79,930 points
18 views
0 votes
1 answer

Spark - how the solve the below question?

option d, Runtime error READ MORE

Nov 23 in Apache Spark by Gitika
• 51,880 points
16 views
0 votes
1 answer

2)What will be printed when the below code is executed ?

Hi, @Ritu, List(5,100,10) is printed. The take method returns the first n elements in ...READ MORE

Nov 23 in Apache Spark by Gitika
• 51,880 points
16 views
0 votes
1 answer

4)Spark streaming converts streaming data into DStreams. which one of the given statements about DStreams is True?

Hi@ritu, Spark DStream (Discretized Stream) is the basic ...READ MORE

Nov 23 in Apache Spark by MD
• 79,930 points
15 views
0 votes
1 answer

7)From Schema RDD, data can be cache by which one of the given choices?

Hi, @Ritu, According to the official documentation of Spark 1.2, ...READ MORE

Nov 23 in Apache Spark by Gitika
• 51,880 points
16 views
0 votes
1 answer

How do you load this multiline data in spark as a single record?

Hi@Ruben, I think you can add an escape ...READ MORE

Nov 23 in Apache Spark by MD
• 79,930 points
14 views
0 votes
0 answers

6)What allows spark streaming to provide fault tolerance for network sources of data?

6)What allows spark streaming to provide fault ...READ MORE

Nov 22 in Apache Spark by ritu
• 700 points
24 views
+1 vote
1 answer

How to write Spark DataFrame to Avro Data File?

Hi@akhtar, Since Avro library is external to Spark, ...READ MORE

Nov 4 in Apache Spark by MD
• 79,930 points
55 views
0 votes
1 answer

How to read Avro Partition Data?

Hi@akhtar, When we try to retrieve the data ...READ MORE

Nov 4 in Apache Spark by MD
• 79,930 points
36 views
0 votes
1 answer

How to read a dataframe based on an avro schema?

Hi, I am able to understand your requirement. ...READ MORE

Oct 30 in Apache Spark by MD
• 79,930 points
86 views
0 votes
1 answer

how create distance vector in pyspark (Euclidean distance)

Hi@dani, You can find the euclidean distance using ...READ MORE

Oct 16 in Apache Spark by MD
• 79,930 points
108 views
0 votes
1 answer

How to implement my clustering algorithm in pyspark (without using the ready library for example k-means)?

Hi@dani, As you said you are a beginner ...READ MORE

Oct 14 in Apache Spark by MD
• 79,930 points
125 views
0 votes
1 answer

Facing issue while reading tsv file in pyspark

Hi@khyati, You are getting this type of output ...READ MORE

Sep 28 in Apache Spark by MD
• 79,930 points
157 views