map() and flatmap()

0 votes
What is the difference between map() and flatmap()?
Jun 20, 2018 in Apache Spark by Ashish
• 2,630 points
90 views

2 answers to this question.

0 votes
The map() transformation takes in a function and applies it to each element in the RDD with the result of the function being the new value of each element in the resulting RDD. Sometimes we want to produce multiple output elements for each input element. The operation to do this is called flatMap(). As with map(), the function we provide to flatMap() is called individually for each element in our input RDD. Instead of returning a single element, we return an iterator with our return values.
answered Jun 20, 2018 by kurt_cobain
• 9,260 points
0 votes

map(): Return a new distributed dataset formed by passing each element of the source through the function 

Example: 

val a = sc . parallelize ( List (" dog " , " salmon " , " salmon " , " rat " , " elephant") , 3)

val b = a . map ( _ . length )

val c = a . zip ( b )

c . collect

flatmap(): Similar to map, but allows emitting more than one item in the map function​

Example:

val a = sc . parallelize (1 to 10 , 5)

a . flatMap (1 to _ ) . collect

answered Jul 3, 2018 by zombie
• 3,690 points

Related Questions In Apache Spark

0 votes
1 answer

What is Map and flatMap in Spark?

Hi, The map is a specific line or ...READ MORE

answered Jul 3 in Apache Spark by Gitika
• 25,340 points
195 views
0 votes
0 answers

What is the use case of map and flatMap?

What is the major use case for ...READ MORE

Aug 24 in Apache Spark by anonymous
• 120 points

closed Aug 26 by Omkar 84 views
0 votes
2 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

answered Jun 17 in Apache Spark by vishal
• 160 points
3,284 views
0 votes
1 answer

Why is Spark faster than Hadoop Map Reduce

Firstly, it's the In-memory computation, if the file ...READ MORE

answered Apr 30, 2018 in Apache Spark by shams
• 3,580 points
152 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
222 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7 in Big Data Hadoop by pradeep
197 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,550 points
583 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,260 points
1,967 views
0 votes
1 answer

Difference between createOrReplaceTempView and registerTempTable

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this ...READ MORE

answered Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,260 points
2,248 views