map and flatmap

0 votes
What is the difference between map() and flatmap()?
Jun 20, 2018 in Apache Spark by Ashish
• 2,650 points
819 views

2 answers to this question.

0 votes
The map() transformation takes in a function and applies it to each element in the RDD with the result of the function being the new value of each element in the resulting RDD. Sometimes we want to produce multiple output elements for each input element. The operation to do this is called flatMap(). As with map(), the function we provide to flatMap() is called individually for each element in our input RDD. Instead of returning a single element, we return an iterator with our return values.
answered Jun 20, 2018 by kurt_cobain
• 9,390 points
0 votes

map(): Return a new distributed dataset formed by passing each element of the source through the function 

Example: 

val a = sc . parallelize ( List (" dog " , " salmon " , " salmon " , " rat " , " elephant") , 3)

val b = a . map ( _ . length )

val c = a . zip ( b )

c . collect

flatmap(): Similar to map, but allows emitting more than one item in the map function​

Example:

val a = sc . parallelize (1 to 10 , 5)

a . flatMap (1 to _ ) . collect

answered Jul 4, 2018 by zombie
• 3,790 points

Related Questions In Apache Spark

0 votes
1 answer

What is Map and flatMap in Spark?

Hi, The map is a specific line or ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 65,910 points
1,881 views
+1 vote
0 answers

What is the use case of map and flatMap?

What is the major use case for ...READ MORE

Aug 25, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 1,418 views
+1 vote
3 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

answered Jun 17, 2019 in Apache Spark by vishal
• 180 points
38,077 views
0 votes
1 answer

Difference between map() and mapPartitions() function in Spark.

Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE

answered Jan 29, 2020 in Apache Spark by MD
• 95,440 points
6,092 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
1,623 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
1,877 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,660 points
2,641 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,390 points
5,470 views
0 votes
2 answers

Difference between createOrReplaceTempView and registerTempTable

I am pretty sure createOrReplaceTempView just replaced ...READ MORE

answered Sep 18, 2020 in Apache Spark by Nathan Mott
12,998 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP