map() and flatmap()

0 votes
What is the difference between map() and flatmap()?
Jun 20, 2018 in Apache Spark by Ashish
• 2,630 points
179 views

2 answers to this question.

0 votes
The map() transformation takes in a function and applies it to each element in the RDD with the result of the function being the new value of each element in the resulting RDD. Sometimes we want to produce multiple output elements for each input element. The operation to do this is called flatMap(). As with map(), the function we provide to flatMap() is called individually for each element in our input RDD. Instead of returning a single element, we return an iterator with our return values.
answered Jun 20, 2018 by kurt_cobain
• 9,310 points
0 votes

map(): Return a new distributed dataset formed by passing each element of the source through the function 

Example: 

val a = sc . parallelize ( List (" dog " , " salmon " , " salmon " , " rat " , " elephant") , 3)

val b = a . map ( _ . length )

val c = a . zip ( b )

c . collect

flatmap(): Similar to map, but allows emitting more than one item in the map function​

Example:

val a = sc . parallelize (1 to 10 , 5)

a . flatMap (1 to _ ) . collect

answered Jul 3, 2018 by zombie
• 3,750 points

Related Questions In Apache Spark

0 votes
1 answer

What is Map and flatMap in Spark?

Hi, The map is a specific line or ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 33,770 points
811 views
+1 vote
0 answers

What is the use case of map and flatMap?

What is the major use case for ...READ MORE

Aug 24, 2019 in Apache Spark by anonymous
• 130 points

closed Aug 26, 2019 by Omkar 261 views
0 votes
2 answers

map() vs flatMap() in Spark

Spark map function expresses a one-to-one transformation. ...READ MORE

answered Jun 17, 2019 in Apache Spark by vishal
• 180 points
17,524 views
0 votes
1 answer

Difference between map() and mapPartitions() function in Spark??

Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE

answered Jan 29 in Apache Spark by MD
• 42,420 points
1,396 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
638 views
+1 vote
1 answer
0 votes
3 answers

Can we run Spark without using Hadoop?

No, you can run spark without hadoop. ...READ MORE

answered May 7, 2019 in Big Data Hadoop by pradeep
363 views
0 votes
1 answer

Joining Multiple Spark Dataframes

You can run the below code to ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Bharani
• 4,560 points
1,533 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

answered Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,310 points
3,111 views
0 votes
1 answer

Difference between createOrReplaceTempView and registerTempTable

createOrReplaceTempView() creates/replaces a local temp view with the dataframe provided. Lifetime of this ...READ MORE

answered Apr 25, 2018 in Apache Spark by kurt_cobain
• 9,310 points
5,362 views