map vs flatMap in Spark

Question

Please explain to me the difference between map() and flatMap() in Spark.

Thanks

score +1 · Answer 1 · Mar 8, 2019

Both map() and flatMap() are used for transformations.

The map() transformation takes in a function and applies it to each element in the RDD and the result of the function is a new value of each element in the resulting RDD. The flatMap() is used to produce multiple output elements for each input element. When using map(), the function we provide to flatMap() is called individually for each element in our input RDD. Instead of returning a single element, an iterator with the return values is returned.

answered Mar 8, 2019 by Raj

vishal · Answer 2 · Jun 17, 2019

Spark map function expresses a one-to-one transformation. It transforms each element of a collection into one element of the resulting collection. While Spark flatMap function expresses a one-to-many transformation. It transforms each element to 0 or more elements.

answered Jun 17, 2019 by vishal
• 180 points

MD · Answer 3 · Dec 16, 2020

Hi,

The map is a specific line or row to process that data. In FlatMap each input item can be mapped to multiple output items (so the function should return a Seq rather than a single item). So most frequently used to return Array elements.