map() vs flatMap() in Spark

0 votes

Please explain to me the difference between map() and flatMap() in Spark.

Thanks

Mar 8 in Apache Spark by Tina
157 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

Both map() and flatMap() are used for transformations. 

The map() transformation takes in a function and applies it to each element in the RDD and the result of the function is a new value of each element in the resulting RDD. The flatMap() is used to produce multiple output elements for each input element. When using map(), the function we provide to flatMap() is called individually for each element in our input RDD. Instead of returning a single element, an iterator with the return values is returned.

answered Mar 8 by Raj

Related Questions In Apache Spark

0 votes
1 answer

map vs mapValues in Spark

There is a difference between the two: mapValues ...READ MORE

answered Jun 29, 2018 in Apache Spark by nitinrawat895
• 9,030 points
1,381 views
0 votes
5 answers

groupByKey vs reduceByKey in Apache Spark.

Below Images are self explainatry for reducebykey ...READ MORE

answered Apr 22 in Apache Spark by Gunjan Kumar
3,464 views
0 votes
1 answer

Filter, Option or FlatMap in spark

If, for option 2, you mean have ...READ MORE

answered Nov 9, 2018 in Apache Spark by Frankie
• 9,570 points
216 views
0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Nov 20, 2018 in Apache Spark by Frankie
• 9,570 points
117 views
0 votes
0 answers
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
1,635 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,030 points
130 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
7,938 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

answered Mar 8 in Apache Spark by Raj
51 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

answered Mar 8 in Apache Spark by Raj
12 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.