Working of map function on data

Question

Suppose we have a "customer" file with the data -
1 vishal
2 vijay
3 vinay

if I create an RDD

val cust = sc.textfile("home\customer.txt").map(_.split(" "))

What operation are map and split going to perform? Can you please explain this to me?

score 0 · Answer 1 · Jul 11, 2019

The map function creates an array of arrays and the split function defines the delimiter in the dataset. Refer to the below screenshot

Since the dataset was delimited by space, we wrote the split function as - split(" "). If our dataset was delimited by tab, then we would have to specify "\t" in the split function.

answered Jul 11, 2019 by Krish

Working of map function on data

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Apache Spark

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

How to increase the amount of data to be transferred to shuffle service at the same time?

How to load data of .csv file in MySQL Database Table?

what is the benefit of repartition(1) and coalesce(1). When we save data we use df.repartition(1).so how many partition it will create

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

How can I minimize data transfers when working with Spark?

How to get ID of a map task in Spark?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES