How to extract record from one RDD using another RDD

+1 vote

I have a input file who's data is as below:

1,Prashant,IT,Developer,27,8000
2,Saurav,Training,Trainer,38,18000
6,Vibhor,Training,Manager,40,20000
3,Vineet,Finance,Accountant,31,16000
5,Atul,IT,Manager,39,25000
7,Raman,Sales,Manager,43,23000
8,Rakesh,Finance,Manager,45,26000
4,Kartik,Sales,Salesman,29,11000

I need another RDD where the records are extracted if it contains word as Manager.

Can anyone please help me on this?

Aug 23, 2019 in Apache Spark by anonymous
2,233 views

1 answer to this question.

+1 vote

Hey, you can use "contains" filter to extract those lines. Something like this:

val newRDD = oldRDD.filter(line => line.contains(“Manager”))
answered Aug 23, 2019 by Karan

Related Questions In Apache Spark

0 votes
1 answer

How to save and retrieve the Spark RDD from HDFS?

You can save the RDD using saveAsObjectFile and saveAsTextFile method. ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
13,238 views
0 votes
1 answer

How to create RDD from parallelized collection in scala?

Hi, You can check this example in your ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,438 views
0 votes
1 answer

How to create RDD from existing RDD in scala?

scala> val rdd1 = sc.parallelize(List(1,2,3,4,5))                           -  Creating ...READ MORE

answered Feb 29, 2020 in Apache Spark by anonymous
1,295 views
0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,620 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,758 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,309 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
106,709 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
8,123 views
+1 vote
8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
61,221 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP