Spark Core How to fetch max n rows of an RDD function without using Rdd max

0 votes

I have an RDD having below elements:
('09', [25, 66, 67])
('17', [66, 67, 39])
('04', [25])
('08', [120, 122])
('28', [25, 67])
('30', [122])

I need to fetch the elements having max number of elements in the list which is 3 in the above RDD
O/p should be filtered into another RDD and not use the max function and **spark dataframes**:
('09', [25, 66, 67])
('17', [66, 67, 39])

max_len = uniqueRDD.max(lambda x: len(x[1]))
maxRDD = uniqueRDD.filter(lambda x : (len(x[1]) == len(max_len[1])))

I am able to do with above lines of code but spark streaming won't support this as max_len is a tuple and not RDD

Can someone suggest? Thanks in advance,

Dec 3, 2020 in Apache Spark by Prashant
• 120 points
2,208 views

1 answer to this question.

0 votes
Hi@Prasant,

If Spark Streaming is not supporting tuple, then you need to convert the tuple to RDD.
answered Dec 3, 2020 by MD
• 95,460 points

Related Questions In Apache Spark

0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
2,719 views
–1 vote
0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18, 2020 in Apache Spark by anonymous
• 110 points
2,039 views
0 votes
1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,490 points
3,454 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

answered May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,997 views
+1 vote
2 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 3, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
1,277 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
1,906 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,065 views
+1 vote
8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
61,844 views
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 88,838 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP