Spark Core How to fetch max n rows of an RDD function without using Rdd max

0 votes

I have an RDD having below elements:
('09', [25, 66, 67])
('17', [66, 67, 39])
('04', [25])
('08', [120, 122])
('28', [25, 67])
('30', [122])

I need to fetch the elements having max number of elements in the list which is 3 in the above RDD
O/p should be filtered into another RDD and not use the max function and **spark dataframes**:
('09', [25, 66, 67])
('17', [66, 67, 39])

max_len = uniqueRDD.max(lambda x: len(x[1]))
maxRDD = uniqueRDD.filter(lambda x : (len(x[1]) == len(max_len[1])))

I am able to do with above lines of code but spark streaming won't support this as max_len is a tuple and not RDD

Can someone suggest? Thanks in advance,

Dec 3, 2020 in Apache Spark by Prashant
• 120 points
266 views

1 answer to this question.

0 votes
Hi@Prasant,

If Spark Streaming is not supporting tuple, then you need to convert the tuple to RDD.
answered Dec 3, 2020 by MD
• 95,300 points

Related Questions In Apache Spark

0 votes
1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,950 points
1,151 views
0 votes
0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18, 2020 in Apache Spark by anonymous
• 120 points
836 views
0 votes
1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,480 points
1,986 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 11,380 points
6,047 views
+1 vote
2 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 2, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
416 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
1,077 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,802 views
+1 vote
8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
47,355 views
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 73,311 views