Spark Machine Learning pipeline works fine in Spark 1 6 but it gives error when executed on Spark 2 x

0 votes

I have written a code in Spark1.6 which was working fine. However when I converted it to Saprk 2.0. I am getting an error as following:

 <console>:56: error: type mismatch;
found   : Array[]
required: Array[ with                                                                                                                         m.shared.HasOutputCol with{def co                                                                                                                       py(extra:                                                                                                                      with with                                                                                                                     l.DefaultParamsWritable{def copy(extra: org                                                                                                            with                                                                                                                     l with}}]
Note: <:                                                                                                                     elineStage with with org.apache.sp                                                                                                           {def copy(extra:                                                                                                                     mMap): with                                                                                                                     asOutputCol with{def copy(extra:                                                                                                             with org.                                                                                                            with                                                                                                                     aramsWritable}}, but class Array is invariant in type T.   
 You may wish to investigate a wildcard type such as `_ <:                                                                                                                     pelineStage with with org.apache.s                                                                                                           {def copy(extra:                                                                                                                     amMap): with                                                                                                                     HasOutputCol with{def copy(extra:                                                                                                             with org                                                                                                            with                                                                                                                     ParamsWritable}}`. (SLS 3.2.10)
May 31, 2018 in Apache Spark by hack236

1 answer to this question.

0 votes

You need to change the following:

val pipeline = new Pipeline().setStages(discretizers ++ Array(assembler, selector))

If you want to Master Machine Learning concepts. Enroll in Machine Learning Course now!

answered May 31, 2018 by Shubham
• 13,490 points

Related Questions In Apache Spark

0 votes
1 answer

Spark 2.3? What is new in it?

Here are the changes in new version ...READ MORE

answered May 28, 2018 in Apache Spark by kurt_cobain
• 9,390 points
+1 vote
1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

answered Dec 13, 2019 in Apache Spark by Alexandru
• 510 points

edited Dec 13, 2019 by Alexandru 2,445 views
0 votes
1 answer
0 votes
1 answer

Difference between Spark ML & Spark MLlib package

org.apache.spark.mllib is the old Spark API while ...READ MORE

answered Jul 5, 2018 in Apache Spark by Shubham
• 13,490 points
+1 vote
2 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 3, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
0 votes
1 answer

Getting error while connecting zookeeper in Kafka - Spark Streaming integration

I guess you need provide this kafka.bootstrap.servers ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,490 points
0 votes
3 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3,1) df.filter(col("uid").isin(notFollowingList:_*)) You can ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP