Spark Machine Learning pipeline works fine in Spark 1 6 but it gives error when executed on Spark 2 x

0 votes

I have written a code in Spark1.6 which was working fine. However when I converted it to Saprk 2.0. I am getting an error as following:

 <console>:56: error: type mismatch;
found   : Array[]
required: Array[ with                                                                                                                         m.shared.HasOutputCol with{def co                                                                                                                       py(extra:                                                                                                                      with with                                                                                                                     l.DefaultParamsWritable{def copy(extra: org                                                                                                            with                                                                                                                     l with}}]
Note: <:                                                                                                                     elineStage with with org.apache.sp                                                                                                           {def copy(extra:                                                                                                                     mMap): with                                                                                                                     asOutputCol with{def copy(extra:                                                                                                             with org.                                                                                                            with                                                                                                                     aramsWritable}}, but class Array is invariant in type T.   
 You may wish to investigate a wildcard type such as `_ <:                                                                                                                     pelineStage with with org.apache.s                                                                                                           {def copy(extra:                                                                                                                     amMap): with                                                                                                                     HasOutputCol with{def copy(extra:                                                                                                             with org                                                                                                            with                                                                                                                     ParamsWritable}}`. (SLS 3.2.10)
May 31, 2018 in Apache Spark by hack236

1 answer to this question.

0 votes

You need to change the following:

val pipeline = new Pipeline().setStages(discretizers ++ Array(assembler, selector))

answered May 31, 2018 by Shubham
• 13,490 points

