TypeError Invalid param value given for param inputCols Could not convert DataFrame R D Spend double Administration double Marketing Spend double to list of strings

0 votes

Hi Guys,

I am trying to create one ML model using pyspark, but It shows me the below error.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/home/spark/spark/python/pyspark/ml/param/__init__.py in _set(self, **kwargs)
    438                 try:
--> 439                     value = p.typeConverter(value)
    440                 except TypeError as e:
~/home/spark/spark/python/pyspark/ml/param/__init__.py in toListString(value)
    156                 return [TypeConverters.toString(v) for v in value]
--> 157         raise TypeError("Could not convert %s to list of strings" % value)
    158
TypeError: Could not convert DataFrame[R&D Spend: double, Administration: double, Marketing Spend: double, State_Florida: bigint, State_New York: bigint] to list of strings
During handling of the above exception, another exception occurred:
TypeError                                 Traceback (most recent call last)
<ipython-input-46-e14adf296775> in <module>
----> 1 assembler = VectorAssembler(inputCols=features,outputCol="features")
~/home/spark/spark/python/pyspark/__init__.py in wrapper(self, *args, **kwargs)
    108             raise TypeError("Method %s forces keyword arguments." % func.__name__)
    109         self._input_kwargs = kwargs
--> 110         return func(self, **kwargs)
    111     return wrapper
    112
~/home/spark/spark/python/pyspark/ml/feature.py in __init__(self, inputCols, outputCol, handleInvalid)
   2795         self._setDefault(handleInvalid="error")
   2796         kwargs = self._input_kwargs
-> 2797         self.setParams(**kwargs)
   2798
   2799     @keyword_only
~/home/spark/spark/python/pyspark/__init__.py in wrapper(self, *args, **kwargs)
    108             raise TypeError("Method %s forces keyword arguments." % func.__name__)
    109         self._input_kwargs = kwargs
--> 110         return func(self, **kwargs)
    111     return wrapper
    112
~/home/spark/spark/python/pyspark/ml/feature.py in setParams(self, inputCols, outputCol, handleInvalid)
   2805         """
   2806         kwargs = self._input_kwargs
-> 2807         return self._set(**kwargs)
   2808
   2809
~/home/spark/spark/python/pyspark/ml/param/__init__.py in _set(self, **kwargs)
    439                     value = p.typeConverter(value)
    440                 except TypeError as e:
--> 441                     raise TypeError('Invalid param value given for param "%s". %s' % (p.name, e))
    442             self._paramMap[p] = value
    443         return self
TypeError: Invalid param value given for param "inputCols". Could not convert DataFrame[R&D Spend: double, Administration: double, Marketing Spend: double, State_Florida: bigint, State_New York: bigint] to list of strings


How can I solve this?

May 7, 2020 in Apache Spark by akhtar
• 38,240 points
1,673 views

1 answer to this question.

0 votes

Hi@akhtar,

In your error, it shows that you are trying to pass one dataframe. But it expects one list. So you have to convert your dataframe in a list and then you can apply that list for your task. To convert one spark dataframe to list, you can use this bellow function.

$ list = df.columns[0:3]

Inside this function you have to pass no of columns you want to convert.

Hope this will work.

answered May 7, 2020 by MD
• 95,460 points

Related Questions In Apache Spark

0 votes
1 answer

Convert the given Spar rdd object to Spark DataFrame.

You can create a DataFrame from the ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
612 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,441 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
18,539 views
0 votes
1 answer

How to change encryption key value for local I/O?

There another property where you can set ...READ MORE

answered Mar 14, 2019 in Apache Spark by Raj
297 views
0 votes
1 answer

PySpark Config ?

Mainly, we use SparkConf because we need ...READ MORE

answered Jul 26, 2018 in Apache Spark by kurt_cobain
• 9,390 points
341 views
0 votes
1 answer

env: ‘python’: No such file or directory in pyspark.

Hi@akhtar, This error occurs because your python version ...READ MORE

answered Apr 7, 2020 in Apache Spark by MD
• 95,460 points
5,039 views
0 votes
1 answer

ImportError: No module named 'pyspark'

Hi@akhtar, By default pyspark in not present in ...READ MORE

answered May 6, 2020 in Apache Spark by MD
• 95,460 points
13,723 views
+1 vote
1 answer

How to convert pyspark Dataframe to pandas Dataframe?

Hi@akhtar, To convert pyspark dataframe into pandas dataframe, ...READ MORE

answered May 7, 2020 in Apache Spark by MD
• 95,460 points
7,519 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP