TypeError Invalid param value given for param inputCols Could not convert DataFrame R D Spend double Administration double Marketing Spend double to list of strings

0 votes

Hi Guys,

I am trying to create one ML model using pyspark, but It shows me the below error.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/home/spark/spark/python/pyspark/ml/param/__init__.py in _set(self, **kwargs)
    438                 try:
--> 439                     value = p.typeConverter(value)
    440                 except TypeError as e:
~/home/spark/spark/python/pyspark/ml/param/__init__.py in toListString(value)
    156                 return [TypeConverters.toString(v) for v in value]
--> 157         raise TypeError("Could not convert %s to list of strings" % value)
    158
TypeError: Could not convert DataFrame[R&D Spend: double, Administration: double, Marketing Spend: double, State_Florida: bigint, State_New York: bigint] to list of strings
During handling of the above exception, another exception occurred:
TypeError                                 Traceback (most recent call last)
<ipython-input-46-e14adf296775> in <module>
----> 1 assembler = VectorAssembler(inputCols=features,outputCol="features")
~/home/spark/spark/python/pyspark/__init__.py in wrapper(self, *args, **kwargs)
    108             raise TypeError("Method %s forces keyword arguments." % func.__name__)
    109         self._input_kwargs = kwargs
--> 110         return func(self, **kwargs)
    111     return wrapper
    112
~/home/spark/spark/python/pyspark/ml/feature.py in __init__(self, inputCols, outputCol, handleInvalid)
   2795         self._setDefault(handleInvalid="error")
   2796         kwargs = self._input_kwargs
-> 2797         self.setParams(**kwargs)
   2798
   2799     @keyword_only
~/home/spark/spark/python/pyspark/__init__.py in wrapper(self, *args, **kwargs)
    108             raise TypeError("Method %s forces keyword arguments." % func.__name__)
    109         self._input_kwargs = kwargs
--> 110         return func(self, **kwargs)
    111     return wrapper
    112
~/home/spark/spark/python/pyspark/ml/feature.py in setParams(self, inputCols, outputCol, handleInvalid)
   2805         """
   2806         kwargs = self._input_kwargs
-> 2807         return self._set(**kwargs)
   2808
   2809
~/home/spark/spark/python/pyspark/ml/param/__init__.py in _set(self, **kwargs)
    439                     value = p.typeConverter(value)
    440                 except TypeError as e:
--> 441                     raise TypeError('Invalid param value given for param "%s". %s' % (p.name, e))
    442             self._paramMap[p] = value
    443         return self
TypeError: Invalid param value given for param "inputCols". Could not convert DataFrame[R&D Spend: double, Administration: double, Marketing Spend: double, State_Florida: bigint, State_New York: bigint] to list of strings


How can I solve this?

May 7, 2020 in Apache Spark by akhtar
• 38,230 points
2,157 views

1 answer to this question.

0 votes

Hi@akhtar,

In your error, it shows that you are trying to pass one dataframe. But it expects one list. So you have to convert your dataframe in a list and then you can apply that list for your task. To convert one spark dataframe to list, you can use this bellow function.

$ list = df.columns[0:3]

Inside this function you have to pass no of columns you want to convert.

Hope this will work.

To know more about Pyspark, it's recommended that you join Pyspark Course today.

answered May 7, 2020 by MD
• 95,440 points

Related Questions In Apache Spark

0 votes
1 answer

Convert the given Spar rdd object to Spark DataFrame.

You can create a DataFrame from the ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
838 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
3,716 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

answered Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
19,881 views
0 votes
1 answer

How to change encryption key value for local I/O?

There another property where you can set ...READ MORE

answered Mar 14, 2019 in Apache Spark by Raj
489 views
0 votes
1 answer

PySpark Config ?

Mainly, we use SparkConf because we need ...READ MORE

answered Jul 26, 2018 in Apache Spark by kurt_cobain
• 9,390 points
636 views
0 votes
1 answer

env: ‘python’: No such file or directory in pyspark.

Hi@akhtar, This error occurs because your python version ...READ MORE

answered Apr 7, 2020 in Apache Spark by MD
• 95,440 points
5,937 views
0 votes
1 answer

ImportError: No module named 'pyspark'

Hi@akhtar, By default pyspark in not present in ...READ MORE

answered May 6, 2020 in Apache Spark by MD
• 95,440 points
14,967 views
+1 vote
1 answer

How to convert pyspark Dataframe to pandas Dataframe?

Hi@akhtar, To convert pyspark dataframe into pandas dataframe, ...READ MORE

answered May 7, 2020 in Apache Spark by MD
• 95,440 points
7,957 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP