ValueError could not convert string to float in Machine learning

+1 vote

Hi Guys,

I am trying to filter my dataset using constant variable method, but it shows me the bellow error.

ValueError                                Traceback (most recent call last)
<ipython-input-10-d28793719248> in <module>
----> 1 model.fit(dataset)
~\anaconda3\lib\site-packages\sklearn\feature_selection\_variance_threshold.py in fit(self, X, y)
     67         """
     68         X = check_array(X, ('csr', 'csc'), dtype=np.float64,
---> 69                         force_all_finite='allow-nan')
     70 
     71         if hasattr(X, "toarray"):   # sparse matrix
~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    529                     array = array.astype(dtype, casting="unsafe", copy=False)
    530                 else:
--> 531                     array = np.asarray(array, order=order, dtype=dtype)
    532             except ComplexWarning:
    533                 raise ValueError("Complex data not supported\n"
~\anaconda3\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 
ValueError: could not convert string to float: '208 Michael Ferry Apt. 674\nLaurabury, NE 37010-5101'

How can I solve this error?

Apr 14, 2020 in Machine Learning by akhtar
• 38,170 points
10,788 views

1 answer to this question.

0 votes

Hi@akhtar,

You are trying to use constant variable method for filtering your dataset. But your dataset may contain string as shown in the error. We know constant or quasi-constant method is used to filter out the columns which contains only numeric value.

To avoid this error you can use co-relation method to filter out your string data.

Hope this will help.

answered Apr 14, 2020 by MD
• 95,060 points


logmodel.fit(X_train,y_train)
ValueError                                Traceback (most recent call last)
<ipython-input-39-0b508b2e1562> in <module>
----> 1 logmodel.fit(X_train,y_train)

~\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight)
   1525 
   1526         X, y = check_X_y(X, y, accept_sparse='csr', dtype=_dtype, order="C",
-> 1527                          accept_large_sparse=solver != 'liblinear')
   1528         check_classification_targets(y)
   1529         self.classes_ = np.unique(y)

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    753                     ensure_min_features=ensure_min_features,
    754                     warn_on_dtype=warn_on_dtype,
--> 755                     estimator=estimator)
    756     if multi_output:
    757         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    529                     array = array.astype(dtype, casting="unsafe", copy=False)
    530                 else:
--> 531                     array = np.asarray(array, order=order, dtype=dtype)
    532             except ComplexWarning:
    533                 raise ValueError("Complex data not supported\n"

~\anaconda3\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: could not convert string to float: 'SOTON/O.Q. 3101307'

As mentioned above you have to convert your string data to float. For that you can use the concept of categorical variable. Just remove your string column and pass that column in dummy variable function.

$ pd.get_dummies(string column)

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer

What is clustering in Machine Learning?

Clustering is a type of unsupervised learning ...READ MORE

answered May 9, 2019 in Machine Learning by Shridhar
271 views
+1 vote
2 answers

ValueError: Not enough values to unpack

Make the following changes in your script, ...READ MORE

answered Jun 24, 2019 in Machine Learning by Omkar
• 69,090 points
12,281 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 1,850 views
0 votes
0 answers
+4 votes
6 answers

Lowercase in Python

You can simply the built-in function in ...READ MORE

answered Apr 11, 2018 in Python by hemant
• 5,810 points
1,422 views
0 votes
1 answer

How to save machine learning model?

Hi@akhtar, To save your Machine Learning model, you ...READ MORE

answered Apr 13, 2020 in Machine Learning by MD
• 95,060 points
219 views