ValueError help with Simple Exponential Smoothing analysis on my data set.

+1 vote

I'm very new, and attempting to teach myself Python through online resources.

I've attempted the following code I found online to conduct a Single Exponential Smoothing Analysis on my time series data:

# Simple Exponential Smoothing
fit1 = SimpleExpSmoothing(Data).fit(smoothing_level=0.2,optimized=False)
fcast1 = fit1.forecast(12).rename(r'$\alpha=0.2$')
# plot
fcast1.plot(marker='o', color='blue', legend=True)
fit1.fittedvalues.plot(marker='o',  color='blue')

fit2 = SimpleExpSmoothing(Data).fit(smoothing_level=0.6,optimized=False)
fcast2 = fit2.forecast(12).rename(r'$\alpha=0.6$')
# plot
fcast2.plot(marker='o', color='red', legend=True)
fit2.fittedvalues.plot(marker='o', color='red')

fit3 = SimpleExpSmoothing(Data).fit()
fcast3 = fit3.forecast(12).rename(r'$\alpha=%s$'%fit3.model.params['smoothing_level'])
# plot
fcast3.plot(marker='o', color='green', legend=True)
fit3.fittedvalues.plot(marker='o', color='green')

This code returns the error: 

"Pandas data cast to numpy dtype of object. Check input data with np.asarray(data)."

I have researched online and attempted to create dummy variables within my dataset, however the result remains the same.

Month Product Key Income



01-Apr-18 A101 8289.05
01-Apr-18 A102 0.00

Above is a 3 row snippet of the dataset, could it be the structure of the data that causes the forecast/SES code to result in an error?

Entire ValueError code:

ValueError Traceback (most recent call last) <ipython-input-6-c73edf2e1e6d> in <module> 1 # Simple Exponential Smoothing ----> 2 fit1 = SimpleExpSmoothing(Data).fit(smoothing_level=0.2,optimized=False) 3 fcast1 = fit1.forecast(12).rename(r'$\alpha=0.2$') 4 # plot 5 fcast1.plot(marker='o', color='blue', legend=True) 

~\Anaconda3\lib\site-packages\statsmodels\tsa\ in __init__(self, endog) 1005 1006 def __init__(self, endog): -> 1007 super(SimpleExpSmoothing, self).__init__(endog) 1008 1009 def fit(self, smoothing_level=None, optimized=True, start_params=None, 

~\Anaconda3\lib\site-packages\statsmodels\tsa\ in __init__(self, endog, trend, damped, seasonal, seasonal_periods, dates, freq, missing) 484 seasonal_periods=None, dates=None, freq=None, missing='none'): 485 super(ExponentialSmoothing, self).__init__( --> 486 endog, None, dates, freq, missing=missing) 487 if trend in ['additive', 'multiplicative']: 488 trend = {'additive': 'add', 'multiplicative': 'mul'}[trend] 

~\Anaconda3\lib\site-packages\statsmodels\tsa\base\ in __init__(self, endog, exog, dates, freq, missing, **kwargs) 47 missing='none', **kwargs): 48 super(TimeSeriesModel, self).__init__(endog, exog, missing=missing, ---> 49 **kwargs) 50 51 # Date handling in indexes ~\Anaconda3\lib\site-packages\statsmodels\base\ in __init__(self, endog, exog, **kwargs) 214 215 def __init__(self, endog, exog=None, **kwargs): --> 216 super(LikelihoodModel, self).__init__(endog, exog, **kwargs) 217 self.initialize() 218 ~\Anaconda3\lib\site-packages\statsmodels\base\ in __init__(self, endog, exog, **kwargs) 66 hasconst = kwargs.pop('hasconst', None) 67 = self._handle_data(endog, exog, missing, hasconst, ---> 68 **kwargs) 69 self.k_constant = 70 self.exog = 

~\Anaconda3\lib\site-packages\statsmodels\base\ in _handle_data(self, endog, exog, missing, hasconst, **kwargs) 89 90 def _handle_data(self, endog, exog, missing, hasconst, **kwargs): ---> 91 data = handle_data(endog, exog, missing, hasconst, **kwargs) 92 # kwargs arrays could have changed, easier to just attach here 93 for key in kwargs: 

~\Anaconda3\lib\site-packages\statsmodels\base\ in handle_data(endog, exog, missing, hasconst, **kwargs) 633 klass = handle_data_class_factory(endog, exog) 634 return klass(endog, exog=exog, missing=missing, hasconst=hasconst, --> 635 **kwargs) ~\Anaconda3\lib\site-packages\statsmodels\base\ in __init__(self, endog, exog, missing, hasconst, **kwargs) 74 self.orig_endog = endog 75 self.orig_exog = exog ---> 76 self.endog, self.exog = self._convert_endog_exog(endog, exog) 77 78 self.const_idx = None 

~\Anaconda3\lib\site-packages\statsmodels\base\ in _convert_endog_exog(self, endog, exog) 473 exog = exog if exog is None else np.asarray(exog) 474 if endog.dtype == object or exog is not None and exog.dtype == object: --> 475 raise ValueError("Pandas data cast to numpy dtype of object. " 476 "Check input data with np.asarray(data).") 477 return super(PandasData, self)._convert_endog_exog(endog, exog) 

ValueError: Pandas data cast to numpy dtype of object. Check input data with np.asarray(data).
Jul 30 in Python by Declan

edited Jul 31 125 views
Which part of the code is giving you this error?

I will be honest and say I don't know,

A thread on the ValueError with a hash is: # kwargs arrays could have changed, easier to just attach here

Does this mean anything to you? Alternatively, I could paste the entire ValueError on the post.

Thankyou for your help

It would be very helpful if you posted the entire error here.

Try converting your data to float using .astype().
I have edited to the post to include the ValueError
Try converting your data to float using .astype() before you use .fit()
Unfortunately this hasn't stopped the error, I even removed the 'product key' column to simplify the source data.

Could this error be related to a library that I might need to install/import?
I don't think its a missing library issue. It has to be a data type issue. Give me some time, I'll try the same on my system and get back to you.

Are you executing this on pycharm (I just need this for information, it won't matter much)
Hey @Declan, what form is your data in? Is it in the form of DataFrame format?
Hi, I'm currently using Jupyter not PyCharm and no I haven't used the DataFrame format, the example I was following didn't indicate to apply this.

I will try it now and see if it changes anything
@Kyraa I have just checked and using the "PD.READ.CSV" function creates a dataframe from the source data, if this is correct then the format is in a dataframe
@Kyraa, I think I have found the problem, I attempted to use some similar time series forecast through Power BI with the same dataset, and it returned with 'not enough data points'. I'm unsure how many points is required for the SES model, so I'll attempt the same code with another set holding more data points, to see if the result changes.


Yeah, try using a dataset with higher data points. Also, try converting your model as below before using the .fit()

fit2 = SimpleExpSmoothing(np.asarray(Data)).fit(smoothing_level=0.6,optimized=False)

Let me know what happens. 

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Python

0 votes
1 answer
0 votes
0 answers

i would like assistance on my code after i input: y = data.temp x = data.drop('temp', axis=1)

This error appeared:      Traceback (most recent call ...READ MORE

Jul 30 in Python by mimo
0 votes
1 answer

Need help writing a dataframe into a csv with the help of a loop

Using the following logic you can arrive ...READ MORE

answered Apr 17, 2018 in Python by anonymous
0 votes
1 answer

Need help with searching a binary search tree

Instead of multiplying the number of nodes ...READ MORE

answered Apr 17, 2018 in Python by anonymous
+1 vote
2 answers

Measuring the distance between pixels on OpenCv with Python

You can try this: Mat pts1(nPts, 1, CV_8UC2), ...READ MORE

answered Aug 24, 2018 in Python by Omkar
• 67,620 points
0 votes
1 answer

How to replace values with None in Pandas data frame in Python?

Actually in later versions of pandas this ...READ MORE

answered Aug 30, 2018 in Python by Priyaj
• 56,900 points
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6 in Python by Neha
• 330 points

edited Jul 8 by Kalgi 279 views
+4 votes
6 answers