Linear regression returning bad fit with large x values

Question

I'm looking to do a linear regression to determine the estimated date of depletion for a particular resource. I have a dataset containing a column of dates, and several columns of data, always decreasing. A linear regression using scikit learn's LinearRegression() function yields a bad fit.

I converted the date column to ordinal, which resulted in values ~700,000. Relative to the y axis of values between 0-200, this is rather large. I imagine that the regression function is starting at low values and working its way up, eventually giving up before it finds a good enough fit. If i could assign starting values to the parameters, large intercept and small slope, perhaps it would fix the problem. I don't know how to do this, and i am very curious as to other solutions.
Here is the code;

model=LinearRegression().fit(dates,y)
model.score(dates,y)

y_pred=model.predict(dates)

plt.scatter(dates,y)
plt.plot(dates,y_pred,color='red')
plt.show()

print(model.intercept_)
print(model.coef_)

This code plots the linear model over the data, yielding stunning inaccuracy. I would share in this post, but i am not sure how to post an image from my desktop.

My original data is dates, and i convert to ordinal in code i have not shared here. If there is an easier way to do this that would be more accurate, i would appreciate a suggestion.

Nandini · Answer 1 · Mar 23, 2022

To make the date values start at zero, subtract the minimum date (X) value from all date values.

There's no need to scale up by a factor of 1000. The above solution should work fine for your question.