Efficient online linear regression algorithm in python

0 votes

I got a 2-D dataset with two columns x and y. I would like to get the linear regression coefficients and interception dynamically when new data feed in. Using scikit-learn I could calculate all current available data like this:

from sklearn.linear_model import LinearRegression
regr = LinearRegression()
x = np.arange(100)
y = np.arange(100)+10*np.random.random_sample((100,))
regr.fit(x,y)
print(regr.coef_)
print(regr.intercept_)

However, I got quite big dataset (more than 10k rows in total) and I want to calculate coefficient and intercept as fast as possible whenever there's new rows coming in. Currently calculate 10k rows takes about 600 microseconds, and I want to accelerate this process.

Scikit-learn looks like does not have online update function for linear regression module. Is there any better ways to do this?

Mar 21, 2022 in Machine Learning by Dev
• 6,000 points
1,233 views

1 answer to this question.

0 votes

To calculate 10k rows, and also to speed up the process use scikit learn that calculates 10k samples really fast
The code you can find below.

from sklearn.linear_model import LinearRegression
x = np.arange(10000).reshape(-1,1)
y = np.arange(10000)+100*np.random.random_sample((10000,))
regr = LinearRegression()
%timeit regr.fit(x,y)
# 419 µs ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
answered Mar 23, 2022 by Nandini
• 5,480 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer

Python script for linear regression on panda dataframe

Use the following code: from scipy import stats slope, ...READ MORE

answered May 23, 2019 in Machine Learning by Imran
1,930 views
0 votes
1 answer
0 votes
1 answer
+1 vote
2 answers

View onto a numpy array?

 just index it as you normally would. ...READ MORE

answered Oct 18, 2018 in Python by roberto
989 views
0 votes
1 answer
0 votes
1 answer

Difference between classification and regression score in Python scikit learn

Classification Score is used for discrete values ...READ MORE

answered Feb 24, 2022 in Machine Learning by Nandini
• 5,480 points
584 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP