Logistic regression coefficient meaning

Question

I'm trying to write my own logistic regressor (using batch/mini-batch gradient descent) for practice purposes.

I generated a random dataset (see below) with normally distributed inputs, and the output is binary (0,1). I manually used coefficients for the input and was hoping to be able to reproduce them (see below for the code snippet). However, to my surprise, neither my own code, nor sklearn LogisticRegression were able to reproduce the actual numbers (although the sign and order of magnitude are in line). Moreso, the coefficients my algorithm produced are different than the one produced by sklearn. Am I misinterpreting what the coefficients for a logistic regression are?

I will appreciate any insight into this discrepancy.

Thank you!

edit: I tried using statsmodels Logit and got yet a third set of slightly different values for the coefficients

Some more info that might be relevant: I wrote a linear regressor using an almost identical code and it worked perfectly, so I am fairly confident this is not a problem in the code. Also my regressor actually outperformed the sklearn one on the training set, and they have the exact same accuracy on the test set, so I have no reason to believe the regressors are wrong.

Code snippets for the generation of the dataset:

o1 = 2
o2 = -3
x[:,1]=np.random.rand(size)*2
x[:,2]=np.random.rand(size)*3
y = np.vectorize(sigmoid)(x[:,1]*o1+x[:,2]*o2 + np.random.normal(size=size))

so as can be seen, input coefficients are +2 and -3 (intercept 0); sklearn coefficients were ~2.8 and ~-4.8; my coefficients were ~1.7 and ~-2.6

and of the regressor (the most relevant parts of it):

for j in range(bin_size):
    xs = x[i]
    y_real = y[i]
    z = np.dot(self.coeff,xs)
    h = sigmoid(z)
    dc+= (h-y_real)*xs
self.coeff-= dc * (learning_rate/n)

Nandini · Answer 1 · Mar 23, 2022

What did the intercept teach you? It's hardly surprising, given that your y is a third-degree polynomial and your model only has two coefficients, whereas 3 + y-intercept would be required to model the response variable from predictors.
Furthermore, because to SGD, for example, values may differ, but coefficients may differ, resulting in correct y for a finite number of points.
To be sure and rule out the iterative approach failing, use np.linalg.inv to solve the normal equation and observe the coefficients. Also, check to see if regularization was applied in statsmodels and/or sklearn predicts.