Linear Regression Normalization Vs Standardization

Question

I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables.

Normalization = x -xmin/ xmax – xmin Zero Score Standardization = x - xmean/ xstd

a) Also, when to Normalize (Vs) Standardize ?
b) How Normalization affects Linear Regression?
c) Is it okay if I don't normalize all the attributes/lables in the linear regression?

Dev · Answer 1 · Mar 8, 2022

Your data is transformed into a range between 0 and 1 after normalization.

Standardization is the process of transforming your data into a distribution with a mean of 0 and a standard deviation of 1.It's worth noting that the outcomes might not be so dissimilar. It's possible that the two methods merely require different hyperparameters to produce equivalent outcomes.

The best thing to do is to see what works best for your situation. If you can't afford it for any reason, standardization will probably be more beneficial than normalizing for most algorithms.

Normalization and standardization are both intended to achieve the same goal: to develop features with similar ranges. This allows us to ensure that we are capturing the genuine information in a feature and that we are not over-weighting a feature just because its values are considerably higher than those of other features.

There's no need to standardize/normalize if all of your features are within a reasonable range of one another. Normalization/standardization is required if some features naturally take on values that are substantially larger/smaller than others.

If you're going to normalize at least one variable/feature, I'd recommend doing it for all of them as well.

For some examples of when one should be preferred over the other, see the following:

Standardization may be especially important in clustering analysis to analyze similarities between features based on certain distance measurements. Another well-known example is Principal Component Analysis, where we favor standardization over Min-Max scaling because we're looking for the components that maximize variance (depending on the question and if the PCA computes the components via the correlation matrix instead of the covariance matrix)
This isn't to say, though, that Min-Max scaling isn't useful! Image processing is a common application in which pixel intensities must be standardized to fit within a given range (i.e., 0 to 255 for the RGB color range). Furthermore, most neural network algorithms require data on a 0-1 scale.

Normalization has the disadvantage of losing some data information, particularly about outliers, when compared to standardization.

Linear Regression Normalization Vs Standardization

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Machine Learning

Ignore the NaN and do the linear regression on remaining values

Create dataframe using Pandas - Linear Regression

What is rolling linear regression?

Python script for linear regression on panda dataframe

Python code for basic linear regression

Show python implementation of Lasso class - regression

What is LassoLars? - Linear regression

Can you give LassoLars python example?

Alternatives to linear regression for dataset with many points with small value and some extreme values

How to resolve heteroscedasticity in Multiple Linear Regression in R?

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES