Linear Regression Normalization Vs Standardization

0 votes
I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables.

Normalization = x -xmin/ xmax – xmin   Zero Score Standardization = x - xmean/ xstd  

a) Also, when to Normalize (Vs) Standardize ?
b) How Normalization affects Linear Regression?
c) Is it okay if I don't normalize all the attributes/lables in the linear regression?
Mar 4, 2022 in Machine Learning by Nandini
• 5,480 points

1 answer to this question.

0 votes
Your data is transformed into a range between 0 and 1 after normalization.

Standardization is the process of transforming your data into a distribution with a mean of 0 and a standard deviation of 1.It's worth noting that the outcomes might not be so dissimilar. It's possible that the two methods merely require different hyperparameters to produce equivalent outcomes.

The best thing to do is to see what works best for your situation. If you can't afford it for any reason, standardization will probably be more beneficial than normalizing for most algorithms.

Normalization and standardization are both intended to achieve the same goal: to develop features with similar ranges. This allows us to ensure that we are capturing the genuine information in a feature and that we are not over-weighting a feature just because its values are considerably higher than those of other features.

There's no need to standardize/normalize if all of your features are within a reasonable range of one another. Normalization/standardization is required if some features naturally take on values that are substantially larger/smaller than others.

If you're going to normalize at least one variable/feature, I'd recommend doing it for all of them as well.

For some examples of when one should be preferred over the other, see the following:

Standardization may be especially important in clustering analysis to analyze similarities between features based on certain distance measurements. Another well-known example is Principal Component Analysis, where we favor standardization over Min-Max scaling because we're looking for the components that maximize variance (depending on the question and if the PCA computes the components via the correlation matrix instead of the covariance matrix)
This isn't to say, though, that Min-Max scaling isn't useful! Image processing is a common application in which pixel intensities must be standardized to fit within a given range (i.e., 0 to 255 for the RGB color range). Furthermore, most neural network algorithms require data on a 0-1 scale.

Normalization has the disadvantage of losing some data information, particularly about outliers, when compared to standardization.
answered Mar 8, 2022 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer

What is rolling linear regression?

Rolling regression is the analysis of changing ...READ MORE

answered May 23, 2019 in Machine Learning by Jinu
0 votes
1 answer

Python script for linear regression on panda dataframe

Use the following code: from scipy import stats slope, ...READ MORE

answered May 23, 2019 in Machine Learning by Imran
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

What is LassoLars? - Linear regression

LassoLars is a lasso model implemented using ...READ MORE

answered May 22, 2019 in Machine Learning by Basu
0 votes
1 answer
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP