I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables.

Normalization = x -xmin/ xmax – xmin   Zero Score Standardization = x - xmean/ xstd

a) Also, when to Normalize (Vs) Standardize ?
b) How Normalization affects Linear Regression?
c) Is it okay if I don't normalize all the attributes/lables in the linear regression?
Mar 4, 2022 841 views

## 1 answer to this question.

Your data is transformed into a range between 0 and 1 after normalization.

Standardization is the process of transforming your data into a distribution with a mean of 0 and a standard deviation of 1.It's worth noting that the outcomes might not be so dissimilar. It's possible that the two methods merely require different hyperparameters to produce equivalent outcomes.

The best thing to do is to see what works best for your situation. If you can't afford it for any reason, standardization will probably be more beneficial than normalizing for most algorithms.

Normalization and standardization are both intended to achieve the same goal: to develop features with similar ranges. This allows us to ensure that we are capturing the genuine information in a feature and that we are not over-weighting a feature just because its values are considerably higher than those of other features.

There's no need to standardize/normalize if all of your features are within a reasonable range of one another. Normalization/standardization is required if some features naturally take on values that are substantially larger/smaller than others.

If you're going to normalize at least one variable/feature, I'd recommend doing it for all of them as well.

For some examples of when one should be preferred over the other, see the following:

Standardization may be especially important in clustering analysis to analyze similarities between features based on certain distance measurements. Another well-known example is Principal Component Analysis, where we favor standardization over Min-Max scaling because we're looking for the components that maximize variance (depending on the question and if the PCA computes the components via the correlation matrix instead of the covariance matrix)
This isn't to say, though, that Min-Max scaling isn't useful! Image processing is a common application in which pixel intensities must be standardized to fit within a given range (i.e., 0 to 255 for the RGB color range). Furthermore, most neural network algorithms require data on a 0-1 scale.

Normalization has the disadvantage of losing some data information, particularly about outliers, when compared to standardization.
• 6,000 points

## Ignore the NaN and do the linear regression on remaining values

Yes, you can do this using statsmodels: import ...READ MORE

## What is rolling linear regression?

Rolling regression is the analysis of changing ...READ MORE

## Python script for linear regression on panda dataframe

Use the following code: from scipy import stats slope, ...READ MORE

## Python code for basic linear regression

Hi @Dipti, you could try something like ...READ MORE

## Show python implementation of Lasso class - regression

Hey @Tanmay, try something like this: >>> from ...READ MORE

## What is LassoLars? - Linear regression

LassoLars is a lasso model implemented using ...READ MORE

## Can you give LassoLars python example?

Hey @Vivek, Try something like this: >>> from ...READ MORE