Screening multi collinearity in a regression model

Question

I hope that this one is not going to be "ask-and-answer" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regression model. How to cure them... well, sometimes you don't need to "cure" collinearity, since it doesn't affect regression model itself, but interpretation of an effect of individual predictors.

One way to spot collinearity is to put each predictor as a dependent variable, and other predictors as independent variables, determine R², and if it's larger than .9 (or .95), we can consider predictor redundant. This is one "method"... what about other approaches? Some of them are time consuming, like excluding predictors from model and watching for b-coefficient changes - they should be noticeably different.

Of course, we must always bear in mind the specific context/goal of the analysis... Sometimes, only remedy is to repeat a research, but right now, I'm interested in various ways of screening redundant predictors when (multi)collinearity occurs in a regression model.

Dev · Answer 1 · Mar 30, 2022

The kappa() function can be of assistance. Here's an example of a modeled case. You can use help(kappa) for details.

> set.seed(42)
> y1 <- rnorm(100)
> y2 <- rnorm(100)
> y3 <- y1 + 2*y2 + rnorm(100)*0.0001    # so y3 approx a linear comb. of y1+y2
> mm12 <- model.matrix(~ y1 + y2)        # normal model, two indep. regressors
> mm123 <- model.matrix(~ y1 + y2 + y3)  # bad model with near collinearity
> kappa(mm12)                            # a 'low' kappa is good
[1] 1.166029
> kappa(mm123)                           # a 'high' kappa not good
[1] 121530.7

We go even further by increasing the collinearity of the third regressor:

> y4 <- y1 + 2*y2 + rnorm(100)*0.000001  # even more collinear
> mm124 <- model.matrix(~ y1 + y2 + y4)
> kappa(mm124)
[1] 13955982
> y5 <- y1 + 2*y2                        # now y5 is linear comb of y1,y2
> mm125 <- model.matrix(~ y1 + y2 + y5)
> kappa(mm125)
[1] 1.067568e+16