I hope that this one is not going to be "ask-and-answer" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regression model. How to cure them... well, sometimes you don't need to "cure" collinearity, since it doesn't affect regression model itself, but interpretation of an effect of individual predictors.

One way to spot collinearity is to put each predictor as a dependent variable, and other predictors as independent variables, determine R2, and if it's larger than .9 (or .95), we can consider predictor redundant. This is one "method"... what about other approaches? Some of them are time consuming, like excluding predictors from model and watching for b-coefficient changes - they should be noticeably different.

Of course, we must always bear in mind the specific context/goal of the analysis... Sometimes, only remedy is to repeat a research, but right now, I'm interested in various ways of screening redundant predictors when (multi)collinearity occurs in a regression model. Mar 26 30 views

## 1 answer to this question.

The kappa() function can be of assistance. Here's an example of a modeled case. You can use help(kappa) for details.

```> set.seed(42)
> y1 <- rnorm(100)
> y2 <- rnorm(100)
> y3 <- y1 + 2*y2 + rnorm(100)*0.0001    # so y3 approx a linear comb. of y1+y2
> mm12 <- model.matrix(~ y1 + y2)        # normal model, two indep. regressors
> mm123 <- model.matrix(~ y1 + y2 + y3)  # bad model with near collinearity
> kappa(mm12)                            # a 'low' kappa is good
 1.166029
> kappa(mm123)                           # a 'high' kappa not good
 121530.7
```

We go even further by increasing the collinearity of the third regressor:

```> y4 <- y1 + 2*y2 + rnorm(100)*0.000001  # even more collinear
> mm124 <- model.matrix(~ y1 + y2 + y4)
> kappa(mm124)
 13955982
> y5 <- y1 + 2*y2                        # now y5 is linear comb of y1,y2
> mm125 <- model.matrix(~ y1 + y2 + y5)
> kappa(mm125)
 1.067568e+16

``` answered Mar 30 by
• 6,000 points

## How do I create a linear regression model in Weka without training?

Weka is a classification algorithm. This is ...READ MORE

## How to load a model from an HDF5 file in Keras?

Hi@akhtar, If you stored the complete model, not ...READ MORE

## Can someone explain to me the difference between a cost function and the gradient descent equation in logistic regression?

when we train a model with data, ...READ MORE

## difference between a cost function and the gradient descent equation in logistic regression?

Cost function is a way to evaluate ...READ MORE

## Plot logistic regression curve in R

The Code looks something like this: fit = ...READ MORE

## How to add regression line equation and R2 on graph?

Below is one solution: # GET EQUATION AND ...READ MORE

## How to export regression equations for grouped data?

First, you'll need a linear model with ...READ MORE

## R: Force regression coefficients to add up to 1

b1 + b2 = 1 Let us fit ...READ MORE