I have a data.frame that includes factor and numeric variables, as may be seen below.

testFrame = data.frame(First=sample (1:10), Second=sample (1:20), Third=sample (1:10), Replace=T);
Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4) I want to construct a matrix that assigns dummy variables to the factor and leaves the numeric variables alone. Fourth=rep(c("Alice","Bob","Charlie","David"), 5), and Fifth=rep(c("Edward","Frank","Georgia

First + Second + Third + Fourth + Fifth, data=testFrame, model.matrix
This eliminates the reference level for one level of each factor, as expected when executing lm. But I want to create a matrix that includes a dummy or indicator variable for each level of every factor. I am not concerned about multicollinearity because I am developing this matrix for glmnet.

Is there a way to have model.matrix create the dummy for every level of the factor?
Jul 9, 2022 414 views

## 1 answer to this question.

Yes, you can modify the model.matrix() function to create dummy variables for every level of a factor variable, including all levels of each factor. By default, model.matrix() uses a treatment contrast coding, which creates dummy variables for each level except one (reference level). To include all levels as separate dummy variables, you can use the contrasts.arg parameter in the model.matrix() function. Here's an example:

```testFrame <- data.frame(First = sample(1:10), Second = sample(1:20), Third = sample(1:10), Replace = TRUE)
Fourth <- rep(c("Alice", "Bob", "Charlie", "David"), 5)
Fifth <- rep(c("Edward", "Frank", "Georgia", "Hank", "Isaac"), 4)
testFrame\$Fourth <- as.factor(Fourth)
testFrame\$Fifth <- as.factor(Fifth)

dummyMatrix <- model.matrix(~., data = testFrame, contrasts.arg = lapply(testFrame[ , sapply(testFrame, is.factor)], contrasts, contrasts = FALSE))```

In this example, we convert the Fourth and Fifth variables to factors and then pass the testFrame data.frame to the model.matrix() function. The contrasts.arg parameter uses lapply() to apply the contrasts() function with contrasts = FALSE to all factor variables in testFrame. This ensures that dummy variables are created for all levels of each factor variable.

The resulting dummyMatrix will include dummy variables for every level of each factor variable, while leaving the numeric variables unchanged.

Enhance your data skills with our comprehensive Data Analytics Courses – Enroll now!

• 1,100 points

## How to write a custom function which will replace all the missing values in a vector with the mean of values in R?

Consider this vector: a<-c(1,2,3,NA,4,5,NA,NA) Write the function to impute ...READ MORE

## Show a list of all variables in R

Hi Swathi, You can use ls() to list ...READ MORE

## How to replace all occurrences of a character in a character column in a data frame in R

If you used sub() to replace the ...READ MORE

## How do I make a matrix from a list of vectors in R?

Suppose l1 and l2 are my vectors, li = ...READ MORE

## Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

## Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

The below is the code to perform ...READ MORE