How to export regression equations for grouped data

0 votes

I have a data frame PlotData_df with 3 columns: Velocity (numeric), Height(numeric), Gender(categorical).

        Velocity Height Gender
1       4.1    3.0   Male
2       3.1    4.0 Female
3       3.9    2.4 Female
4       4.6    2.8   Male
5       4.1    3.3 Female
6       3.1    3.2 Female
7       3.7    3.0   Male
8       3.6    2.4   Male
9       3.2    2.7 Female
10      4.2    2.5   Male

I used the following to give regression equation for complete data:

c <- lm(Height ~ Velocity, data = PlotData_df)

summary(c)
#             Estimate Std. Error t value Pr(>|t|)   
# (Intercept)   4.1283     1.0822   3.815  0.00513 **
# Velocity     -0.3240     0.2854  -1.135  0.28915   
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# Residual standard error: 0.4389 on 8 degrees of freedom
# Multiple R-squared:  0.1387,  Adjusted R-squared:  0.03108 
# F-statistic: 1.289 on 1 and 8 DF,  p-value: 0.2892

a <- signif(coef(c)[1], digits = 2)
b <- signif(coef(c)[2], digits = 2)
Regression <- paste0("Velocity = ",b," * Height + ",a)
print(Regression)
# [1] "Velocity = -0.32 * Height + 4.13"

How can I extend this to display two regression equations (depending on whether Gender is Male or Female)?

Mar 7 in Machine Learning by Nandini
• 5,480 points
21 views

1 answer to this question.

0 votes

First, you'll need a linear model with Height and Gender interaction. Try:

Fit <- lm(formula = Velocity ~ Height * Gender, data = PlotData_df)

Then choose whether or not to show the fitted regression function / equation. You'll need two equations, one for males and the other for females. Because we choose to plug in coefficients / numbers, there is no alternative option. The instructions below will show you how to obtain them.

## formatted coefficients
theta <- signif(fit$coef, digits = 2)
# (Intercept)  Height  GenderMale  Height:GenderMale
#        4.42   -0.30       -1.01               0.54 

## equation for Female:
eqn_Female <- paste0("Velocity = ", theta[2], " * Height + ", theta[1])
# [1] "Velocity = -0.30 * Height + 4.42"

## equation for Male:
eqn_Male <- paste0("Velocity = ", theta[2] + theta[4], " * Height + ", theta[1] + theta[3])
# [1] "Velocity = 0.24 * Height + 3.41"


The slope for Male is theta[2] + theta[4], while the intercept is theta[1] + theta[3]. You can familiarize yourself with ANOVA and contrast treatment for factor variables.

answered Mar 14 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer

How to get early stopping for lasso regression

I believe you're referring to regularization. In ...READ MORE

answered Mar 23 in Machine Learning by Nandini
• 5,480 points
34 views
0 votes
1 answer

How to plot support vectors for support vector regression?

The problem was solved after I improved ...READ MORE

answered Mar 25 in Machine Learning by Nandini
• 5,480 points
19 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Extract regression coefficient values

A quick rundown. These values are stored ...READ MORE

answered Mar 30 in Machine Learning by Dev
• 6,000 points
25 views
0 votes
1 answer
0 votes
1 answer

How to add regression line equation and R2 on graph?

Below is one solution: # GET EQUATION AND ...READ MORE

answered Jun 1, 2018 in Data Analytics by DataKing99
• 8,240 points
5,844 views
0 votes
1 answer

How to perform regression algorithm on a textual data(IMDB reviews)?

You can use either word2vec or tf-idf ...READ MORE

answered Mar 30 in Machine Learning by Dev
• 6,000 points
28 views
webinar REGISTER FOR FREE WEBINAR X
Send OTP
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP