Plotting logistic regression in R with the Smarket dataset

0 votes

I'm attempting to plot a simple logistic regression in R for the Smarket data set (in the library "MASS"). I've successfully completed the glm.fit process to calculate the deviance residuals and coefficients, but I'd really like to visualize the logistic regression, but cannot find a simple way to do this.

For more context, I'm using "An Introduction to Statistical Learning". This is an extension of the exercise from page 156-160 (btw, this is not a school requirement, just me trying to figure it out). I imagine this is a relatively easy problem, but I'm new to R and can't seem to get it.

Thanks.

The code I've used is provided below:

glm.fit=glm(Direction~Lag1+Lag2+Lag3+Lag4+Lag5+Volume , data = Smarket, family=binomial)
summary(glm.fit)
coef(glm.fit)

is there a way to create a visualization of the logistic regression? I have not included the code to include the library, if this is needed, let me know, and I will add it.

Thanks!

Apr 11, 2022 in Machine Learning by Nandini
• 5,480 points
969 views

1 answer to this question.

0 votes

The first, third, and fourth methods of visualizing the results of a logistic regression are all borrowed from the code for Chapter 5 of Gelman and Hill's Data Analysis Using Regression & Multilevel/Hierarchical Models. Some functions from the arm package (which comes with the book) are used in the code below

library(arm)

Plot the probability of Direction="Up" over a variety of Lag 1 values and for three different Volume values (other Lag values are set to 0, but you can change them if you want)

# Function to jitter class category values
jitter.binary <- function(a, jitt=.05){
  ifelse (a-1==0, runif(length(a), 0, jitt), runif(length(a), 1-jitt, 1))
}

# Sequence of Lag1 values for plotting
x = seq(-5,5.7,length.out=100)

# Plot jittered Direction vs. Lag 1. This shows the actual distribution of the data.
with(Smarket, plot(Lag1, jitter.binary(as.numeric(Direction)), pch=16,cex=0.7,
                   ylab="Pr(Up)", xlab="Lag 1"))

# Add model prediction curves. These show the probability of Direction="Up" vs. Lag 1 
# for three different fixed values of Volume.
curve(expr=invlogit(cbind(1, x,0,0,0,0,1.48) %*% coef(glm.fit)), 
      from=-5, to=5.7, lwd=.5, add=TRUE)
curve(expr=invlogit(cbind(1, x, 0,0,0,0, 0.36) %*% coef(glm.fit)), 
      from=-5, to=5.7, lwd=.5, add=TRUE, col="red", lty=2)
curve(expr=invlogit(cbind(1, x, 0,0,0,0, 3.15) %*% coef(glm.fit)), 
      from=-5, to=5.7, lwd=.5, add=TRUE, col="blue", lty=2)

enter image description here

Plot Lag 1 vs. Volume, color the spots according to the Direction value, and include the decision boundary. Because the decision boundary for your real regression is a five-dimensional hyperplane, I've created a new regression with just the two predictors for this. (You may still graph the decision boundary in two dimensions for models with numerous predictors by taking 2D slices through the multidimensional predictor space.)

# New regression model
fit2 = glm(Direction ~ Lag1 + Volume , data = Smarket, family=binomial)

# Probability of Direction="Up" for this model
Smarket$Pred2 = predict(fit2, type="response")

# Set Prediction to "Up" for probability > 0.5; "Down" otherwise.
Smarket$PredCat2 = cut(Smarket$Pred2, c(0,0.5,1), include.lowest=TRUE, labels=c("Down","Up"))

# Graph Lag1 vs. Volume with coloring and point-style based on value
# of Direction
with(Smarket, plot(Lag1, Volume, pch=ifelse(Direction=="Down", 3, 1), 
                   col=ifelse(Direction=="Down", "red", "blue"), cex=0.6))

# Add the decision boundary
curve(expr= -(cbind(1, x) %*% coef(glm.fit2)[1:2])/coef(glm.fit2)[3],
      from=-5,to=5.7, add=TRUE)

enter image description here

Hope this helps.

Ignite Your Future with Machine Learning Training

answered Apr 12, 2022 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer

Can we change the sigmoid with tanh in Logistic regression transforms??

Hi@Deepanshu, Yes, you can use tanh instead of ...READ MORE

answered May 12, 2020 in Machine Learning by MD
• 95,460 points
2,701 views
0 votes
1 answer
0 votes
1 answer

Plot logistic regression curve in R

The Code looks something like this: fit = ...READ MORE

answered Apr 4, 2022 in Machine Learning by Nandini
• 5,480 points
2,370 views
0 votes
1 answer

How to add regression line equation and R2 on graph?

Below is one solution: # GET EQUATION AND ...READ MORE

answered Jun 1, 2018 in Data Analytics by DataKing99
• 8,250 points
6,717 views
0 votes
1 answer

How to export regression equations for grouped data?

First, you'll need a linear model with ...READ MORE

answered Mar 14, 2022 in Machine Learning by Dev
• 6,000 points
570 views
0 votes
1 answer

R: Force regression coefficients to add up to 1

b1 + b2 = 1 Let us fit ...READ MORE

answered Mar 23, 2022 in Machine Learning by Nandini
• 5,480 points
1,874 views
0 votes
1 answer

Extract regression coefficient values

A quick rundown. These values are stored ...READ MORE

answered Mar 30, 2022 in Machine Learning by Dev
• 6,000 points
1,036 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP