multiple log regression models ggplot2 - r

In the following example, I follow on from the following link, in which we learn the basic to creating a log regression model.
data(mtcars)
dat <- subset(mtcars, select=c(mpg, am, vs))
logr_vm <- glm(vs ~ mpg, data=dat, family=binomial)
library(ggplot2)
ggplot(dat, aes(x=mpg, y=vs)) + geom_point() +
stat_smooth(method="glm", method.args=list(family="binomial"), se=T) +
theme_bw()
Now I want to create a second log model where we predict a new outcome vs2.
How can I use ggplot2 to show the two models with different colours?
dat$vs2 <- with(dat, ifelse(mpg > 20, 1, vs))
so that the secondary log model is ....
logr_vm2 <- glm(vs2 ~ mpg, data=dat, family=binomial)

When fitting the models with ggplot itself, and you only have a few models, you can easily add a legend by manually mapping a model name to colors inside aes. The rest will then be taken care of.
Addionally, I use geom_count instead of geom_point to show that we have overlapping values here, and add some colors to show your different categories:
ggplot(dat, aes(x = mpg)) +
geom_count(aes(y = vs, col = mpg > 20), alpha = 0.3) +
stat_smooth(aes(y = vs, fill = 'm1'), col = 'black',
method = "glm", method.args = list(family = "binomial")) +
stat_smooth(aes(y = vs2, fill = 'm2'), col = 'black',
method = "glm", method.args = list(family = "binomial")) +
scale_size_area() +
scale_color_discrete(h.start = 90) +
theme_bw()

Related

How to add legend into ggplot with both by-group and combined effects?

I'm trying to create a plot that has an overall effect from the regression as well as by-group effects for the same regression. As an example, I have used this plot for the mtcars dataset:
#### Group x Total Effect Plot ####
mtcars %>%
ggplot(aes(x=disp,
y=wt))+
geom_point()+
geom_smooth(method = "lm",
se=F,
color = "gray",
aes(group=factor(am)))+
geom_smooth(method = "lm",
se=F)
Which looks like this:
However, I'd like to add a legend to the plot which is normal for by-factor aes functions in R, but I'm unsure of how to do this with the given example, as the total effect gets lost in the legend when I try:
#### Group by Total Effect Plot ####
mtcars %>%
ggplot(aes(x=disp,
y=wt))+
geom_point()+
geom_smooth(method = "lm",
se=F,
aes(color=factor(am)))+
geom_smooth(method = "lm",
se=F)
Is there a way I can artificially add the legend in some way? Or is there some workaround I'm not considering? My desired result is below:
You can do something like this:
library(ggplot2)
mtcars |>
ggplot(aes(x = disp, y = wt)) +
geom_point() +
geom_smooth(
aes(group = factor(am), colour = "Automatic/Manual Transmission"),
method = "lm",
se = FALSE
) +
geom_smooth(
aes(colour = "Total Effect"),
method = "lm",
se = FALSE
) +
scale_colour_manual(
values = c(
"Automatic/Manual Transmission" = "grey",
"Total Effect" = "blue"
)
) +
labs(colour = "Legend")
#> `geom_smooth()` using formula 'y ~ x'
#> `geom_smooth()` using formula 'y ~ x'
Created on 2022-10-18 with reprex v2.0.2

Is there a way to plot regressed out data?

I have a model:
lm(Y ~ A + B + X) where A + B are covariates.
I have plotted the raw data using:
ggplot(data = data, aes(x=X, y=Y) + geom_point() + geom_smooth(method = "lm", se= FALSE)
I would like to plot the data such that the A + B covariates have been regressed out. Is there a way to do this?
Here is an example with one covariate. I leave it to you as an exercise to create a nice visualization with two covariates. I would probably use facets to illustrate this for different (constant) values of the second covariate.
fit <- lm(mpg ~ I(1/hp) + wt, data = mtcars)
summary(fit)
newdata <- expand.grid(hp = seq(50, 350, by = 1),
wt = 2:5)
newdata$mpg <- predict(fit, newdata = newdata)
library(ggplot2)
ggplot(mtcars, aes(x = hp, y = mpg, color = wt)) +
geom_point() +
geom_line(data = newdata, aes(group = wt))

How can I add a layer showing the distribution on a conditional variable in a probability plot in R studio?

I am fitting the following regression:
model <- glm(DV ~ conditions + predictor + conditions*predictor, family = binomial(link = "probit"), data = d).
I use 'sjPlot' (and 'ggplot2') to make the following plot:
library("ggplot2")
library("sjPlot")
plot_model(model, type = "pred", terms = c("predictor", "conditions")) +
xlab("Xlab") +
ylab("Ylab") +
theme_minimal() +
ggtitle("Title")>
But I can't figure out how to add a layer showing the distribution on the conditioning variable like I can easily do by setting "hist = TRUE" using 'interplot':
library("interplot")
interplot(model, var1 = "conditions", var2 = "predictor", hist = TRUE) +
xlab("Xlab") +
ylab("Ylab") +
theme_minimal() +
ggtitle("Title")
I have tried a bunch of layers using just ggplot as well, with no success
ggplot(d, aes(x=predictor, y=DV, color=conditions))+
geom_smooth(method = "glm") +
xlab("Xlab") +
ylab("Ylab") +
theme_minimal() +
ggtitle("Title")
.
I am open to any suggestions!
I've obviously had to try to recreate your data to get this to work, so it won't be faithful to your original, but if we assume your plot is something like this:
p <- plot_model(model, type = "pred", terms = c("predictor [all]", "conditions")) +
xlab("Xlab") +
ylab("Ylab") +
theme_minimal() +
ggtitle("Title")
p
Then we can add a histogram of the predictor variable like this:
p + geom_histogram(data = d, inherit.aes = FALSE,
aes(x = predictor, y = ..count../1000),
fill = "gray85", colour = "gray50", alpha = 0.3)
And if you wanted to do the whole thing in ggplot, you need to remember to tell geom_smooth that your glm is a probit model, otherwise it will just fit a normal linear regression. I've copied the color palette over too for this example, though note the smoothing lines for the groups start at their lowest x value rather than extrapolating back to 0.
ggplot(d, aes(x = predictor, y = DV, color = conditions))+
geom_smooth(method = "glm", aes(fill = conditions),
method.args = list(family = binomial(link = "probit")),
alpha = 0.15, size = 0.5) +
xlab("Xlab") +
scale_fill_manual(values = c("#e41a1c", "#377eb8")) +
scale_colour_manual(values = c("#e41a1c", "#377eb8")) +
ylab("Ylab") +
theme_minimal() +
ggtitle("Title") +
geom_histogram(aes(y = ..count../1000),
fill = "gray85", colour = "gray50", alpha = 0.3)
Data
set.seed(69)
n_each <- 500
predictor <- rgamma(2 * n_each, 2.5, 3)
predictor <- 1 - predictor/max(predictor)
log_odds <- c((1 - predictor[1:n_each]) * 5 - 3.605,
predictor[n_each + 1:n_each] * 0 + 0.57)
DV <- rbinom(2 * n_each, 1, exp(log_odds)/(1 + exp(log_odds)))
conditions <- factor(rep(c(" ", " "), each = n_each))
d <- data.frame(DV, predictor, conditions)

Show R2 and p-value in ggplot for y~log(x) fuction

I want to make a ggplot with a log regression and want to show the R2 and p-value.
I tried stat_cor, but it only shows R2 and p-value for a linear regression. I tried to incorporate "formula=y~log(x)" into stat_cor, but sais unknown parameter: formula. Do I have to use a different function to make that happen?
ggplot(data = Data,aes(x=Carbon_per,y=Pyrite_per,col=Ecosystem,shape=Ecosystem)) +
geom_smooth(method='lm', formula=y~log(x))+
geom_point() +
stat_cor(aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")))
Cheers,
Gloria
Are you looking for something like this?
library(ggpubr)
library(ggplot2)
ggplot(data = mtcars, aes(x = log(wt), y = mpg)) +
geom_smooth(method = "lm",
formula = y ~ x) +
geom_point() +
stat_cor(label.y = 40)+
stat_regline_equation(label.y = 45)

How to plot raw data but use predicted values for line fit in ggplot2 R?

I have a data set (dat), with raw data (raw_x and raw_y). I have predicted a model and the predictions from the model are stored in dat$predict.
I wish to plot the raw data but overlay the data with a geom_smooth (here a quadratic function) but using the predicted data. This is my attempt at the basic code. I am not sure how to use predicted values in the geom_smooth yet.
ggplot(dat, aes(x = raw_x, y = raw_y, colours = "red")) +
geom_point() +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2))
The following plots the original points, the linear fit line and the fitted points. I use made up data since you have posted none.
set.seed(1234)
x <- cumsum(rnorm(100))
y <- x + x^2 + rnorm(100, sd = 50)
dat <- data.frame(raw_x = x, raw_y = y)
fit <- lm(y ~ x + I(x^2), dat)
dat$predict <- predict(fit)
ggplot(dat, aes(x = raw_x, y = raw_y)) +
geom_point(colour = "blue") +
theme_bw() +
geom_smooth(method = "lm", formula = y ~ x + I(x^2), colour = "red") +
geom_point(aes(y = predict), colour = "black")

Resources