Plotting a number of regression lines in a single plot - r

How do I show 2 regression lines on the same plot?
Here are both models:
data(mtcars)
a <- lm(mpg~wt+hp)
b <- lm(mpg~wt+hp+wt*hp)
I plot wt on the x-axis, mpg on the y-axis and hp as the colour.
Here it is in base R:
cr <- colorRamp(c("yellow", "red"))
with(mtcars, {
plot(wt, mpg, col = rgb(cr(hp / max(hp)), max=255),
xlab="Weight", ylab="Miles per Gallon", pch=20)
})
Also, please show how to accomplish this in ggplot2.
Here's the plot:
library(ggplot2)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(aes(col = hp))
p + scale_colour_gradientn(colours=c("green","black"))
Thanks in advance!

The documentation for geom_smooth practically tells you how to do this.
One can use the regression models to predict new values for y and then plot these on the same graph using geom_smooth().
Below is code for ggplot2 that produces what I think you want. The two lines overlap so much that it looks like only one line is plotted and I've set one linetype to dashed to demonstrate this.
I don't know how to achieve this in base R though.
data(mtcars)
library(ggplot2)
a <- lm(mpg~wt+hp, data = mtcars)
b <- lm(mpg~wt+hp+wt*hp, data = mtcars)
mtcars$pred.a <- predict(a)
mtcars$pred.b <- predict(b)
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(aes(col = hp)) +
scale_colour_gradientn(colours=c("green","black")) +
geom_smooth(aes(x = wt, y = pred.a), method = "lm", colour = "black", fill = NA) +
geom_smooth(aes(x = wt, y = pred.b), method = "lm", colour = "red", fill = NA, linetype = 4)
p

A base R solution:
a <- lm(mpg~wt+hp, data=mtcars)
b <- lm(mpg~wt+hp+wt*hp, data=mtcars)
wt <- mtcars[, "wt"]
idx <- sort(wt, index.return=TRUE)$ix
plot(mpg~wt, data=mtcars)
lines(wt[idx], predict(a)[idx], col="red")
lines(wt[idx], predict(b)[idx], col="blue")
However, it is not the best visualisation conceivable.

You are asking how to add a regression line, but your regression models produce a regression plane and a regression surface, both higher dimensional than a line. You can find a regression line by conditioning on a chosen value of hp, or show multiple lines for different values of hp.
Using base graphics you can use the Predict.Plot function in the TeachingDemos package to add prediction lines/curves to a plot for a fitted model (or 2). The interactive TkPredict' function in the same package will let you interact with the plot to choose conditioning values, then will produce the call toPredict.Plot` to create the current line. You can the combine the generated commands to include them on the same plot.

Related

R || Adjusting x-axis in sjPlot::plot_model()

I want to graph an interaction effect between two variables with one outcome in R. While I can successfully produce a graph using sjPlot:plot_model(), the interaction plot does not resize when I adjust the x-axis values. Instead, the graph that's plotted is always that of the original-size while the x- and y-axis will adjust. Below is an example using the mtcars data in R.
library(sjPlot)
library(sjmisc)
library(ggplot2)
mtcars.df <- mtcars
fit <- lm(mpg ~ hp * disp, data = mtcars.df)
plot_model(fit, type = "pred", terms = c("hp", "disp"))
I can get a graph like this in my own code. However, when I attempt to alter the x- and y-axes as seen below, the grid expands, but the graph itself does not.
plot_model(fit, type = "pred", terms = c("hp", "disp"), axis.lim = list(c(0,150),c(0,200)))
Picture of successfully graphed interaction with wildly exaggerated adjustments to the axes. The graph does not extend but the grid does.
What code can I use to adjust both the lines of my interaction effect AND those of the grid? Adjusting post-hoc with
plot_model(fit, type = "pred", terms = c("hp", "disp"))+xlim(0,150)
creates the same issue.
Post-hoc extending the graph creates the same issue.
plot_model will only plot interactions over the range of your original data. It's really not difficult to do it directly in ggplot though by feeding whatever x values you want into predict:
library(ggplot2)
mtcars.df <- mtcars
fit <- lm(mpg ~ hp * disp, data = mtcars.df)
new_df <- expand.grid(hp = 0:300, disp = c(106.78, 230.72, 354.66))
predictions <- predict(fit, new_df, se = TRUE)
new_df$mpg <- predictions$fit
new_df$upper <- new_df$mpg + 1.96 * predictions$se.fit
new_df$lower <- new_df$mpg - 1.96 * predictions$se.fit
new_df$disp <- factor(new_df$disp)
ggplot(new_df, aes(hp, mpg)) +
geom_ribbon(aes(ymax = upper, ymin = lower, fill = disp), alpha = 0.3) +
geom_line(aes(color = disp)) +
scale_fill_brewer(palette = "Set1") +
scale_color_brewer(palette = "Set1")
Created on 2022-05-21 by the reprex package (v2.0.1)
plot_model allow you to choose the range of the plot just adding the range in square braquets next to the selected variable <<[min,max]>>.
I think the easiest way would be the following:
plot_model(fit, type = "pred", terms = c("hp [0,300]", "disp"))
You can find more details here:
https://strengejacke.github.io/sjPlot/articles/plot_marginal_effects.html

ggplot to see model fit and scatterplot data at the same time [duplicate]

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this...
data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50))
ggplot(data,aes(x.plot,y.plot))+stat_summary(fun.data=mean_cl_normal) +
geom_smooth(method='lm',formula=data$y.plot~data$x.plot)
But it is not working either.
In general, to provide your own formula you should use arguments x and y that will correspond to values you provided in ggplot() - in this case x will be interpreted as x.plot and y as y.plot. You can find more information about smoothing methods and formula via the help page of function stat_smooth() as it is the default stat used by geom_smooth().
ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data=mean_cl_normal) +
geom_smooth(method='lm', formula= y~x)
If you are using the same x and y values that you supplied in the ggplot() call and need to plot the linear regression line then you don't need to use the formula inside geom_smooth(), just supply the method="lm".
ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data= mean_cl_normal) +
geom_smooth(method='lm')
As I just figured, in case you have a model fitted on multiple linear regression, the above mentioned solution won't work.
You have to create your line manually as a dataframe that contains predicted values for your original dataframe (in your case data).
It would look like this:
# read dataset
df = mtcars
# create multiple linear model
lm_fit <- lm(mpg ~ cyl + hp, data=df)
summary(lm_fit)
# save predictions of the model in the new data frame
# together with variable you want to plot against
predicted_df <- data.frame(mpg_pred = predict(lm_fit, df), hp=df$hp)
# this is the predicted line of multiple linear regression
ggplot(data = df, aes(x = mpg, y = hp)) +
geom_point(color='blue') +
geom_line(color='red',data = predicted_df, aes(x=mpg_pred, y=hp))
# this is predicted line comparing only chosen variables
ggplot(data = df, aes(x = mpg, y = hp)) +
geom_point(color='blue') +
geom_smooth(method = "lm", se = FALSE)
The simple and versatile solution is to draw a line using slope and intercept from geom_abline. Example usage with a scatterplot and lm object:
library(tidyverse)
petal.lm <- lm(Petal.Length ~ Petal.Width, iris)
ggplot(iris, aes(x = Petal.Width, y = Petal.Length)) +
geom_point() +
geom_abline(slope = coef(petal.lm)[["Petal.Width"]],
intercept = coef(petal.lm)[["(Intercept)"]])
coef is used to extract the coefficients of the formula provided to lm. If you have some other linear model object or line to plot, just plug in the slope and intercept values similarly.
I found this function on a blog
ggplotRegression <- function (fit) {
`require(ggplot2)
ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) +
geom_point() +
stat_smooth(method = "lm", col = "red") +
labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
"Intercept =",signif(fit$coef[[1]],5 ),
" Slope =",signif(fit$coef[[2]], 5),
" P =",signif(summary(fit)$coef[2,4], 5)))
}`
once you loaded the function you could simply
ggplotRegression(fit)
you can also go for ggplotregression( y ~ x + z + Q, data)
Hope this helps.
If you want to fit other type of models, like a dose-response curve using logistic models you would also need to create more data points with the function predict if you want to have a smoother regression line:
fit: your fit of a logistic regression curve
#Create a range of doses:
mm <- data.frame(DOSE = seq(0, max(data$DOSE), length.out = 100))
#Create a new data frame for ggplot using predict and your range of new
#doses:
fit.ggplot=data.frame(y=predict(fit, newdata=mm),x=mm$DOSE)
ggplot(data=data,aes(x=log10(DOSE),y=log(viability)))+geom_point()+
geom_line(data=fit.ggplot,aes(x=log10(x),y=log(y)))
Another way to use geom_line() to add regression line is to use broom package to get fitted values and use it as shown here
https://cmdlinetips.com/2022/06/add-regression-line-to-scatterplot-ggplot2/

Different colours in ggplot based on geom_smooth

I created a ggplot with linear geom_smooth now i would like to have the points, from the geom_point to have a different colour below and above the linear smooth line.
I know I can add the color to the point by doing geom_point(aes(x, y, colour = z)). My problem is how to determine if a point in the plot is below or above the linear line.
Can ggplot2 do this or do have to create a new column in the data frame first?
Below is the sample code with geom_smooth but without the different colours above and below the line.
Any help is appreciated.
library(ggplot2)
df <- data.frame(x = rnorm(100),
y = rnorm(100))
ggplot(df, aes(x,y)) +
geom_point() +
geom_smooth(method = "lm")
I believe ggplot2 can't do this for you. As you say, you could create a new variable in df to make the colouring. You can do so, based on the residuals of the linear model.
For example:
library(ggplot2)
set.seed(2015)
df <- data.frame(x = rnorm(100),
y = rnorm(100))
# Fit linear regression
l = lm(y ~ x, data = df)
# Make new group variable based on residuals
df$group = NA
df$group[which(l$residuals >= 0)] = "above"
df$group[which(l$residuals < 0)] = "below"
# Make the plot
ggplot(df, aes(x,y)) +
geom_point(aes(colour = group)) +
geom_smooth(method = "lm")
Note that the colour argument has to be passed to geom_point(), otherwise geom_smooth() will produce a fit to each group separately.
Result:

ScatterPlot of Cars example

I am trying to plot a scatterplot from mtcars of: hp ~ mpg and for each point (x,y) show how many cylinders (cyl) by different colors.
I tried to use the function ScatterPlot , but it's not recognized without adding the 'car' package.
So I tried :
plot(mtcars$mpg ~ mtcars$hp , data=mtcars, xlab="HP", ylab="Hwy.MPG")
How can I add number of cylinders for each point of this graph? (with different colors)
I'm going to assume you're using mtcars from datasets.
The simplest way to add colour is to just add a colargument:
plot(mpg ~ hp , data=mtcars, col=cyl, xlab="HP", ylab="Hwy.MPG")
If you want custom colours, you can use the palettefunction:
palette(c("red", "blue", "green"))
plot(mpg ~ hp , data=mtcars, col=cyl, xlab="HP", ylab="Hwy.MPG")
Here's an example in lattice It's a little more "Oo-Lala", and fairly straightforward.
library(lattice)
xyplot(mpg ~ hp, data = mtcars, groups = cyl, pch = 19,
xlab = "HP", ylab = "Hwy.MPG", auto.key = list(columns = 3))
And to complete the picture, here's the ggplot example
library(ggplot2)
ggplot(mtcars, aes(x = hp, y = mpg)) + geom_point(aes(color = factor(cyl)), size = 4)

Adding a regression line on a ggplot

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this...
data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50))
ggplot(data,aes(x.plot,y.plot))+stat_summary(fun.data=mean_cl_normal) +
geom_smooth(method='lm',formula=data$y.plot~data$x.plot)
But it is not working either.
In general, to provide your own formula you should use arguments x and y that will correspond to values you provided in ggplot() - in this case x will be interpreted as x.plot and y as y.plot. You can find more information about smoothing methods and formula via the help page of function stat_smooth() as it is the default stat used by geom_smooth().
ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data=mean_cl_normal) +
geom_smooth(method='lm', formula= y~x)
If you are using the same x and y values that you supplied in the ggplot() call and need to plot the linear regression line then you don't need to use the formula inside geom_smooth(), just supply the method="lm".
ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data= mean_cl_normal) +
geom_smooth(method='lm')
As I just figured, in case you have a model fitted on multiple linear regression, the above mentioned solution won't work.
You have to create your line manually as a dataframe that contains predicted values for your original dataframe (in your case data).
It would look like this:
# read dataset
df = mtcars
# create multiple linear model
lm_fit <- lm(mpg ~ cyl + hp, data=df)
summary(lm_fit)
# save predictions of the model in the new data frame
# together with variable you want to plot against
predicted_df <- data.frame(mpg_pred = predict(lm_fit, df), hp=df$hp)
# this is the predicted line of multiple linear regression
ggplot(data = df, aes(x = mpg, y = hp)) +
geom_point(color='blue') +
geom_line(color='red',data = predicted_df, aes(x=mpg_pred, y=hp))
# this is predicted line comparing only chosen variables
ggplot(data = df, aes(x = mpg, y = hp)) +
geom_point(color='blue') +
geom_smooth(method = "lm", se = FALSE)
The simple and versatile solution is to draw a line using slope and intercept from geom_abline. Example usage with a scatterplot and lm object:
library(tidyverse)
petal.lm <- lm(Petal.Length ~ Petal.Width, iris)
ggplot(iris, aes(x = Petal.Width, y = Petal.Length)) +
geom_point() +
geom_abline(slope = coef(petal.lm)[["Petal.Width"]],
intercept = coef(petal.lm)[["(Intercept)"]])
coef is used to extract the coefficients of the formula provided to lm. If you have some other linear model object or line to plot, just plug in the slope and intercept values similarly.
I found this function on a blog
ggplotRegression <- function (fit) {
`require(ggplot2)
ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) +
geom_point() +
stat_smooth(method = "lm", col = "red") +
labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
"Intercept =",signif(fit$coef[[1]],5 ),
" Slope =",signif(fit$coef[[2]], 5),
" P =",signif(summary(fit)$coef[2,4], 5)))
}`
once you loaded the function you could simply
ggplotRegression(fit)
you can also go for ggplotregression( y ~ x + z + Q, data)
Hope this helps.
If you want to fit other type of models, like a dose-response curve using logistic models you would also need to create more data points with the function predict if you want to have a smoother regression line:
fit: your fit of a logistic regression curve
#Create a range of doses:
mm <- data.frame(DOSE = seq(0, max(data$DOSE), length.out = 100))
#Create a new data frame for ggplot using predict and your range of new
#doses:
fit.ggplot=data.frame(y=predict(fit, newdata=mm),x=mm$DOSE)
ggplot(data=data,aes(x=log10(DOSE),y=log(viability)))+geom_point()+
geom_line(data=fit.ggplot,aes(x=log10(x),y=log(y)))
Another way to use geom_line() to add regression line is to use broom package to get fitted values and use it as shown here
https://cmdlinetips.com/2022/06/add-regression-line-to-scatterplot-ggplot2/

Resources