Adding a simple lm trend line to a ggplot boxplot - r

When adding a linear model trend line to a boxplot using standard R graphics I use:
boxplot(iris[,2]~iris[,1],col="LightBlue",main="Quartile1 (Rare)")
modelQ1<-lm(iris[,2]~iris[,1])
abline(modelQ1,lwd=2)
However, when using this in ggplot2:
a <- ggplot(iris,aes(factor(iris[,1]),iris[,2]))
a + geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", formula=iris[,2]~iris[,1])
I get the following error:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
And the line does not appear on my plot.
The models used in both of these scenarios are identical. If anyone could point out where I'm going wrong, that would be great.
EDIT: Used the iris dataset as an example.

The error message is pretty much self-explanatory: Add aes(group=1) to geom_smooth:
ggplot(iris, aes(factor(Sepal.Length), Sepal.Width)) +
geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", aes(group=1))

FYI, this error can also be encountered (and fixed) using the simple qplot interface to ggplot2
The error message is not explanatory enough for a few people at least :-).
In this case, the key is to include only the contents of the suggested aesthetic
library(ggplot2)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), data= iris)
# error, needs aes(group=1)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), group = 1, data= iris)

Related

slope of lines in interaction plot in ggplot2 does not match estimates

I am trying to plot the interaction effects from a multiple linear regression using ggplot2. However, the slope of the lines plotted do not match what they should be based on the estimates returned by the lm function.
Here is my code:
lm.sense <- lm(sense_of_belonging ~ active*mathEAL + MathID + comfort_speaking, data=Data)
library(ggplot2)
p.sense <- ggplot(lm.sense, aes(y=sense_of_belonging, x=active, color=mathEAL)) + geom_smooth(method="lm", se=FALSE)```
Does ggplot not hold the other variables constant?
ggplot2 works with data.frames and doesn't naturally know what to do with an lm object. (Try plot(lm.sense) to see what base R offers here.)
Your ggplot call is using the underlying data from Data (tucked away inside your lm.sense object) to make a plot where x = active and y = sense_of_belonging. It uses that underlying data to do a linear regression that doesn't relate to the mathEAL, MathID, and comfort_speaking variables. Compare these: (they have the same result)
lm.mtcars <- lm(mpg ~ wt + cyl, data = mtcars)
ggplot(lm.mtcars, aes(mpg, wt)) +
geom_point() + geom_smooth(method="lm", se=FALSE)
ggplot(mtcars, aes(mpg, wt)) +
geom_point() + geom_smooth(method="lm", se=FALSE)
Depending on what you want to do, you could show some of the impact of other variables within your geom_smooth by referencing those:
ggplot(mtcars, aes(mpg, wt, color = as.character(cyl))) +
geom_point() + geom_smooth(method="lm", se=FALSE, fullrange = TRUE)
It would help to understand what kind of output you're hoping to generate to give more specific suggestions.

Add a manually designed non-linear line in ggplot2?

I would like to add a non-linear model line to a graph in R, but instead of letting ggplot find the best fit, I just want to preset its parameters and thus be able to see how multiple manually designed models fit on top of the data. I tried the following:
ggplot(cars, aes(x = speed, y = dist)) +
geom_point() +
geom_smooth(method = "nls", method.args = list(formula = y ~ 0.76*exp(x*0.5), color = blue, data = data)
But got the error:
Computation failed in 'stat_smooth()':
formal argument "data" matched by multiple actual arguments
with slight adjustments, I also get the error 'what" must be a function or character string. Does anyone know if manually designating a line like this is possible? I could not find any other Stack Overflow post about this specific topic.
You might be looking for geom_function():
gg0 <- ggplot(cars, aes(x = speed, y = dist)) + geom_point()
gg0 + geom_function(fun = function(x) 0.76*exp(x*0.5), colour = "blue") +
coord_cartesian(ylim=c(0,100))
I added coord_cartesian because the specified function attains really large values for the upper end of the x-range of this graph ...

stat_smooth with different colors using geom_point

I want to plot two numeric variables against each other in a scatterplot and the points should have different colors for each category of another binary variable. I also want to have regression lines.
This is my straight forward code:
library(MASS)
library(ggplot2)
ggplot(cats, aes(Bwt, Hwt, color = Sex)) +
geom_point() +
stat_smooth(method = "lm")
However these are lines from two separate regressions.
I want to have the regression lines from the following regression:
lm(Hwt ~ Bwt + Sex, data = cats)
I've tried the following, but this doesn't work:
ggplot(cats, aes(Bwt, Hwt, color = Sex)) +
geom_point() +
stat_smooth(method = "lm", formula = Hwt ~ Bwt + Sex)
Is there an easy (!) way to achieve this?
It would be no problem for me to write a more complex code but that's not what I'm searching for.

Geom_smooth line not showing up on one of the panels in facet_grid

Hoping to get some insight on this ... currently creating some plots with ggplot and using facets, as well as adding fits using geom_smooth. I have two fits, a non-linear and linear.
For some reason, the non-linear one is not showing up not the top facet ... it used to work for me and now has stopped!
Here is the graph code:
ggplot(example_data, aes(x,y))+
geom_point(col="black", size=3)+
facet_grid(k~loc,labeller = as_labeller(loc.labels))+
geom_smooth(method = "lm", formula = y~x,col="blue", se=FALSE)+
geom_smooth(method = "nls", formula = y~A*x^ B, se=FALSE,col="red")
This is the output I get:
and or some reason the red line is missing from the top panel ..
I needed to specify start values for the non-linear line ..
updated code:
ggplot(example_data, aes(x,y))+
geom_point(col="black", size=3)+
facet_grid(k~loc)+
geom_smooth(method = "lm", formula = y~x,col="blue", se=FALSE)+
geom_smooth(method = "nls", formula = y~A*x^ B, se=FALSE,col="red",
method.args =list(start=c(A=400,B=0)))
Thanks for the speedy help!

How can I use Theil-Sen method with geom_smooth

I am trying to implement a theil-sen operator in ggplot's geom_smooth. In an ideal world it would read something like: geom_smooth(..., methods= "mblm"). I cannot seem to find an answer to this, nor can I figure out how I would customize the methods for this. Any advice, pointers, or code help would be greatly appreciated.
I would like to effectively replacing add "mblm" to the methods options in geom_smooth:
library(tidyverse)
library(mblm)
# Option 1 - adding 'mblm' into the methods directly
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_smooth(method='mblm')
# Option 2 - defining the Theil-Sen function outside
ts_fit <- mblm(qsec ~ wt, data = mtcars)
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_smooth( alpha=0,method=ts_fit)
Neither works. I generate the warning Warning message:
Computation failed in stat_smooth(): unused argument (weights = weight), which is essential an error in the geom_smooth line. Any help would be appreciated.
Thanks in advance,
Nate
I figured it out. Here is the answer for completion.
# Option 2 - defining the Theil-Sen function outside
ts_fit <- mblm(qsec ~ wt, data = mtcars)
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_abline(intercept = coef(ts_fit)[1], slope = coef(ts_fit)[2])
Update:
Figured out a more repeatable way to accomplish this.
sen <- function(..., weights = NULL) {
mblm::mblm(...)
}
mtcars %>%
ggplot(aes(qsec, wt)) +
geom_point() +
geom_smooth(method = sen)

Resources