I am trying to implement a theil-sen operator in ggplot's geom_smooth. In an ideal world it would read something like: geom_smooth(..., methods= "mblm"). I cannot seem to find an answer to this, nor can I figure out how I would customize the methods for this. Any advice, pointers, or code help would be greatly appreciated.
I would like to effectively replacing add "mblm" to the methods options in geom_smooth:
library(tidyverse)
library(mblm)
# Option 1 - adding 'mblm' into the methods directly
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_smooth(method='mblm')
# Option 2 - defining the Theil-Sen function outside
ts_fit <- mblm(qsec ~ wt, data = mtcars)
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_smooth( alpha=0,method=ts_fit)
Neither works. I generate the warning Warning message:
Computation failed in stat_smooth(): unused argument (weights = weight), which is essential an error in the geom_smooth line. Any help would be appreciated.
Thanks in advance,
Nate
I figured it out. Here is the answer for completion.
# Option 2 - defining the Theil-Sen function outside
ts_fit <- mblm(qsec ~ wt, data = mtcars)
ggplot(mtcars, aes(qsec, wt))+
geom_point() +
geom_abline(intercept = coef(ts_fit)[1], slope = coef(ts_fit)[2])
Update:
Figured out a more repeatable way to accomplish this.
sen <- function(..., weights = NULL) {
mblm::mblm(...)
}
mtcars %>%
ggplot(aes(qsec, wt)) +
geom_point() +
geom_smooth(method = sen)
Related
I am trying to plot the interaction effects from a multiple linear regression using ggplot2. However, the slope of the lines plotted do not match what they should be based on the estimates returned by the lm function.
Here is my code:
lm.sense <- lm(sense_of_belonging ~ active*mathEAL + MathID + comfort_speaking, data=Data)
library(ggplot2)
p.sense <- ggplot(lm.sense, aes(y=sense_of_belonging, x=active, color=mathEAL)) + geom_smooth(method="lm", se=FALSE)```
Does ggplot not hold the other variables constant?
ggplot2 works with data.frames and doesn't naturally know what to do with an lm object. (Try plot(lm.sense) to see what base R offers here.)
Your ggplot call is using the underlying data from Data (tucked away inside your lm.sense object) to make a plot where x = active and y = sense_of_belonging. It uses that underlying data to do a linear regression that doesn't relate to the mathEAL, MathID, and comfort_speaking variables. Compare these: (they have the same result)
lm.mtcars <- lm(mpg ~ wt + cyl, data = mtcars)
ggplot(lm.mtcars, aes(mpg, wt)) +
geom_point() + geom_smooth(method="lm", se=FALSE)
ggplot(mtcars, aes(mpg, wt)) +
geom_point() + geom_smooth(method="lm", se=FALSE)
Depending on what you want to do, you could show some of the impact of other variables within your geom_smooth by referencing those:
ggplot(mtcars, aes(mpg, wt, color = as.character(cyl))) +
geom_point() + geom_smooth(method="lm", se=FALSE, fullrange = TRUE)
It would help to understand what kind of output you're hoping to generate to give more specific suggestions.
I want to create a colored scatter plot and display a (multiple) linear regression. At the moment my code looks like this (using the cars data-set as an example)
my.formula <- y ~ x
ggplot(mtcars, aes(x=mpg, y=cyl, color=(disp))) +
geom_point() +
geom_smooth(method=lm, se=FALSE) +
ggpmisc::stat_poly_eq(formula = my.formula,
aes(label = paste(..rr.label.., sep = "~~~")),
parse = TRUE) +
scale_colour_gradientn(colours=RColorBrewer::brewer.pal(9,"YlOrRd")) +
theme_bw()
However, I would like to include also the information from the second (coloured) information of the scatter plot in the regression model. Does anybody have a suggestion on how to achieve this?
The idea would be to use a formula like: my.formula <- y ~ x1 + x2 where x1 is mpg and x2 is disp. and to create e.g. a plot with the regression and the corresponding data if possible in 2D (also subplots would be possible to see all information)
You can manually create the plot using stat_function and the fit from your model, well described in this ggiraphExtra vignette. However, that package has a nice wrapper that can do exactly this.
library(ggiraphExtra)
mdl <- lm(data = mtcars, cyl ~ mpg + disp)
ggPredict(mdl)
I want to add how many samples were added to a graph, next to my stat_cor (ggpubr) text.
I'm using the following code to generate the graph:
dataset = mtcars
ggplot(dataset, aes(dataset$wt, dataset$disp)) +
geom_jitter() +
geom_smooth(level=0.95, method = "loess") +
stat_cor(method="spearman") +
theme_classic()
But, if I want to plot multiple graphs in one figure, which uses a real data set where different variables have different missing values, it would be nice to have my sample size used to plot the geom_jitter.
It's a little hacky (and limited in its options), but you can use the label.sep argument to insert the sample size between the correlation coefficient and the p-value (note that somewhat older version of ggpubr have a bug with label.sep... if this doesn't work for you, try updating your package)
ggplot(mtcars, aes(wt, disp)) +
geom_jitter() +
geom_smooth(level = 0.95, method = "loess") +
stat_cor(method = "spearman", label.sep = sprintf(", n = %s, ", nrow(mtcars))) +
theme_classic()
If your concern is missing values, you might need to use a different function than nrow, but I'll leave that to you. This also will not work with facets (you'll get the same number in each facet).
For a fully flexible solution, I think you could use a geom_text, or maybe a stat_summary with geom = "text" would be possible?
Or go hardcore like this answer, if nothing else works
Just for completeness on missing values:
ggplot(mtcars, aes(wt, disp)) +
geom_jitter() +
geom_smooth(level = 0.95, method = "loess") +
stat_cor(method = "spearman", label.sep =
sprintf(", n = %s, ",
sum(complete.cases(mtcars[c("wt","disp")]))
)) +
theme_classic()
To plot the value of N on complete cases of wt and disp as the example shows
I'm trying to find an easy and intuitive way to calculate and display the peaks of a ggplot2::geom_density() object.
This blog explains how to do it in base R, but it is a multistep process.
But it seems much more intuitive to use the stat_peaks() function of the ggpmisc package.
However, when running the code below, I get the error: stat_peaks requires the following missing aesthetics: y
library(tidyverse)
library(ggpmisc)
ggplot(iris, aes(x = Petal.Length)) +
geom_density() +
stat_peaks(colour = "red")
When creating a geom_density() you don't need to supply a y aesthetic.
So if indeed stat_peaks is the way to go, is there a work around to this issue? Perhaps there is a better solution to my problem.
Here is a simple workaround. The idea is to call ggplot_build, let ggplot do the calculations for you and then extract the needed y aesthetic from the resulting object, which is density in your case.
library(ggplot2)
library(ggpmisc)
p <- ggplot(iris, aes(x = Petal.Length)) +
geom_density()
pb <- ggplot_build(p)
p + stat_peaks(
data = pb[['data']][[1]], # take a look at this object
aes(x = x, y = density),
colour = "red",
size = 3
)
I'm sure that this approach can be improved by one of the ggplot2 wizards around that can explain why this is not working...
ggplot(iris, aes(x = Petal.Length, y = stat(density))) +
geom_density() +
stat_peaks()
error: stat_peaks requires the following missing aesthetics: y
... which was my first guess.
When adding a linear model trend line to a boxplot using standard R graphics I use:
boxplot(iris[,2]~iris[,1],col="LightBlue",main="Quartile1 (Rare)")
modelQ1<-lm(iris[,2]~iris[,1])
abline(modelQ1,lwd=2)
However, when using this in ggplot2:
a <- ggplot(iris,aes(factor(iris[,1]),iris[,2]))
a + geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", formula=iris[,2]~iris[,1])
I get the following error:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
And the line does not appear on my plot.
The models used in both of these scenarios are identical. If anyone could point out where I'm going wrong, that would be great.
EDIT: Used the iris dataset as an example.
The error message is pretty much self-explanatory: Add aes(group=1) to geom_smooth:
ggplot(iris, aes(factor(Sepal.Length), Sepal.Width)) +
geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", aes(group=1))
FYI, this error can also be encountered (and fixed) using the simple qplot interface to ggplot2
The error message is not explanatory enough for a few people at least :-).
In this case, the key is to include only the contents of the suggested aesthetic
library(ggplot2)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), data= iris)
# error, needs aes(group=1)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), group = 1, data= iris)