Add a manually designed non-linear line in ggplot2? - r

I would like to add a non-linear model line to a graph in R, but instead of letting ggplot find the best fit, I just want to preset its parameters and thus be able to see how multiple manually designed models fit on top of the data. I tried the following:
ggplot(cars, aes(x = speed, y = dist)) +
geom_point() +
geom_smooth(method = "nls", method.args = list(formula = y ~ 0.76*exp(x*0.5), color = blue, data = data)
But got the error:
Computation failed in 'stat_smooth()':
formal argument "data" matched by multiple actual arguments
with slight adjustments, I also get the error 'what" must be a function or character string. Does anyone know if manually designating a line like this is possible? I could not find any other Stack Overflow post about this specific topic.

You might be looking for geom_function():
gg0 <- ggplot(cars, aes(x = speed, y = dist)) + geom_point()
gg0 + geom_function(fun = function(x) 0.76*exp(x*0.5), colour = "blue") +
coord_cartesian(ylim=c(0,100))
I added coord_cartesian because the specified function attains really large values for the upper end of the x-range of this graph ...

Related

What are the differences in use of labeller as function and as argument in ggplot

I have solved my particular use case issue regarding this, but I hope that by posing this question I may assist others with similar issues and provide a bookmark to come back to in the future if I need to make similar graphics. When using library(ggplot2) the capability to facet plots into multi-panel figures is paramount to many analyses, however, getting the labels right using the labeller argument and/or function can be rather finicky when the underlying data is not formatted correctly. I present to you a simple data set to demonstrate my particular use case and some variations that caused me problems, as well as the solutions that I used. I also pose some questions about how the labeller function is used within the labeller argument to facet_wrap and/or facet_grid. The starting data:
dat <- tibble(label1 = c('one','two','three','four','five','six',
'six','five','four','three','two','one'),
label2 = rep('T[alpha]'),
fit = c(25,50,75,40,60,90,40,50,70,30,90,100),
upr = c(35,60,85,50,70,100,50,60,80,40,100,110),
lwr = c(15,40,65,30,50,80,30,40,60,20,80,90),
var1 = rep(c(seq(1990,1995,1)),2))
Notice that label2 here is a character expression that can be evaluated as an expression. This appears to be the same as using base-R to plot:
plot(dat$var1, dat$fit)
mtext(expression(paste('T'[alpha])))
Essentially, as long as the expression can be evaluated within the expression function, it will work with label_parsed as an argument to facet_grid or facet_wrap. Here is what the code looks like for the actual plot I want:
p1 <- ggplot(dat, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = label_parsed)
p1
Now if I want to use label1 with a space, I can use this solution as well, but need to include a tilde ~ anywhere the space should exist so that it is still able to be evaluated as an expression.
dat1 <- dat %>%
mutate(label1 = paste(label1, '~x', sep = ''))
dat1
p2 <- ggplot(dat1, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = label_parsed)
p2
This is as far as I had to go to solve my use case. Howver, to better understand the functionality I would like to ask the community to clarify another way of using labeller as a function. The question posed here:
label_parsed of facet_grid in ggplot2 mixed with spaces and expressions
eventually helped me arrive at my solution. However, if we use the code as written in that question, where the labeller argument is: labeller = labeller(type=label_parsed) the result is not what is desired (we get the character expression as written instead of being evaluated as an expression.
p3 <- ggplot(dat1, aes(x = var1, y = fit)) +
geom_point() +
geom_segment(aes(xend = var1, y = lwr, yend = upr)) +
facet_grid(label1~label2, scales = "fixed",
labeller = labeller(type=label_parsed)) # this is the change
p3
Can anyone explain when it would be appropriate to use labeller as a function within the call to labeller from facet_grid? My hope is that a solution exists that doesn't require re-formatting of the entire dataset to reflect expressions and perhaps that utility lies within the labeller function itself.

How to add sample size used in plotting geom_jitter

I want to add how many samples were added to a graph, next to my stat_cor (ggpubr) text.
I'm using the following code to generate the graph:
dataset = mtcars
ggplot(dataset, aes(dataset$wt, dataset$disp)) +
geom_jitter() +
geom_smooth(level=0.95, method = "loess") +
stat_cor(method="spearman") +
theme_classic()
But, if I want to plot multiple graphs in one figure, which uses a real data set where different variables have different missing values, it would be nice to have my sample size used to plot the geom_jitter.
It's a little hacky (and limited in its options), but you can use the label.sep argument to insert the sample size between the correlation coefficient and the p-value (note that somewhat older version of ggpubr have a bug with label.sep... if this doesn't work for you, try updating your package)
ggplot(mtcars, aes(wt, disp)) +
geom_jitter() +
geom_smooth(level = 0.95, method = "loess") +
stat_cor(method = "spearman", label.sep = sprintf(", n = %s, ", nrow(mtcars))) +
theme_classic()
If your concern is missing values, you might need to use a different function than nrow, but I'll leave that to you. This also will not work with facets (you'll get the same number in each facet).
For a fully flexible solution, I think you could use a geom_text, or maybe a stat_summary with geom = "text" would be possible?
Or go hardcore like this answer, if nothing else works
Just for completeness on missing values:
ggplot(mtcars, aes(wt, disp)) +
geom_jitter() +
geom_smooth(level = 0.95, method = "loess") +
stat_cor(method = "spearman", label.sep =
sprintf(", n = %s, ",
sum(complete.cases(mtcars[c("wt","disp")]))
)) +
theme_classic()
To plot the value of N on complete cases of wt and disp as the example shows

Problems with ggplot2 and geom_errorbar()

Greeting,
I'm having a hard time with ggplot2 and the geom_error function.
I have a data frame with individuals(rows) and size(column 1) and density(column2). My aim is to plot influence of density on size in a quadratic model.
lm(size ~ poly(density, 2, raw=TRUE))
for that matter I used.
ggplot(df, aes(x = density, y = size, col = Sexo)) +
geom_smooth(method = lm, formula = y ~ x + I(x^2), size = 1)+
geom_point())
It went fine. But now I want to plot the same data set with geom_errorbar. I tried.
ggplot(cg.cvic, aes(x = as.factor(density), y = size, col = sex)) +
geom_errorbar(ymin = size-sd, ymax = size + sd))
And I'm guettint the response:
Error in size - sd : non-numeric argument to binary operator
What am I doing wrong?
Firstly there is no column sd in your data frame. Moreover R has build in function sd which is a function not a variable or a number. So from R perspective you are trying to add variable to a function, so R tells you that one of the argument is non-numeric and your are trying to perform on him action which can only be perfomed on numbers. You have extract somehow the standard deviation of your model predictions, write it in your data frame and after that use it in ggplot. And don't name it sd, use something else.

Suppress message from geom_line with only one point

I'm iterating through multiple data sets to produce line plots for each set. How can I prevent ggplot from complaining when I use geom_line over one point?
Take, for example, the following data:
mydata = data.frame(
x = c(1, 2),
y = c(2, 2),
group = as.factor(c("foo", "foo"))
)
Creating line graph looks and works just fine because there are two points in the line:
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
However, plotting only the fist row give the message:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
Some of my figures will only have one point and the messages cause hangups in the greater script that produces these figures. I know the plots still work, so my concern is avoiding the message. I'd also like to avoid using suppressWarnings() if possible in case another legitimate and unexpected issue arises.
Per an answer to this question: suppressMessages(ggplot()) fails because you have to wrap it around a print() call of the ggplot object--not the ggplot object itself. This is because the warning/message only occurs when the object is drawn.
So, to view your plot without a warning message run:
p <- ggplot(mydata[1,], aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
suppressMessages(print(p))
I think the following if-else solution should resolve the problem:
if (nrow(mydata) > 1) {
ggplot(mydata, aes(x = x, y = y)) +
geom_point() +
geom_line(aes(group = group))
} else {
ggplot(mydata, aes(x = x, y = y)) +
geom_point()
}
On the community.RStudio.com, John Mackintosh suggests a solution which worked for me:
Freely quoting:
Rather than suppress warnings, change the plot layers slightly.
Facet wrap to create empty plot
Add geom_point for entire data frame
Subset the dataframe by creating a vector of groups with more than one data point, and filtering the original data for those groups. Only
plot lines for this subset.
Details and example code in the followup of the link above.

Adding a simple lm trend line to a ggplot boxplot

When adding a linear model trend line to a boxplot using standard R graphics I use:
boxplot(iris[,2]~iris[,1],col="LightBlue",main="Quartile1 (Rare)")
modelQ1<-lm(iris[,2]~iris[,1])
abline(modelQ1,lwd=2)
However, when using this in ggplot2:
a <- ggplot(iris,aes(factor(iris[,1]),iris[,2]))
a + geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", formula=iris[,2]~iris[,1])
I get the following error:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
And the line does not appear on my plot.
The models used in both of these scenarios are identical. If anyone could point out where I'm going wrong, that would be great.
EDIT: Used the iris dataset as an example.
The error message is pretty much self-explanatory: Add aes(group=1) to geom_smooth:
ggplot(iris, aes(factor(Sepal.Length), Sepal.Width)) +
geom_boxplot() +
geom_smooth(method = "lm", se=FALSE, color="black", aes(group=1))
FYI, this error can also be encountered (and fixed) using the simple qplot interface to ggplot2
The error message is not explanatory enough for a few people at least :-).
In this case, the key is to include only the contents of the suggested aesthetic
library(ggplot2)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), data= iris)
# error, needs aes(group=1)
qplot(factor(Sepal.Length), Sepal.Width, geom = c("smooth"), group = 1, data= iris)

Resources