I am pretty new in modelling. I have three groups of data (by period), which I want to display by lines over scatter plot.
I figured out how to put my method and formula in geom_smooth, and I am able to display a single line.
However, when I want to add lines per group, which could be accomplished by ggplot(.., aes(..,group = period)), I've got back a Warning:
Warning message:
Computation failed in `stat_smooth()`:
number of iterations exceeded maximum of 50
and the line is not displayed.
My working code:
ggplot(tab, aes(x=distance, y=grad)) + #
geom_point() + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01)), #
se = F)
results:
Code providing error (with added group = period in aes), and not displaying lines per group:
ggplot(tab, aes(x=distance, y=grad, group = period)) + #
geom_point() + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01)), #
se = F)
Do you have some ideas how can I increase the number of iteration in ggplot2 by geom_smooth function?
I found some information to increase number of iteration by control=nls.control(maxiter=200) https://stat.ethz.ch/pipermail/r-help/2006-June/107606.html relative to R base modelling, but I can't find solution or directions for ggplot2.
Based on #Axeman comment, I added the control=nls.control(maxiter=200) to the
method.args = list(start=c(a=20, b=0.01),
control=nls.control(maxiter=200))
The whole script is thus:
ggplot(tab, aes(x=distance, y=grad, group = period, col = period)) + #
geom_point(col = "grey") + theme_bw() +
geom_smooth(method = "nls",
formula = y ~ a*x^(-b),
method.args = list(start=c(a=20, b=0.01),
control=nls.control(maxiter=200)), #
se = F)
And the result is:
Related
I have a data.frame with observed success/failure outcomes per two groups along with expected probabilities:
library(dplyr)
observed.probability.df <- data.frame(group = c("A","B"), p = c(0.4,0.6))
expected.probability.df <- data.frame(group = c("A","B"), p = qlogis(c(0.45,0.55)))
observed.data.df <- do.call(rbind,lapply(c("A","B"), function(g)
data.frame(group = g, value = c(rep(0,1000*dplyr::filter(observed.probability.df, group != g)$p),rep(1,1000*dplyr::filter(observed.probability.df, group == g)$p)))
)) %>% dplyr::left_join(expected.probability.df)
observed.probability.df$group <- factor(observed.probability.df$group, levels = c("A","B"))
observed.data.df$group <- factor(observed.data.df$group, levels = c("A","B"))
I'm fitting a logistic regression (binomial glm with a logit link function) to these data with the offset term:
fit <- glm(value ~ group + offset(p), data = observed.data.df, family = binomial(link = 'logit'))
Now, I'd like to plot these data as a bar graph using ggplot2's geom_bar, color-coded by group, and to add to that the trend line and shaded standard error area estimated in fit.
I'd use stat_smooth for that but I don't think it can handle the offset term in it's formula, so looks like I need to resort to assembling this figure in an alternative way.
To get the bars and the trend line I used:
slope.est <- function(x, ests) plogis(ests[1] + ests[2] * x)
library(ggplot2)
ggplot(observed.probability.df, aes(x = group, y = p, fill = group)) +
geom_bar(stat = 'identity') +
stat_function(fun = slope.est,args=list(ests=coef(fit)),size=2,color="black") +
scale_x_discrete(name = NULL,labels = levels(observed.probability.df$group), breaks = sort(unique(observed.probability.df$group))) +
theme_minimal() + theme(legend.title = element_blank()) + ylab("Fraction of cells")
So the question is how to add to that the shaded standard error around the trend line?
Using stat_function I am able to shade the entire area from the upper bound of the standard error all the way down to the X-axis:
ggplot(observed.probability.df, aes(x = group, y = p, fill = group)) +
geom_bar(stat = 'identity') +
stat_function(fun = slope.est,args=list(ests=coef(fit)),size=2,color="black") +
stat_function(fun = slope.est,args=list(ests=summary(fit)$coefficients[,1]+summary(fit)$coefficients[,2]),geom='area',fill="gray",alpha=0.25) +
scale_x_discrete(name = NULL,labels = levels(observed.probability.df$group), breaks = sort(unique(observed.probability.df$group))) +
theme_minimal() + theme(legend.title = element_blank()) + ylab("Fraction of cells")
Which is close but not quite there.
Any idea how to subtract from the shaded area above the area that's below the lower bound of the standard error? Perhaps geom_ribbon is the way to go here, but I don't know how to combine it with the slope.est function
How do you use geom_smooth() using a formula with the following form:
log(Unit.Sales_1) ~ log(Price) + A
On top of a ggplot that is using a completely different dataset?
I'm currently able to use a transformed axis, a different dataset, but not both at the same time. I get the following error message:
Computation failed in stat_smooth(): object 'Unit.Sales_1' not found
And the rest of my ggplot looks like this:
ggplot() +
geom_point(data = hist_data1, aes(x = Price, y = Unit.Sales, color = "Historicals")) +
geom_line(data = est_data1, aes(x = P1, y = Q1, color = "Estimated")) +
geom_smooth(data = wide_oj_data,
formula = Unit.Sales_1 ~ log(Price_1) + log(Price_3) + Promotion_1 + Holiday,
method = "glm",
method.args = list(family = gaussian(link = 'log')),
aes(x = Price_1, y = Unit.Sales_1)
)
Thank you :)
This question already has an answer here:
ggplot2 geom_smooth line not showing up on my graph
(1 answer)
Closed 6 years ago.
Learning ggplot2 and don't understand why the second set of code produces an error. All I had to do was add the aesthetics to the stat_smooth command in the third set of code and it ran fine, but I don't understand why.
ggplot(df, aes(x=wave.height, y=ship.deploy)) + geom_point() +
stat_smooth(method="glm", method.args=list(family="binomial"), se=FALSE)
ggplot(data = df) +
geom_point(mapping = aes(x = wave.height, y = ship.deploy)) +
stat_smooth(method = "glm", method.args = list(family = "binomial"), se = FALSE)
Error: stat_smooth requires the following missing aesthetics: x, y
ggplot(data = df) +
geom_point(mapping = aes(x = wave.height, y = ship.deploy)) +
stat_smooth(mapping = aes(x = wave.height, y = ship.deploy),method = "glm", method.args = list(family = "binomial"), se = FALSE)
Only aesthetic mappings specified at the top level, ggplot(aes()), are inherited by subsequent layers. Aesthetics specified in a single layer, geom_point(aes()) apply only to that layer.
To avoid re-specifying the same mappings, put them at the top, as in your first code.
ggplot(data = wheatX,
aes(x = No.of.species,
y = Weight.of.weed,
color = Treatment)) +
geom_point(shape = 1) +
scale_colour_hue(l = 50) +
geom_smooth(method = glm,
se = FALSE)
This draws a straight line.
But the species number will decrease at somepoint. I want to make the line curve. How can I do it. Thanks
This is going to depend on what you mean by "smooth"
One thing you can do is apply a loess curve:
ggplot() + ... + stat_smooth(method = "loess", formula = biomass ~ numSpecies, size = 1)
Or you can manually build a polynomial model using the regular lm method:
ggplot() + ... + stat_smooth(method = "lm", formula = biomass ~ numSpecies + I(numSpecies^2), size = 1)
You'll need to figure out the exact model you want to use for the second case, hence what I originally meant by the definition of the term "smooth"
I am working on some viscosity experiments and I'm trying to make an Eyring plot with ν vs. θ.
When I create the plot with ggplot2 I can't get my model displayed.
These are the values used:
> theta
[1] 25 30 35 40 45
> nu
[1] 1.448462 1.362730 1.255161 1.167408 1.083005
Here I create the plot with my values from above:
plot <-
ggplot()+
geom_point(mapping = aes(theta, nu), colour = "#0072bd", size = 4, shape = 16)+
theme_bw()+
labs(
x = expression(paste(theta, " ", "[°C]")),
y = expression(paste("ln(", nu, ")", " ", "[mPa*s]")))+
ylim(0, 10)+
xlim(0, 100)
That's what the plot looks like.
Now, I add my model with geom_smooth()
plot +
geom_smooth(
method = "nls",
method.args = list(formula = nu~a*exp(b/theta),
start=list(a=1, b=0.1)))
But nothing happens... Not even an error message and the plot looks just the same as before.
I also tried to put the formula directly as a geom_smooth() argument and the start values as well,
plot +
geom_smooth(
method = "nls",
formula = nu~a*exp(b/theta),
start=list(a=1, b=0.1))
but then I get the
Error:Unknown parameter: start
Can anyone find the mistake I'm making?
Thanks in advance!
Cheers
EDIT
When separating the aesthetics mapping,
plot <-
ggplot()+
aes(theta, nu)+
geom_point(colour = "#0072bd", size = 4, shape = 16)+
theme_bw()+
labs(
x = expression(paste(theta, " ", "[°C]")),
y = expression(paste("ln(", nu, ")", " ", "[mPa*s]")))+
ylim(0, 10)+
xlim(0, 100)
I get the following error (and still nothing changes):
Warning message:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to min; returning -Inf
3: Computation failed in stat_smooth():
$ operator is invalid for atomic vectors
You have several things going on, many of which were pointed out in the comments.
Once you put your variables in a data.frame for ggplot and define you aesthetics either globally in ggplot or within each geom, the main thing going on is that the formula in geom_smooth expects you to refer to y and x instead of the variable names. geom_smooth will use the variables you mapped to y and x in aes.
The other complication you will run into is outlined here. Because you don't get standard errors from predict.nls, you need to use se = FALSE in geom_smooth.
Here is what your geom_smooth code might look like:
geom_smooth(method = "nls", se = FALSE,
method.args = list(formula = y~a*exp(b/x), start=list(a=1, b=0.1)))
And here is the full code and plot.
ggplot(df, aes(theta, nu))+
geom_point(colour = "#0072bd", size = 4, shape = 16)+
geom_smooth(method = "nls", se = FALSE,
method.args = list(formula = y~a*exp(b/x), start=list(a=1, b=0.1))) +
theme_bw()+
labs(
x = expression(paste(theta, " ", "[°C]")),
y = expression(paste("ln(", nu, ")", " ", "[mPa*s]")))+
ylim(0, 10) +
xlim(0, 100)
Note that geom_smooth won't fit outside the range of the dataset unless you use fullrange = TRUE instead of the default. This may be pretty questionable if you only have 5 data points.
ggplot(df, aes(theta, nu))+
geom_point(colour = "#0072bd", size = 4, shape = 16)+
geom_smooth(method = "nls", se = FALSE, fullrange = TRUE,
method.args = list(formula = y~a*exp(b/x), start=list(a=1, b=0.1))) +
theme_bw()+
labs(
x = expression(paste(theta, " ", "[°C]")),
y = expression(paste("ln(", nu, ")", " ", "[mPa*s]")))+
ylim(0, 10) +
xlim(0, 100)
I just wrote this answer as #lukeA made the comment.
df<- data.frame(theta = c(25, 30, 35, 40, 45),
nu = c( 1.448462, 1.362730, 1.255161, 1.167408, 1.083005))
myModel <- nls(nu~a*exp(b/theta), data=df, start=list(a=1, b=0.1))
myPredict <- expand.grid(theta = seq(5, 100, by =0.1))
#expand.grid here in case your model has more than one variable
#Caution, extrapolating well beyond the data
myPredict$fit <- predict(myModel, newdata= myPredict)
plot + geom_line(data = myPredict, aes(x= theta, y= fit))