How can I apply geom_smooth() for every group ?
The code below uses facet_wrap(), so plots every group in a separate graph.
I would like to integrate the graph, and get one graph.
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point(aes(color = Species)) +
geom_smooth(method = "nls", formula = y ~ a * x + b, se = F,
method.args = list(start = list(a = 0.1, b = 0.1))) +
facet_wrap(~ Species)
You have to put all your variable in ggplot aes():
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) +
geom_point() +
geom_smooth(method = "nls", formula = y ~ a * x + b, se = F,
method.args = list(start = list(a = 0.1, b = 0.1)))
Adding a mapping aes(group=Species) to the geom_smooth() call will do what you want.
Basic plot:
library(ggplot2); theme_set(theme_bw())
g0 <- ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point(aes(color = Species))
geom_smooth:
g0 + geom_smooth(aes(group=Species),
method = "nls", formula = y ~ a * x + b, se = FALSE,
method.args = list(start = list(a = 0.1, b = 0.1)))
formula is always expressed in terms of x and y, no matter what variables are called in the original data set:
the x variable in the formula refers to the variable that is mapped to the x-axis (Sepal.Length)
the y variable to the y-axis variable (Petal.Length)
The model is fitted separately to groups in the data (Species).
If you add a colour mapping (for a factor variable) that will have the same effect (groups are implicitly defined according to the intersection of all the mappings used to distinguish geoms), plus the lines will be appropriately coloured.
g0 + geom_smooth(aes(colour=Species),
method = "nls", formula = y ~ a * x + b, se = FALSE,
method.args = list(start = list(a = 0.1, b = 0.1)))
As #HubertL points out, if you want to apply the same aesthetics to all of your geoms, you can just put them in the original ggplot call ...
By the way, I assume that in reality you want to use a more complex nls model - otherwise you could just use geom_smooth(...,method="lm") and save yourself trouble ...
Related
I am trying to add the R-squared equations, each with a different formula. I tried the mapply function modelling on a previous answer but nothing happens. There is no error, but not equation displayed either. I also want to plot the equation on one line, and the Rsquared in the next line, I don't know where exactly to add the \n in the stat_poly_eq.
library(ggplot2)
library(ggpmisc)
set.seed(14)
df <- data.frame(
var.test = c("T","T","T","T","M","M","M","M","A","A","A","A"),
val.test = rnorm(12,8,5),
x = c(1:12)
)
my.formula <- c(y~x + I(x^2), y~x, y~x + I(x^2))
ggplot(df, aes(x = x, y = val.test)) +
geom_point() +
mapply(function(x, z) geom_smooth(method="glm", data=function(d) subset(d, var.test==z), formula = x,
method.args = list(family = "poisson"), color = "black" ), my.formula, c("A","M","T")) + facet_grid(.~var.test) +
mapply(function(x,z) stat_poly_eq(formula = x, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), parse = TRUE, size = 2.5, col = "black", data=function(d) subset(d, var.test==z),my.formula, c("A","M","T")))
The issue with your code was a wrong closing paranthesis, i.e. you included my.formula and c("A","M","T") as arguments of stat_poly_eq. That's why no labels were plotted as you looped over nothing.
Concerning your second question. TBMK you can't have a line break in a math expression. One approach to deal with that would be to add the equation and the R^2 via two separate stat_poly_eq layers.
Additionally I simplified your code a bit. It's not necessary to have multiple mapplys. One is sufficient. You could return multiple layers by wrapping them inside a list.
library(ggplot2)
library(ggpmisc)
ggplot(df, aes(x = x, y = val.test)) +
geom_point() +
mapply(function(x, z) {
data <- subset(df, var.test == z)
list(
geom_smooth(
method = "glm", data = data, formula = x,
method.args = list(family = "poisson"), color = "black"
),
stat_poly_eq(formula = x, aes(label = ..eq.label..),
parse = TRUE, size = 2.5, col = "black", data = data, vjust = -0.1),
stat_poly_eq(formula = x, aes(label = ..rr.label..),
parse = TRUE, size = 2.5, col = "black", data = data, vjust = 1.1)
)
}, my.formula, c("A", "M", "T")) +
facet_grid(. ~ var.test)
I want to fit three different functions for each of these factors (var.test). I tried the following method but I get an error that reads Warning messages: 1: Computation failed in stat_smooth():invalid formula. Any other way to get multiple formulae to read be passed at once?
set.seed(14)
df <- data.frame(
var.test = c("T","T","T","T","M","M","M","M","A","A","A","A"),
val.test = rnorm(12,4,5),
x = c(1:12)
)
my.formula <- c(y~x + I(x^2), y~x, y~x + I(x^2))
ggplot(df, aes(x = x, y = val.test)) + geom_point() +
geom_smooth(method="glm", formula = my.formula,
method.args = list(family = "poisson"), color = "black" ) + facet_grid(.~var.test)
You can only have one formula per geom_smooth(). You'll need to add three different geom_smooth layers. YOu can do that manually
ggplot(df, aes(x = x, y = val.test)) +
geom_point() +
geom_smooth(method="glm", formula = my.formula[[1]], method.args = list(family = "poisson"), color = "black" ) +
geom_smooth(method="glm", formula = my.formula[[2]], method.args = list(family = "poisson"), color = "black" ) +
geom_smooth(method="glm", formula = my.formula[[3]], method.args = list(family = "poisson"), color = "black" ) +
facet_grid(.~var.test)
Or you can use lapply to help
ggplot(df, aes(x = x, y = val.test)) +
geom_point() +
lapply(my.formula, function(x) geom_smooth(method="glm", formula = x,
method.args = list(family = "poisson"), color = "black" )) +
facet_grid(.~var.test)
If you want different lines per panel, then you can filter your data for each panel. Here we use an mapply helper and subset the data for each line.
ggplot(df, aes(x = x, y = val.test)) +
geom_point() +
mapply(function(x, z) geom_smooth(method="glm", data=function(d) subset(d, var.test==z), formula = x,
method.args = list(family = "poisson"), color = "black" ),
my.formula, c("A","M","T")) +
facet_grid(.~var.test)
I would like to add the regression line and R^2 to my ggplot. I am fitting the regression line to different categories and for each category I am getting a unique equation. I'd like to set the position of equations for each category manually. i.e. Finding the max expression of y for each group and printing the equation at ymax + 1.
Here is my code:
library(ggpmisc)
df <- data.frame(x = c(1:100))
df$y <- 20 * c(0, 1) + 3 * df$x + rnorm(100, sd = 40)
df$group <- factor(rep(c("A", "B"), 50))
df <- df %>% group_by(group) %>% mutate(ymax = max(y))
my.formula <- y ~ x
df %>%
group_by(group) %>%
do(tidy(lm(y ~ x, data = .)))
p <- ggplot(data = df, aes(x = x, y = y, colour = group)) +
geom_smooth(method = "lm", se=FALSE, formula = my.formula) +
stat_poly_eq(formula = my.formula,
aes(x = x , y = ymax + 1, label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE) +
geom_point()
p
Any suggestion how to do this?
Also is there any way I can only print the slope of the equation. (remove the intercept from plot)?
Thanks,
I'm pretty sure that setting adjusting stat_poly_eq() with the geom argument will get what you want. Doing so will center the equations, leaving the left half of each clipped, so we use hjust = 0 to left-adjust the equations. Finally, depending on your specific data, the equations may be overlapping each other, so we use the position argument to have ggplot attempt to separate them.
This adjusted call should get you started, I hope:
p <- ggplot(data = df, aes(x = x, y = y, colour = group)) +
geom_smooth(method = "lm", se=FALSE, formula = my.formula) +
stat_poly_eq(
formula = my.formula,
geom = "text", # or 'label'
hjust = 0, # left-adjust equations
position = position_dodge(), # in case equations now overlap
aes(x = x , y = ymax + 1, label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE) +
geom_point()
p
I am plotting different plots in my shiny app.
By using geom_smooth(), I am fitting a smoothing curve on a scatterplot.
I am plotting these plots with ggplot() and rendering with ggplotly().
Is there any way, I can exclude a particular data profile from geom_smooth().
For e.g.:
It can be seen in the fit, the fit is getting disturbed and which is not desirable. I have tried plotly_click(), plotly_brush(), plotly_select(). But, I don't want user's interference when plotting this fit, this makes the process much slower and inaccurate.
Here is my code to plot this:
#plot
g <- ggplot(data = d_f4, aes_string(x = d_f4$x, y = d_f4$y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1)+
geom_smooth(formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
Unfortunately, I can not include my dataset in my question, because the dataset is quite big.
You can make an extra data.frame without the "outliers" and use this as the input for geom_smooth:
set.seed(8)
test_data <- data.frame(x = 1:100)
test_data$y <- sin(test_data$x / 10) + rnorm(100, sd = 0.1)
test_data[60:65, "y"] <- test_data[60:65, "y"] + 1
data_plot <- test_data[-c(60:65), ]
library(ggplot2)
ggplot(data = test_data, aes(x = x, y = y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1) +
geom_smooth(formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
ggplot(data = test_data, aes(x = x, y = y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1) +
geom_smooth(data = data_plot, formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
Created on 2020-11-27 by the reprex package (v0.3.0)
BTW: you don't need aes_string (which is deprecated) and d_f4$x, you can just use aes(x = x)
How can I apply geom_smooth() for every group ?
The code below uses facet_wrap(), so plots every group in a separate graph.
I would like to integrate the graph, and get one graph.
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point(aes(color = Species)) +
geom_smooth(method = "nls", formula = y ~ a * x + b, se = F,
method.args = list(start = list(a = 0.1, b = 0.1))) +
facet_wrap(~ Species)
You have to put all your variable in ggplot aes():
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) +
geom_point() +
geom_smooth(method = "nls", formula = y ~ a * x + b, se = F,
method.args = list(start = list(a = 0.1, b = 0.1)))
Adding a mapping aes(group=Species) to the geom_smooth() call will do what you want.
Basic plot:
library(ggplot2); theme_set(theme_bw())
g0 <- ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point(aes(color = Species))
geom_smooth:
g0 + geom_smooth(aes(group=Species),
method = "nls", formula = y ~ a * x + b, se = FALSE,
method.args = list(start = list(a = 0.1, b = 0.1)))
formula is always expressed in terms of x and y, no matter what variables are called in the original data set:
the x variable in the formula refers to the variable that is mapped to the x-axis (Sepal.Length)
the y variable to the y-axis variable (Petal.Length)
The model is fitted separately to groups in the data (Species).
If you add a colour mapping (for a factor variable) that will have the same effect (groups are implicitly defined according to the intersection of all the mappings used to distinguish geoms), plus the lines will be appropriately coloured.
g0 + geom_smooth(aes(colour=Species),
method = "nls", formula = y ~ a * x + b, se = FALSE,
method.args = list(start = list(a = 0.1, b = 0.1)))
As #HubertL points out, if you want to apply the same aesthetics to all of your geoms, you can just put them in the original ggplot call ...
By the way, I assume that in reality you want to use a more complex nls model - otherwise you could just use geom_smooth(...,method="lm") and save yourself trouble ...