My data is like this
Date
Speed
1/2019
4500
2/2019
3400
3/2019
5300
4/2019
2000
The date is my independent variable and Speed is my Dependent variable.
I'm trying to plot the trend line with a regression equation to understand if there is an increasing trend or decreasing trend.
I try to use this code but it did not show the equation in the graph.
ggscatter(a, x = "Date", y = "Speed", add = "reg.line") +
stat_cor(label.x = 03/2019, label.y = 3700) +
stat_regline_equation(label.x = 03/2019, label.y = 3600)
#> `geom_smooth()` using formula 'y ~ x'
Example output that I want (the Correlation Equation and the Regression Equation)
Here's the method I usually use:
library(tidyverse)
library(lubridate)
library(ggpmisc)
df <- tibble::tribble(
~Date, ~Speed,
"1/2019", 4500L,
"2/2019", 3400L,
"3/2019", 5300L,
"4/2019", 2000L
)
df$Date <- lubridate::my(df$Date)
ggplot(df, aes(x = Date, y = Speed)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x, se = FALSE) +
stat_poly_eq(formula = x ~ y,
aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE)
EDIT
With the p-value:
geom_point() +
geom_smooth(method = "lm", formula = y ~ x, se = FALSE) +
stat_poly_eq(formula = x ~ y,
aes(label = paste(..eq.label.., ..rr.label.., ..p.value.label.., sep = "~~~")),
parse = TRUE)
Related
The scatterplot is colour-coded by factor z. By default, ggplot2 also pots the regression lines by factor. I want to plot a single regression line passing through the data. How do I achiece this?
x <- c(1:50)
y <- rnorm(50,4,1)
z <- rep(c("P1", "P2"), each = 25)
df <- data.frame(x,y,z)
my.formula = y ~ x
ggplot(aes(x = x, y = y, color = z), data = df) +
geom_point() + scale_fill_manual(values=c("purple", "blue")) +
geom_smooth(method="lm", formula = y ~ x ) +
stat_poly_eq(formula = my.formula, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), parse = TRUE, size = 2.5, col = "black")+
theme_classic()
If I undertand you correctly, you can assign group = 1 in the aes to plot just one regression line. You can use the following code:
library(tidyverse)
library(ggpmisc)
my.formula = y ~ x
ggplot(aes(x = x, y = y, color = z, group = 1), data = df) +
geom_point() + scale_fill_manual(values=c("purple", "blue")) +
geom_smooth(method="lm", formula = y ~ x ) +
stat_poly_eq(formula = my.formula, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), parse = TRUE, size = 2.5, col = "black")+
theme_classic()
Output:
I would like to add the regression line and R^2 to my ggplot. I am fitting the regression line to different categories and for each category I am getting a unique equation. I'd like to set the position of equations for each category manually. i.e. Finding the max expression of y for each group and printing the equation at ymax + 1.
Here is my code:
library(ggpmisc)
df <- data.frame(x = c(1:100))
df$y <- 20 * c(0, 1) + 3 * df$x + rnorm(100, sd = 40)
df$group <- factor(rep(c("A", "B"), 50))
df <- df %>% group_by(group) %>% mutate(ymax = max(y))
my.formula <- y ~ x
df %>%
group_by(group) %>%
do(tidy(lm(y ~ x, data = .)))
p <- ggplot(data = df, aes(x = x, y = y, colour = group)) +
geom_smooth(method = "lm", se=FALSE, formula = my.formula) +
stat_poly_eq(formula = my.formula,
aes(x = x , y = ymax + 1, label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE) +
geom_point()
p
Any suggestion how to do this?
Also is there any way I can only print the slope of the equation. (remove the intercept from plot)?
Thanks,
I'm pretty sure that setting adjusting stat_poly_eq() with the geom argument will get what you want. Doing so will center the equations, leaving the left half of each clipped, so we use hjust = 0 to left-adjust the equations. Finally, depending on your specific data, the equations may be overlapping each other, so we use the position argument to have ggplot attempt to separate them.
This adjusted call should get you started, I hope:
p <- ggplot(data = df, aes(x = x, y = y, colour = group)) +
geom_smooth(method = "lm", se=FALSE, formula = my.formula) +
stat_poly_eq(
formula = my.formula,
geom = "text", # or 'label'
hjust = 0, # left-adjust equations
position = position_dodge(), # in case equations now overlap
aes(x = x , y = ymax + 1, label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE) +
geom_point()
p
I am adding the regression line equation to my ggplot. However, I would like to remove the intercept from plot and keep only the slope and R^2.
Here is the code I am using to generate the plot and equation. Do you have any idea how can I remove the intercept?
library(ggpmisc)
df <- data.frame(x = c(1:100))
df$y <- 20 * c(0, 1) + 3 * df$x + rnorm(100, sd = 40)
df$group <- factor(rep(c("A", "B"), 50))
df <- df %>% group_by(group) %>% mutate(ymax = max(y))
df %>%
group_by(group) %>%
do(tidy(lm(y ~ x, data = .)))
p <- ggplot(data = df, aes(x = x, y = y, colour = group)) +
geom_smooth(method = "lm", se=FALSE, formula = y ~ x) +
stat_regline_equation(
aes( x = x, y = y , label = paste(..eq.label..,..rr.label.., sep = "~~~~")),
formula=y~x, size=3,
)
p
Thanks,
You can use stat_fit_tidy from the ggpmisc package:
df <- data.frame(x = c(1:100))
df$y <- 20 * c(0, 1) + 3 * df$x + rnorm(100, sd = 40)
df$group <- factor(rep(c("A", "B"), 50))
library(ggpmisc)
my_formula <- y ~ x
ggplot(df, aes(x = x, y = y, colour = group)) +
geom_point() +
geom_smooth(method = "lm", formula = my_formula, se = FALSE) +
stat_fit_tidy(
method = "lm",
method.args = list(formula = my_formula),
mapping = aes(label = sprintf('slope~"="~%.3g',
after_stat(x_estimate))),
parse = TRUE)
EDIT
If you want the R squared as well:
ggplot(df, aes(x = x, y = y, colour = group)) +
geom_point() +
geom_smooth(method = "lm", formula = my_formula, se = FALSE) +
stat_fit_tidy(
method = "lm",
method.args = list(formula = my_formula),
mapping = aes(label = sprintf('slope~"="~%.3g',
after_stat(x_estimate))),
parse = TRUE) +
stat_poly_eq(formula = my_formula,
aes(label = ..rr.label..),
parse = TRUE,
label.x = 0.6)
EDIT
Another way:
myformat <- "Slope: %s --- R²: %s"
ggplot(df, aes(x, y, colour = group)) +
geom_point() +
geom_smooth(method = "lm", formula = my_formula, se = FALSE) +
stat_poly_eq(
formula = my_formula, output.type = "numeric",
mapping = aes(label =
sprintf(myformat,
formatC(stat(coef.ls)[[1]][[2, "Estimate"]]),
formatC(stat(r.squared)))),
vstep = 0.1
)
Based on the example here
Adding Regression Line Equation and R2 on graph, I am struggling to include the regression line equation for my model in each facet. However, I don't figure why is changing the limits of my x axis.
library(ggplot2)
library(reshape2)
df <- data.frame(year = seq(1979,2010), M02 = runif(32,-4,6),
M06 = runif(32, -2.4, 5.1), M07 = runif(32, -2, 7.1))
df <- melt(df, id = c("year"))
ggplot(data = df, mapping = aes(x = year, y = value)) +
geom_point() +
scale_x_continuous() +
stat_smooth_func(geom = 'text', method = 'lm', hjust = 0, parse = T) +
geom_smooth(method = 'lm', se = T) +
facet_wrap(~ variable) # as you can see, the scale_x_axis goes back to 1800
If I include on the x the limits,
scale_x_continuous(limits = c(1979,2010))
it does not show the regression coefficient anymore. What am I doing wrong here?
stat_smooth_func available here: https://gist.github.com/kdauria/524eade46135f6348140
You can use stat_poly_eq function from the ggpmisc package.
library(reshape2)
library(ggplot2)
library(ggpmisc)
#> For news about 'ggpmisc', please, see https://www.r4photobiology.info/
#> For on-line documentation see https://docs.r4photobiology.info/ggpmisc/
df <- data.frame(year = seq(1979,2010), M02 = runif(32,-4,6),
M06 = runif(32, -2.4, 5.1), M07 = runif(32, -2, 7.1))
df <- melt(df, id = c("year"))
formula1 <- y ~ x
ggplot(data = df, mapping = aes(x = year, y = value)) +
geom_point() +
scale_x_continuous() +
geom_smooth(method = 'lm', se = TRUE) +
stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~")),
label.x = "left", label.y = "top",
formula = formula1, parse = TRUE, size = 3) +
facet_wrap(~ variable)
ggplot(data = df, mapping = aes(x = year, y = value)) +
geom_point() +
scale_x_continuous() +
geom_smooth(method = 'lm', se = TRUE) +
stat_poly_eq(aes(label = paste(..eq.label.., sep = "~~~")),
label.x = "left", label.y = 0.15,
eq.with.lhs = "italic(hat(y))~`=`~",
eq.x.rhs = "~italic(x)",
formula = formula1, parse = TRUE, size = 4) +
stat_poly_eq(aes(label = paste(..rr.label.., sep = "~~~")),
label.x = "left", label.y = "bottom",
formula = formula1, parse = TRUE, size = 4) +
facet_wrap(~ variable)
Created on 2019-01-10 by the reprex package (v0.2.1.9000)
Probably someone will suggest a better solution, but as an alternative, you can change stat_smooth_func and you can make the final row like this
data.frame(x=1979, y=ypos, label=func_string)
instead of
data.frame(x=xpos, y=ypos, label=func_string)
So, the plot will be like below
I'm using R package ggpmisc. Wonder how to put hat on y in Regression Equation or how to get custom Response and Explanatory variable name in Regression Equation on graph.
library(ggplot2)
library(ggpmisc)
df <- data.frame(x1 = c(1:100))
set.seed(12345)
df$y1 <- 2 + 3 * df$x1 + rnorm(100, sd = 40)
p <- ggplot(data = df, aes(x = x1, y = y1)) +
geom_smooth(method = "lm", se=FALSE, color="black", formula = y ~ x) +
stat_poly_eq(formula = y ~ x,
aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")),
parse = TRUE) +
geom_point()
p
I would turn off the default value for y that is pasted in and build your own formula. For example
ggplot(data = df, aes(x = x1, y = y1)) +
geom_smooth(method = "lm", se=FALSE, color="black", formula = y ~ x) +
stat_poly_eq(formula = y ~ x, eq.with.lhs=FALSE,
aes(label = paste("hat(italic(y))","~`=`~",..eq.label..,"~~~", ..rr.label.., sep = "")),
parse = TRUE) +
geom_point()
We use eq.with.lhs=FALSE to turn off the automatic inclusion of y= and then we paste() the hat(y) on to the front (with the equals sign). Note that the formatting comes from the ?plotmath help page.