I'm using ggplot to visualize many linear regressions and facet them by groups. I'd like geom_smooth() to show the trend line as one color if P < 0.05, a different color if P < 0.10, and not show it at all if P ≥ 0.10.
I managed to do this using a loop to extract P-values from lm() for each regression, then join them with the data used for plotting. Then I add another column of color names to pass to aes(), determined conditionally from the P-values, and use scale_color_identity() to achieve my goal.
Here's an example:
library(tidyverse)
#make mtcars a tibble and cyl a factor, for convenience
mtcars1 <- as_tibble(mtcars) %>% dplyr::mutate(cyl = as.factor(cyl))
#initialize a list to store p-values from lm() for each level of factor
p.list <- vector(mode = "list", length = length(levels(mtcars1$cyl)))
names(p.list) <- levels(mtcars1$cyl)
#loop to calculate p-values for each level of mtcars$cyl
for(i in seq_along(levels(mtcars1$cyl))){
mtcars.sub <- mtcars1 %>% dplyr::filter(cyl == levels(.$cyl)[i])
lm.pval <- mtcars.sub %>%
dplyr::distinct(cyl) %>%
dplyr::mutate(P =
summary(lm(mpg ~ disp, data = mtcars.sub))$coefficients[2,4] ##extract P-value
)
p.list[[i]] <- lm.pval
}
#join p-values to dataset and add column to use with scale_color_identity()
mtcars.p <- mtcars1 %>% dplyr::left_join(dplyr::bind_rows(p.list, .id = "cyl"), by = "cyl") %>%
dplyr::mutate(p.color = ifelse(P < 0.05, "black",
ifelse(P < 0.10, "lightblue", NA)))
#plot
ggplot(data = mtcars.p, aes(x = disp, y = mpg)) +
geom_smooth(method = "lm",
se = FALSE,
aes(color = p.color)) +
geom_point() +
scale_color_identity(name = NULL,
na.translate = FALSE,
labels = c("P < 0.05", "P < 0.10"),
guide = "legend") +
facet_wrap(~cyl, scales = "free")
This seems like too many initial steps for something that should be relatively easy. Are these steps necessary, or is there a more efficient way of doing this? Can ggplot or any other packages out there do this on their own, without having to first extract p-values from lm()?
After specifying your regression function, you can include the line function within ggplot:
myline<-lm(mpg ~ disp, data = mtcars)
ggplot(data = mtcars, aes(x = disp, y = mpg)) +
geom_abline(slope = coef(myline)[[2]], intercept = coef(myline)[[1]], color='blue')+
geom_point(color='red') +
scale_color_identity(name = NULL,
na.translate = FALSE,
labels = c("P < 0.05", "P < 0.10"),
guide = "legend") +
facet_wrap(~cyl, scales = "free")
The same as above, you can use this geom_smooth() command as well:
geom_smooth(slope = coef(myline)[[2]], intercept = coef(myline)[[1]], color='blue',se=F,method='lm')+
We may simplify the steps with a group by operation and also instead of extracting each component, the output can be in a tibble with tidy from broom
library(broom)
library(dplyr)
library(tidyr)
mtcars1 %>%
group_by(cyl) %>%
summarise(out = list(tidy(lm(mpg ~ disp, data = cur_data())))) %>%
unnest(out)
-output
# A tibble: 6 x 6
cyl term estimate std.error statistic p.value
<fct> <chr> <dbl> <dbl> <dbl> <dbl>
1 4 (Intercept) 40.9 3.59 11.4 0.00000120
2 4 disp -0.135 0.0332 -4.07 0.00278
3 6 (Intercept) 19.1 2.91 6.55 0.00124
4 6 disp 0.00361 0.0156 0.232 0.826
5 8 (Intercept) 22.0 3.35 6.59 0.0000259
6 8 disp -0.0196 0.00932 -2.11 0.0568
Related
sample_data = read.table("http://freakonometrics.free.fr/db.txt",
header=TRUE, sep=";")
head(sample_data)
model = glm(Y~0+X1+X2+X3,family=binomial,data=sample_data)
summary(model)
exp(coef(model ))
exp(cbind(OR = coef(model ), confint(model )))
I have the above sample data on logistic regression with categorical predictor
I try the above code i get the following output,
OR 2.5 % 97.5 %
X1 1.67639337 1.352583976 2.09856514
X2 1.23377720 1.071959330 1.42496949
X3A 0.01157565 0.001429430 0.08726854
X3B 0.06627849 0.008011818 0.54419759
X3C 0.01118084 0.001339984 0.08721028
X3D 0.01254032 0.001545240 0.09539880
X3E 0.10654454 0.013141540 0.87369972
but I am wondering how to extract OR and CI only for factors. My
desired output will be:
OR 2.5 % 97.5 %
X3A 0.01157565 0.001429430 0.08726854
X3B 0.06627849 0.008011818 0.54419759
X3C 0.01118084 0.001339984 0.08721028
X3D 0.01254032 0.001545240 0.09539880
X3E 0.10654454 0.013141540 0.87369972
Can any one help me the code to extract it?
additionally I want to plot the above OR with confidence interval for
the extracted one.
Can you also help me the code with plot,or box plot?
You could filter out the rows that are the same as variable names in your data frame, since those row names with factor levels appended will not match:
result <- exp(cbind(OR = coef(model ), confint(model )))
result[!rownames(result) %in% names(sample_data),]
#> OR 2.5 % 97.5 %
#> X3A 0.01157565 0.001429430 0.08726854
#> X3B 0.06627849 0.008011818 0.54419759
#> X3C 0.01118084 0.001339984 0.08721028
#> X3D 0.01254032 0.001545240 0.09539880
#> X3E 0.10654454 0.013141540 0.87369972
To extract the necessary rows and plot them, the full reproducible code would be:
library(tidyverse)
sample_data <- read.table("http://freakonometrics.free.fr/db.txt",
header = TRUE, sep = ";")
model <- glm(Y ~ 0 + X1 + X2 + X3,family = binomial, data = sample_data)
result <- exp(cbind(OR = coef(model), confint(model)))
#> Waiting for profiling to be done...
result %>%
as.data.frame(check.names = FALSE) %>%
rownames_to_column(var = "Variable") %>%
filter(!Variable %in% names(sample_data)) %>%
ggplot(aes(x = OR, y = Variable)) +
geom_vline(xintercept = 1, linetype = 2) +
geom_errorbarh(aes(xmin = `2.5 %`, xmax = `97.5 %`), height = 0.1) +
geom_point(size = 2) +
scale_x_log10(name = "Odds ratio (log scale)") +
theme_minimal(base_size = 16)
Created on 2022-06-14 by the reprex package (v2.0.1)
One possibility, using broom to extract the coefficients, dplyr::filter to select the terms you want, and dwplot to plot.
library(broom)
library(dotwhisker)
library(dplyr)
tt <- (tidy(model, exponentiate = TRUE, conf.int = TRUE)
|> filter(stringr::str_detect(term, "^X3"))
)
dwplot(tt)
In addition, I would suggest:
library(ggplot2)
dwplot(tt) + scale_x_log10() + geom_vline(xintercept = 1, lty = 2) +
labs(x="Odds ratio")
To extract all but the first 2 rows, use a negative index on the rows.
I will also coerce to data.frame and add an id, it will be needed to plot the confidence intervals.
ORCI <- exp(cbind(OR = coef(model), confint(model)))[-(1:2), ]
ORCI <- cbind.data.frame(ORCI, id = row.names(ORCI))
Hi everyone based on the wage-dataset (wage being the dependent variable) and on the workflow created below, I would like to find out the following:
What is the predicted wage of a person with age equal to 30 for each piecewise model?
Considering the flexible pw6_wf_fit model configuration and in particular the six breakpoints above: Exceeding which (approximate) value of age correlates strongest with wage?
I tried to use versions of extract but so far I don´t know how to apply it in R. Helpful for any comment
The code I use is the following:
if (!require("pacman")) install.packages("pacman")
# load (or install if pacman cannot find an existing installation) the relevant packages
pacman::p_load(
tidyverse, tidymodels, ISLR, patchwork,
rpart, rpart.plot, randomForest, gbm, kernlab, parsnip, skimr
)
data(Wage, package = "ISLR")
Wage %>%
tibble::as_tibble() %>%
skimr::skim()
lin_rec <- recipe(wage ~ age, data = Wage)
# Specify as linear regression
lm_spec <-
linear_reg() %>%
set_mode("regression") %>%
set_engine("lm")
plot_model <- function(wf_fit, data) {
predictions <-
tibble::tibble(age = seq(min(data$age), max(data$age))) %>%
dplyr::bind_cols(
predict(wf_fit, new_data = .),
predict(wf_fit, new_data = ., type = "conf_int")
)
p <- ggplot2::ggplot(aes(age, wage), data = data) +
geom_point(alpha = 0.05) +
geom_line(aes(y = .pred),
data = predictions, color = "darkgreen") +
geom_line(aes(y = .pred_lower),
data = predictions, linetype = "dashed", color = "blue") +
geom_line(aes(y = .pred_upper),
data = predictions, linetype = "dashed", color = "blue") +
scale_x_continuous(breaks = seq(20, 80, 5)) +
labs(title = substitute(wf_fit)) +
theme_classic()
return(p)
}
pw3_rec <- lin_rec %>% step_discretize(age, num_breaks = 3, min_unique = 5)
pw4_rec <- lin_rec %>% step_discretize(age, num_breaks = 4, min_unique = 5)
pw5_rec <- lin_rec %>% step_discretize(age, num_breaks = 5, min_unique = 5)
pw6_rec <- lin_rec %>% step_discretize(age, num_breaks = 6, min_unique = 5)
pw3_wf_fit <- workflow(pw3_rec, lm_spec) %>% fit(data = Wage)
pw4_wf_fit <- workflow(pw4_rec, lm_spec) %>% fit(data = Wage)
pw5_wf_fit <- workflow(pw5_rec, lm_spec) %>% fit(data = Wage)
pw6_wf_fit <- workflow(pw6_rec, lm_spec) %>% fit(data = Wage)
(plot_model(pw3_wf_fit, Wage) + plot_model(pw4_wf_fit, Wage)) /
(plot_model(pw5_wf_fit, Wage) + plot_model(pw6_wf_fit, Wage))
The answer to the first question is pretty straightforward:
map(list(pw3_wf_fit, pw4_wf_fit, pw5_wf_fit, pw6_wf_fit),
~predict(.x, new_data=tibble(age=30))) %>%
bind_rows()
# # A tibble: 4 × 1
# .pred
# <dbl>
# 1 99.3
# 2 94.2
# 3 92.3
# 4 89.5
I compare categorical data from three different groups.
I wonder if it is possible to easily add p-values of chi-squared tests to facet ggplots (since I am analyzing a big data set). I just read that there is a marvelous way to do so when comparing means https://www.r-bloggers.com/add-p-values-and-significance-levels-to-ggplots/. However, I could not find a solution for other tests (like the chisq.test in my case).
d.test <- data.frame(
results = sample(c("A","B","C"), 30, replace =TRUE),
test = sample(c("test1", "test2","test3"), 30, replace = TRUE)
)
chisq.test(d.test$results,d.test$test)
ggplot(d.test, aes(results) ) +
geom_bar() + facet_grid(test ~ .)
Many thanks for your help! ;D
Store your p-value in a variable
pval <- chisq.test(d.test$results,d.test$test)$p.value
Use annotate to plot text manually
ggplot(d.test, aes(results) ) +
geom_bar() + facet_grid(test ~ .) +
annotate("text", x=1, y=5, label=pval)
Change its positioning with x and y
ggplot(d.test, aes(results) ) +
geom_bar() + facet_grid(test ~ .) +
annotate("text", x=2, y=3, label=pval)
Change significant digits displayed with signif
ggplot(d.test, aes(results) ) +
geom_bar() + facet_grid(test ~ .) +
annotate("text", x=1, y=5, label=signif(pval,4))
Add a 'label' p-value: with
ggplot(d.test, aes(results) ) +
geom_bar() + facet_grid(test ~ .) +
annotate("text", x=1, y=5, label=paste0("p-value: ", signif(pval,4)))
broom has methods to create tidy dataframes of most statistical test outputs. Then you can use that output as a data = argument within geom_text.
Generate data
library(broom)
library(dplyr)
library(ggplot2)
fakedata <-
data.frame(groups = sample(c("pop1", "pop2", "pop3", "pop4"), 120, replace = T),
results = sample(c("A","B","C"), 120, replace = TRUE),
test = sample(c("test1", "test2","test3"), 120, replace = TRUE))
Conduct and tidy tests
fakedata.test <-
fakedata %>%
group_by(groups) %>%
do(fit = chisq.test(.$results, .$test)) %>%
tidy(fit)
# A tibble: 4 x 5
# Groups: groups [4]
groups statistic p.value parameter method
<fctr> <dbl> <dbl> <int> <fctr>
1 pop1 3.714286 0.44605156 4 Pearson's Chi-squared test
2 pop2 2.321429 0.67687042 4 Pearson's Chi-squared test
3 pop3 2.294897 0.68169829 4 Pearson's Chi-squared test
4 pop4 10.949116 0.02714188 4 Pearson's Chi-squared test
Visualize
fakedata %>%
ggplot(aes(results, test)) +
geom_jitter(width = 0.2, height = 0.2, shape = 1, size = 2) +
geom_text(data = fakedata.test,
aes(3, 3.5,
label = paste0("χ²(", parameter, ")=", round(statistic, 2), "; p=", round(p.value, 2))),
hjust = 1) +
facet_wrap(~groups)
Suppose I have 2 data frames, one for 2015 and one for 2016. I want to run a regression for each data frame and plot one of the coefficient for each regression with their respective confidence interval. For example:
set.seed(1020022316)
library(dplyr)
library(stargazer)
df16 <- data.frame(
x1 = rnorm(1000, 0, 2),
t = sample(c(0, 1), 1000, T),
e = rnorm(1000, 0, 10)
) %>% mutate(y = 0.5 * x1 + 2 * t + e) %>%
select(-e)
df15 <- data.frame(
x1 = rnorm(1000, 0, 2),
t = sample(c(0, 1), 1000, T),
e = rnorm(1000, 0, 10)
) %>% mutate(y = 0.75 * x1 + 2.5 * t + e) %>%
select(-e)
lm16 <- lm(y ~ x1 + t, data = df16)
lm15 <- lm(y ~ x1 + t, data = df15)
stargazer(lm15, lm16, type="text", style = "aer", ci = TRUE, ci.level = 0.95)
I want to plot t=1.558, x=2015, and t=2.797, x=2016 with their respective .95 CI. What is the best way of doing this?
I could do it 'by hand', but I hope there is a better way.
library(ggplot2)
df.plot <-
data.frame(
y = c(lm15$coefficients[['t']], lm16$coefficients[['t']]),
x = c(2015, 2016),
lb = c(
confint(lm15, 't', level = 0.95)[1],
confint(lm16, 't', level = 0.95)[1]
),
ub = c(
confint(lm15, 't', level = 0.95)[2],
confint(lm16, 't', level = 0.95)[2]
)
)
df.plot %>% ggplot(aes(x, y)) + geom_point() +
geom_errorbar(aes(ymin = lb, ymax = ub), width = 0.1) +
geom_hline(aes(yintercept=0), linetype="dashed")
Best: The figure quality (looks nice), code elegance, easy to expand (more than 2 regressions)
This is a bit too long for a comment, so I post it as a partial answer.
It is unclear from your post if your main problem is to get the data into the right shape, or if it is the plotting itself. But just to follow up on one of the comments, let me show you how to do run several models using dplyr and broom that makes plotting easy. Consider the mtcars-dataset:
library(dplyr)
library(broom)
models <- mtcars %>% group_by(cyl) %>%
do(data.frame(tidy(lm(mpg ~ disp, data = .),conf.int=T )))
head(models) # I have abbreviated the following output a bit
cyl term estimate std.error statistic p.value conf.low conf.high
(dbl) (chr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
4 (Intercept) 40.8720 3.5896 11.39 0.0000012 32.752 48.99221
4 disp -0.1351 0.0332 -4.07 0.0027828 -0.210 -0.06010
6 (Intercept) 19.0820 2.9140 6.55 0.0012440 11.591 26.57264
6 disp 0.0036 0.0156 0.23 0.8259297 -0.036 0.04360
You see that this gives you all coefficients and confidence intervals in one nice dataframe, which makes plotting with ggplot easier. For instance, if your datasets have identical content, you could add a year identifier to them (e.g. df1$year <- 2000; df2$year <- 2001 etc), and bind them together afterwards (e.g. using bind_rows, of you can use bind_rows's .id option). Then you can use the year identifer instead of cyl in the above example.
The plotting then is simple. To use the mtcars data again, let's plot the coefficients for disp only (though you could also use faceting, grouping, etc):
ggplot(filter(models, term=="disp"), aes(x=cyl, y=estimate)) +
geom_point() + geom_errorbar(aes(ymin=conf.low, ymax=conf.high))
To use your data:
df <- bind_rows(df16, df15, .id = "years")
models <- df %>% group_by(years) %>%
do(data.frame(tidy(lm(y ~ x1+t, data = .),conf.int=T ))) %>%
filter(term == "t") %>%
ggplot(aes(x=years, y=estimate)) + geom_point() +
geom_errorbar(aes(ymin=conf.low, ymax=conf.high))
Note that you can easily add more and more models just by binding more and more data to the main dataframe. You can also easily use faceting, grouping or position-dodgeing to adjust the look of the corresponding plot if you want to plot more than one coefficient.
This is the solution I have right now:
gen_df_plot <- function(reg, coef_name){
df <- data.frame(y = reg$coefficients[[coef_name]],
lb = confint(reg, coef_name, level = 0.95)[1],
ub = confint(reg, coef_name, level = 0.95)[2])
return(df)
}
df.plot <- lapply(list(lm15,lm16), gen_df_plot, coef_name = 't')
df.plot <- data.table::rbindlist(df.plot)
df.plot$x <- as.factor(c(2015, 2016))
df.plot %>% ggplot(aes(x, y)) + geom_point(size=4) +
geom_errorbar(aes(ymin = lb, ymax = ub), width = 0.1, linetype="dotted") +
geom_hline(aes(yintercept=0), linetype="dashed") + theme_bw()
I don't love it, but it works.
Here is what might be generalized code. I have made a change to how "x" is defined so that you don't have to worry about alphabetic reordering of the factor.
#
# Paul Gronke and Paul Manson
# Early Voting Information Center at Reed College
#
# August 27, 2019
#
#
# Code to plot a single coefficient from multiple models, provided
# as an easier alternative to "coefplot" and "dotwhisker". Some users
# may find those packages more capable
#
# Code adapted from https://stackoverflow.com/questions/35582052/plot-regression-coefficient-with-confidence-intervals
# gen_df_plot function will create a tidy data frame for your plot
# Currently set up to display 95% confidence intervals
gen_df_plot <- function(reg, coef_name){
df <- data.frame(y = reg$coefficients[[coef_name]],
lb = confint(reg, coef_name, level = 0.95)[1],
ub = confint(reg, coef_name, level = 0.95)[2])
return(df)
}
# Populate the data frame with a list of your model results.
df.plot <- lapply(list(model1, # List your models here
model2),
gen_df_plot,
coef_name = 'x1') # Coefficient name
# Convert the list to a tidy data frame
df.plot <- data.table::rbindlist(df.plot)
# Provide the coefficient or regression labels below, in the
# order that you want them to appear. The "levels=unique(.)" parameter
# overrides R's desire to order the factor alphabetically
df.plot$x <- c("Group 1",
"Group 2") %>%
factor(., levels = unique(.),
ordered = TRUE)
# Create your plot
df.plot %>% ggplot(aes(x, y)) +
geom_point(size=4) +
geom_errorbar(aes(ymin = lb, ymax = ub), width = 0.1, linetype="dotted") +
geom_hline(aes(yintercept=0), linetype="dashed") +
theme_bw() +
ggtitle("Comparing Coefficients") +
ylab("Coefficient Value")```
How do we print the equation of a line on a plot?
I have 2 independent variables and would like an equation like this:
y=mx1+bx2+c
where x1=cost, x2 =targeting
I can plot the best fit line but how do i print the equation on the plot?
Maybe i cant print the 2 independent variables in one equation but how do i do it for say
y=mx1+c at least?
Here is my code:
fit=lm(Signups ~ cost + targeting)
plot(cost, Signups, xlab="cost", ylab="Signups", main="Signups")
abline(lm(Signups ~ cost))
I tried to automate the output a bit:
fit <- lm(mpg ~ cyl + hp, data = mtcars)
summary(fit)
##Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.90833 2.19080 16.847 < 2e-16 ***
## cyl -2.26469 0.57589 -3.933 0.00048 ***
## hp -0.01912 0.01500 -1.275 0.21253
plot(mpg ~ cyl, data = mtcars, xlab = "Cylinders", ylab = "Miles per gallon")
abline(coef(fit)[1:2])
## rounded coefficients for better output
cf <- round(coef(fit), 2)
## sign check to avoid having plus followed by minus for negative coefficients
eq <- paste0("mpg = ", cf[1],
ifelse(sign(cf[2])==1, " + ", " - "), abs(cf[2]), " cyl ",
ifelse(sign(cf[3])==1, " + ", " - "), abs(cf[3]), " hp")
## printing of the equation
mtext(eq, 3, line=-2)
Hope it helps,
alex
You use ?text. In addition, you should not use abline(lm(Signups ~ cost)), as this is a different model (see my answer on CV here: Is there a difference between 'controling for' and 'ignoring' other variables in multiple regression). At any rate, consider:
set.seed(1)
Signups <- rnorm(20)
cost <- rnorm(20)
targeting <- rnorm(20)
fit <- lm(Signups ~ cost + targeting)
summary(fit)
# ...
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.1494 0.2072 0.721 0.481
# cost -0.1516 0.2504 -0.605 0.553
# targeting 0.2894 0.2695 1.074 0.298
# ...
windows();{
plot(cost, Signups, xlab="cost", ylab="Signups", main="Signups")
abline(coef(fit)[1:2])
text(-2, -2, adj=c(0,0), labels="Signups = .15 -.15cost + .29targeting")
}
Here's a solution using tidyverse packages.
The key is the broom package, whcih simplifies the process of extracting model data. For example:
fit1 <- lm(mpg ~ cyl, data = mtcars)
summary(fit1)
fit1 %>%
tidy() %>%
select(estimate, term)
Result
# A tibble: 2 x 2
estimate term
<dbl> <chr>
1 37.9 (Intercept)
2 -2.88 cyl
I wrote a function to extract and format the information using dplyr:
get_formula <- function(object) {
object %>%
tidy() %>%
mutate(
term = if_else(term == "(Intercept)", "", term),
sign = case_when(
term == "" ~ "",
estimate < 0 ~ "-",
estimate >= 0 ~ "+"
),
estimate = as.character(round(abs(estimate), digits = 2)),
term = if_else(term == "", paste(sign, estimate), paste(sign, estimate, term))
) %>%
summarize(terms = paste(term, collapse = " ")) %>%
pull(terms)
}
get_formula(fit1)
Result
[1] " 37.88 - 2.88 cyl"
Then use ggplot2 to plot the line and add a caption
mtcars %>%
ggplot(mapping = aes(x = cyl, y = mpg)) +
geom_point() +
geom_smooth(formula = y ~ x, method = "lm", se = FALSE) +
labs(
x = "Cylinders", y = "Miles per Gallon",
caption = paste("mpg =", get_formula(fit1))
)
Plot using geom_smooth()
This approach of plotting a line really only makes sense to visualize the relationship between two variables. As #Glen_b pointed out in the comment, the slope we get from modelling mpg as a function of cyl (-2.88) doesn't match the slope we get from modelling mpg as a function of cyl and other variables (-1.29). For example:
fit2 <- lm(mpg ~ cyl + disp + wt + hp, data = mtcars)
summary(fit2)
fit2 %>%
tidy() %>%
select(estimate, term)
Result
# A tibble: 5 x 2
estimate term
<dbl> <chr>
1 40.8 (Intercept)
2 -1.29 cyl
3 0.0116 disp
4 -3.85 wt
5 -0.0205 hp
That said, if you want to accurately plot the regression line for a model that includes variables that don't appear included in the plot, use geom_abline() instead and get the slope and intercept using broom package functions. As far as I know geom_smooth() formulas can't reference variables that aren't already mapped as aesthetics.
mtcars %>%
ggplot(mapping = aes(x = cyl, y = mpg)) +
geom_point() +
geom_abline(
slope = fit2 %>% tidy() %>% filter(term == "cyl") %>% pull(estimate),
intercept = fit2 %>% tidy() %>% filter(term == "(Intercept)") %>% pull(estimate),
color = "blue"
) +
labs(
x = "Cylinders", y = "Miles per Gallon",
caption = paste("mpg =", get_formula(fit2))
)
Plot using geom_abline()