ggplot2: facet_wrap failure for multi-layered plots with two variables - r

I am trying to create a multi-faceted plot with free scaling using ggplot2. By design, facet_grid, cannot achieve what I need. And facet_wrap fails with a cryptic error. Could you please tell me, do you have any suggestions on how to fix the error? A reproducible example is given below.
Let's create sample data:
require(tidyverse)
require(modelr)
d1 <- tibble(
x = 1:100,
y = 1:100 + rnorm(10),
z = y ^ 2,
dataset_name = "d1"
)
d2 <- tibble(
x = 1:1000,
y = 1:1000 + rnorm(10),
z = y ^ 2,
dataset_name = "d2"
)
#these data will be used for the 1st layer
actuals <- bind_rows(d1, d2)
#these data will be used for the 2nd layer
predictions <- bind_rows(
d1 %>% gather_predictions(
"m1" = lm(y ~ x, data = d1),
"m2" = lm(y ~ x + z, data = d1),
.pred = "y"
),
d2 %>% gather_predictions(
"m1" = lm(y ~ x, data = d2),
"m2" = lm(y ~ x + z, data = d2),
.pred = "y"
)
)
facet_grid generated the required graphs:
)
But it cannot (by design) scale the x-axis:
ggplot(actuals, aes(x, y)) +
geom_point() +
geom_line(data = predictions, colour = "red") +
facet_grid(dataset_name ~ model, scales = "free")
If I want to plot the data only for one dataset (namely, predictions), it works as expected and I get 4 facets:
ggplot(predictions, aes(x, y)) +
geom_point() +
facet_wrap( ~ model + dataset_name, scales = "free")
However, if I try to combine actuals and predictions as follows:
ggplot(actuals, aes(x, y)) +
geom_point() +
geom_line(data = predictions, colour = "red") +
facet_wrap( ~ model + dataset_name, scales = "free")
Then things fall apart with the following error: Error in gList(list(x = 0.5, y = 0.5, width = 1, height = 1, just = "centre", : only 'grobs' allowed in "gList"

Try making a single variable with the interaction of model and dataset_name.
# these two blocks of code are equivalent
library(magrittr)
predictions %<>% mutate(mod_dn = interaction(model, dataset_name))
and
predictions <- predictions %>%
mutate(mod_dn = interaction(model, dataset_name))
Now, this poses a problem for facet_wrap, since mod_dn does not exist there. So we need to merge the two datasets together. Using tidyverse, we can do this with left_join, but we need to be careful about what we join by, and then adjust the ggplot call accordingly:
all_data <- left_join(
actuals,
predictions,
by = c("x", "dataset_name"),
suffix = c(".actual", ".pred")
)
all_data %>%
ggplot(aes(x, y.actual)) +
geom_point() +
geom_line(aes(y = y.pred), colour = "red") +
facet_wrap( ~ mod_dn, scales = "free") +
labs(y = "y")

Thank you all for the help! It seems that a more straightforward solution (even though it will alter the original question), would be to tweak the predictions as follows:
predictions <- bind_rows(
d1 %>% gather_predictions(
"m1" = lm(y ~ x, data = d1),
"m2" = lm(y ~ x + z, data = d1),
.pred = "y.pred"
),
d2 %>% gather_predictions(
"m1" = lm(y ~ x, data = d2),
"m2" = lm(y ~ x + z, data = d2),
.pred = "y.pred"
)
)
Then we can do the plotting without resorting to joins:
ggplot(predictions, aes(x, y)) +
geom_point() +
geom_line(aes(x, y.pred), colour = "red") +
facet_wrap( ~ model + dataset_name, scales = "free")
This renders the desired plot.

Related

Plot binomial GAM in ggplot

I'm trying to visualize a dataset that uses a binomial response variable (proportions). I'm using a gam to examine the trend, but having difficult getting it to plot with ggplot. How do I get the smooth added to the plot?
Example:
set.seed(42)
df <- data.frame(y1 = sample.int(100),
y2 = sample.int(100),
x = runif(100, 0, 100))
ggplot(data = df,
aes(y = y1/(y1+y2), x = x)) +
geom_point(shape = 1) +
geom_smooth(method = "gam",
method.args = list(family = binomial),
formula = cbind(y1, y2) ~ s(x))
Warning message:
Computation failed in `stat_smooth()`
Caused by error in `cbind()`:
! object 'y1' not found
The formula in geom_smooth has to be in terms of x and y, representing the variables on your x and y axes, so you can't pass in y1 and y2.
The way round this is that rather than attempting to use the cbind type left-hand side of your gam, you can expand the counts into 1s and 0s so that there is only a single y variable. Although this makes for a little extra pre-processing, it allows you to draw your points just as easily using stat = 'summary' inside geom_point and makes your geom_smooth very straightforward:
library(tidyverse)
set.seed(42)
df <- data.frame(y1 = sample.int(100),
y2 = sample.int(100),
x = runif(100, 0, 100))
df %>%
rowwise() %>%
summarize(y = rep(c(1, 0), times = c(y1, y2)), x = x) %>%
ggplot(aes(x, y)) +
geom_point(stat = 'summary', fun = mean, shape = 1) +
geom_smooth(method = "gam",
method.args = list(family = binomial),
formula = y ~ s(x)) +
theme_classic()
Created on 2023-01-20 with reprex v2.0.2

How to plot the result of a regression prediction in R

I am beginning with ML in R, and I really like the idea of visualize the results of my calculations, I am wondering how to plot a Prediction.
library("faraway")
library(tibble)
library(stats)
data("sat")
df<-sat[complete.cases(sat),]
mod_sat_sal <- lm(total ~ salary, data = df)
new_teacher <- tibble(salary = 40)
predict(mod_sat_sal, new_teacher)
Expected result:
Data and Regression Model
data(sat, package = "faraway")
df <- sat[complete.cases(sat), ]
model <- lm(total ~ salary, data = df)
Method (1) : graphics way
# Compute the confidence band
x <- seq(min(df$salary), max(df$salary), length.out = 300)
x.conf <- predict(model, data.frame(salary = x),
interval = 'confidence')
# Plot
plot(total ~ salary, data = df, pch = 16, xaxs = "i")
polygon(c(x, rev(x)), c(x.conf[, 2], rev(x.conf[, 3])),
col = gray(0.5, 0.5), border = NA)
abline(model, lwd = 3, col = "darkblue")
Method (2) : ggplot2 way
library(ggplot2)
ggplot(df, aes(x = salary, y = total)) +
geom_point() +
geom_smooth(method = "lm")

visreg: overlay two models in a single plot

I'm trying to draw in a single plot crude and adjusted GAM models using library visreg:
# Create DF
set.seed(123)
x1 = rnorm(2000)
z = 1 + 3*x1 + 3*exp(x1)
pr = 1/(1+exp(-z))
y = rbinom(2000,1,pr)
df = data.frame(y=y,x1=x1, x2=exp(x1)*z)
# Fitting GAMs
library(mgcv)
crude <- gam(y ~ s(x1), family=binomial(link=logit), data=df)
adj <- gam(y ~ s(x1) + s(x2), family=binomial(link=logit), data=df)
# Plot results using 'visreg'
library(visreg)
p.crude <- visreg(crude, scale='response', "x1", line.par = list(col = 'red'), gg=TRUE) + theme_bw()
p.adj <- visreg(adj, scale='response', "x1", gg=TRUE) + theme_bw()
Using gridExtra I can produce a two columns plot, however I would have a single plot which overlays the two model plots.
You can use the plot=FALSE parameter to get the data without the plots:
p.crude <- visreg(crude, scale='response', "x1", line.par = list(col = 'red'), plot=FALSE)
p.adj <- visreg(adj, scale='response', "x1", plot = FALSE)
And, then re-create it by hand:
dplyr::bind_rows(
dplyt::mutate(p.crude$fit, plt = "crude"),
dplyr::mutate(p.adj$fit, plt = "adj")
) -> fits
ggplot() +
geom_ribbon(
data = fits,
aes(x1, ymin=visregLwr, ymax=visregUpr, group=plt), fill="gray90"
) +
geom_line(data = fits, aes(x1, visregFit, group=plt, color=plt)) +
theme_bw()
https://github.com/pbreheny/visreg/blob/master/R/ggFactorPlot.R has all the other computations and geoms/aesthetics you can use in the recreation.

How to plot 3 models in one Figure in R?

I'm new with R and I have fit 3 models for my data as follows:
Model 1: y = a(x) + b
lm1 = lm(data$CBI ~ data$dNDVI)
Model 2: y = a(x)2 + b(x) + c
lm2 <- lm(CBI ~ dNDVI + I(dNDVI^2), data=data)
Model 3: y = x(a|x| + b)–1
lm3 = nls(CBI ~ dNDVI*(a*abs(dNDVI) + b) - 1, start = c(a = 1.5, b = 2.7), data = data)
Now I would like to plot all these three models in R but I could not find the way to do it, can you please help me? I have tried with the first two models as follow and it work but I don't know how to add the Model 3 on it:
ggplot(data = data, aes(x = dNDVI, y = CBI)) +
geom_point() +
geom_smooth(method = lm, formula = y ~ x, size = 1, se = FALSE) +
geom_smooth(method = lm, formula = y ~ x + I(x^2), size = 1, se = FALSE ) +
theme_bw()
I also would like to add a legend which show 3 different colours or types of lines/curves for the 3 models as well. Can you please guide me how to make it in the figure?
Using iris as a dummy set to represent the three models:
new.dat <- data.frame(Sepal.Length=seq(min(iris$Sepal.Length),
max(iris$Sepal.Length), length.out=50)) #new data.frame to predict the fitted values for each model
m1 <- lm(Petal.Length ~ Sepal.Length, iris)
m2 <- lm(Petal.Length ~ Sepal.Length + I(Sepal.Length^2), data=iris)
m3 <- nls(Petal.Length ~ Sepal.Length*(a*abs(Sepal.Length) + b) - 1,
start = c(a = 1.5, b = 2.7), data = iris)
new.dat$m1.fitted <- predict(m1, new.dat)
new.dat$m2.fitted <- predict(m2, new.dat)
new.dat$m3.fitted <- predict(m3, new.dat)
new.dat <- new.dat %>% gather(var, val, m1.fitted:m3.fitted) #stacked format of fitted data of three models (to automatically generate the legend in ggplot)
ggplot(new.dat, aes(Sepal.Length, val, colour=var)) +
geom_line()

How to plot a linear and quadratic model on the same graph?

So I have 2 models for the data set that I am using:
> Bears1Fit1 <- lm(Weight ~ Neck.G)
>
> Bears2Fit2 <- lm(Weight ~ Neck.G + I(Neck.G)^2)
I want to plot these two models on the same scatterplot. I have this so far:
> plot(Neck.G, Weight, pch = c(1), main = "Black Bears Data: Weight Vs Neck Girth", xlab = "Neck Girth (inches) ", ylab = "Weight (pounds)")
> abline(Bears1Fit1)
However, I am unsure of how I should put the quadratic model on the same graph as well. I want to be able to have both lines on the same graph.
Here is an example with cars data set:
data(cars)
make models:
model_lm <- lm(speed ~ dist, data = cars)
model_lm2 <- lm(speed ~ dist + I(dist^2), data = cars)
make new data:
new.data <- data.frame(dist = seq(from = min(cars$dist),
to = max(cars$dist), length.out = 200))
predict:
pred_lm <- predict(model_lm, newdata = new.data)
pred_lm2 <- predict(model_lm2, newdata = new.data)
plot:
plot(speed ~ dist, data = cars)
lines(pred_lm ~ new.data$dist, col = "red")
lines(pred_lm2 ~ new.data$dist, col = "blue")
legend("topleft", c("linear", "quadratic"), col = c("red", "blue"), lty = 1)
with ggplot2
library(ggplot2)
put all data in one data frame and convert to long format using melt from reshape2
preds <- data.frame(new.data,
linear = pred_lm,
quadratic = pred_lm2)
preds <- reshape2::melt(preds,
id.vars = 1)
plot
ggplot(data = preds)+
geom_line(aes(x = dist, y = value, color = variable ))+
geom_point(data = cars, aes(x = dist, y = speed))+
theme_bw()
EDIT: another way using just ggplot2 using two geom_smooth layers, one with the default formula y ~ x (so it need not be specified) and one with a quadratic model formula = y ~ x + I(x^2). In order to get a legend we can specify color within the aes call naming the desired entry as we want it to show in the legend.
ggplot(cars,
aes(x = dist, y = speed)) +
geom_point() +
geom_smooth(method = "lm",
aes(color = "linear"),
se = FALSE) +
geom_smooth(method = "lm",
formula = y ~ x + I(x^2),
aes(color = "quadratic"),
se = FALSE) +
theme_bw()

Resources