I'm trying to draw in a single plot crude and adjusted GAM models using library visreg:
# Create DF
set.seed(123)
x1 = rnorm(2000)
z = 1 + 3*x1 + 3*exp(x1)
pr = 1/(1+exp(-z))
y = rbinom(2000,1,pr)
df = data.frame(y=y,x1=x1, x2=exp(x1)*z)
# Fitting GAMs
library(mgcv)
crude <- gam(y ~ s(x1), family=binomial(link=logit), data=df)
adj <- gam(y ~ s(x1) + s(x2), family=binomial(link=logit), data=df)
# Plot results using 'visreg'
library(visreg)
p.crude <- visreg(crude, scale='response', "x1", line.par = list(col = 'red'), gg=TRUE) + theme_bw()
p.adj <- visreg(adj, scale='response', "x1", gg=TRUE) + theme_bw()
Using gridExtra I can produce a two columns plot, however I would have a single plot which overlays the two model plots.
You can use the plot=FALSE parameter to get the data without the plots:
p.crude <- visreg(crude, scale='response', "x1", line.par = list(col = 'red'), plot=FALSE)
p.adj <- visreg(adj, scale='response', "x1", plot = FALSE)
And, then re-create it by hand:
dplyr::bind_rows(
dplyt::mutate(p.crude$fit, plt = "crude"),
dplyr::mutate(p.adj$fit, plt = "adj")
) -> fits
ggplot() +
geom_ribbon(
data = fits,
aes(x1, ymin=visregLwr, ymax=visregUpr, group=plt), fill="gray90"
) +
geom_line(data = fits, aes(x1, visregFit, group=plt, color=plt)) +
theme_bw()
https://github.com/pbreheny/visreg/blob/master/R/ggFactorPlot.R has all the other computations and geoms/aesthetics you can use in the recreation.
Related
I have fitted 6 lm() models and 1 gam() model on the same dataset.
Now I want to plot them all in one plot on top of each other. Can I do this without defining the models again in ggplot?
My case is this
I have
model1 <- lm(y~1, data = data) %>% coef()
model2 <- lm(y~x, data = data) %>% coef()
model3 <- lm(y~abs(x), data = data) %>% coef()
...
model7 <- gam(y~s(x), data = data) %>% coef()
can I feed the stored coefficients of my models to ggplot?
ggplot(data, mapping = aes(x = x, y = y)) +
geom_point() +
geom_abline(model1) +
geom_abline(model2) +
....
Or do Is the only way to plot the model prediction lines to manualy fill out the parameters like this:
ggplot(data, mapping = aes(x = x, y = y)) +
geom_point() +
geom_abline(intercept = model1[1]) +
geom_abline(slope = model2[2], intercept = model2[1]) +
geom_abline(slope = model3[2], intercept = model3[1]) +
...
Example code
set.seed(123)
x <- rnorm(50)
y <- rweibull(50,1)
d <- as.data.frame(cbind(x,y))
model1 <- coef(lm(y~1, data = d))
model2 <- coef(lm(y~x, data = d))
model3 <- coef(lm(y~abs(x), data = d))
Including the SE for each line/model and a legend would be welcome as well.
In order for this to work, you really need to save the whole model. So if we assume you have the entire model
# set.seed(101) used for sample data
model1 <- lm(y~1, data = d)
model2 <- lm(y~x, data = d)
model3 <- lm(y~abs(x), data = d)
We can write a helper function to predict new values from these models over a the given range of x values. Here's such a function
newvalsforx <- function(x) {
xrng <- seq(min(x), max(x), length.out=100)
function(m) data.frame(x=xrng, y=predict(m, data.frame(x=xrng)))
}
pred <- newvals(d$x)
This pred() will make predictions from the models over the observed range of x. We can then use these as new data to pass to geom_lines that we can add to a plot. For example
ggplot(d, aes(x,y)) +
geom_point() +
geom_line(data=pred(model1), color="red") +
geom_line(data=pred(model2), color="blue") +
geom_line(data=pred(model3), color="green")
This gives me
I am trying to create a multi-faceted plot with free scaling using ggplot2. By design, facet_grid, cannot achieve what I need. And facet_wrap fails with a cryptic error. Could you please tell me, do you have any suggestions on how to fix the error? A reproducible example is given below.
Let's create sample data:
require(tidyverse)
require(modelr)
d1 <- tibble(
x = 1:100,
y = 1:100 + rnorm(10),
z = y ^ 2,
dataset_name = "d1"
)
d2 <- tibble(
x = 1:1000,
y = 1:1000 + rnorm(10),
z = y ^ 2,
dataset_name = "d2"
)
#these data will be used for the 1st layer
actuals <- bind_rows(d1, d2)
#these data will be used for the 2nd layer
predictions <- bind_rows(
d1 %>% gather_predictions(
"m1" = lm(y ~ x, data = d1),
"m2" = lm(y ~ x + z, data = d1),
.pred = "y"
),
d2 %>% gather_predictions(
"m1" = lm(y ~ x, data = d2),
"m2" = lm(y ~ x + z, data = d2),
.pred = "y"
)
)
facet_grid generated the required graphs:
)
But it cannot (by design) scale the x-axis:
ggplot(actuals, aes(x, y)) +
geom_point() +
geom_line(data = predictions, colour = "red") +
facet_grid(dataset_name ~ model, scales = "free")
If I want to plot the data only for one dataset (namely, predictions), it works as expected and I get 4 facets:
ggplot(predictions, aes(x, y)) +
geom_point() +
facet_wrap( ~ model + dataset_name, scales = "free")
However, if I try to combine actuals and predictions as follows:
ggplot(actuals, aes(x, y)) +
geom_point() +
geom_line(data = predictions, colour = "red") +
facet_wrap( ~ model + dataset_name, scales = "free")
Then things fall apart with the following error: Error in gList(list(x = 0.5, y = 0.5, width = 1, height = 1, just = "centre", : only 'grobs' allowed in "gList"
Try making a single variable with the interaction of model and dataset_name.
# these two blocks of code are equivalent
library(magrittr)
predictions %<>% mutate(mod_dn = interaction(model, dataset_name))
and
predictions <- predictions %>%
mutate(mod_dn = interaction(model, dataset_name))
Now, this poses a problem for facet_wrap, since mod_dn does not exist there. So we need to merge the two datasets together. Using tidyverse, we can do this with left_join, but we need to be careful about what we join by, and then adjust the ggplot call accordingly:
all_data <- left_join(
actuals,
predictions,
by = c("x", "dataset_name"),
suffix = c(".actual", ".pred")
)
all_data %>%
ggplot(aes(x, y.actual)) +
geom_point() +
geom_line(aes(y = y.pred), colour = "red") +
facet_wrap( ~ mod_dn, scales = "free") +
labs(y = "y")
Thank you all for the help! It seems that a more straightforward solution (even though it will alter the original question), would be to tweak the predictions as follows:
predictions <- bind_rows(
d1 %>% gather_predictions(
"m1" = lm(y ~ x, data = d1),
"m2" = lm(y ~ x + z, data = d1),
.pred = "y.pred"
),
d2 %>% gather_predictions(
"m1" = lm(y ~ x, data = d2),
"m2" = lm(y ~ x + z, data = d2),
.pred = "y.pred"
)
)
Then we can do the plotting without resorting to joins:
ggplot(predictions, aes(x, y)) +
geom_point() +
geom_line(aes(x, y.pred), colour = "red") +
facet_wrap( ~ model + dataset_name, scales = "free")
This renders the desired plot.
So I have 2 models for the data set that I am using:
> Bears1Fit1 <- lm(Weight ~ Neck.G)
>
> Bears2Fit2 <- lm(Weight ~ Neck.G + I(Neck.G)^2)
I want to plot these two models on the same scatterplot. I have this so far:
> plot(Neck.G, Weight, pch = c(1), main = "Black Bears Data: Weight Vs Neck Girth", xlab = "Neck Girth (inches) ", ylab = "Weight (pounds)")
> abline(Bears1Fit1)
However, I am unsure of how I should put the quadratic model on the same graph as well. I want to be able to have both lines on the same graph.
Here is an example with cars data set:
data(cars)
make models:
model_lm <- lm(speed ~ dist, data = cars)
model_lm2 <- lm(speed ~ dist + I(dist^2), data = cars)
make new data:
new.data <- data.frame(dist = seq(from = min(cars$dist),
to = max(cars$dist), length.out = 200))
predict:
pred_lm <- predict(model_lm, newdata = new.data)
pred_lm2 <- predict(model_lm2, newdata = new.data)
plot:
plot(speed ~ dist, data = cars)
lines(pred_lm ~ new.data$dist, col = "red")
lines(pred_lm2 ~ new.data$dist, col = "blue")
legend("topleft", c("linear", "quadratic"), col = c("red", "blue"), lty = 1)
with ggplot2
library(ggplot2)
put all data in one data frame and convert to long format using melt from reshape2
preds <- data.frame(new.data,
linear = pred_lm,
quadratic = pred_lm2)
preds <- reshape2::melt(preds,
id.vars = 1)
plot
ggplot(data = preds)+
geom_line(aes(x = dist, y = value, color = variable ))+
geom_point(data = cars, aes(x = dist, y = speed))+
theme_bw()
EDIT: another way using just ggplot2 using two geom_smooth layers, one with the default formula y ~ x (so it need not be specified) and one with a quadratic model formula = y ~ x + I(x^2). In order to get a legend we can specify color within the aes call naming the desired entry as we want it to show in the legend.
ggplot(cars,
aes(x = dist, y = speed)) +
geom_point() +
geom_smooth(method = "lm",
aes(color = "linear"),
se = FALSE) +
geom_smooth(method = "lm",
formula = y ~ x + I(x^2),
aes(color = "quadratic"),
se = FALSE) +
theme_bw()
I have been trying to fit a polynomial surface to a set of point with 3 coordinates.
Let the data be:
DATA <- with(mtcars, as.data.frame(cbind(1:32, wt,disp,mpg)))
I have been trying to draw a surface using:
plot3d from rgl package,
using rsm package,
scatterplot3d package.
For example:
library(scatterplot3d)
attach(mtcars)
DATA <- as.data.frame(cbind(1:32, wt,disp,mpg))
scatterplot3d(wt,disp,mpg, main="3D Scatterplot")
model <- loess(mpg ~wt + disp, data=DATA)
x <-range(DATA$wt)
x <- seq(x[1], x[2], length.out=50)
y <- range(DATA$disp)
y <- seq(y[1], y[2], length.out=50)
z <- outer(x,y,
function(wt,disp)
predict(model, data.frame(wt,disp)))
z
p <- persp(x,y,z, theta=30, phi=30,
col="lightblue",expand = 0.5,shade = 0.2,
xlab="wt", ylab="disp", zlab="mpg")
I have also tried using surf.ls function:
surf.ls(2,DATA[,2],DATA[,3],DATA[,4])
But what I got looks like this:
I don't really know how to transform it to a 3D plot and more importantly, how to get the formula for the best fit surface obtained.
I would really appreciate your help.
PS I have deleted my last post and included more details in this one.
Try this:
attach(mtcars)
DATA <- as.data.frame(cbind(1:32, wt,disp,mpg))
x_wt <- DATA$wt
y_disp <- DATA$disp
z_mpg <- DATA$mpg
fit <- lm(z_mpg ~ poly(x_wt, y_disp, degree = 2), data = DATA)
To plot with rsm, use the following:
library(rsm)
image(fit, y_disp ~ x_wt)
contour(fit, y_disp ~ x_wt)
persp(fit, y_disp ~ x_wt, zlab = "z_mpg")
To plot with ggplot, use the following:
## ggplot
# Use rsm package to create surface model.
library(rsm)
SurfMod <- contour(fit, y_disp ~ x_wt)
# extract list values from rsm Surface Model
Xvals <- SurfMod$`x_wt ~ y_disp`[1]
Yvals <- SurfMod$`x_wt ~ y_disp`[2]
Zvals <- SurfMod$`x_wt ~ y_disp`[3]
# Construct matrix with col and row names
SurfMatrix <- Zvals$z
colnames(SurfMatrix) <- Yvals$y
rownames(SurfMatrix) <- Xvals$x
# Convert matrix to data frame
library(reshape2)
SurfDF <- melt(SurfMatrix)
library(ggplot2)
gg <- ggplot(data = SurfDF) +
geom_tile(data = SurfDF, aes(Var1, Var2,z = value, fill = value)) +
stat_contour(data = SurfDF, aes(Var1, Var2, z = value, color = ..level..)) +
scale_colour_gradient(low = "green", high = "red") +
geom_point(data = DATA, aes(wt, disp, z = mpg, color = mpg)) +
geom_text(data = DATA, aes(wt, disp,label=mpg),hjust=0, vjust=0) +
scale_fill_continuous(name="mpg") +
xlab("x_wt") +
ylab("y_disp")
library(directlabels)
direct.label.ggplot(gg, "angled.endpoints")
To see all of the available direct.label methods, go to http://directlabels.r-forge.r-project.org/docs/index.html
I'm running a set of models with the same independent variables but different dependent variables and would like to create a set of coefficient plots in one figures in which each model gets its own panel. The following code provides intuition but in this all of the models are integrated into one figure rather than have 3 unique panels side-by-side in one figure:
require("coefplot")
set.seed(123)
dat <- data.frame(x = rnorm(100), z = rnorm(100), y1 = rnorm(100), y2 = rnorm(100), y3 = rnorm(100))
mod1 <- lm(y1 ~ x + z, data = dat)
mod2 <- lm(y2 ~ x + z, data = dat)
mod3 <- lm(y3 ~ x + z, data = dat)
multiplot(mod1,mod2, mod3)
Which generates this plot:
Any thoughts on how to get them to panel next to each other in one figure? Thanks!
I haven't used the coefplot package before, but you can create a coefficient plot directly in ggplot2.
set.seed(123)
dat <- data.frame(x = rnorm(100), z = rnorm(100), y1 = rnorm(100), y2 = rnorm(100), y3 = rnorm(100))
mod1 <- lm(y1 ~ x + z, data = dat)
mod2 <- lm(y2 ~ x + z, data = dat)
mod3 <- lm(y3 ~ x + z, data = dat)
## Create data frame of model coefficients and standard errors
# Function to extract what we need
ce = function(model.obj) {
extract = summary(get(model.obj))$coefficients[ ,1:2]
return(data.frame(extract, vars=row.names(extract), model=model.obj))
}
# Run function on the three models and bind into single data frame
coefs = do.call(rbind, sapply(paste0("mod",1:3), ce, simplify=FALSE))
names(coefs)[2] = "se"
# Faceted coefficient plot
ggplot(coefs, aes(vars, Estimate)) +
geom_hline(yintercept=0, lty=2, lwd=1, colour="grey50") +
geom_errorbar(aes(ymin=Estimate - se, ymax=Estimate + se, colour=vars),
lwd=1, width=0) +
geom_point(size=3, aes(colour=vars)) +
facet_grid(. ~ model) +
coord_flip() +
guides(colour=FALSE) +
labs(x="Coefficient", y="Value") +
theme_grey(base_size=15)