Related
I am trying to create a plot showing odds ratios from a lrm model using spines from the rms package. I've created plots of the log-odds as well as the odds, but for the plot showing odds ratios, the reference starts at an odds ratio of 0 and not an odds ratio of 1.
Here is an illustration of what I have using the cgd dataset in the survival package.
summary(cgd)
dd = datadist(cgd)
options(datadist="dd")
attach(cgd)
fit <- lrm(inherit ~ rcs(age, 4), data=cgd)
fit
summary(fit)
an <- anova(fit)
f1 <- ggplot(Predict(fit), ylab="Log-Odds")+ theme_grey()
f2 <- ggplot(Predict(fit, fun=exp), anova=an, pval = TRUE, ylab="Odds ratio") + theme_grey()
cowplot::plot_grid(f1, f2, nrow = 1, ncol = 2, scale = .9)
These are the plots produced:
Am I doing something wrong with my code? What can I do so that the odds ratios start at 1?
Using the 'iris' dataset (sightly modified as below), I plot the results of an LME.
PLEASE NOTE: I am only using the iris dataset as mock data for the purpose of plotting, so please do not critique the appropriateness of this test. I'm not interested in the statistics, rather the plotting.
Using ggpredict function and plotting the results, the plot extends the predictions beyond the range of the data. Is there a systematic way plot predictions only within the range of each faceted data?
I can plot each facet separately, limit the axis per plot manually, and cowplot them back together, but if there is way to say 'predict only to the max. and min. of the data for that group', this would be great.
Given that these are facets of a single model, perhaps not showing the predictions for different groups is in fact misleading, and I should rather create three different models if I only want predictions within those data subsets?
library(lme4)
library(ggeffects)
library(ggplot2)
data(iris)
glimpse(iris)
df = iris
glimpse(df)
df_ed = df %>% group_by(Species) %>% mutate(Sepal.Length = ifelse(Species == "setosa",Sepal.Length+10,Sepal.Length+0))
df_ed = df_ed %>% group_by(Species) %>% mutate(Sepal.Length = ifelse(Species == "versicolor",Sepal.Length-3,Sepal.Length+0))
glimpse(df_ed)
m_test =
lmer(Sepal.Width ~ Sepal.Length * Species +
(1|Petal.Width),
data = df_ed, REML = T)
summary(m_test)
test_plot = ggpredict(m_test, c("Sepal.Length", "Species"), type = "re") %>% plot(rawdata = T, dot.alpha = 0.6, facet = T, alpha = 0.3)
As per the OP's comment, I think this will provide a solution. In this example, I use data from the sleepstudy dataset that comes with the lme4 package. First, we have to postulate a mixed model, which I generically call fit.
Note that I do not perform any hypothesis test to formally select an appropriate random-effects structure. Of course, this is essential to adequately capture the correlations in the repeated measurements, but falls outside the scope of this post.
library(lme4)
library(splines)
# quantiles of Days
quantile(sleepstudy$Days, c(0.05, 0.95))
# 5% 95%
# 0 9
# mixed model
fit <- lmer(Reaction ~ ns(Days, df = 2, B = c(0, 9)) +
(Days | Subject), data = sleepstudy)
# new data.frame for prediction
ND <- with(sleepstudy, expand.grid(Days = seq(0L, 9L, len = 50)))
Then, we need a fucntion that enables us to obtain predictions from fit for certain values of the covariates. The function effectPlot_lmer() takes the following arguments:
object: a character string indicating the merMod object that was fitted (the mixed model).
ND: a character string indicating the new data.frame, which specifies the values of the covariates for which we want to obtain predictions.
orig_data: a character string specifying the data on which the mixed model was fitted.
# function to obtain predicted reaction times
effectPlot_lmer <- function (object, ND, orig_data) {
form <- formula(object, fixed.only = TRUE)
namesVars <- all.vars(form)
betas <- fixef(object)
V <- vcov(object)
orig_data <- orig_data[complete.cases(orig_data[namesVars]), ]
Terms <- delete.response(terms(form))
mfX <- model.frame(Terms, data = orig_data)
Terms_new <- attr(mfX, "terms")
mfX_new <- model.frame(Terms_new, ND, xlev = .getXlevels(Terms, mfX))
X <- model.matrix(Terms_new, mfX_new)
pred <- c(X %*% betas)
ses <- sqrt(diag(X %*% V %*% t(X)))
ND$pred <- pred
ND$low <- pred - 1.96 * ses
ND$upp <- pred + 1.96 * ses
return(ND)
}
Finally, we can make an effect plot with ggplot.
# effect plot
library(ggplot2)
ggplot(effectPlot_lmer(fit, ND, orig_data = sleepstudy),
aes(x = Days, y = pred)) +
geom_line(size = 1.2, colour = 'blue4') +
geom_ribbon(aes(ymin = low, ymax = upp), colour = NA,
fill = adjustcolor('blue4', 0.2)) +
theme_bw() + ylab('Expected Reaction (ms)')
i have the following data and created a model with the package glmmTMB in R for plant diameters ~ plant density (number of plants) with a random plot effect:
d <- data.frame (diameter = c(17,16,15,13,11, 19,17,15,11,11, 19,15,14,11,8),
plant_density = c(1000,2000,3000,4000,5000, 1000,2000,3000,4000,5000, 1000,2000,3000,4000,5000),
plot = c(1,1,1,1,1, 2,2,2,2,2, 3,3,3,3,3))
glmm.model <- glmmTMB(diameter ~ plant_density + (1|plot),
data = d,
na.action = na.omit,
family="gaussian",
ziformula = ~ 0)
My intention was to create a plot with predicted diameter data for different plant densities with an included random plot effect. So i tried to predict the data:
new.dat <- data.frame(diameter= d$diameter,
plant_density = d$plant_density,
plot= d$plot)
new.dat$prediction <- predict(glmm.model, new.data = new.dat,
type = "response", re.form = NA)
Unfortunately I get an output for every plot but wanted a generalized prediction for the diameter ~ plant density.
My goal is to create a plot like here, but with a regression model from glmmTMB which consider the random effect.
Thanks for ur help!
The ggeffects package makes this type of thing very easy to implement and customize.
For example
library('ggplot2')
library('glmmTMB')
library('ggeffects')
d <- data.frame (diameter = c(17,16,15,13,11, 19,17,15,11,11, 19,15,14,11,8),
plant_density = c(1000,2000,3000,4000,5000, 1000,2000,3000,4000,5000, 1000,2000,3000,4000,5000),
plotx = as.factor( c(1,1,1,1,1, 2,2,2,2,2, 3,3,3,3,3)))
glmm.model <- glmmTMB(diameter ~ plant_density + (1|plotx),
data = d,
family="gaussian")
# basically what your looking for
plot(ggpredict(glmm.model, terms = "plant_density"))
# with additional a change of limits on the y-axis
plot(ggpredict(glmm.model, terms = "plant_density")) +
scale_y_continuous(limits = c(0, 20))
You can really do whatever you'd like with it from there, changing colors, themes, scales, the package has some nice vignettes as well.
From ggsurvplot doc, I can facet a ggplot object as below.
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Facet ggsurvplot() output by
# a combination of factors
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Fit (complexe) survival curves
#++++++++++++++++++++++++++++++++++++
require("survival")
fit3 <- survfit( Surv(time, status) ~ sex + rx + adhere,
data = colon )
# Visualize: plot survival curves by sex and facet by rx and adhere
#++++++++++++++++++++++++++++++++++++
ggsurv <- ggsurvplot(fit3, conf.int = TRUE)
ggsurv$plot +theme_bw() + facet_grid(rx ~ adhere)
Now if I were to use survreg to fit Weibull model, I have to use ggsurvplot_df as below.
The challenge I have is the inconsistency between ggsurvplot and ggsurvplot_df despite the fact that both are built on top of ggplot.
Is there a way to facet ggsurvplot_df object as how ggsurvplot object is?
# Weibull model
wbmod <- survreg(Surv(time, status) ~ sex + rx + adhere,
data = colon)
summary(colon)
# Imaginary patients
newdat <- expand.grid(
rx = levels(colon$rx),
adhere = unique(colon$adhere),
sex = unique(colon$sex))
newdat
# Compute survival curves
surv <- seq(.99, .01, by = -.01)
t <- predict(wbmod, type = 'quantile', p = 1-surv,
newdata = newdat)
# How many rows and columns does t have?
dim(t)
# Use cbind() to combine the information in newdat with t
surv_wbmod_wide <- cbind(newdat, t)
# Use melt() to bring the data.frame to long format
library(reshape2)
surv_wbmod <- melt(surv_wbmod_wide, id.vars = c('rx','adhere', 'sex'), variable.name = 'surv_id', value.name = 'time')
dim(surv_wbmod)
# Use surv_wbmod$surv_id to add the correct survival probabilities surv
surv_wbmod$surv <- surv[as.numeric(surv_wbmod$surv_id)]
# Add columns upper, lower, std.err, and strata to the data.frame
surv_wbmod[, c("upper", "lower", "std.err", "strata")] <- NA
# Take a look at the structure of the object
str(surv_wbmod)
# Plot the survival curves
ggsurvplot_df(surv_wbmod, surv.geom = geom_line, linetype = 'rx', color = 'adhere', legend.title = NULL)
Do you mean something like this?
If this is want you need then ggsurvplot_df was not an object. Therfore no faceting was possible?!
# Plot the survival curves
ggsurvplot_df <- ggsurvplot(surv_wbmod, surv.geom = geom_line, linetype = 'rx', color = 'adhere', legend.title = NULL)
ggsurvplot_df + theme_bw() + facet_grid(rx ~ adhere)
I try to make a plot for standard purposes with zero inflated model and zero inflated mixed model using ggplot2 without success. For this, I try:
#Packages
library(pscl)
library(glmmTMB)
library(ggplot2)
library(gridExtra)
# Artificial data set
set.seed(007)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 5 # maximum follow-up time
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))
DF$y <- rnbinom(n * K, size = 2, mu = exp(1.552966))
str(DF)
Using zero inflated poisson model with pscl package
time2<-(DF$time)^2
mZIP <- zeroinfl(y~time+time2+sex|time+sex, data=DF)
summary(mZIP)
If I imagine thal all coefficients are significant
# Y estimated
pred.data1 = data.frame(
time<-DF$time,
time2<-(DF$time)^2,
sex<-DF$sex)
pred.data1$y = predict(mZIP, newdata=pred.data1, type="response")
Now using zero inflated poisson mixed model with glmmTMB package
mZIPmix<- glmmTMB(y~time+time2+sex+(1|id),
data=DF, ziformula=~1,family=poisson)
summary(mZIPmix)
#
# new Y estimated
pred.data2 = data.frame(
time<-DF$time,
time2<-(DF$time)^2,
sex<-DF$sex,
id<-DF$id)
pred.data2$y = predict(mZIPmix, newdata=pred.data2, type="response")
Plot zero inflated poisson model and mixed poisson model
par(mfrow=c(1,2))
plot1<-ggplot(DF, aes(time, y, colour=sex)) +
labs(title="Zero inflated model") +
geom_point() +
geom_line(data=pred.data1) +
stat_smooth(method="glm", family=poisson(link="log"), formula = y~poly(x,2),fullrange=TRUE)
plot2<-ggplot(DF, aes(time, y, colour=sex)) +
labs(title="Zero inflated mixed model") +
geom_point() +
geom_line(data=pred.data2) +
stat_smooth(method="glm", family=poisson(link="log"), formula = y~poly(x,2),fullrange=TRUE)## here a don't find any method to mixed glm
grid.arrange(plot1, plot2, ncol=2)
#-
Doesn't work of sure. Is possible to make this using ggplot2?
Thanks in advance
I'm not sure, but it looks to me that you're looking for marginal effects. You can do this with the ggeffects-package. Here are two examples, using your simulated data, that create a ggplot-object, one with and one w/o raw data.
library(glmmTMB)
library(ggeffects)
mZIPmix<- glmmTMB(y~poly(time,2)+sex+(1|id), data=DF, ziformula=~1,family=poisson)
# compute marginal effects and create a plot.
# the tag "[all]" is useful for polynomial terms, to produce smoother plots
ggpredict(mZIPmix, c("time [all]", "sex")) %>% plot(rawdata = TRUE, jitter = .01)
ggpredict(mZIPmix, c("time [all]", "sex")) %>% plot(rawdata = FALSE)
Created on 2019-05-16 by the reprex package (v0.2.1)
Note that sex only has an "additive" effect. Maybe you want to model an intercation between time and sex?
mZIPmix<- glmmTMB(y~poly(time,2)*sex+(1|id), data=DF, ziformula=~1,family=poisson)
ggpredict(mZIPmix, c("time [all]", "sex")) %>% plot(rawdata = TRUE, jitter = .01)
ggpredict(mZIPmix, c("time [all]", "sex")) %>% plot()
Created on 2019-05-16 by the reprex package (v0.2.1)