Overlay 2 allEffects graphs - r

I have the following model
require(effects)
fit<-lme(x ~ y, data, random= ~1|item)
plot(allEffects(fit)
fit2<-lme(x ~ y, data2, random = ~1|item)
plot(allEffects(fit2)
How can I plot fit and fit2 overlaying? I have tried the par(new=T), but it does not work. The graphs plot fine individually.

I'm not sure there's a very nice way to do this. I usually extract the information from the effects structure and plot it with ggplot (lattice would be possible too).
Here's an example:
library(effects)
library(nlme)
library(plyr) ## utilities
Fit a model to the first and second half of one of the standard example data sets:
fm1 <- lme(distance ~ age, random = ~1|Subject,
data = Orthodont[1:54,])
fm2 <- update(fm1, data = Orthodont[55:108,])
a1 <- allEffects(fm1)
a2 <- allEffects(fm2)
Extract the information from the efflist object. This is the part that isn't completely general ... the hard part is getting out the predictor variable.
as.data.frame.efflist <- function(x) {
ldply(x,
function(z) {
r <- with(z,data.frame(fit,
var=variables[[1]]$levels,
lower,upper))
return(plyr::rename(r,setNames(z$variables[[1]]$name,"var")))
})
}
For convenience, use ldply to put the results of both models together:
comb <- ldply(list(fm1=a1,fm2=a2),as.data.frame,.id="model")
Now plot:
library(ggplot2); theme_set(theme_bw())
ggplot(comb,aes(age,fit,
ymin=lower,ymax=upper,
colour=model,fill=model))+
geom_line()+
geom_ribbon(alpha=0.2,colour=NA)+
geom_rug(sides="b")
The rug plot component is a little silly here.

Related

plotting an interaction term in moderated regression using MICE imputation

I'm using imputed data to test a series of regression models, including some moderation models.
Imputation
imp_data <- mice(data,m=20,maxit=20,meth='cart',seed=12345)
I then convert this to long format so I can recode / sum variables as needed, beore turning back to mids format
impdatlong_mids<-as.mids(impdat_long)
Example model:
model1 <- with(impdatlong_mids,
lm(Outcome ~ p1_sex + p2 + p3 + p4
+ p5+ p6+ p7+ p8+ p9+ p10
+ p11+ p1_sex*p12+ p1_sex*p13 + p14)
in non-imputed data, to create a graphic representation of the significant ineraction, I'd use (e.g.)
interact_plot (model=model1, pred = p1_sex, modx = p12)
This doesn't work with imputed data / mids objects.
Has anyone plotted an interaction using imputed data, and able to help or share examples?
Thanks
EDIT: Reproducible example
library(tidyverse)
library(interactions)
library(mice)
# library(reprex) does not work with this
set.seed(42)
options(warn=-1)
#---------------------------------------#
# Data preparations
# loading an editing data
d <- mtcars
d <- d %>% mutate_at(c('cyl','am'),factor)
# create missing data and impute it
mi_d <- d
nr_of_NAs <- 30
for (i in 1:nr_of_NAs) {
mi_d[sample(nrow(mi_d),1),sample(ncol(mi_d),1)] <- NA
}
mi_d <- mice(mi_d, m=2, maxit=2)
#---------------------------------------#
# regressions
#not imputed
lm_d <- lm(qsec ~ cyl*am + mpg*disp, data=d)
#imputed dataset
lm_mi <- with(mi_d,lm(qsec ~ cyl*am + mpg*disp))
lm_mi_pool <- pool(lm_mi)
#---------------------------------------#
# interaction plots
# not imputed
#continuous
interactions::interact_plot(lm_d, pred=mpg,modx=disp, interval=T,int.width=0.3)
#categorical
interactions::cat_plot(lm_d, pred = cyl, modx = am)
#---------------------------------------#
# interaction plots
# imputed
#continuous
interactions::interact_plot(lm_mi_pool, pred=mpg,modx=disp, interval=T,int.width=0.3)
# Error in model.frame.default(model) : object is not a matrix
#categorical
interactions::cat_plot(lm_mi_pool, pred = cyl, modx = am)
# Error in model.frame.default(model) : object is not a matrix
The problem seems to be that neither interact_plot, cat_plot or any other available package allows for (at least categorical) interaction plotting with objects of class mipo or pooled regression outputs.
I am using the walking data from the mice package as an example. One way to get the interaction plot (well version of one type of interaction plot) is to use the gtsummary package. Under the hood it will take the model1 use pool() from mice to average over the models and then use a combo of tbl_regression() and plot() to output a plot of the coefficients in the model. The tbl_regression() function is what is calling the pool() function.
library(mice)
library(dplyr)
library(gtsummary)
imp_data <- mice(mice::walking,m=20,maxit=20,meth='cart',seed=12345)
model1 <- with(imp_data,
lm(age ~ sex*YA))
model1 %>%
tbl_regression() %>%
plot()
The package emmeans allows you to extract interaction effects from a mira object. Here is a gentle introduction. After that, the interactions can be plotted with appropriate ggplot. This example is for the categorical variables but could be extended to the continous case - after the emmeans part things get relatively straighforward.
library(ggplot2)
library(ggstance)
library(emmeans)
library(khroma)
library(jtools)
lm_mi <- with(mi_d,lm(qsec ~ gear*carb))
#extracting interaction effects
emcatcat <- emmeans(lm_mi, ~gear*carb)
tidy <- as_tibble(emcatcat)
#plotting
pd <- position_dodge(0.5)
ggplot(tidy, aes(y=gear, x=emmean, colour=carb)) +
geom_linerangeh(aes(xmin=lower.CL, xmax=upper.CL), position=pd,size = 2) +
geom_point(position=pd,size = 4)+
ggtitle('Interactions') +
labs (x = "aggreageted interaction effect") +
scale_color_bright() +
theme_nice()
this can be extended to a three-way interaction plot with facet_grid as long as you have a third categorical interaction term.

Plotting the predictions of a mixed model as a line in R

I'm trying to plot the predictions (predict()) of my mixed model below such that I can obtain my conceptually desired plot as a line below.
I have tried to plot my model's predictions, but I don't achieve my desired plot. Is there a better way to define predict() so I can achieve my desired plot?
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
newdata <- with(dat3, expand.grid(pc1=unique(pc1), pc2=unique(pc2), discon=unique(discon)))
y <- predict(m4, newdata=newdata, re.form=NA)
plot(newdata$pc1+newdata$pc2, y)
More sjPlot. See the parameter grid to wrap several predictors in one plot.
library(lme4)
library(sjPlot)
library(patchwork)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3) # Does not converge
m4 <- lmer(math~pc1+pc2+discon+(1|id), data=dat3) # Converges
# To remove discon
a <- plot_model(m4,type = 'pred')[[1]]
b <- plot_model(m4,type = 'pred',title = '')[[2]]
a + b
Edit 1: I had some trouble removing the dropcon term within the sjPlot framework. I gave up and fell back on patchwork. I'm sure Daniel could knows the correct way.
As Magnus Nordmo suggest, this is very simple with sjPlot which has some predefined functions for these types of plot.
library(lme4)
dat3 <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/dat3.csv')
m4 <- lmer(math~pc1+pc2+discon+(pc1+pc2+discon|id), data=dat3)
plot_model(m4, type = 'pred', terms = c('pc1', 'pc2'),
ci.lvl = 0)
which gives the following result.
This plot is designed to include different quantiles of the second term in terms over the axes of pc1 and pred. You could split up these plots and combine them using patchwork and the interval can be changed by using square brackets after the term in terms (eg pc1 [-10:1] for interval between -10 and 1).

Problem combining ranef plots of two models with grid.arrange()

I have two LMEs:
lme1 <- lmer(F1 ~ (phoneme|individual) + (1|word) + frequency,
data = nurse_female)
lme2 <- lmer(F2 ~ (phoneme|individual) + (1|word) +
frequency + age + (1|zduration),
data = nurse_female)
I created simple dotplots dotplot(ranef(lme1)) of the random effects which creates a plot for each random predictor. I am however only interested in the phoneme|individual one which looks like this:
Normally I would use grid.arrange() but I can't get it to only select the phoneme|individual plots. Do you know a way to do this?
(A reproducible example would be useful, I hope this example does what you want ...). I think the key here is to recognize that the dotplot.ranef.mer method returns a list of plots:
library(lme4)
fm1 <- lmer(angle ~ (1|recipe) + (1|recipe:replicate), cake, REML= FALSE)
dd <- dotplot(ranef(fm1))
length(dd) ## 2
They're not necessarily in the same order as in the formula:
names(dd) ## [1] "recipe:replicate" "recipe"
print(dd[["recipe"]])
print(dd[["recipe:replicate"]])
So you would want something like
f <- function(m) dotplot(ranef(m))[["individual"]]
gridExtra::grid.arrange(f(lme1),f(lme2))

Add raw data points to jp.int (sjPlot)

For my manuscript, I plotted a lme with an interaction of two continuous variables:
Create data
mydata <- data.frame( SID=sample(1:150,400,replace=TRUE),age=sample(50:70,400,replace=TRUE), sex=sample(c("Male","Female"),200, replace=TRUE),time= seq(0.7, 6.2, length.out=400), Vol =rnorm(400),HCD =rnorm(400))
mydata$time <- as.numeric(mydata$time)
Run the model:
model <- lme(HCD ~ age*time+sex*time+Vol*time, random=~time|SID, data=mydata)
Make plot:
sjp.int(model, swap.pred=T, show.ci=T, mdrt.values="meansd")
The reviewer now wants me to add the raw data points to this plot. How can I do this? I tried adding geom_point() referring to mydata, but that is not possible.
Any ideas?
Update:
I thought that maybe I could extract the random slope of HCD and then residuals HCD for the covariates and also residuals Vol for the covariates and plot those two to make things easier (then I could plot the points in a 2D plot).
So, I tried to extract the slopes and use these to fit a linear regression, but the results are different (in the reproducible example less significant, but in my data: the interaction became non-significant (and was significant in the lme)). Not sure what that means or whether this just shows that I should not try to plot it this way.
get the slopes:
model <- lme(HCD ~ time, random=~time|SID, data=mydata)
slopes <- rbind(row.names(model$coefficients$random$SID), model$coef$random$SID[,2])
slopes2 <- data.frame(matrix(unlist(slopes), nrow=144, byrow=T))
names(slopes2)[1] <- "SID"
names(slopes2)[2] <- "slopes"
(save the slopes2 and reopen, because somehow R sees it as a factor)
Then create a cross-sectional dataframe and merge the slopes:
mydata$time2 <- round(mydata$time)
new <- reshape(mydata,idvar = "SID", timevar="time2", direction="wide")
newdata <- dplyr::left_join(new, slop, by="SID")
The lm:
modelw <- lm(slop$slopes ~ age.1+sex.1+Vol.1, data=newdata)
Vol now has a p-value of 0.8 (previously this was 0.14)

Using ggplot2 to plot an already-existing linear model

Let's say that I have some data and I have created a linear model to fit the data. Then I plot the data using ggplot2 and I want to add the linear model to the plot. As far as I know, this is the standard way of doing it (using the built-in cars dataset):
library(ggplot2)
fit <- lm(dist ~ speed, data = cars)
summary(fit)
p <- ggplot(cars, aes(speed, dist))
p <- p + geom_point()
p <- p + geom_smooth(method='lm')
p
However, the above violates the DRY principle ('don't repeat yourself'): it involves creating the linear model in the call to lm and then recreating it in the call to geom_smooth. This seems inelegant to me, and it also introduces a space for bugs. For example, if I change the model that is created with lm but forget to change the model that is created with geom_smooth, then the summary and the plot won't be of the same model.
Is there a way of using ggplot2 to plot an already existing linear model, e.g. by passing the lm object itself to the geom_smooth function?
What one needs to do is to create a new data frame with the observations from the old one plus the predicted values from the model, then plot that dataframe using ggplot2.
library(ggplot2)
# create and summarise model
cars.model <- lm(dist ~ speed, data = cars)
summary(cars.model)
# add 'fit', 'lwr', and 'upr' columns to dataframe (generated by predict)
cars.predict <- cbind(cars, predict(cars.model, interval = 'confidence'))
# plot the points (actual observations), regression line, and confidence interval
p <- ggplot(cars.predict, aes(speed,dist))
p <- p + geom_point()
p <- p + geom_line(aes(speed, fit))
p <- p + geom_ribbon(aes(ymin=lwr,ymax=upr), alpha=0.3)
p
The great advantage of doing this is that if one changes the model (e.g. cars.model <- lm(dist ~ poly(speed, 2), data = cars)) then the plot and the summary will both change.
Thanks to Plamen Petrov for making me realise what was needed here. As he points out, this approach will only work if predict is defined for the model in question; if not, one has to define it oneself.
I believe you want to do something along the lines of :
library(ggplot2)
# install.packages('dplyr')
library(dplyr)
fit <- lm(dist ~ speed, data = cars)
cars %>%
mutate( my_model = predict(fit) ) %>%
ggplot() +
geom_point( aes(speed, dist) ) +
geom_line( aes(speed, my_model) )
This will also work for more complex models as long as the corresponding predict method is defined. Otherwise you will need to define it yourself.
In the case of linear model you can add the confidence/prediction bands with slightly more work and reproduce your plot.

Resources