Extract random formula from nlme objects - r

I'm trying to extract the random structure from models constructed using lme, but I can't seem to get anything other than the fixed formula. E.g.,
library(nlme)
fm1 <- lme(distance ~ age, Orthodont, random = ~ age | Subject)
deparse(terms(fm1))
# "distance ~ age"
This is possible for lmer using findbars():
library(lmerTest)
fm2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
findbars(formula(fm2))
# [[1]]
# Days | Subject
I want to be able to extract:
# ~ age | Subject
# (Days | Subject)
I could potentially get at this using regexpr but I would also like this to apply to more complex structures (multiple random slopes, nested random variables, etc.), and that might include additive or random slopes. Thanks!

You can access these by
fm1$call$fixed
# distance ~ age
fm1$call$random
# ~age | Subject

Related

How to pass a named list to the dots (`...`) argument of a function (specifically `anova`) in R? (alternatives to `do.call`)

I have a list of mixed models output of lme4::lmer which I want to pass to anova which has the form anova(object, ...), so I do
models_list <- list("lmm1" = lmm1, "lmm2" = lmm2, "lmm3" = lmm3, "lmm4" = lmm4, "lmm5" = lmm5)
do.call(anova, c(models_list[[1]], models_list[-1]))
Warning in anova.merMod(new("lmerMod", resp = new("lmerResp", .xData = <environment>), :
failed to find model names, assigning generic names
I get the result, but with the generic names remarked by the warning, so the same result as if models_list was not named. I asked also on github (https://github.com/lme4/lme4/issues/612), but using do.call seems that I won't be able to solve this issue. Is there some other way?
Reproducible example
library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- lmer(Reaction ~ Days + (Days || Subject), sleepstudy)
anova(fm1,fm2)
refitting model(s) with ML (instead of REML)
Data: sleepstudy
Models:
fm2: Reaction ~ Days + ((1 | Subject) + (0 + Days | Subject))
fm1: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
fm2 5 1762.0 1778.0 -876.00 1752.0
fm1 6 1763.9 1783.1 -875.97 1751.9 0.0639 1 0.8004
# so I can see fm2 and fm2, to which model corresponds each line, but
models_list <- list("fm1" = fm1, "fm2" = fm2)
do.call(anova, c(lmaux[[1]], lmaux[-1]))
Warning in anova.merMod(new("lmerMod", resp = new("lmerResp", .xData = <environment>), :
failed to find model names, assigning generic names
refitting model(s) with ML (instead of REML)
Data: sleepstudy
Models:
MODEL2: Reaction ~ Days + ((1 | Subject) + (0 + Days | Subject))
MODEL1: Reaction ~ Days + (Days | Subject)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
MODEL2 5 1762.0 1778.0 -876.00 1752.0
MODEL1 6 1763.9 1783.1 -875.97 1751.9 0.0639 1 0.8004
so the model names fm1, fm2 were replaced with MODEL2, MODEL1;
this is a problem if the model names are given by changing (possibly non-consecutive) numbers.
I have checked possible questions of which these would be a kind of duplicate, as
How to pass a list to a function in R?
Pass partial list of arguments to do.call()
How to pass extra argument to the function argument of do.call in R
but have not found any satisfactory answer.
Thank you!
anova.merMod uses non-standard evaluation (NSE) to get the model names. As is often the case, NSE is more trouble than it's worth. Here is a solution:
eval(
do.call(
call,
c(list("anova"),
lapply(names(models_list), as.symbol)),
quote = TRUE),
models_list)
#refitting model(s) with ML (instead of REML)
#Data: sleepstudy
#Models:
#fm2: Reaction ~ Days + ((1 | Subject) + (0 + Days | Subject))
#fm1: Reaction ~ Days + (Days | Subject)
# npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
#fm2 5 1762.0 1778.0 -876.00 1752.0
#fm1 6 1763.9 1783.1 -875.97 1751.9 0.0639 1 0.8004
The solution creates a call. That is done with the call function as usual. However, here we need to pass the (quoted) arguments as a list using do.call. We then evaluate this call within the list of models.
I would try to avoid having this in production code because it's quite complex and therefore difficult to maintain.

Residual modeling for mixed models: Any other package than nlme?

Aside from R function nlme::lme(), I'm wondering how else I can model the Level-1 residual variance-covariance structure?
ps. My search showed I could possibly use glmmTMB package but it seems it is not about Level-1 residuals but random-effects themselves (see below code).
glmmTMB::glmmTMB(y ~ times + ar1(times | subjects), data = data) ## DON'T RUN
nlme::lme (y ~ times, random = ~ times | subjects,
correlation = corAR1(), data = data) ## DON'T RUN
glmmTMB can effectively be used to model level-1 residuals, by adding an observation-level random effect to the model (and if necessary suppressing the level-1 variance via dispformula ~ 0. For example, comparing the same fit in lme and glmmTMB:
library(glmmTMB)
library(nlme)
data("sleepstudy" ,package="lme4")
ss <- sleepstudy
ss$times <- factor(ss$Days) ## needed for glmmTMB
I initially tried with random = ~Days|Subject but neither lme nor glmmTMB were happy (overfitted):
lme1 <- lme(Reaction ~ Days, random = ~1|Subject,
correlation=corAR1(form=~Days|Subject), data=ss)
m1 <- glmmTMB(Reaction ~ Days + (1|Subject) +
ar1(times + 0 | Subject),
dispformula=~0,
data=ss,
REML=TRUE,
start=list(theta=c(4,4,1)))
Unfortunately, in order to get a good answer with glmmTMB I did have to tweak the starting values ...

What is "+0" in a simple mixed model code (R)?

I'm trying to compare a difference I encountered in mixed model analyses using the package lme4. Maybe my statistical background is not sharp enough but I just can't figure what the "+0" in the code is and what the resulting difference (to a model without +0) implies.
Here my example with the +0:
lmer(Yield ~ Treatment + 0 + (1|Batch) + (1|Irrigation), data = D)
in contrast to:
lmer(Yield ~ Treatment + (1|Batch) + (1|Irrigation), data = D)
Does anyone have a smart explanation for what the +0 is and what it does to the results?
Models with + 0 usually mean "without an overall intercept" (in the fixed effects). By default, models have an intercept included, you can also make that explicit using + 1.
Most discussions of regression modelling will recommend including an intercept, unless there's good reason to believe the outcome will be 0 when the predictors are all zero (maybe true of some physical processes?).
Compare:
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- lmer(Reaction ~ Days + 0 + (Days | Subject), sleepstudy)
summary(fm1)
summary(fm2)
paying attention to the fixed effects

Bacteria on fingers. Syntax for crossed random effects with random slopes but not intercepts in MASS::glmmPQL

I have non-normal data (bacteria on fingers after touching surfaces with and without gloves) so using glmmPQL from the MASS package. I have one categorical predictor (Gloves), a repeated measurement variable (NumberContacts) and Participants who did the experiment gloved and ungloved so are crossed. I'd like to use the Participant variable as a random effect with random slope (but not intercept as they have 0 bacteria to start with). I can't figure out the syntax for random effects with random slope but not random intercept. Could you show me how to do this please?
So far I have:
require(MASS)
PQL <- glmmPQL(bacteria ~ Gloves+ NumberContacts, ~1|Participant,
family = gaussian(link = "log"),
#weights = varIdent(form = ~1 | NumberContacts),
#correlation = corAR1(NumberContacts),
data = na.omit(Ksub),
verbose = F)
Bacteria on fingers after each contact
Density plots of bacteria on fingers after each contact
See https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#model-specification, which notes that (0+x|group) or (-1+x|group) specifies "random slope of x within group: no variation in intercept."
The model specifications in the example below are equivalent:
library(MASS)
library(lme4)
fm1 <- lmer(Reaction ~ Days + (0 + Days | Subject), sleepstudy)
fm2 <- glmmPQL(Reaction ~ Days, random = ~ 0 + Days | Subject,
family = gaussian, data = sleepstudy)

drop1 function for lmer

I constructed a mixed effect model with three fixed effects and one random effect.
mdl1 <- lmer(yld.res ~ veg + rep + rip + (1|state),
REML=FALSE,data=data2)
I want to get the most parsimonious model from the above model. To do this, I want to drop one independent variable at a time and see if it improved the fit of the model (by looking at the AICc value). But when I use drop1, it gives me the following error:
drop1(mdl1, test="F")
Error in match.arg(test) : 'arg' should be one of “none”, “Chisq”, “user”
I am not really sure how to go about this and would really appreciate any help.
If you just use drop1() with the default test="none" it will give you the AIC values corresponding to the model with each fixed effect dropped in turn.
Here's a slightly silly example (it probably doesn't make sense to test the model with a quadratic but no linear term):
library('lme4')
fm1 <- lmer(Reaction ~ Days + I(Days^2) + (Days | Subject), sleepstudy)
drop1(fm1)
## Single term deletions
##
## Model:
## Reaction ~ Days + I(Days^2) + (Days | Subject)
## Df AIC
## <none> 1764.3
## Days 1 1769.2
## I(Days^2) 1 1763.9
How badly do you need AICc rather than AIC? That could be tricky/require some hacking ...

Resources