I'm trying to compare a difference I encountered in mixed model analyses using the package lme4. Maybe my statistical background is not sharp enough but I just can't figure what the "+0" in the code is and what the resulting difference (to a model without +0) implies.
Here my example with the +0:
lmer(Yield ~ Treatment + 0 + (1|Batch) + (1|Irrigation), data = D)
in contrast to:
lmer(Yield ~ Treatment + (1|Batch) + (1|Irrigation), data = D)
Does anyone have a smart explanation for what the +0 is and what it does to the results?
Models with + 0 usually mean "without an overall intercept" (in the fixed effects). By default, models have an intercept included, you can also make that explicit using + 1.
Most discussions of regression modelling will recommend including an intercept, unless there's good reason to believe the outcome will be 0 when the predictors are all zero (maybe true of some physical processes?).
Compare:
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- lmer(Reaction ~ Days + 0 + (Days | Subject), sleepstudy)
summary(fm1)
summary(fm2)
paying attention to the fixed effects
Related
I am creating a series of logistic regression models using mgcv and comparing them with AIC(). I have only four variables (socioeconomic class (Socio), sex, year of death (YOD), and age) for each individual and I am curious how these variables explain the likelihood of someone being buried with burial goods (N=c.12,000).
For one model, I ran the following:
model5 <- mgcv::gam(Commemorated ~ s(Age, k=5) + s(YOD, k=5) + Socio + Sex +
ti(Age,YOD, k=5) + s(Age, by=Socio, k=5) + s(YOD, by=Socio, k=5),
family=binomial(link='logit'), data=mydata, method='ML')
AIC(model5) was -1333.434. This was drastically different than I expected given models I had run previously. As a test, I ran the following:
model6 <- mgcv::gam(Commemorated ~ s(Age, k=6) + s(YOD, k=5) + Socio + Sex +
ti(Age,YOD, k=5) + s(Age, by=Socio, k=5) + s(YOD, by=Socio, k=5),
family=binomial(link='logit'), data=mydata, method='ML')
gam.check() for both models were fine. For the second model, I only shifted the k value of the first term up 1, which in my understanding should not have altered the AIC drastically but when I ran AIC(model6), it was 6048.187, which is as expected given previous models I have run.
Other things I have looked at:
model5$aic: 6047.284
model6$aic: 6047.245
logLik.gam(model5): -3005.652 (df=-3673.87)
logLik.gam(model6): -3005.629 (df=18.46467)
So it would appear that for some reason, the degrees of freedom for model5 is drastically different than model6 for a reason I cannot explain. If anyone has any ideas on how to troubleshoot this problem further, it would be greatly appreciated!
Edit: As commented below, I have also altered model5 from s(YOD, k=5) to ti(YOD,k=5) and this 'fixes' the AIC() result. Running model5 without the term, Sex (which previous models have shown has little effect but I have left in because it is meaningful from a theoretical standpoint), also 'fixes' the AIC() result. So this problem is not specific to the term s(Age), the number of knots, or the smoothed terms in general.
Aside from R function nlme::lme(), I'm wondering how else I can model the Level-1 residual variance-covariance structure?
ps. My search showed I could possibly use glmmTMB package but it seems it is not about Level-1 residuals but random-effects themselves (see below code).
glmmTMB::glmmTMB(y ~ times + ar1(times | subjects), data = data) ## DON'T RUN
nlme::lme (y ~ times, random = ~ times | subjects,
correlation = corAR1(), data = data) ## DON'T RUN
glmmTMB can effectively be used to model level-1 residuals, by adding an observation-level random effect to the model (and if necessary suppressing the level-1 variance via dispformula ~ 0. For example, comparing the same fit in lme and glmmTMB:
library(glmmTMB)
library(nlme)
data("sleepstudy" ,package="lme4")
ss <- sleepstudy
ss$times <- factor(ss$Days) ## needed for glmmTMB
I initially tried with random = ~Days|Subject but neither lme nor glmmTMB were happy (overfitted):
lme1 <- lme(Reaction ~ Days, random = ~1|Subject,
correlation=corAR1(form=~Days|Subject), data=ss)
m1 <- glmmTMB(Reaction ~ Days + (1|Subject) +
ar1(times + 0 | Subject),
dispformula=~0,
data=ss,
REML=TRUE,
start=list(theta=c(4,4,1)))
Unfortunately, in order to get a good answer with glmmTMB I did have to tweak the starting values ...
I built a model in r with the lmer function:
lmer(DV ~ IV1 + IV1:IV2 - 1 + (1|Group/Participant)
This was correctly specified and I got the results I expected.
I'm now trying to replicate these results in spss. so far I have:
MIXED DV BY IV1 IV2
/FIXED IV1 IV1*IV2 | NOINT SSTYPE(3)
/METHOD REML
/RANDOM= INTERCEPT | SUBJECT(Group*Participant) COVTYPE(VC)
/PRINT= SOLUTION TESTCOV.
My results are not remotely similar, and I believe it is because of the differences between the : and * terms.
How can I replicate IV1 + IV1:IV2 in SPSS?
If I'm understanding the R formula documentation properly, the ":" there is the same as the "*" or "BY" specification in SPSS.
If you want a second, uncorrelated random effect with just Group as the subject specification, simply add a second RANDOM subcommand, such as:
/RANDOM= INTERCEPT | SUBJECT(Group) COVTYPE(VC)
I would like to use the update() function to update the random part of my model, specifically, adding a random effect. Most examples (help("update"), help("update.formula"), lme4:mixed effects modeling with R) focus on the fixed part of the model. How would I go from fm0 to fm1 using update() in the example below?
library(lme4)
(fm0 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy))
(fm1 <- lmer(Reaction ~ Days + (1 + Days | Subject), sleepstudy))
I doubt this will be useful in your case, but you have to remove the random effect and then add the desired on back in:
update(fm0, . ~ . -(1|Subject) + (1 + Days | Subject))
I constructed a mixed effect model with three fixed effects and one random effect.
mdl1 <- lmer(yld.res ~ veg + rep + rip + (1|state),
REML=FALSE,data=data2)
I want to get the most parsimonious model from the above model. To do this, I want to drop one independent variable at a time and see if it improved the fit of the model (by looking at the AICc value). But when I use drop1, it gives me the following error:
drop1(mdl1, test="F")
Error in match.arg(test) : 'arg' should be one of “none”, “Chisq”, “user”
I am not really sure how to go about this and would really appreciate any help.
If you just use drop1() with the default test="none" it will give you the AIC values corresponding to the model with each fixed effect dropped in turn.
Here's a slightly silly example (it probably doesn't make sense to test the model with a quadratic but no linear term):
library('lme4')
fm1 <- lmer(Reaction ~ Days + I(Days^2) + (Days | Subject), sleepstudy)
drop1(fm1)
## Single term deletions
##
## Model:
## Reaction ~ Days + I(Days^2) + (Days | Subject)
## Df AIC
## <none> 1764.3
## Days 1 1769.2
## I(Days^2) 1 1763.9
How badly do you need AICc rather than AIC? That could be tricky/require some hacking ...