I would like to use the update() function to update the random part of my model, specifically, adding a random effect. Most examples (help("update"), help("update.formula"), lme4:mixed effects modeling with R) focus on the fixed part of the model. How would I go from fm0 to fm1 using update() in the example below?
library(lme4)
(fm0 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy))
(fm1 <- lmer(Reaction ~ Days + (1 + Days | Subject), sleepstudy))
I doubt this will be useful in your case, but you have to remove the random effect and then add the desired on back in:
update(fm0, . ~ . -(1|Subject) + (1 + Days | Subject))
Related
I'm doing a few regressions using state and year fixed effects. There are two ways that I've done it:
reg1 <- lm(x ~ y + z + factor(year) + factor(state) + year:state, data=df)
and:
reg1 <- lm(x ~ y + z + factor(year) + factor(state) + factor(year)*factor(state), data=df)
but I can't explain the difference in results between using each way.
Does anyone know the difference between year:state and factor(year)*factor(state) when using the lm function?
I know that plm does that for you, but in that specific case, I have to add the fixed effects manually.
I appreciated any insights into staggered did (difference-in-differences) models.
I wanted to ask if I use the correct function to set-up the model for a did (data structure provided below):
did=time*treated
didreg = lm(y ~ time + treated + did + x + factor(year) + factor(firm), data = sample)
The data looks like:
I'm not familiar with difference-in-difference modelling, but from skimming the Wiki it seems that what you want is a simple interaction. To fit that, you don't even need to calculate a new variable (did), but you can specify it directly in the model. There's couple of ways to specify that with R formula syntax:
# Simple main effects models, no interactions
main_mod <- lm(y ~ time + treated + x + factor(year) + factor(firm), data = sample)
# Model with the interaction effect explicitly specified
did_mod1 <- lm(y ~ time + treated + time:treated + x + factor(year) + factor(firm), data = sample)
# Model with shortened syntax for specifying interactions
did_mod2 <- lm(y ~ time * treated + x + factor(year) + factor(firm), data = sample)
did_mod1 and did_mod2 are identical, did_mod2 is just a more compact way of writing the same model. The * indicates that you want both the main effects and the interactions of the variables to the left and the right. It's recommended to always fit main effects when you fit interactions, so the second way of writing the model saves time & space.
I'm trying to compare a difference I encountered in mixed model analyses using the package lme4. Maybe my statistical background is not sharp enough but I just can't figure what the "+0" in the code is and what the resulting difference (to a model without +0) implies.
Here my example with the +0:
lmer(Yield ~ Treatment + 0 + (1|Batch) + (1|Irrigation), data = D)
in contrast to:
lmer(Yield ~ Treatment + (1|Batch) + (1|Irrigation), data = D)
Does anyone have a smart explanation for what the +0 is and what it does to the results?
Models with + 0 usually mean "without an overall intercept" (in the fixed effects). By default, models have an intercept included, you can also make that explicit using + 1.
Most discussions of regression modelling will recommend including an intercept, unless there's good reason to believe the outcome will be 0 when the predictors are all zero (maybe true of some physical processes?).
Compare:
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
fm2 <- lmer(Reaction ~ Days + 0 + (Days | Subject), sleepstudy)
summary(fm1)
summary(fm2)
paying attention to the fixed effects
I'm trying to extract the random structure from models constructed using lme, but I can't seem to get anything other than the fixed formula. E.g.,
library(nlme)
fm1 <- lme(distance ~ age, Orthodont, random = ~ age | Subject)
deparse(terms(fm1))
# "distance ~ age"
This is possible for lmer using findbars():
library(lmerTest)
fm2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
findbars(formula(fm2))
# [[1]]
# Days | Subject
I want to be able to extract:
# ~ age | Subject
# (Days | Subject)
I could potentially get at this using regexpr but I would also like this to apply to more complex structures (multiple random slopes, nested random variables, etc.), and that might include additive or random slopes. Thanks!
You can access these by
fm1$call$fixed
# distance ~ age
fm1$call$random
# ~age | Subject
I constructed a mixed effect model with three fixed effects and one random effect.
mdl1 <- lmer(yld.res ~ veg + rep + rip + (1|state),
REML=FALSE,data=data2)
I want to get the most parsimonious model from the above model. To do this, I want to drop one independent variable at a time and see if it improved the fit of the model (by looking at the AICc value). But when I use drop1, it gives me the following error:
drop1(mdl1, test="F")
Error in match.arg(test) : 'arg' should be one of “none”, “Chisq”, “user”
I am not really sure how to go about this and would really appreciate any help.
If you just use drop1() with the default test="none" it will give you the AIC values corresponding to the model with each fixed effect dropped in turn.
Here's a slightly silly example (it probably doesn't make sense to test the model with a quadratic but no linear term):
library('lme4')
fm1 <- lmer(Reaction ~ Days + I(Days^2) + (Days | Subject), sleepstudy)
drop1(fm1)
## Single term deletions
##
## Model:
## Reaction ~ Days + I(Days^2) + (Days | Subject)
## Df AIC
## <none> 1764.3
## Days 1 1769.2
## I(Days^2) 1 1763.9
How badly do you need AICc rather than AIC? That could be tricky/require some hacking ...