R: Specifying random effects using glmer command - r

I am analyzing categorical data from a questionnaire conducted in different schools to see what factors might have influenced pupil's responses. I am therefore building a mixed model using the glmer command from R's lme4 package. For each survey question response I have six predictor variables and I want to include School as a random effect in such a way as both the intercept and slope vary by school. I have searched long and hard both online and offline and have found conflicting accounts concerning the correct way to code for this and, being an R novice, am not sure which is right! Here is what I've come up with (where Like is the response variable):
LikeM1 <- glmer(Like ~ Treatment + Regularity + Learn + Age + Gender +
Organisation_Membership_Summary + (1 + Like|School),
data = MagpieData, na.action = "na.omit", family = binomial(logit))
Have I specified School as a random effect correctly so that both the intercept and slope vary by School, or not? I should perhaps mention that being categorical data, all my variables are factors in R.

If you want both the slope and the intercept to vary by group, the general form is: y ~ x + (1 + x | group). In the parentheses, the 1 indicates that the intercept should vary by group, and the x indicates that the coefficient of predictor x should vary by group. You've got a lot of predictors in your model. I'd start with one predictor at a time to make interpretation a bit easier.

I think you want to do this:
LikeM1 <-glmer(Like ~ Treatment + Regularity + Learn + Age + Gender + Organisation_Membership_Summary + (1 | School) + (0 + Treatment + Regularity + Learn + Age + Gender + Organisation_Membership_Summary | School), data = MagpieData, na.action = "na.omit",family = binomial(logit))
The first part of the formula in parentheses is the random intercept and the second is the random slope. This link provides a really good explanation.

Related

What will be best "formula" for this mixed effects model

I have following study which I want to analyze with Mixed effects model:
"Subjects" are divided in two "Group" (Treatment A and B).
"Weight" is recorded before and 3 months ("Time") after treatment (repeated measures).
Need to correct for subjects "age" and "gender" also.
Main question is: Whether two groups differ in their effect on weight?
For Mixed effects, I was considering following syntax with lmer function of lme4 package:
lmer(weight ~ Group*Time + age, (1|subject) + (1|gender), data=mydata)
Is this syntax correct or do I need to use more complex terms such as ones given below:
(time|subject)
(time + 1|subject)
(1|subject) + (1|Group:subject) + (1|Time:subject)
I have tried to see different sources on the internet but literature seems to be very confusing.
gender should not be a random effect (intercept). It doesn't meet any of the usual requirements for it to be treated as random.
(time|subject)
and
(time + 1|subject)
are the same. It means you are allowing the fixed effect of time to vary at different levels of subject
(1|subject) + (1|Group:subject) + (1|Time:subject)
makes very little sense. This says that Time is nested in subject because (1|Time:subject) is the samee as (1|subject:Time) and (1|subject) + (1|subject:Time) is the definition of how to specify nested random effects. The addition of (1|Group:subject) seems bizarre and I would be surprised if such a model is identified. Your research question is "Whether two groups differ" so this means you want to know the fixed effect of Group, so (1|Group:subject) does not make sense.
The model:
lmer(weight ~ Group*Time + age + gender, (1|subject), data=mydata)
makes sense.
Finally, this question should be on Cross Validated.

Modeling random slopes

I have to write down three models which try to explain frequency of voice by different factors. First two were no problem, but do not really know what they are asking for in the third model. I understand the random intercepts, but not the random slopes here. Especially since we shall use random slopes for 'attitude' twice?
Any help appreciated.
The first one, model_FE, only has fixed effects. It tries to explain frequency in terms of gender, attitude and their interaction.
The second one, model_intercept_only, is like model_FE but also adds random intercepts for both scenario and subject.
Finally, model_max_RE is like model_FE but also specifies the following random effects structure: by-scenario random intercepts, and random slopes for gender, attitude and their interaction, as well as by-subject random intercepts and random slopes for attitude.
Remember to set eval = TRUE.
model_FE <- brm(formula = frequency ~ gender * attitude,
data = politeness_data)
model_intercept_only <- brm(formula = frequency ~ gender * attitude + (1|subject) + (1|scenario) , data = politeness_data)
The random-effects term described by
by-scenario random intercepts, and random slopes for gender, attitude and their interaction
corresponds to
(1 + gender*attitude | scenario)
the one described by
as well as by-subject random intercepts and random slopes for attitude.
corresponds to
(1 + attitude | subject)
These terms should be combined with the fixed effects:
~ gender*attitude + (1 + gender*attitude | scenario) +
(1 + attitude | subject)
In the random-effects term (f|g), g specifies the grouping variable: this is always a categorical variable, and to be sensible should be an exchangeable variable (i.e., changing the labels on the variable shouldn't change their meaning: I would say that sex would not normally be considered exchangeable).
The formula component to the left of the |, f, specifies the terms that vary across levels of the grouping variable: unless explicitly suppressed with -1 or 0, this always includes an intercept. Unless you want to enforce that particular combinations of random effects are independent of each other, you should include all of the varying terms in the same f specification. You need to be especially careful if you want multiple independent f terms that contain categorical variables (this needs a longer explanation/separate question).
you can sensibly have multiple "random intercepts" in the same model if they pertain to different grouping variables: e.g. (1|subject) + (1|scenario) means that there is variation in the intercept across subjects and across scenarios.

How to indicate paired observations with lmer mixed models

I am fairly new to linear mixed models, and I'm trying to generate a model using lmer in which I test the effects of:
Group (fixed): 2 levels
Treatment (fixed): 2 levels (unstimulated and
stimulated)
Group * Treatment
on the dependent variable "Outcome", considering the random effect of "Subject".
In this experiment, each subject in the two groups had one arm stimulated and one unstimulated.
So far, the model I came up with is
lmer(Outcome ~ Group + Treatment + Group*Treatment + (1|Subject), REML=FALSE, data= data)
However, I'm not sure of how to specify that each subject has one arm unstimulated and one stimulated.
Can anybody please help?
If your question is more about an appropriate model specification for your case, I would say that it depends on your study and your goals. What you are describing is in line with your formula, and it makes sense as it is. You are already accounting for the Student effect with the (1|Student) and Treatment specifies the treated arm and the non treated arm. I would suggest you to check out this post which discusses fixed and mixed effects
Regarding the way of specifying models in lmer using formulas, my first comment is that the following 3 are equivalent:
Outcome ~ Group + Treatment + Group*Treatment
Outcome ~ Group + Treatment + Group:Treatment
Outcome ~ Group*Treatment
The third is a compact form of the second and the first is redundant. Then I would suggest you to try the following alternatives which are valid too, so that you get more familiar with the formula notation
model2 <- lmer(Outcome ~ Treatment +(1+Treatment|Group)+(1|Subject), REML=FALSE, data= data);coef(summary(model2));ranef(model2)
model3 <- lmer(Outcome ~ Treatment +(0+Treatment|Group)+(1|Subject), REML=FALSE, data= data);coef(summary(model3));ranef(model3)
model4 <- lmer(Outcome ~ Treatment +(1|Group)+(1|Subject), REML=FALSE, data= data);coef(summary(model4));ranef(model4)

plot interaction terms in a mixed-model

I would like to visualize the effect of a significant interaction term in the following mixed effects model.
c1 <- lmer(log.weight ~ time + I(time^2) + temp + precip + time:precip + time:temp + (1|indiv), data = noctrl)
This model includes fixed effects of 'time' (simple and quadratic term), 'temperature', 'precipitation' and two interactions on the logarithmized response 'weight'. All terms are significant and the models assumptions of normality and homogeneity are met.
I’ve been using the 'effects' package to produce interaction plots to show the effect of the interactions. When I try to show the interaction of time and temperature (time:temp) with the following code I’m not sure whether the resulting plot correctly shows this interaction.
ef2 <- effect(term="time:temp", c1, multiline=TRUE)
y <- as.data.frame(ef2)
ggplot(y , aes(time, fit, color=temp)) + geom_point() + geom_errorbar(aes(ymin=fit-se, ymax=fit+se), width=0.4)
I need help understanding the resulting plot please. How come despite this interaction term is highly significant, the SEs overlap at each value of x?
Am I using the effects package correctly? Or is it because I need to include the quadratic term of time I(time^2) in the interaction terms as well?
Thank you very much for the clarification.

Mixed Modelling - Different Results between lme and lmer functions

I am currently working through Andy Field's book, Discovering Statistics Using R. Chapter 14 is on Mixed Modelling and he uses the lme function from the nlme package.
The model he creates, using speed dating data, is such:
speedDateModel <- lme(dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality,
random = ~1|participant/looks/personality)
I tried to recreate a similar model using the lmer function from the lme4 package; however, my results are different. I thought I had the proper syntax, but maybe not?
speedDateModel.2 <- lmer(dateRating ~ looks + personality + gender +
looks:gender + personality:gender +
(1|participant) + (1|looks) + (1|personality),
data = speedData, REML = FALSE)
Also, when I run the coefficients of these models I notice that it only produces random intercepts for each participant. I was trying to then create a model that produces both random intercepts and slopes. I can't seem to get the syntax correct for either function to do this. Any help would be greatly appreciated.
The only difference between the lme and the corresponding lmer formula should be that the random and fixed components are aggregated into a single formula:
dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality+ (1|participant/looks/personality)
using (1|participant) + (1|looks) + (1|personality) is only equivalent if looks and personality have unique values at each nested level.
It's not clear what continuous variable you want to define your slopes: if you have a continuous variable x and groups g, then (x|g) or equivalently (1+x|g) will give you a random-slopes model (x should also be included in the fixed-effects part of the model, i.e. the full formula should be y~x+(x|g) ...)
update: I got the data, or rather a script file that allows one to reconstruct the data, from here. Field makes a common mistake in his book, which I have made several times in the past: since there is only a single observation in the data set for each participant/looks/personality combination, the three-way interaction has one level per observation. In a linear mixed model, this means the variance at the lowest level of nesting will be confounded with the residual variance.
You can see this in two ways:
lme appears to fit the model just fine, but if you try to calculate confidence intervals via intervals(), you get
intervals(speedDateModel)
## Error in intervals.lme(speedDateModel) :
## cannot get confidence intervals on var-cov components:
## Non-positive definite approximate variance-covariance
If you try this with lmer you get:
## Error: number of levels of each grouping factor
## must be < number of observations
In both cases, this is a clue that something's wrong. (You can overcome this in lmer if you really want to: see ?lmerControl.)
If we leave out the lowest grouping level, everything works fine:
sd2 <- lmer(dateRating ~ looks + personality +
gender + looks:gender + personality:gender +
looks:personality+
(1|participant/looks),
data=speedData)
Compare lmer and lme fixed effects:
all.equal(fixef(sd2),fixef(speedDateModel)) ## TRUE
The starling example here gives another example and further explanation of this issue.

Resources