Multiple random effects without interaction in nlme::lme - r

Let consider this simple model consisting in two independent random effects:
$$y_{ijk} = \mu + \delta1_i + \delta2_j + \epsilon_{i,j,k}$$
where \delta1_i and \delta2_j are independent random effect (i.e. \delta1_i \sim N(0,\sigma_1^2) iid, \delta2_i \sim N(0,\sigma_1^2) iid, both independent).
Defining and inferring the variance components of such a model is straightforward using lme4::lmer (sorry the example below is likely to have no physical meaning) :
library(lme4)
library(nlme)
data(Orthodont,package="nlme")
lmer(data=Orthodont,distance~1+(1|Sex)+(1|Subject))
#Linear mixed model fit by REML ['lmerMod']
#Formula: distance ~ 1 + (1 | Sex) + (1 | Subject)
# Data: Orthodont
#REML criterion at convergence: 510.3937
#Random effects:
# Groups Name Std.Dev.
# Subject (Intercept) 1.596
# Sex (Intercept) 1.550
# Residual 2.220
#Number of obs: 108, groups: Subject, 27; Sex, 2
#Fixed Effects:
#(Intercept)
# 23.83
How can this be modeled easily using nlme::lme ? Specifying the random effect using a list of random effect implicitly introduces interaction depending from the ordering of the random effects , which can be verified here (changing order of specification, change results):
VarCorr(lme(data=Orthodont,distance~1,random=list(~1|Subject,~1|Sex)))
# Variance StdDev
#Subject = pdLogChol(1)
#(Intercept) 1.875987 1.369667
#Sex = pdLogChol(1)
#(Intercept) 1.875987 1.369667
#Residual 4.929784 2.220312
VarCorr(lme(data=Orthodont,distance~1,random=list(~1|Sex,~1|Subject)))
# Variance StdDev
#Sex = pdLogChol(1)
#(Intercept) 2.403699 1.550387
#Subject = pdLogChol(1)
#(Intercept) 2.546702 1.595839
#Residual 4.929784 2.220312

Related

Equivalence of a mixed model fitted by lme and lmer

I have fitted a mixed effects model considering both functions widely used in R, namely: the lme function from the nlme package and the lmer function from the lme4 package.
To readjust the model from lme to lme4, following the same reparametrization, I used the following information from this topic, being that is only possible to do this in lme4 in a hackable way.: Heterocesdastic model of mixed effects via lmer function
I apologize for hosting the data in a link, however, I couldn't find an internal R database that has variables that might match my problem.
Data: https://drive.google.com/file/d/1jKFhs4MGaVxh-OPErvLDfMNmQBouywoM/view?usp=sharing
The fitted models were:
library(nlme)
library(lme4)
ModLME = lme(Var1~I(Var2)+I(Var2^2),
random = ~1|Var3,
weights = varIdent(form=~1|Var4),
Dataone, method="REML")
ModLMER = lmer(Var1~I(Var2)+I(Var2^2)+(1|Var3)+(0+dummy(Var4,"1")|Var5),
Dataone, REML = TRUE,
control=lmerControl(check.nobs.vs.nlev="ignore",
check.nobs.vs.nRE="ignore"))
Which are equivalent, see:
all.equal(REMLcrit(ModLMER), c(-2*logLik(ModLME)))
[1] TRUE
all.equal(fixef(ModLME), fixef(ModLMER), tolerance=1e-7)
[1] TRUE
> summary(ModLME)
Linear mixed-effects model fit by REML
Data: Dataone
AIC BIC logLik
-209.1431 -193.6948 110.5715
Random effects:
Formula: ~1 | Var3
(Intercept) Residual
StdDev: 0.05789852 0.03636468
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | Var4
Parameter estimates:
0 1
1.000000 5.641709
Fixed effects: Var1 ~ I(Var2) + I(Var2^2)
Value Std.Error DF t-value p-value
(Intercept) 0.9538547 0.01699642 97 56.12093 0
I(Var2) -0.5009804 0.09336479 97 -5.36584 0
I(Var2^2) -0.4280151 0.10038257 97 -4.26384 0
summary(ModLMER)
Linear mixed model fit by REML. t-tests use Satterthwaites method [lmerModLmerTest]
Formula: Var1 ~ I(Var2) + I(Var2^2) + (1 | Var3) + (0 + dummy(Var4, "1") |
Var5)
Data: Dataone
Control: lmerControl(check.nobs.vs.nlev = "ignore", check.nobs.vs.nRE = "ignore")
REML criterion at convergence: -221.1
Scaled residuals:
Min 1Q Median 3Q Max
-4.1151 -0.5891 0.0374 0.5229 2.1880
Random effects:
Groups Name Variance Std.Dev.
Var3 (Intercept) 6.466e-12 2.543e-06
Var5 dummy(Var4, "1") 4.077e-02 2.019e-01
Residual 4.675e-03 6.837e-02
Number of obs: 100, groups: Var3, 100; Var5, 100
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.95385 0.01700 95.02863 56.121 < 2e-16 ***
I(Var2) -0.50098 0.09336 92.94048 -5.366 5.88e-07 ***
I(Var2^2) -0.42802 0.10038 91.64017 -4.264 4.88e-05 ***
However, when observing the residuals of these models, note that they are not similar. See that in the model adjusted by lmer, mysteriously appears a residue with the shape of a few points close to a straight line. So, how could you solve such a problem so that they are identical? I believe the problem is in the lme4 model.
aa=plot(ModLME, main="LME")
bb=plot(ModLMER, main="LMER")
gridExtra::grid.arrange(aa,bb,ncol=2)
I can tell you what's going on and what should in principle fix it, but at the moment the fix doesn't work ...
The residuals being plotted take all of the random effects into account, which in the case of the lmer fit includes the individual-level random effects (the (0+dummy(Var4,"1")|Var5) term), which leads to weird residuals for the Var4==1 group. To illustrate this:
plot(ModLMER, col = Dataone$Var4+1)
i.e., you can see that the weird residuals are exactly the ones in red == those for which Var4==1.
In theory we should be able to get the same residuals via:
res <- Dataone$Var1 - predict(ModLMER, re.form = ~(1|Var3))
i.e., ignore the group-specific observation-level random effect term. However, it looks like there is a bug at the moment ("contrasts can be applied only to factors with 2 or more levels").
An extremely hacky solution is to construct the random-effect predictions without the observation-level term yourself:
## fixed-effect predictions
p0 <- predict(ModLMER, re.form = NA)
## construct RE prediction, Var3 term only:
Z <- getME(ModLMER, "Z")
b <- drop(getME(ModLMER, "b"))
## zero out observation-level components
b[101:200] <- 0
## add RE predictions to fixed predictions
p1 <- drop(p0 + Z %*% b)
## plot fitted vs residual
plot(p1, Dataone$Var1 - p1)
For what it's worth, this also works:
library(glmmTMB)
ModGLMMTMB <- glmmTMB(Var1~I(Var2)+I(Var2^2)+(1|Var3),
dispformula = ~factor(Var4),
REML = TRUE,
data = Dataone)

Likelihood Ratio Test for LMM gives P-Value of 1?

I did an experiment in which people had to give answers to moral dilemmas that were either personal or impersonal. I now want to see if there is an interaction between the type of dilemma and the answer participants gave (yes or no) that influences their reaction time.
For this, I computed a Linear Mixed Model using the lmer()-function of the lme4-package.
My Data looks like this:
subject condition gender.b age logRT answer dilemma pers_force
1 105 a_MJ1 1 27 5.572154 1 1 1
2 107 b_MJ3 1 35 5.023881 1 1 1
3 111 a_MJ1 1 21 5.710427 1 1 1
4 113 c_COA 0 31 4.990433 1 1 1
5 115 b_MJ3 1 23 5.926926 1 1 1
6 119 b_MJ3 1 28 5.278115 1 1 1
My function looks like this:
lmm <- lmer(logRT ~ pers_force * answer + (1|subject) + (1|dilemma),
data = dfb.3, REML = FALSE, control = lmerControl(optimizer="Nelder_Mead"))
with subjects and dilemmas as random factors. This is the output:
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: logRT ~ pers_force * answer + (1 | subject) + (1 | dilemma)
Data: dfb.3
Control: lmerControl(optimizer = "Nelder_Mead")
AIC BIC logLik deviance df.resid
-13637.3 -13606.7 6825.6 -13651.3 578
Scaled residuals:
Min 1Q Median 3Q Max
-3.921e-07 -2.091e-07 2.614e-08 2.352e-07 6.273e-07
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 3.804e-02 1.950e-01
dilemma (Intercept) 0.000e+00 0.000e+00
Residual 1.155e-15 3.398e-08
Number of obs: 585, groups: subject, 148; contrasts, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.469e+00 1.440e-02 379.9
pers_force1 -1.124e-14 5.117e-09 0.0
answer -1.095e-15 4.678e-09 0.0
pers_force1:answer -3.931e-15 6.540e-09 0.0
Correlation of Fixed Effects:
(Intr) prs_f1 answer
pers_force1 0.000
answer 0.000 0.447
prs_frc1:aw 0.000 -0.833 -0.595
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular
I then did a Likelihood Ratio Test using a reduced model to obtain p-Values:
lmm_null <- lmer(logRT ~ pers_force + answer + (1|subject) + (1|dilemma),
data = dfb.3, REML = FALSE,
control = lmerControl(optimizer="Nelder_Mead"))
anova(lmm,lmm_null)
For both models, I get the warning "boundary (singular) fit: see ?isSingular", but if I drop one random effect to make the structure simpler, then I get the warning that the models failed to converge (which is a bit strange), so I ignored it.
But then, the LRT output looks like this:
Data: dfb.3
Models:
lmm_null: logRT ~ pers_force + answer + (1 | subject) + (1 | dilemma)
lmm: logRT ~ pers_force * answer + (1 | subject) + (1 | dilemma)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
lmm_null 6 -13639 -13613 6825.6 -13651
lmm 7 -13637 -13607 6825.6 -13651 0 1 1
As you can see, the Chi-Square value is 0 and the p-Value is exactly 1, which seems very strange. I guess something must have gone wrong here, but I can't figure out what.
You say
logRT is the average logarithmized reaction time across all those 4 dilemmas.
If I'm interpreting this correctly — i.e., each subject has the same response for all of the times they are observed — then this is the proximal cause of your problem. (I know I've seen this exact problem before, but I don't know where — here? r-sig-mixed-models#r-project.org?)
simulate data
library(lme4)
set.seed(101)
dd1 <- expand.grid(subject=factor(100:150), contrasts=factor(1:4))
dd1$answer <- rbinom(nrow(dd1),size=1,prob=0.5)
dd1$logRT <- simulate(~answer + contrasts + (1|subject),
family=gaussian,
newparams=list(beta=c(0,1,1,-1,2),theta=1,sigma=1),
newdata=dd1)[[1]]
regular fit
This is fine and gives answers close to the true parameters:
m1 <- lmer(logRT~answer + contrasts + (1|subject), data=dd1)Linear mixed model fit by REML ['lmerMod']
## Random effects:
## Groups Name Std.Dev.
## subject (Intercept) 1.0502
## Residual 0.9839
## Number of obs: 204, groups: subject, 51
## Fixed Effects:
## (Intercept) answer contrasts2 contrasts3 contrasts4
## -0.04452 0.85333 1.16785 -1.07847 1.99243
now average the responses by subject
We get a raft of warning messages, and the same pathologies you are seeing (residual variance and all parameter estimates other than the intercept are effectively zero). This is because lmer is trying to estimate residual variance from the within-subject variation, and we have gotten rid of it!
I don't know why you are doing the averaging. If this is unavoidable, and your design is the randomized-block type shown here (each subject sees all four dilemmas/contrasts), then you can't estimate the dilemma effects.
dd2 <- transform(dd1, logRT=ave(logRT,subject))
m2 <- update(m1, data=dd2)
## Random effects:
## Groups Name Std.Dev.
## subject (Intercept) 6.077e-01
## Residual 1.624e-05
## Number of obs: 204, groups: subject, 51
## Fixed Effects:
## (Intercept) answer contrasts2 contrasts3 contrasts4
## 9.235e-01 1.031e-10 -1.213e-11 -1.672e-15 -1.011e-11
Treating the dilemmas as a random effect won't do what you want (allow for individual-to-individual variability in how they were presented). That among-subject variability in the dilemmas is going to get lumped into the residual variability, where it belongs — I would recommend treating it as a fixed effect.

Recurring error using lmer() function for a linear mixed-effects model

I attempted to construct a linear mixed effects model using lmer function from lme4 package and I ran into a recurring error. The model uses two fixed effects:
DBS_Electrode (factor w/3 levels) and
PostOp_ICA (continuous variable).
I use (1 | Subject) as a random effect term in which Subject is a factor of 38 levels (38 total subjects). Below is the line of code I attempted to run:
LMM.DBS <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject), data = DBS)
I recieved the following error:
Number of levels of each grouping factor must be < number of observations.
I would appreciate any help, I have tried to navigate this issue myself and have been unsuccessful.
Linear mixed effect model supposes that there is less subjects than observations so it throws an if it is not the case.
You can think of this formula as telling your model that it should
expect that there’s going to be multiple responses per subject, and
these responses will depend on each subject’s baseline level.
Please consult A very basic tutorial for performing linear mixed effects analyses by B. Winter, p. 4.
In your case you should increase amount of observations per subject (> 1). Please see the simulation below:
library(lme4)
set.seed(123)
n <- 38
DBS_Electrode <- factor(sample(LETTERS[1:3], n, replace = TRUE))
Distal_Lead_Migration <- 10 * abs(rnorm(n)) # Distal_Lead_Migration in cm
PostOp_ICA <- 5 * abs(rnorm(n))
# amount of observations equals to amout of subjects
Subject <- paste0("X", 1:n)
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
# Error: number of levels of each grouping factor must be < number of observations
# amount of observations more than amout of subjects
Subject <- c(paste0("X", 1:36), "X1", "X37")
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
summary(model)
Output:
Linear mixed model fit by REML ['lmerMod']
Formula: Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject)
Data: DBS
REML criterion at convergence: 224.5
Scaled residuals:
Min 1Q Median 3Q Max
-1.24605 -0.73780 -0.07638 0.64381 2.53914
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 2.484e-14 1.576e-07
Residual 2.953e+01 5.434e+00
Number of obs: 38, groups: Subject, 37
Fixed effects:
Estimate Std. Error t value
(Intercept) 7.82514 2.38387 3.283
DBS_ElectrodeB 0.22884 2.50947 0.091
DBS_ElectrodeC -0.60940 2.21970 -0.275
PostOp_ICA -0.08473 0.36765 -0.230
Correlation of Fixed Effects:
(Intr) DBS_EB DBS_EC
DBS_ElctrdB -0.718
DBS_ElctrdC -0.710 0.601
PostOp_ICA -0.693 0.324 0.219

lmer p values used to create the stars listed by screenreg

This may be an obvious question that I have yet to be able to find an answer for (R newbie), but when generating a mixed effects model using the lmer function then displaying the results using:
screenreg(list(model4re), single.row = TRUE)
we get a list of the beta estimates, standard error and the significance level in the form of stars.
What test is used to determine these p-values to label the stars (Important as I recognize that there is some contention around how appropriately determine a significant influence using these models) and how can we extract the p-values used for these stars?
A detailed description of the methods available in R to calculate p-values for the parameters estimated by lmer can be found typing ?lme4::pvalues.
Below I show the code for calculating p-values for the Kenward-Roger-corrected tests:
library(lmerTest)
fm1 <- lmerTest::lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
lmerTest::anova(fm1)
#############
Analysis of Variance Table of type III with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Days 30031 30031 1 17 45.853 3.264e-06 ***
The stargazer command in the stargazer package print p-values for estimated parameters:
library(stargazer)
fm2 <- lme4::lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
stargazer(fm2, type="text", report="vcp")
===============================================
Dependent variable:
---------------------------
Reaction
-----------------------------------------------
Days 10.467
p = 0.000
Constant 251.405
p = 0.000
-----------------------------------------------
Observations 180
Log Likelihood -871.814
Akaike Inf. Crit. 1,755.628
Bayesian Inf. Crit. 1,774.786
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01
In texreg p-values for lme4 objects are calculated by the extract.lmerMod command. See the following example:
library(lme4)
data(oats, package="MASS")
(fm1 <- lmer(Y ~ V*N + (1| B/V), data = oats))
##############
Linear mixed model fit by REML ['merModLmerTest']
Formula: Y ~ V * N + (1 | B/V)
Data: oats
REML criterion at convergence: 529.0285
Random effects:
Groups Name Std.Dev.
V:B (Intercept) 10.30
B (Intercept) 14.65
Residual 13.31
Number of obs: 72, groups: V:B, 18; B, 6
Fixed Effects:
(Intercept) VMarvellous VVictory N0.2cwt N0.4cwt N0.6cwt
80.0000 6.6667 -8.5000 18.5000 34.6667 44.8333
VMarvellous:N0.2cwt VVictory:N0.2cwt VMarvellous:N0.4cwt VVictory:N0.4cwt VMarvellous:N0.6cwt VVictory:N0.6cwt
3.3333 -0.3333 -4.1667 4.6667 -4.6667 2.1667
###############
Using extract.lmerMod we get:
extract.lmerMod(fm1)
###############
coef. s.e. p
(Intercept) 80.0000000 9.106977 1.570989e-18
VMarvellous 6.6666667 9.715025 4.925730e-01
VVictory -8.5000000 9.715025 3.816101e-01
N0.2cwt 18.5000000 7.682954 1.604334e-02
N0.4cwt 34.6666667 7.682954 6.417271e-06
N0.6cwt 44.8333333 7.682954 5.365224e-09
VMarvellous:N0.2cwt 3.3333333 10.865337 7.590063e-01
VVictory:N0.2cwt -0.3333333 10.865337 9.755259e-01
VMarvellous:N0.4cwt -4.1666667 10.865337 7.013620e-01
VVictory:N0.4cwt 4.6666667 10.865337 6.675591e-01
VMarvellous:N0.6cwt -4.6666667 10.865337 6.675591e-01
VVictory:N0.6cwt 2.1666667 10.865337 8.419413e-01
GOF dec. places
AIC 559.0285 TRUE
BIC 593.1785 TRUE
Log Likelihood -264.5143 TRUE
Num. obs. 72.0000 FALSE
Num. groups: V:B 18.0000 FALSE
Num. groups: B 6.0000 FALSE
Var: V:B (Intercept) 106.0618 TRUE
Var: B (Intercept) 214.4771 TRUE
Var: Residual 177.0833 TRUE
Looking inside the extract.lmerMod function, the p-values are calculated as follows:
betas <- lme4::fixef(fm1)
Vcov <- vcov(fm1)
Vcov <- as.matrix(Vcov)
se <- sqrt(diag(Vcov))
zval <- betas/se
(pval <- 2 * pnorm(abs(zval), lower.tail = FALSE))
##################
(Intercept) VMarvellous VVictory N0.2cwt N0.4cwt N0.6cwt VMarvellous:N0.2cwt VVictory:N0.2cwt
1.570989e-18 4.925730e-01 3.816101e-01 1.604334e-02 6.417271e-06 5.365224e-09 7.590063e-01 9.755259e-01
VMarvellous:N0.4cwt VVictory:N0.4cwt VMarvellous:N0.6cwt VVictory:N0.6cwt
7.013620e-01 6.675591e-01 6.675591e-01 8.419413e-01

Interpreting the output of summary(glmer(...)) in R

I'm an R noob, I hope you can help me:
I'm trying to analyse a dataset in R, but I'm not sure how to interpret the output of summary(glmer(...)) and the documentation isn't a big help:
> data_chosen_stim<-glmer(open_chosen_stim~closed_chosen_stim+day+(1|ID),family=binomial,data=chosenMovement)
> summary(data_chosen_stim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: open_chosen_stim ~ closed_chosen_stim + day + (1 | ID)
Data: chosenMovement
AIC BIC logLik deviance df.resid
96.7 105.5 -44.4 88.7 62
Scaled residuals:
Min 1Q Median 3Q Max
-1.4062 -1.0749 0.7111 0.8787 1.0223
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0 0
Number of obs: 66, groups: ID, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4511 0.8715 0.518 0.605
closed_chosen_stim2 0.4783 0.5047 0.948 0.343
day -0.2476 0.5060 -0.489 0.625
Correlation of Fixed Effects:
(Intr) cls__2
clsd_chsn_2 -0.347
day -0.916 0.077
I understand the GLM behind it, but I can't see the weights of the independent variables and their error bounds.
update: weights.merMod already has a type argument ...
I think what you're looking for weights(object,type="working").
I believe these are the diagonal elements of W in your notation?
Here's a trivial example that matches up the results of glm and glmer (since the random effect is bogus and gets an estimated variance of zero, the fixed effects, weights, etc etc converges to the same value).
Note that the weights() accessor returns the prior weights by default (these are all equal to 1 for the example below).
Example (from ?glm):
d.AD <- data.frame(treatment=gl(3,3),
outcome=gl(3,1,9),
counts=c(18,17,15,20,10,20,25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(),
data=d.AD)
library(lme4)
d.AD$f <- 1 ## dummy grouping variable
glmer.D93 <- glmer(counts ~ outcome + treatment + (1|f),
family = poisson(),
data=d.AD,
control=glmerControl(check.nlev.gtr.1="ignore"))
Fixed effects and weights are the same:
all.equal(fixef(glmer.D93),coef(glm.D93)) ## TRUE
all.equal(unname(weights(glm.D93,type="working")),
weights(glmer.D93,type="working"),
tol=1e-7) ## TRUE

Resources