bootstrapping for lmer with interaction term - r

I am running a mixed model using lme4 in R:
full_mod3=lmer(logcptplus1 ~ logdepth*logcobb + (1|fyear) + (1 |flocation),
data=cpt, REML=TRUE)
summary:
Formula: logcptplus1 ~ logdepth * logcobb + (1 | fyear) + (1 | flocation)
Data: cpt
REML criterion at convergence: 577.5
Scaled residuals:
Min 1Q Median 3Q Max
-2.7797 -0.5431 0.0248 0.6562 2.1733
Random effects:
Groups Name Variance Std.Dev.
fyear (Intercept) 0.2254 0.4748
flocation (Intercept) 0.1557 0.3946
Residual 0.9663 0.9830
Number of obs: 193, groups: fyear, 16; flocation, 16
Fixed effects:
Estimate Std. Error t value
(Intercept) 4.3949 1.2319 3.568
logdepth 0.2681 0.4293 0.625
logcobb -0.7189 0.5955 -1.207
logdepth:logcobb 0.3791 0.2071 1.831
I have used the effects package and function in R to calculate the 95% confidence intervals for the model output. I have calculated and extracted the 95% CI and standard error using the effects package so that I can examine the relationship between the predictor variable of importance and the response variable by holding the secondary predictor variable (logdepth) constant at the median (2.5) in the data set:
gm=4.3949 + 0.2681*depth_median + -0.7189*logcobb_range + 0.3791*
(depth_median*logcobb_range)
ef2=effect("logdepth*logcobb",full_mod3,
xlevels=list(logcobb=seq(log(0.03268),log(0.37980),,200)))
I have attempted to bootstrap the 95% CIs using code from here. However, I need to calculate the 95% CIs for only the median depth (2.5). Is there a way to specify in the confint() code so that I can calculate the CIs needed to visualize the bootstrapped results as in the plot above?
confint(full_mod3,method="boot",nsim=200,boot.type="perc")

You can do this by specifying a custom function:
library(lme4)
?confint.merMod
FUN: bootstrap function; if ‘NULL’, an internal function that returns the fixed-effect parameters as well as the random-effect parameters on the standard deviation/correlation scale will be used. See ‘bootMer’ for details.
So FUN can be a prediction function (?predict.merMod) that uses a newdata argument that varies and fixes appropriate predictor variables.
An example with built-in data (not quite as interesting as yours since there's a single continuous predictor variable, but I think it should illustrate the approach clearly enough):
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
pframe <- data.frame(Days=seq(0,20,by=0.5))
## predicted values at population level (re.form=NA)
pfun <- function(fit) {
predict(fit,newdata=pframe,re.form=NA)
}
set.seed(101)
cc <- confint(fm1,method="boot",FUN=pfun)
Picture:
par(las=1,bty="l")
matplot(pframe$Days,cc,lty=2,col=1,type="l",
xlab="Days",ylab="Reaction")

Related

Equivalence of a mixed model fitted by lme and lmer

I have fitted a mixed effects model considering both functions widely used in R, namely: the lme function from the nlme package and the lmer function from the lme4 package.
To readjust the model from lme to lme4, following the same reparametrization, I used the following information from this topic, being that is only possible to do this in lme4 in a hackable way.: Heterocesdastic model of mixed effects via lmer function
I apologize for hosting the data in a link, however, I couldn't find an internal R database that has variables that might match my problem.
Data: https://drive.google.com/file/d/1jKFhs4MGaVxh-OPErvLDfMNmQBouywoM/view?usp=sharing
The fitted models were:
library(nlme)
library(lme4)
ModLME = lme(Var1~I(Var2)+I(Var2^2),
random = ~1|Var3,
weights = varIdent(form=~1|Var4),
Dataone, method="REML")
ModLMER = lmer(Var1~I(Var2)+I(Var2^2)+(1|Var3)+(0+dummy(Var4,"1")|Var5),
Dataone, REML = TRUE,
control=lmerControl(check.nobs.vs.nlev="ignore",
check.nobs.vs.nRE="ignore"))
Which are equivalent, see:
all.equal(REMLcrit(ModLMER), c(-2*logLik(ModLME)))
[1] TRUE
all.equal(fixef(ModLME), fixef(ModLMER), tolerance=1e-7)
[1] TRUE
> summary(ModLME)
Linear mixed-effects model fit by REML
Data: Dataone
AIC BIC logLik
-209.1431 -193.6948 110.5715
Random effects:
Formula: ~1 | Var3
(Intercept) Residual
StdDev: 0.05789852 0.03636468
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | Var4
Parameter estimates:
0 1
1.000000 5.641709
Fixed effects: Var1 ~ I(Var2) + I(Var2^2)
Value Std.Error DF t-value p-value
(Intercept) 0.9538547 0.01699642 97 56.12093 0
I(Var2) -0.5009804 0.09336479 97 -5.36584 0
I(Var2^2) -0.4280151 0.10038257 97 -4.26384 0
summary(ModLMER)
Linear mixed model fit by REML. t-tests use Satterthwaites method [lmerModLmerTest]
Formula: Var1 ~ I(Var2) + I(Var2^2) + (1 | Var3) + (0 + dummy(Var4, "1") |
Var5)
Data: Dataone
Control: lmerControl(check.nobs.vs.nlev = "ignore", check.nobs.vs.nRE = "ignore")
REML criterion at convergence: -221.1
Scaled residuals:
Min 1Q Median 3Q Max
-4.1151 -0.5891 0.0374 0.5229 2.1880
Random effects:
Groups Name Variance Std.Dev.
Var3 (Intercept) 6.466e-12 2.543e-06
Var5 dummy(Var4, "1") 4.077e-02 2.019e-01
Residual 4.675e-03 6.837e-02
Number of obs: 100, groups: Var3, 100; Var5, 100
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.95385 0.01700 95.02863 56.121 < 2e-16 ***
I(Var2) -0.50098 0.09336 92.94048 -5.366 5.88e-07 ***
I(Var2^2) -0.42802 0.10038 91.64017 -4.264 4.88e-05 ***
However, when observing the residuals of these models, note that they are not similar. See that in the model adjusted by lmer, mysteriously appears a residue with the shape of a few points close to a straight line. So, how could you solve such a problem so that they are identical? I believe the problem is in the lme4 model.
aa=plot(ModLME, main="LME")
bb=plot(ModLMER, main="LMER")
gridExtra::grid.arrange(aa,bb,ncol=2)
I can tell you what's going on and what should in principle fix it, but at the moment the fix doesn't work ...
The residuals being plotted take all of the random effects into account, which in the case of the lmer fit includes the individual-level random effects (the (0+dummy(Var4,"1")|Var5) term), which leads to weird residuals for the Var4==1 group. To illustrate this:
plot(ModLMER, col = Dataone$Var4+1)
i.e., you can see that the weird residuals are exactly the ones in red == those for which Var4==1.
In theory we should be able to get the same residuals via:
res <- Dataone$Var1 - predict(ModLMER, re.form = ~(1|Var3))
i.e., ignore the group-specific observation-level random effect term. However, it looks like there is a bug at the moment ("contrasts can be applied only to factors with 2 or more levels").
An extremely hacky solution is to construct the random-effect predictions without the observation-level term yourself:
## fixed-effect predictions
p0 <- predict(ModLMER, re.form = NA)
## construct RE prediction, Var3 term only:
Z <- getME(ModLMER, "Z")
b <- drop(getME(ModLMER, "b"))
## zero out observation-level components
b[101:200] <- 0
## add RE predictions to fixed predictions
p1 <- drop(p0 + Z %*% b)
## plot fitted vs residual
plot(p1, Dataone$Var1 - p1)
For what it's worth, this also works:
library(glmmTMB)
ModGLMMTMB <- glmmTMB(Var1~I(Var2)+I(Var2^2)+(1|Var3),
dispformula = ~factor(Var4),
REML = TRUE,
data = Dataone)

Recurring error using lmer() function for a linear mixed-effects model

I attempted to construct a linear mixed effects model using lmer function from lme4 package and I ran into a recurring error. The model uses two fixed effects:
DBS_Electrode (factor w/3 levels) and
PostOp_ICA (continuous variable).
I use (1 | Subject) as a random effect term in which Subject is a factor of 38 levels (38 total subjects). Below is the line of code I attempted to run:
LMM.DBS <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject), data = DBS)
I recieved the following error:
Number of levels of each grouping factor must be < number of observations.
I would appreciate any help, I have tried to navigate this issue myself and have been unsuccessful.
Linear mixed effect model supposes that there is less subjects than observations so it throws an if it is not the case.
You can think of this formula as telling your model that it should
expect that there’s going to be multiple responses per subject, and
these responses will depend on each subject’s baseline level.
Please consult A very basic tutorial for performing linear mixed effects analyses by B. Winter, p. 4.
In your case you should increase amount of observations per subject (> 1). Please see the simulation below:
library(lme4)
set.seed(123)
n <- 38
DBS_Electrode <- factor(sample(LETTERS[1:3], n, replace = TRUE))
Distal_Lead_Migration <- 10 * abs(rnorm(n)) # Distal_Lead_Migration in cm
PostOp_ICA <- 5 * abs(rnorm(n))
# amount of observations equals to amout of subjects
Subject <- paste0("X", 1:n)
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
# Error: number of levels of each grouping factor must be < number of observations
# amount of observations more than amout of subjects
Subject <- c(paste0("X", 1:36), "X1", "X37")
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
summary(model)
Output:
Linear mixed model fit by REML ['lmerMod']
Formula: Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject)
Data: DBS
REML criterion at convergence: 224.5
Scaled residuals:
Min 1Q Median 3Q Max
-1.24605 -0.73780 -0.07638 0.64381 2.53914
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 2.484e-14 1.576e-07
Residual 2.953e+01 5.434e+00
Number of obs: 38, groups: Subject, 37
Fixed effects:
Estimate Std. Error t value
(Intercept) 7.82514 2.38387 3.283
DBS_ElectrodeB 0.22884 2.50947 0.091
DBS_ElectrodeC -0.60940 2.21970 -0.275
PostOp_ICA -0.08473 0.36765 -0.230
Correlation of Fixed Effects:
(Intr) DBS_EB DBS_EC
DBS_ElctrdB -0.718
DBS_ElctrdC -0.710 0.601
PostOp_ICA -0.693 0.324 0.219

How to obtain the p-values for each coefficient in a nested logit glmer model (using lme4)?

I'm running the following code:
library(lme4)
library(nlme)
nest.reg2 <- glmer(SS ~ (bd|cond), family = "binomial",
data = combined2)
coef(nest.reg2)
summary(nest.reg2)
Which produces the following output:
coefficients
$cond
bd (Intercept)
LL -1.014698 1.286768
no -3.053920 4.486349
SS -5.300883 8.011879
summary
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: SS ~ (bd | cond)
Data: combined2
AIC BIC logLik deviance df.resid
1419.7 1439.7 -705.8 1411.7 1084
Scaled residuals:
Min 1Q Median 3Q Max
-8.0524 -0.8679 -0.4508 1.0735 2.2756
Random effects:
Groups Name Variance Std.Dev. Corr
cond (Intercept) 33.34 5.774
bd 13.54 3.680 -1.00
Number of obs: 1088, groups: cond, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3053 0.1312 -2.327 0.02 *
My question is how do I test the significance of each of the coefficients for this model? The Summary function seems to only provide a p-value for the intercept, not the coefficients.
When I try anova(nest.reg2) I get nothing, just:
Analysis of Variance Table
Df Sum Sq Mean Sq F value
I've tried the solutions proposed here (How to obtain the p-value (check significance) of an effect in a lme4 mixed model?) to no avail.
To clarify, the cond variable is a factor with three levels (SS, no, and LL), and I believe that the coef command produces coefficients for the continuous bd variable at each of those levels, so what I'm trying to do is test the significance of those coefficients.
There are several issues here.
the main one is that you can really only do significance testing on fixed effect coefficients; you have coded your model with no fixed effects. You might be looking for
glmer(SS ~ bd + (1|cond), ...)
which will model the overall (population-level) distinctions among the levels of bd and include variation in the intercept among levels of cond.
If you have multiple levels of bd represented in each cond group, then you can in principle also allow for variation in treatment effects among cond groups:
glmer(SS ~ bd + (bd|cond), ...)
however, you have another problem. Three groups (i.e., levels of cond) isn't really enough, in practice, to estimate variability among groups. That's why you're seeing a correlation of -1.00 in your output, which indicates you have a singular fit (e.g. see here for more discussion).
therefore, another possibility would be to just go ahead and treat cond as a fixed effect (adjusting the contrasts on cond so that the main effect of bd is estimated as the average across groups rather than the effect in the baseline level of cond).
glm(SS~bd*cond,contrasts=list(cond=contr.sum),...)

Interpreting the output of summary(glmer(...)) in R

I'm an R noob, I hope you can help me:
I'm trying to analyse a dataset in R, but I'm not sure how to interpret the output of summary(glmer(...)) and the documentation isn't a big help:
> data_chosen_stim<-glmer(open_chosen_stim~closed_chosen_stim+day+(1|ID),family=binomial,data=chosenMovement)
> summary(data_chosen_stim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: open_chosen_stim ~ closed_chosen_stim + day + (1 | ID)
Data: chosenMovement
AIC BIC logLik deviance df.resid
96.7 105.5 -44.4 88.7 62
Scaled residuals:
Min 1Q Median 3Q Max
-1.4062 -1.0749 0.7111 0.8787 1.0223
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0 0
Number of obs: 66, groups: ID, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4511 0.8715 0.518 0.605
closed_chosen_stim2 0.4783 0.5047 0.948 0.343
day -0.2476 0.5060 -0.489 0.625
Correlation of Fixed Effects:
(Intr) cls__2
clsd_chsn_2 -0.347
day -0.916 0.077
I understand the GLM behind it, but I can't see the weights of the independent variables and their error bounds.
update: weights.merMod already has a type argument ...
I think what you're looking for weights(object,type="working").
I believe these are the diagonal elements of W in your notation?
Here's a trivial example that matches up the results of glm and glmer (since the random effect is bogus and gets an estimated variance of zero, the fixed effects, weights, etc etc converges to the same value).
Note that the weights() accessor returns the prior weights by default (these are all equal to 1 for the example below).
Example (from ?glm):
d.AD <- data.frame(treatment=gl(3,3),
outcome=gl(3,1,9),
counts=c(18,17,15,20,10,20,25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(),
data=d.AD)
library(lme4)
d.AD$f <- 1 ## dummy grouping variable
glmer.D93 <- glmer(counts ~ outcome + treatment + (1|f),
family = poisson(),
data=d.AD,
control=glmerControl(check.nlev.gtr.1="ignore"))
Fixed effects and weights are the same:
all.equal(fixef(glmer.D93),coef(glm.D93)) ## TRUE
all.equal(unname(weights(glm.D93,type="working")),
weights(glmer.D93,type="working"),
tol=1e-7) ## TRUE

Cannot run lmer from within a function

I am running into a problem trying to embed lmer within a function. Here is a reproducible example using data from lexdec. If I run lmer on the data frame directly, there is no problem. For example, say that I want to see whether reading times in a lexical decision task differed as a function of Trial. There were 2 types of word stimuli, "animal" (e.g. "dog") and "plant" (e.g. "cherry"). I can compute a mixed-effects model for animal words:
library(languageR) #load lexdec data
library(lme4) #load lmer()
s <- summary(lmer(RT ~ Trial + (1|Subject) + (1|Word), data = lexdec[lexdec$Class== "animal", ]))
s #this works well
However, if I embed the lmer model inside a function (say to not type the same command for each level of class) I get an error message. Do you know why? Any suggestions will be much appreciated!
#lmer() is now embedded in a function
compute.lmer <- function(df,class) {
m <- lmer(RT ~ Trial + (1|Subject) + (1|Word),data = df[df$Class== class, ])
m <- summary(m)
return(m)
}
#Now I can use this function to iterate over the 2 levels of the **Class** factor
for (c in levels(lexdec$Class)){
s <- compute.lmer(lexdec,c)
print(c)
print(s)
}
#But this gives an error message
Error in `colnames<-`(`*tmp*`, value = c("Estimate", "Std. Error", "df", :
length of 'dimnames' [2] not equal to array extent
I don't know what the problem is, your code runs just fine for me. (Are your packages up to date? What R version are you running? Have you cleaned your workspace and tried your code from scratch?)
That said, this is a great use case for plyr::dlply. I would do it like this:
library(languageR)
library(lme4)
library(plyr)
stats <- dlply(lexdec,
.variables = c("Class"),
.fun=function(x) return(summary(lmer(RT ~ Trial + (1 | Subject) +
(1 | Word), data = x))))
names(stats) <- levels(lexdec$Class)
Which then yields
> stats[["plant"]]
Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Trial + (1 | Subject) + (1 | Word)
Data: x
REML criterion at convergence: -389.5
Scaled residuals:
Min 1Q Median 3Q Max
-2.2647 -0.6082 -0.1155 0.4502 6.0593
Random effects:
Groups Name Variance Std.Dev.
Word (Intercept) 0.003718 0.06097
Subject (Intercept) 0.023293 0.15262
Residual 0.028697 0.16940
Number of obs: 735, groups: Word, 35; Subject, 21
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.3999245 0.0382700 167.23
Trial -0.0001702 0.0001357 -1.25
Correlation of Fixed Effects:
(Intr)
Trial -0.379
When I run your code (copied and pasted without modification), I get similar output. It's identical except for the Data: line.
stats = list()
compute.lmer <- function(df,class) {
m <- lmer(RT ~ Trial + (1|Subject) + (1|Word),data = df[df$Class== class, ])
m <- summary(m)
return(m)
}
for (c in levels(lexdec$Class)){
s <- compute.lmer(lexdec,c)
stats[[c]] <- s
}
> stats[["plant"]]
Linear mixed model fit by REML ['lmerMod']
Formula: RT ~ Trial + (1 | Subject) + (1 | Word)
Data: df[df$Class == class, ]
REML criterion at convergence: -389.5
Scaled residuals:
Min 1Q Median 3Q Max
-2.2647 -0.6082 -0.1155 0.4502 6.0593
Random effects:
Groups Name Variance Std.Dev.
Word (Intercept) 0.003718 0.06097
Subject (Intercept) 0.023293 0.15262
Residual 0.028697 0.16940
Number of obs: 735, groups: Word, 35; Subject, 21
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.3999245 0.0382700 167.23
Trial -0.0001702 0.0001357 -1.25
Correlation of Fixed Effects:
(Intr)
Trial -0.379

Resources