Colnames error after running Summary() in mixed model - r

R version 3.1.0 (2014-04-10)
lmer package version 1.1-6
lmerTest package version 2.0-6
I am currently working with lmer and lmerTest for my analysis.
Every time I add an effect to the random structure, I get the following error when running summary():
#Fitting a mixed model:
TRT5ToVerb.lmer3 = lmer(TRT5ToVerb ~ Group + Condition + (1+Condition|Participant) + (1|Trial), data=AllData, REML=FALSE, na.action=na.omit)
summary(TRT5ToVerb.lmer3)
Error in `colnames<-`(`*tmp*`, value = c("Estimate", "Std. Error", "df", : length of 'dimnames' [2] not equal to array extent
If I leave the structure like this:
TRT5ToVerb.lmer2 = lmer(TRT5ToVerb ~ Group + Condition + (1|Participant) + (1|Trial), data=AllData, REML=FALSE, na.action=na.omit)
there is no error run summary(TRT5ToVerb.lmer2), returning AIC, BIC, logLik deviance, estimates of the random effects, estimates of the fixed effects and their corresponding p-values, etc., etc.
So, apparently something happens when I run lmerTest, despite the fact that the object TRT5ToVerb.lmer3 is there. The only difference between both is the random structure: (1+Condition|Participant) vs. (1|Participant)
Some characteristics of my data:
Both Condition and Group are categorical variables: Condition
comprises 3 levels, and Group 2
The dependent variable (TRT5ToVerb) is continuous: it corresponds to
reading time in terms of ms
This a repeated measures experiment, with 48 observations per
participant (participants=28)
I read this threat, but I cannot see a clear solution. Will it be that I have to transform my dataframe to long format?
And if so, then how do I work with that in lmer?
I hope it is not that.
Thanks!
Disclaimer: I am neither an expert in R, nor in statistics, so please, have some patience.

(Should be a comment, but too long/code formatting etc.)
This fake example seems to work fine with lmerTest 2.0-6 and a development version of lme4 (1.1-8; but I wouldn't expect there to be any relevant differences from 1.1-6 for this example ...)
AllData <- expand.grid(Condition=factor(1:3),Group=factor(1:2),
Participant=1:28,Trial=1:8)
form <- TRT5ToVerb ~ Group + Condition + (1+Condition|Participant) + (1|Trial)
library(lme4)
set.seed(101)
AllData$TRT5ToVerb <- simulate(form[-2],
newdata=AllData,
family=gaussian,
newparam=list(theta=rep(1,7),sigma=1,beta=rep(0,4)))[[1]]
library(lmerTest)
lmer3 <- lmer(form,data=AllData,REML=FALSE)
summary(lmer3)
Produces:
Linear mixed model fit by maximum likelihood ['merModLmerTest']
Formula: TRT5ToVerb ~ Group + Condition + (1 + Condition | Participant) +
(1 | Trial)
Data: AllData
AIC BIC logLik deviance df.resid
4073.6 4136.0 -2024.8 4049.6 1332
Scaled residuals:
Min 1Q Median 3Q Max
-2.97773 -0.65923 0.02319 0.66454 2.98854
Random effects:
Groups Name Variance Std.Dev. Corr
Participant (Intercept) 0.8546 0.9245
Condition2 1.3596 1.1660 0.58
Condition3 3.3558 1.8319 0.44 0.82
Trial (Intercept) 0.9978 0.9989
Residual 0.9662 0.9829
Number of obs: 1344, groups: Participant, 28; Trial, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.49867 0.39764 12.40000 1.254 0.233
Group2 0.03002 0.05362 1252.90000 0.560 0.576
Condition2 -0.03777 0.22994 28.00000 -0.164 0.871
Condition3 -0.27796 0.35237 28.00000 -0.789 0.437
Correlation of Fixed Effects:
(Intr) Group2 Cndtn2
Group2 -0.067
Condition2 0.220 0.000
Condition3 0.172 0.000 0.794

Related

Interpreting output from emmeans::contrast

I have data from a longitudinal study and calculated the regression using the lme4::lmer function. After that I calculated the contrasts for these data but I am having difficulty interpreting my results, as they were unexpected. I think I might have made a mistake in the code. Unfortunately I couldn't replicate my results with an example, but I will post both the failed example and my actual results below.
My results:
library(lme4)
library(lmerTest)
library(emmeans)
#regression
regmemory <- lmer(memory ~ as.factor(QuartileConsumption)*Age+
(1 + Age | ID) + sex + education +
HealthScore, CognitionData)
#results
summary(regmemory)
#Fixed effects:
# Estimate Std. Error df t value Pr(>|t|)
#(Intercept) -7.981e-01 9.803e-02 1.785e+04 -8.142 4.15e-16 ***
#as.factor(QuartileConsumption)2 -8.723e-02 1.045e-01 2.217e+04 -0.835 0.40376
#as.factor(QuartileConsumption)3 5.069e-03 1.036e-01 2.226e+04 0.049 0.96097
#as.factor(QuartileConsumption)4 -2.431e-02 1.030e-01 2.213e+04 -0.236 0.81337
#Age -1.709e-02 1.343e-03 1.989e+04 -12.721 < 2e-16 ***
#sex 3.247e-01 1.520e-02 1.023e+04 21.355 < 2e-16 ***
#education 2.979e-01 1.093e-02 1.061e+04 27.266 < 2e-16 ***
#HealthScore -1.098e-06 5.687e-07 1.021e+04 -1.931 0.05352 .
#as.factor(QuartileConsumption)2:Age 1.101e-03 1.842e-03 1.951e+04 0.598 0.55006
#as.factor(QuartileConsumption)3:Age 4.113e-05 1.845e-03 1.935e+04 0.022 0.98221
#as.factor(QuartileConsumption)4:Age 1.519e-03 1.851e-03 1.989e+04 0.821 0.41174
#contrasts
emmeans(regmemory, poly ~ QuartileConsumption * Age)$contrast
#$contrasts
# contrast estimate SE df z.ratio p.value
# linear 0.2165 0.0660 Inf 3.280 0.0010
# quadratic 0.0791 0.0289 Inf 2.733 0.0063
# cubic -0.0364 0.0642 Inf -0.567 0.5709
The interaction terms in the regression results are not significant, but the linear contrast is. Shouldn't the p-value for the contrast be non-significant?
Below is the code I wrote to try to recreate these results, but failed:
library(dplyr)
library(lme4)
library(lmerTest)
library(emmeans)
data("sleepstudy")
#create quartile column
sleepstudy$Quartile <- sample(1:4, size = nrow(sleepstudy), replace = T)
#regression
model1 <- lmer(Reaction ~ Days * as.factor(Quartile) + (1 + Days | Subject), data = sleepstudy)
#results
summary(model1)
#Fixed effects:
# Estimate Std. Error df t value Pr(>|t|)
#(Intercept) 258.1519 9.6513 54.5194 26.748 < 2e-16 ***
#Days 9.8606 2.0019 43.8516 4.926 1.24e-05 ***
#as.factor(Quartile)2 -11.5897 11.3420 154.1400 -1.022 0.308
#as.factor(Quartile)3 -5.0381 11.2064 155.3822 -0.450 0.654
#as.factor(Quartile)4 -10.7821 10.8798 154.0820 -0.991 0.323
#Days:as.factor(Quartile)2 0.5676 2.1010 152.1491 0.270 0.787
#Days:as.factor(Quartile)3 0.2833 2.0660 155.5669 0.137 0.891
#Days:as.factor(Quartile)4 1.8639 2.1293 153.1315 0.875 0.383
#contrast
emmeans(model1, poly ~ Quartile*Days)$contrast
#contrast estimate SE df t.ratio p.value
# linear -1.91 18.78 149 -0.102 0.9191
# quadratic 10.40 8.48 152 1.227 0.2215
# cubic -18.21 18.94 150 -0.961 0.3379
In this example, the p-value for the linear contrast is non-significant just as the interactions from the regression. Did I do something wrong, or these results are to be expected?
Look at the emmeans() call for the original model:
emmeans(regmemory, poly ~ QuartileConsumption * Age)
This requests that we obtain marginal means for combinations of QuartileConsumption and Age, and obtain polynomial contrasts from those results. It appears that Age is a quantitative variable, so in computing the marginal means, we just use the mean value of Age (see documentation for ref_grid() and vignette("basics", "emmeans")). So the marginal means display, which wasn't shown in the OP, will be in this general form:
QuartileConsumption Age emmean
------------------------------------
1 <mean> <est1>
2 <mean> <est2>
3 <mean> <est3>
4 <mean> <est4>
... and the contrasts shown will be the linear, quadratic, and cubic trends of those four estimates, in the order shown.
Note that these marginal means have nothing to do with the interaction effect; they are just predictions from the model for the four levels of QuartileConsumption at the mean Age (and mean education, mean health score), averaged over the two sexes, if I understand the data structure correctly. So essentially the polynomial contrasts estimate polynomial trends of the 4-level factor at the mean age. And note in particular that age is held constant, so we certainly are not looking at any effects of Age.
I am guessing what you want to be doing to examine the interaction is to assess how the age trend varies over the four levels of that factor. If that is the case, one useful thing to do would be something like
slopes <- emtrends(regmemory, ~ QuartileConsumption, var = "age")
slopes # display the estimated slope at each level
pairs(slopes) # pairwise comparisons of these slopes
See vignette("interactions", "emmeans") and the section on interactions with covariates.

How to extract only the random effects correlation parameters from an lmer model?

I am trying to extract random effect correlation parameters from an lmer output.
This is my model:
m <- lmer(RT ~ Condition + (1 + Condition| Participant), data)
Giving me the following output:
REML criterion at convergence: 6533.6
Scaled residuals:
Min 1Q Median 3Q Max
-3.4666 -0.6318 -0.0232 0.5696 4.1010
Random effects:
Groups Name Variance Std.Dev. Corr
Participant (Intercept) 0.045483 0.21327
Condition2 0.001271 0.03565 -0.43
Condition3 0.005774 0.07599 -0.04 -0.09
Condition4 0.003817 0.06178 -0.57 0.60 0.69
Residual 0.147445 0.38399
Number of obs: 6841, groups: Participant, 39
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.57546 0.03537 44.544
Condition2 0.06677 0.01420 4.703
Condition3 -0.09581 0.01798 -5.328
Condition4 0.02710 0.01639 1.653
Correlation of Fixed Effects:
(Intr) Cndtn2 Cndtn3
Condition2 -0.334
Condition3 -0.157 0.307
Condition4 -0.476 0.508 0.571
However, I only want to extract specific correlation parameters of the random effects, say the correlation between Condition3 and Condition2 (-0.04). Does anyone know how to do that?
I tried using the VarCorr() function which only displays the results for the random effects, but still does not let me extract specific values from it. I would really appreciate any help!
#mfidino's answer is good. Alternatively
cc <- cov2cor(VarCorr(m1)$Subject)
cc["Days", "(Intercept)"]
or
cc <- attr(VarCorr(m1)$Subject, "corr")
cc["Days", "(Intercept)"]
The $Subject part is required because lmer models can have multiple random effects terms, so VarCorr is always returned as a list of covariance matrices (named according to the name of the corresponding grouping variable)
You want to use lme4::VarCorr to extract those values. Here is an example.
library(lme4)
data("sleepstudy")
sl <- sleepstudy
m1 <- lmer(
Reaction ~ Days + (Days | Subject),
data = sl
)
summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (Days | Subject)
Data: sl
REML criterion at convergence: 1743.6
Scaled residuals:
Min 1Q Median 3Q Max
-3.9536 -0.4634 0.0231 0.4634 5.1793
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 612.10 24.741
Days 35.07 5.922 0.07
Residual 654.94 25.592
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 251.405 6.825 36.838
Days 10.467 1.546 6.771
Correlation of Fixed Effects:
(Intr)
Days -0.138
Here, we want to extract that correlation between (Intercept) and Days. We do that like so:
(ranef_vals <- data.frame(VarCorr(m1)))
grp var1 var2 vcov sdcor
1 Subject (Intercept) <NA> 612.100158 24.74065799
2 Subject Days <NA> 35.071714 5.92213766
3 Subject (Intercept) Days 9.604409 0.06555124
4 Residual <NA> <NA> 654.940008 25.59179572
The value we'd want here is on the third row in the sdcor column.
ranef_vals$sdcor[3]
[1] 0.06555124

R: Plotting Mixed Effect models plot results

I am working on linguistic data and try to investigate the realisation of the vowel in words such as NURSE. There are more less 3 categories that can be realised, which I coded as <Er, Ir, Vr>. I then measured Formant values (F1 and F2). Then I created an LME that predicts the F1 and F2 values with different fixed and random effects but the main effect is a cross random effect of phoneme (i.e. <Er, Ir, Vr>) and individual. An example model can be found below.
Linear mixed model fit by REML ['lmerMod']
Formula:
F2 ~ (phoneme | individual) + (1 | word) + age + frequency +
(1 | zduration)
Data: nurse_female
REML criterion at convergence: 654.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.09203 -0.20332 0.03263 0.25273 1.37056
Random effects:
Groups Name Variance Std.Dev. Corr
zduration (Intercept) 0.27779 0.5271
word (Intercept) 0.04488 0.2118
individual (Intercept) 0.34181 0.5846
phonemeIr 0.54227 0.7364 -0.82
phonemeVr 1.52090 1.2332 -0.93 0.91
Residual 0.06326 0.2515
Number of obs: 334, groups:
zduration, 280; word, 116; individual, 23
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.79167 0.32138 5.575
age -0.01596 0.00508 -3.142
frequencylow -0.37587 0.18560 -2.025
frequencymid -1.18901 0.27738 -4.286
frequencyvery high -0.68365 0.26564 -2.574
Correlation of Fixed Effects:
(Intr) age frqncyl frqncym
age -0.811
frequencylw -0.531 -0.013
frequencymd -0.333 -0.006 0.589
frqncyvryhg -0.356 0.000 0.627 0.389
The question is now, how would I go about plotting the mean F2 values for each individual and for each of 3 variants <Er, Ir, Vr>?
I tried plotting the random effects as a caterpillar plot and get the following, but I am not sure, if this is accurate or does what I want. If what I have done Is right, are there any other better ways of plotting it?
ranefs_nurse_female_F2 <- ranef(nurse_female_F2.lmer8_2)
dotplot(ranefs_nurse_female_F2)

Recurring error using lmer() function for a linear mixed-effects model

I attempted to construct a linear mixed effects model using lmer function from lme4 package and I ran into a recurring error. The model uses two fixed effects:
DBS_Electrode (factor w/3 levels) and
PostOp_ICA (continuous variable).
I use (1 | Subject) as a random effect term in which Subject is a factor of 38 levels (38 total subjects). Below is the line of code I attempted to run:
LMM.DBS <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject), data = DBS)
I recieved the following error:
Number of levels of each grouping factor must be < number of observations.
I would appreciate any help, I have tried to navigate this issue myself and have been unsuccessful.
Linear mixed effect model supposes that there is less subjects than observations so it throws an if it is not the case.
You can think of this formula as telling your model that it should
expect that there’s going to be multiple responses per subject, and
these responses will depend on each subject’s baseline level.
Please consult A very basic tutorial for performing linear mixed effects analyses by B. Winter, p. 4.
In your case you should increase amount of observations per subject (> 1). Please see the simulation below:
library(lme4)
set.seed(123)
n <- 38
DBS_Electrode <- factor(sample(LETTERS[1:3], n, replace = TRUE))
Distal_Lead_Migration <- 10 * abs(rnorm(n)) # Distal_Lead_Migration in cm
PostOp_ICA <- 5 * abs(rnorm(n))
# amount of observations equals to amout of subjects
Subject <- paste0("X", 1:n)
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
# Error: number of levels of each grouping factor must be < number of observations
# amount of observations more than amout of subjects
Subject <- c(paste0("X", 1:36), "X1", "X37")
DBS <- data.frame(DBS_Electrode, PostOp_ICA, Subject, Distal_Lead_Migration)
model <- lmer(Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1|Subject), data = DBS)
summary(model)
Output:
Linear mixed model fit by REML ['lmerMod']
Formula: Distal_Lead_Migration ~ DBS_Electrode + PostOp_ICA + (1 | Subject)
Data: DBS
REML criterion at convergence: 224.5
Scaled residuals:
Min 1Q Median 3Q Max
-1.24605 -0.73780 -0.07638 0.64381 2.53914
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 2.484e-14 1.576e-07
Residual 2.953e+01 5.434e+00
Number of obs: 38, groups: Subject, 37
Fixed effects:
Estimate Std. Error t value
(Intercept) 7.82514 2.38387 3.283
DBS_ElectrodeB 0.22884 2.50947 0.091
DBS_ElectrodeC -0.60940 2.21970 -0.275
PostOp_ICA -0.08473 0.36765 -0.230
Correlation of Fixed Effects:
(Intr) DBS_EB DBS_EC
DBS_ElctrdB -0.718
DBS_ElctrdC -0.710 0.601
PostOp_ICA -0.693 0.324 0.219

Interpreting the output of summary(glmer(...)) in R

I'm an R noob, I hope you can help me:
I'm trying to analyse a dataset in R, but I'm not sure how to interpret the output of summary(glmer(...)) and the documentation isn't a big help:
> data_chosen_stim<-glmer(open_chosen_stim~closed_chosen_stim+day+(1|ID),family=binomial,data=chosenMovement)
> summary(data_chosen_stim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: open_chosen_stim ~ closed_chosen_stim + day + (1 | ID)
Data: chosenMovement
AIC BIC logLik deviance df.resid
96.7 105.5 -44.4 88.7 62
Scaled residuals:
Min 1Q Median 3Q Max
-1.4062 -1.0749 0.7111 0.8787 1.0223
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0 0
Number of obs: 66, groups: ID, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4511 0.8715 0.518 0.605
closed_chosen_stim2 0.4783 0.5047 0.948 0.343
day -0.2476 0.5060 -0.489 0.625
Correlation of Fixed Effects:
(Intr) cls__2
clsd_chsn_2 -0.347
day -0.916 0.077
I understand the GLM behind it, but I can't see the weights of the independent variables and their error bounds.
update: weights.merMod already has a type argument ...
I think what you're looking for weights(object,type="working").
I believe these are the diagonal elements of W in your notation?
Here's a trivial example that matches up the results of glm and glmer (since the random effect is bogus and gets an estimated variance of zero, the fixed effects, weights, etc etc converges to the same value).
Note that the weights() accessor returns the prior weights by default (these are all equal to 1 for the example below).
Example (from ?glm):
d.AD <- data.frame(treatment=gl(3,3),
outcome=gl(3,1,9),
counts=c(18,17,15,20,10,20,25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(),
data=d.AD)
library(lme4)
d.AD$f <- 1 ## dummy grouping variable
glmer.D93 <- glmer(counts ~ outcome + treatment + (1|f),
family = poisson(),
data=d.AD,
control=glmerControl(check.nlev.gtr.1="ignore"))
Fixed effects and weights are the same:
all.equal(fixef(glmer.D93),coef(glm.D93)) ## TRUE
all.equal(unname(weights(glm.D93,type="working")),
weights(glmer.D93,type="working"),
tol=1e-7) ## TRUE

Resources