Regression equation produces model outside of all data - r

I'm fairly confused to why I produce a regression equation that is so outside of the range of all data in dataset. I have a feeling the equation is very sensitive to data with a big spread but Im still confused. Any assistance would be greatly appreciated, stats certainly isn't my first language!
For reference this is a geochemical thermodynamics problem: Im trying to fit the Maier-Kelley equation to some experimental data. The Maier-Kelley equation describes how the equilibrium constant (K), in this case dolomite dissolving in water, changes with temperature (T in this case in Kelvin).
log K = A + B.T + C/T + D.logT + E/T^2
To cut a long story short (see Hyeong and Capuano., 2001 if interested) the equilibrium constant (K) is the same as Log_Ca_Mg (ratio of calcium to magnesium ion acitivities).
The experimental data uses groundwater data from different locations and different depths (so identified by FIELD and DepthID - which are my random variables).
I have included 3 datasets
(Problem)Dataset 1:https://pastebin.com/fe2r2ebA
(Working)Dataset 2:https://pastebin.com/gFgaJ2c8
(Working)Dataset 3:https://pastebin.com/X5USaaNA
Using the following code, for dataset 1
> dat1 <- read.csv("PATH_TO_DATASET_1.txt", header = TRUE,sep="\t")
> fm1 <- lmer(Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + I(log10(kelvin)) + I(kelvin^-2) + (1|FIELD) +(1|DepthID),data=dat1)
Warning messages:
1: Some predictor variables are on very different scales: consider rescaling
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0196619 (tol = 0.002, component 1)
3: Some predictor variables are on very different
> summary(fm1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + I(log10(kelvin)) + I(kelvin^-2) + (1 | FIELD) + (1 | DepthID)
Data: dat1
REML criterion at convergence: -774.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.5464 -0.4538 -0.0671 0.3736 6.4217
Random effects:
Groups Name Variance Std.Dev.
DepthID (Intercept) 0.01035 0.1017
FIELD (Intercept) 0.01081 0.1040
Residual 0.01905 0.1380
Number of obs: 1175, groups: DepthID, 675; FIELD, 410
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 3.368e+03 1.706e+03 4.582e-02 1.974 0.876
kelvin 4.615e-01 2.375e-01 4.600e-02 1.943 0.876
I(kelvin^-1) -1.975e+05 9.788e+04 4.591e-02 -2.018 0.875
I(log10(kelvin)) -1.205e+03 6.122e+02 4.582e-02 -1.968 0.876
I(kelvin^-2) 1.230e+07 5.933e+06 4.624e-02 2.073 0.873
Correlation of Fixed Effects:
(Intr) kelvin I(^-1) I(10()
kelvin 0.999
I(kelvn^-1) -1.000 -0.997
I(lg10(kl)) -1.000 -0.999 0.999
I(kelvn^-2) 0.998 0.994 -0.999 -0.997
fit warnings:
Some predictor variables are on very different scales: consider rescaling
convergence code: 0
Model failed to converge with max|grad| = 0.0196619 (tol = 0.002, component 1)
For Dataset 2
> summary(fm2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + I(log10(kelvin)) + I(kelvin^-2) + (1 | FIELD) + (1 | DepthID)
Data: dat2
REML criterion at convergence: -1073.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0816 -0.4772 -0.0581 0.3650 5.6209
Random effects:
Groups Name Variance Std.Dev.
DepthID (Intercept) 0.007368 0.08584
FIELD (Intercept) 0.014266 0.11944
Residual 0.023048 0.15182
Number of obs: 1906, groups: DepthID, 966; FIELD, 537
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -9.366e+01 2.948e+03 1.283e-03 -0.032 0.999
kelvin -2.798e-02 4.371e-01 1.289e-03 -0.064 0.998
I(kelvin^-1) 2.623e+02 1.627e+05 1.285e-03 0.002 1.000
I(log10(kelvin)) 3.965e+01 1.067e+03 1.283e-03 0.037 0.999
I(kelvin^-2) 2.917e+05 9.476e+06 1.294e-03 0.031 0.999
Correlation of Fixed Effects:
(Intr) kelvin I(^-1) I(10()
kelvin 0.999
I(kelvn^-1) -0.999 -0.997
I(lg10(kl)) -1.000 -0.999 0.999
I(kelvn^-2) 0.998 0.994 -0.999 -0.997
fit warnings:
Some predictor variables are on very different scales: consider rescaling
convergence code: 0
Model failed to converge with max|grad| = 0.0196967 (tol = 0.002, component 1)
For Dataset 3
> summary(fm2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + I(log10(kelvin)) + I(kelvin^-2) + (1 | FIELD) + (1 | DepthID)
Data: dat3
REML criterion at convergence: -1590.1
Scaled residuals:
Min 1Q Median 3Q Max
-4.2546 -0.4987 -0.0379 0.4313 4.5490
Random effects:
Groups Name Variance Std.Dev.
DepthID (Intercept) 0.01311 0.1145
FIELD (Intercept) 0.01424 0.1193
Residual 0.03138 0.1771
Number of obs: 6674, groups: DepthID, 3422; FIELD, 1622
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.260e+03 1.835e+03 9.027e-02 0.687 0.871
kelvin 1.824e-01 2.783e-01 9.059e-02 0.655 0.874
I(kelvin^-1) -7.289e+04 9.961e+04 9.044e-02 -0.732 0.866
I(log10(kelvin)) -4.529e+02 6.658e+02 9.028e-02 -0.680 0.872
I(kelvin^-2) 4.499e+06 5.690e+06 9.104e-02 0.791 0.860
Correlation of Fixed Effects:
(Intr) kelvin I(^-1) I(10()
kelvin 0.999
I(kelvn^-1) -1.000 -0.997
I(lg10(kl)) -1.000 -0.999 0.999
I(kelvn^-2) 0.998 0.994 -0.999 -0.998
fit warnings:
Some predictor variables are on very different scales: consider rescaling
convergence code: 0
unable to evaluate scaled gradient
Model failed to converge: degenerate Hessian with 1 negative eigenvalues
I've plotted 'all the data' but for the regression analysis there is no data above the red line or bellow the green line. Only points with a log_ca_mg value between the red and green line at any temperature are included in the regression analysis.
So looking at the regressions on a plot dataset 1 is just way off but as there is no data above the red line this just confuses me no end. The regression is sitting in an area where there is no data. For the other two datasets this isn't a problem. Even for datasets with smaller sizes (n=200) its approximately in the same area. The three datasets look relatively similar when plotted individually.
Im kind of lost. Any help in understanding this would be appreciated.

What follows is an attempt to diagnose what may be going wrong with your model. It will use Dataset 1 for this discussion:
As described in your question, when one runs the original model with Dataset 1, they recieve warnings:
# original model
fm1 <- lme4::lmer(Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + I(log10(kelvin)) + I(kelvin^-2) + (1|FIELD) +(1|DepthID),data=dat1)
Some predictor variables are on very different scales: consider
rescaling convergence code: 0 Model failed to converge with max|grad|
= 0.0196619 (tol = 0.002, component 1)
These and other information suggest your model has problems, perhaps related to the predictors being on a different scale.
Since fm1 has several predictors that are transformations of the variable 'kelvin',we can also check the model for collinearity with the car package vif function:
# examine collinearity with the vif (variance inflation factors)
> car::vif(fm1)
kelvin I(kelvin^-1) I(log10(kelvin)) I(kelvin^-2)
716333 9200929 7688348 1224275
These vif values suggest the fm1 model suffers from high collinearity.
We can try to drop the some of those predictors, to examine a simpler model:
fm1_b <- lme4::lmer(Log_Ca_Mg ~ 1 + kelvin + I(kelvin^-1) + (1|FIELD) +(1|DepthID),data=dat1)
When we run the code, we still get a warning about the predictors being on different scales:
Warning message: Some predictor variables are on very different
scales: consider rescaling
At the same time the vif values are much smaller:
# examine collinearity with the vif (variance inflation factors)
> car::vif(fm1_b)
kelvin I(kelvin^-1)
46.48406 46.48406
Following gung's suggestion that I mentioned in the comments, we can see what happens when we center our kelvin variables:
dat1$kelvin_centered <- as.vector(scale(dat1$kelvin, center= TRUE, scale = FALSE ))
# Make a power transformation on the kelvin_centered variable
dat1$kelvin_centered_pwr <- dat1$kelvin_centered^-1
And check to see if they are correlated
# check the correlation of the centered vars
cor(dat1$kelvin_centered, dat1$kelvin_centered_pwr)
> cor(dat1$kelvin_centered, dat1$kelvin_centered_pwr)
[1] 0.08056641
And construct a different model with the centered variables:
# construct a modifed model
fm1_c <- lme4::lmer(Log_Ca_Mg ~ 1 + kelvin_centered + kelvin_centered_pwr + (1|FIELD) +(1|DepthID),data=dat1)
Notably, we don' see any warnings when we run the code with this model. And the vif values are quite low:
car::vif(fm1_c)
> car::vif(fm1_c)
kelvin_centered kelvin_centered_pwr
1.005899 1.005899
Conclusion
The original model has a high degree of collinearity. Collinearity can make models unstable, which could account for why fm1 failed to converge, and why you are seeing weird predictions in the plots. Model fm1_c may or may not be the correct model for your purpose. It at least provides a lens to understand the issue with your original model.

I think you are going about this the wrong way. It sounds like you are trying to estimate the parameters A, B, C, D and E in the Maier-Kelley equation. You can do this by using non-linear least squares rather than a linear mixed effects model.
Start by defining a function that replicates the formula:
MK_eq <- function(A, B, C, D, E, Temp)
{
A + B * Temp + C / Temp + D * log10(Temp) + E / (Temp^2)
}
Now we use the nls function to get an estimate for A to E:
mod1 <- nls(Log_Ca_Mg ~ MK_eq(A, B, C, D, E, kelvin),
start = list(A = 1, B = 1, C = 1, D = 1, E = 2), data = dat1)
coef(mod1)
#> A B C D E
#> 4.802008e+03 6.538166e-01 -2.818917e+05 -1.717040e+03 1.755566e+07
and we can create a "regression line" by getting a prediction for every value of Kelvin between, say, 275 and 400 in increments of 0.1:
new_data <- data.frame(kelvin = seq(275, 400, 0.1))
new_data$Log_Ca_Mg <- predict(mod1, newdata = new_data)
and we can demonstrate that this is a good approximation by plotting our prediction over the sample:
ggplot(dat1, aes(x = kelvin, y = Log_Ca_Mg)) +
geom_point() +
geom_line(data = new_data, linetype = 2, colour = "red", size = 2)
Note that for simplicity I have avoided discussion of the random effects - it is possible to do mixed effects non-linear least squares using the nlme package, but it is more involved and the discussion here describes how to do it in more detail than I can here.

Related

Interpreting output from emmeans::contrast

I have data from a longitudinal study and calculated the regression using the lme4::lmer function. After that I calculated the contrasts for these data but I am having difficulty interpreting my results, as they were unexpected. I think I might have made a mistake in the code. Unfortunately I couldn't replicate my results with an example, but I will post both the failed example and my actual results below.
My results:
library(lme4)
library(lmerTest)
library(emmeans)
#regression
regmemory <- lmer(memory ~ as.factor(QuartileConsumption)*Age+
(1 + Age | ID) + sex + education +
HealthScore, CognitionData)
#results
summary(regmemory)
#Fixed effects:
# Estimate Std. Error df t value Pr(>|t|)
#(Intercept) -7.981e-01 9.803e-02 1.785e+04 -8.142 4.15e-16 ***
#as.factor(QuartileConsumption)2 -8.723e-02 1.045e-01 2.217e+04 -0.835 0.40376
#as.factor(QuartileConsumption)3 5.069e-03 1.036e-01 2.226e+04 0.049 0.96097
#as.factor(QuartileConsumption)4 -2.431e-02 1.030e-01 2.213e+04 -0.236 0.81337
#Age -1.709e-02 1.343e-03 1.989e+04 -12.721 < 2e-16 ***
#sex 3.247e-01 1.520e-02 1.023e+04 21.355 < 2e-16 ***
#education 2.979e-01 1.093e-02 1.061e+04 27.266 < 2e-16 ***
#HealthScore -1.098e-06 5.687e-07 1.021e+04 -1.931 0.05352 .
#as.factor(QuartileConsumption)2:Age 1.101e-03 1.842e-03 1.951e+04 0.598 0.55006
#as.factor(QuartileConsumption)3:Age 4.113e-05 1.845e-03 1.935e+04 0.022 0.98221
#as.factor(QuartileConsumption)4:Age 1.519e-03 1.851e-03 1.989e+04 0.821 0.41174
#contrasts
emmeans(regmemory, poly ~ QuartileConsumption * Age)$contrast
#$contrasts
# contrast estimate SE df z.ratio p.value
# linear 0.2165 0.0660 Inf 3.280 0.0010
# quadratic 0.0791 0.0289 Inf 2.733 0.0063
# cubic -0.0364 0.0642 Inf -0.567 0.5709
The interaction terms in the regression results are not significant, but the linear contrast is. Shouldn't the p-value for the contrast be non-significant?
Below is the code I wrote to try to recreate these results, but failed:
library(dplyr)
library(lme4)
library(lmerTest)
library(emmeans)
data("sleepstudy")
#create quartile column
sleepstudy$Quartile <- sample(1:4, size = nrow(sleepstudy), replace = T)
#regression
model1 <- lmer(Reaction ~ Days * as.factor(Quartile) + (1 + Days | Subject), data = sleepstudy)
#results
summary(model1)
#Fixed effects:
# Estimate Std. Error df t value Pr(>|t|)
#(Intercept) 258.1519 9.6513 54.5194 26.748 < 2e-16 ***
#Days 9.8606 2.0019 43.8516 4.926 1.24e-05 ***
#as.factor(Quartile)2 -11.5897 11.3420 154.1400 -1.022 0.308
#as.factor(Quartile)3 -5.0381 11.2064 155.3822 -0.450 0.654
#as.factor(Quartile)4 -10.7821 10.8798 154.0820 -0.991 0.323
#Days:as.factor(Quartile)2 0.5676 2.1010 152.1491 0.270 0.787
#Days:as.factor(Quartile)3 0.2833 2.0660 155.5669 0.137 0.891
#Days:as.factor(Quartile)4 1.8639 2.1293 153.1315 0.875 0.383
#contrast
emmeans(model1, poly ~ Quartile*Days)$contrast
#contrast estimate SE df t.ratio p.value
# linear -1.91 18.78 149 -0.102 0.9191
# quadratic 10.40 8.48 152 1.227 0.2215
# cubic -18.21 18.94 150 -0.961 0.3379
In this example, the p-value for the linear contrast is non-significant just as the interactions from the regression. Did I do something wrong, or these results are to be expected?
Look at the emmeans() call for the original model:
emmeans(regmemory, poly ~ QuartileConsumption * Age)
This requests that we obtain marginal means for combinations of QuartileConsumption and Age, and obtain polynomial contrasts from those results. It appears that Age is a quantitative variable, so in computing the marginal means, we just use the mean value of Age (see documentation for ref_grid() and vignette("basics", "emmeans")). So the marginal means display, which wasn't shown in the OP, will be in this general form:
QuartileConsumption Age emmean
------------------------------------
1 <mean> <est1>
2 <mean> <est2>
3 <mean> <est3>
4 <mean> <est4>
... and the contrasts shown will be the linear, quadratic, and cubic trends of those four estimates, in the order shown.
Note that these marginal means have nothing to do with the interaction effect; they are just predictions from the model for the four levels of QuartileConsumption at the mean Age (and mean education, mean health score), averaged over the two sexes, if I understand the data structure correctly. So essentially the polynomial contrasts estimate polynomial trends of the 4-level factor at the mean age. And note in particular that age is held constant, so we certainly are not looking at any effects of Age.
I am guessing what you want to be doing to examine the interaction is to assess how the age trend varies over the four levels of that factor. If that is the case, one useful thing to do would be something like
slopes <- emtrends(regmemory, ~ QuartileConsumption, var = "age")
slopes # display the estimated slope at each level
pairs(slopes) # pairwise comparisons of these slopes
See vignette("interactions", "emmeans") and the section on interactions with covariates.

lme4 1.1-27.1 error: pwrssUpdate did not converge in (maxit) iterations

Sorry that this error has been discussed before, each answer on stackoverflow seems specific to the data
I'm attempting to run the following negative binomial model in lme4:
Model5.binomial<-glmer.nb(countvariable ~ waves + var1 + dummycodedvar2 + dummycodedvar3 + (1|record_id), data=datadfomit)
However, I receive the following error when attempting to run the model:
Error in f_refitNB(lastfit, theta = exp(t), control = control) :pwrssUpdate did not converge in (maxit) iterations
I first ran the model with only 3 predictor variables (waves, var1, dummycodedvar2) and got the same error. But centering the predictors fixed this problem and the model ran fine.
Now with 4 variables (all centered) I expected the model to run smoothly, but receive the error again.
Since every answer on this site seems to point towards a problem in the data, data that replicates the problem can be found here:
https://file.io/3vtX9RwMJ6LF
Your response variable has a lot of zeros:
I would suggest fitting a model that takes account of this, such as a zero-inflated model. The GLMMadaptive package can fit zero-inflated negative binomial mixed effects models:
## library(GLMMadaptive)
## mixed_model(countvariable ~ waves + var1 + dummycodedvar2 + dummycodedvar3, ## random = ~ 1 | record_id, data = data,
## family = zi.negative.binomial(),
## zi_fixed = ~ var1,
## zi_random = ~ 1 | record_id) %>% summary()
Random effects covariance matrix:
StdDev Corr
(Intercept) 0.8029
zi_(Intercept) 1.0607 -0.7287
Fixed effects:
Estimate Std.Err z-value p-value
(Intercept) 1.4923 0.1892 7.8870 < 1e-04
waves -0.0091 0.0366 -0.2492 0.803222
var1 0.2102 0.0950 2.2130 0.026898
dummycodedvar2 -0.6956 0.1702 -4.0870 < 1e-04
dummycodedvar3 -0.1746 0.1523 -1.1468 0.251451
Zero-part coefficients:
Estimate Std.Err z-value p-value
(Intercept) 1.8726 0.1284 14.5856 < 1e-04
var1 -0.3451 0.1041 -3.3139 0.00091993
log(dispersion) parameter:
Estimate Std.Err
0.4942 0.2859
Integration:
method: adaptive Gauss-Hermite quadrature rule
quadrature points: 11
Optimization:
method: hybrid EM and quasi-Newton
converged: TRUE

R: Plotting Mixed Effect models plot results

I am working on linguistic data and try to investigate the realisation of the vowel in words such as NURSE. There are more less 3 categories that can be realised, which I coded as <Er, Ir, Vr>. I then measured Formant values (F1 and F2). Then I created an LME that predicts the F1 and F2 values with different fixed and random effects but the main effect is a cross random effect of phoneme (i.e. <Er, Ir, Vr>) and individual. An example model can be found below.
Linear mixed model fit by REML ['lmerMod']
Formula:
F2 ~ (phoneme | individual) + (1 | word) + age + frequency +
(1 | zduration)
Data: nurse_female
REML criterion at convergence: 654.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.09203 -0.20332 0.03263 0.25273 1.37056
Random effects:
Groups Name Variance Std.Dev. Corr
zduration (Intercept) 0.27779 0.5271
word (Intercept) 0.04488 0.2118
individual (Intercept) 0.34181 0.5846
phonemeIr 0.54227 0.7364 -0.82
phonemeVr 1.52090 1.2332 -0.93 0.91
Residual 0.06326 0.2515
Number of obs: 334, groups:
zduration, 280; word, 116; individual, 23
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.79167 0.32138 5.575
age -0.01596 0.00508 -3.142
frequencylow -0.37587 0.18560 -2.025
frequencymid -1.18901 0.27738 -4.286
frequencyvery high -0.68365 0.26564 -2.574
Correlation of Fixed Effects:
(Intr) age frqncyl frqncym
age -0.811
frequencylw -0.531 -0.013
frequencymd -0.333 -0.006 0.589
frqncyvryhg -0.356 0.000 0.627 0.389
The question is now, how would I go about plotting the mean F2 values for each individual and for each of 3 variants <Er, Ir, Vr>?
I tried plotting the random effects as a caterpillar plot and get the following, but I am not sure, if this is accurate or does what I want. If what I have done Is right, are there any other better ways of plotting it?
ranefs_nurse_female_F2 <- ranef(nurse_female_F2.lmer8_2)
dotplot(ranefs_nurse_female_F2)

Colnames error after running Summary() in mixed model

R version 3.1.0 (2014-04-10)
lmer package version 1.1-6
lmerTest package version 2.0-6
I am currently working with lmer and lmerTest for my analysis.
Every time I add an effect to the random structure, I get the following error when running summary():
#Fitting a mixed model:
TRT5ToVerb.lmer3 = lmer(TRT5ToVerb ~ Group + Condition + (1+Condition|Participant) + (1|Trial), data=AllData, REML=FALSE, na.action=na.omit)
summary(TRT5ToVerb.lmer3)
Error in `colnames<-`(`*tmp*`, value = c("Estimate", "Std. Error", "df", : length of 'dimnames' [2] not equal to array extent
If I leave the structure like this:
TRT5ToVerb.lmer2 = lmer(TRT5ToVerb ~ Group + Condition + (1|Participant) + (1|Trial), data=AllData, REML=FALSE, na.action=na.omit)
there is no error run summary(TRT5ToVerb.lmer2), returning AIC, BIC, logLik deviance, estimates of the random effects, estimates of the fixed effects and their corresponding p-values, etc., etc.
So, apparently something happens when I run lmerTest, despite the fact that the object TRT5ToVerb.lmer3 is there. The only difference between both is the random structure: (1+Condition|Participant) vs. (1|Participant)
Some characteristics of my data:
Both Condition and Group are categorical variables: Condition
comprises 3 levels, and Group 2
The dependent variable (TRT5ToVerb) is continuous: it corresponds to
reading time in terms of ms
This a repeated measures experiment, with 48 observations per
participant (participants=28)
I read this threat, but I cannot see a clear solution. Will it be that I have to transform my dataframe to long format?
And if so, then how do I work with that in lmer?
I hope it is not that.
Thanks!
Disclaimer: I am neither an expert in R, nor in statistics, so please, have some patience.
(Should be a comment, but too long/code formatting etc.)
This fake example seems to work fine with lmerTest 2.0-6 and a development version of lme4 (1.1-8; but I wouldn't expect there to be any relevant differences from 1.1-6 for this example ...)
AllData <- expand.grid(Condition=factor(1:3),Group=factor(1:2),
Participant=1:28,Trial=1:8)
form <- TRT5ToVerb ~ Group + Condition + (1+Condition|Participant) + (1|Trial)
library(lme4)
set.seed(101)
AllData$TRT5ToVerb <- simulate(form[-2],
newdata=AllData,
family=gaussian,
newparam=list(theta=rep(1,7),sigma=1,beta=rep(0,4)))[[1]]
library(lmerTest)
lmer3 <- lmer(form,data=AllData,REML=FALSE)
summary(lmer3)
Produces:
Linear mixed model fit by maximum likelihood ['merModLmerTest']
Formula: TRT5ToVerb ~ Group + Condition + (1 + Condition | Participant) +
(1 | Trial)
Data: AllData
AIC BIC logLik deviance df.resid
4073.6 4136.0 -2024.8 4049.6 1332
Scaled residuals:
Min 1Q Median 3Q Max
-2.97773 -0.65923 0.02319 0.66454 2.98854
Random effects:
Groups Name Variance Std.Dev. Corr
Participant (Intercept) 0.8546 0.9245
Condition2 1.3596 1.1660 0.58
Condition3 3.3558 1.8319 0.44 0.82
Trial (Intercept) 0.9978 0.9989
Residual 0.9662 0.9829
Number of obs: 1344, groups: Participant, 28; Trial, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.49867 0.39764 12.40000 1.254 0.233
Group2 0.03002 0.05362 1252.90000 0.560 0.576
Condition2 -0.03777 0.22994 28.00000 -0.164 0.871
Condition3 -0.27796 0.35237 28.00000 -0.789 0.437
Correlation of Fixed Effects:
(Intr) Group2 Cndtn2
Group2 -0.067
Condition2 0.220 0.000
Condition3 0.172 0.000 0.794

Interpreting the output of summary(glmer(...)) in R

I'm an R noob, I hope you can help me:
I'm trying to analyse a dataset in R, but I'm not sure how to interpret the output of summary(glmer(...)) and the documentation isn't a big help:
> data_chosen_stim<-glmer(open_chosen_stim~closed_chosen_stim+day+(1|ID),family=binomial,data=chosenMovement)
> summary(data_chosen_stim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: open_chosen_stim ~ closed_chosen_stim + day + (1 | ID)
Data: chosenMovement
AIC BIC logLik deviance df.resid
96.7 105.5 -44.4 88.7 62
Scaled residuals:
Min 1Q Median 3Q Max
-1.4062 -1.0749 0.7111 0.8787 1.0223
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0 0
Number of obs: 66, groups: ID, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4511 0.8715 0.518 0.605
closed_chosen_stim2 0.4783 0.5047 0.948 0.343
day -0.2476 0.5060 -0.489 0.625
Correlation of Fixed Effects:
(Intr) cls__2
clsd_chsn_2 -0.347
day -0.916 0.077
I understand the GLM behind it, but I can't see the weights of the independent variables and their error bounds.
update: weights.merMod already has a type argument ...
I think what you're looking for weights(object,type="working").
I believe these are the diagonal elements of W in your notation?
Here's a trivial example that matches up the results of glm and glmer (since the random effect is bogus and gets an estimated variance of zero, the fixed effects, weights, etc etc converges to the same value).
Note that the weights() accessor returns the prior weights by default (these are all equal to 1 for the example below).
Example (from ?glm):
d.AD <- data.frame(treatment=gl(3,3),
outcome=gl(3,1,9),
counts=c(18,17,15,20,10,20,25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(),
data=d.AD)
library(lme4)
d.AD$f <- 1 ## dummy grouping variable
glmer.D93 <- glmer(counts ~ outcome + treatment + (1|f),
family = poisson(),
data=d.AD,
control=glmerControl(check.nlev.gtr.1="ignore"))
Fixed effects and weights are the same:
all.equal(fixef(glmer.D93),coef(glm.D93)) ## TRUE
all.equal(unname(weights(glm.D93,type="working")),
weights(glmer.D93,type="working"),
tol=1e-7) ## TRUE

Resources