I have this fake dataset that describes the effect of air temperature on the growth of two plant species (a and b).
data1 <- read.csv(text = "
year,block,specie,temperature,growth
2019,1,a,0,7.217496163
2019,1,a,1,2.809792001
2019,1,a,2,16.09505635
2019,1,a,3,24.52673264
2019,1,a,4,49.98455022
2019,1,a,5,35.78568291
2019,2,a,0,8.332533323
2019,2,a,1,16.5997836
2019,2,a,2,11.95833966
2019,2,a,3,34.4
2019,2,a,4,54.19081002
2019,2,a,5,41.1291734
2019,1,b,0,14.07939683
2019,1,b,1,13.73257973
2019,1,b,2,31.33076651
2019,1,b,3,44.81995622
2019,1,b,4,79.27999184
2019,1,b,5,75.0527336
2019,2,b,0,14.18896232
2019,2,b,1,29.00692747
2019,2,b,2,27.83736734
2019,2,b,3,61.46006916
2019,2,b,4,93.91100024
2019,2,b,5,92.47922985
2020,1,a,0,4.117536842
2020,1,a,1,12.70711508
2020,1,a,2,16.09570046
2020,1,a,3,29.49417491
2020,1,a,4,35.94571498
2020,1,a,5,50.74477018
2020,2,a,0,3.490585144
2020,2,a,1,3.817105315
2020,2,a,2,22.43112718
2020,2,a,3,14.4
2020,2,a,4,46.84223604
2020,2,a,5,39.10398717
2020,1,b,0,10.17712428
2020,1,b,1,22.04514586
2020,1,b,2,30.37221799
2020,1,b,3,51.80333619
2020,1,b,4,76.22765452
2020,1,b,5,78.37284714
2020,2,b,0,7.308139613
2020,2,b,1,22.03241605
2020,2,b,2,45.88385871
2020,2,b,3,30.43669633
2020,2,b,4,76.12904988
2020,2,b,5,85.9324324
")
The experiment was conducted two years and in a block design (nested within years). The goal is to inform how much growth is affected per unit of change in temperature. Also, the is a need to provide a measure of uncertainty (standard error) for this estimate. The same needs to be done for the growth recorded at zero degrees of temperature.
library(lme4)
library(lmerTest)
library(lsmeans)
test.model.1 <- lmer(growth ~
specie +
temperature +
specie*temperature +
(1|year) +
(1|year:block),
data= data1,
REML=T,
control=lmerControl(check.nobs.vs.nlev = "ignore",
check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nRE="ignore"))
summary(test.model.1)
The summary give me this output for the fixed effect:
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: growth ~ specie + temperature + specie * temperature + (1 | year) +
(1 | year:block)
Data: data1
Control: lmerControl(check.nobs.vs.nlev = "ignore", check.nobs.vs.rankZ = "ignore",
check.nobs.vs.nRE = "ignore")
REML criterion at convergence: 331.3
Scaled residuals:
Min 1Q Median 3Q Max
-2.6408 -0.7637 0.1516 0.5248 2.4809
Random effects:
Groups Name Variance Std.Dev.
year:block (Intercept) 6.231 2.496
year (Intercept) 0.000 0.000
Residual 74.117 8.609
Number of obs: 48, groups: year:block, 4; year, 2
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 2.699 3.356 26.256 0.804 0.428
specieb 4.433 4.406 41.000 1.006 0.320
temperature 8.624 1.029 41.000 8.381 2.0e-10 ***
specieb:temperature 7.088 1.455 41.000 4.871 1.7e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) specib tmprtr
specieb -0.656
temperature -0.767 0.584
spcb:tmprtr 0.542 -0.826 -0.707
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
From this I can get the growth at 0 degrees of temperature for specie "a" (2.699), and for specie "b" (2.699 + 4.443 = 7.132). Also, the rate of change in growth per unit change in temperature is (8.624) for species "a" and (8.624 + 7.088 = 15.712). The problem I have is that the standard deviation reported in summary() is for the marginal estimate, not for the actual value of the parameter. For instance, the standard error for 4.443 (specieb) is 4.406.. but that is not the standard error for the actual growth at 0 degrees for specie b that is 7.132. What I am looking for is the standard error of let's say 7.132. Also, I'd be nice to have all the calculations I did by hand automatically performed.
I was trying making some tries with emmeans() from lsmeans package but I didn't succeed.
emmeans(test.model.1, growth ~ specie*temperature)
Error:
Error in contrast.emmGrid(object = new("emmGrid", model.info = list(call = lmer(formula = growth ~ :
Contrast function 'growth.emmc' not found
I think your main problem is that you don't need the response variable on the left side of the formula you give to emmeans (the package assumes that you're going to use the same response variable as in the original model!) The left-hand side of the formula is reserved for specifying contrasts, e.g. pairwise ~ ... - see help("contrast-methods", package = "emmeans").
I think you might be looking for:
emmeans(test.model.1, ~specie, at = list(temperature=0))
NOTE: Results may be misleading due to involvement in interactions
specie emmean SE df lower.CL upper.CL
a 2.70 3.36 11.3 -4.665 10.1
b 7.13 3.36 11.3 -0.232 14.5
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
If you don't specify the value of temperature, then emmeans uses (I think) the overall average temperature.
For slopes, you want emtrends:
emtrends(test.model.1, ~specie, var = "temperature")
specie temperature.trend SE df lower.CL upper.CL
a 8.62 1.03 41 6.55 10.7
b 15.71 1.03 41 13.63 17.8
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
I highly recommend the extensive and clearly written vignettes for the emmeans package. Since emmeans has so many capabilities it may take a little while to find the answers to your precise questions, but the effort will be repaid in the long term.
As a small picky point, I would say that what summary() gives you are the "actual" parameters that R uses internally, and what emmeans() gives you are the marginal means (as suggested by the name of the package — expected marginal means ...)
I want to use quadratic terms to fit my general linear mixed model with id as a random effect, using the lme4 package. It's about how the distance to settlements influences the probability of occurrence of an animal. I use the following code (I hope it is correct):
glmer_dissettl <- glmer(case ~ poly(dist_settlements,2) + (1|id), data=rsf.data, family=binomial(link="logit"))
summary(glmer_dissettl)
I get the following output:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: case ~ poly(dist_settlements, 2) + (1 | id)
Data: rsf.data
AIC BIC logLik deviance df.resid
6179.2 6205.0 -3085.6 6171.2 4654
Scaled residuals:
Min 1Q Median 3Q Max
-3.14647 -0.90518 -0.04614 0.94833 1.66806
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.02319 0.1523
Number of obs: 4658, groups: id, 18
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.02684 0.04905 0.547 0.584
poly(dist_settlements, 2)1 37.94959 2.41440 15.718 <2e-16 ***
poly(dist_settlements, 2)2 -1.28536 2.28040 -0.564 0.573
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) p(_,2)1
ply(ds_,2)1 0.083
ply(ds_,2)2 0.067 0.150
I don't know exactly how to interpret this, especially with the two lines for poly(dist_settlements,2). Next to understanding, I also wanna see if the quadratic term is making the model better than the basic model without it.
The output of the basic model without a quadratic term:
Generalized linear mixed model fit by maximum likelihood
(Laplace Approximation) [glmerMod]
Family: binomial ( logit )
Formula: case ~ scale(dist_settlements) + (1 | id)
Data: rsf.data
AIC BIC logLik deviance df.resid
6177.5 6196.9 -3085.8 6171.5 4655
Scaled residuals:
Min 1Q Median 3Q Max
-3.6009 -0.8998 -0.0620 0.9539 1.6417
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.02403 0.155
Number of obs: 4658, groups: id, 18
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.02873 0.04945 0.581 0.561
scale(dist_settlements) 0.55936 0.03538 15.810 <2e-16
(Intercept)
scale(dist_settlements) ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
scl(dst_st) 0.077
I appreciate every tip.
A couple of points.
Coefficients of non-linear model terms do not have a straightforward interpretation and you should make effect plots to be able to communicate the results from your analyses. You may use effectPlotData() from the GLMMadaptive package to do this. Refer to this page for more information.
To be able to appraise whether including a quadratic effect of dist_settlements improves the model fit, you should fit a model without the squared term (i.e. only the linear effect of dist_settlements) and a model with the squared term. Then perform a likelihood ratio test to appraise whether inclusion of complex terms improves the model fit. In case of LMMs, make sure to fit both models using maximum likelihood, not REML. For GLMMs, you don't have to borther about (RE)ML.
The variance of the random intercepts is rather close to 0, which may require your attention. Refer to this answer and this section of Ben Bolker's github for more information on this topic.
You may want to take a look at this great lecture series by Dimitris Rizopoulos for more information on (G)LMMs.
This is a long post. I’m having trouble attaining consistent model results between lme4 versions. My old computer was running R 3.0.0 “Masked Marvel”, with the 0.999999-2 version of lme4, and my new work computer is running R 3.0.2 “Frisbee Sailing” with lme4 version 1.1-2. They are not giving me the same model outputs. Omitting the new lme4, I installed every package update on my old laptop one-by-one and ran the models. Then, updated R itself and ran the models. Finally, I made further updates to packages for R 3.0.2, and ran the models once more. They remained consistent with the original analysis. I wasn’t able to run the models with the most recent MASS package with R 3.0.0, and with the most recent nlme package in R 3.0.2. The error message I received in those cases was the same, and is below:
“Error in validObject(.Object) : invalid class “mer” object: Slot L must be a monotonic LL' factorization of size dims['q']”
I’m wondering if someone can help shed light on this before I update to lme4 1.1-2 to complete my analyses. There doesn’t appear to be a changelog at the address linked in R Studio…http://cran.rstudio.com/web/packages/lme4/NEWS
Maybe something else has changed on my end? I’ve been using the same datafile, and setting it up the same within R.
I am not sure how to attach data, or how to make this easiest for you all to run through yourselves. I can do that with some instruction. Below are the outputs for 2 models before the updates, and the same models after. The variation occurs in models that I haven't included here as well, some worse than others. It definitely affects an AIC ranking.
lme4 version 0.999999-2
R version 3.0.0 (2013-04-03) -- "Masked Marvel"
at1500o <- glmer(apred ~
(1 | fFarm) + (1 | Eps) + (1 | fExp),
family = binomial,
data = PU3.atristis)
summary(at1500o)
Number of levels of a grouping factor for the random effects
is *equal* to n, the number of observations
Warning message: In mer_finalize(ans) : false convergence (8)
Generalized linear mixed model fit by the Laplace approximation
Formula: apred ~ (1 | fFarm) + (1 | Eps) + (1 | fExp)
Data: PU3.atristis
AIC BIC logLik deviance
182.2 193 -87.11 174.2
Random effects:
Groups Name Variance Std.Dev.
Eps (Intercept) 1.6630e+01 4.0780e+00
fFarm (Intercept) 2.4393e+00 1.5618e+00
fExp (Intercept) 5.8587e-13 7.6542e-07
Number of obs: 110, groups: Eps, 110; fFarm, 17; fExp, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.1406 0.7126 -8.618 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
False convergence didn’t seem to be a problem
at1500a <- glmer(apred ~
pctfield1500 * factor(SiteTrt) +
(1 | fFarm) + (1 | Eps) + (1 | fExp),
family = binomial,
data = PU3.atristis)
summary(at1500a)
Number of levels of a grouping factor for the random effects
is *equal* to n, the number of observations
Warning message: In mer_finalize(ans) : false convergence (8)
Generalized linear mixed model fit by the Laplace approximation
Formula: apred ~ pctfield1500 + factor(SiteTrt) + pctfield1500:factor(SiteTrt) + (1 | fFarm) + (1 | Eps) + (1 | fExp)
Data: PU3.atristis
AIC BIC logLik deviance
185.2 209.5 -83.59 167.2
Random effects:
Groups Name Variance Std.Dev.
Eps (Intercept) 2.4539e+01 4.95370375
fFarm (Intercept) 8.1397e-01 0.90220336
fExp (Intercept) 1.3452e-08 0.00011598
Number of obs: 110, groups: Eps, 110; fFarm, 17; fExp, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.82245 2.64044 -2.584 0.00977 **
pctfield1500 -0.01041 0.10849 -0.096 0.92353
factor(SiteTrt)2 2.47654 3.97069 0.624 0.53282
factor(SiteTrt)3 4.33391 4.65551 0.931 0.35190
pctfield1500:factor(SiteTrt)2 -0.05073 0.13117 -0.387 0.69895
pctfield1500:factor(SiteTrt)3 -0.05729 0.14524 -0.394 0.69323
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) pc1500 f(ST)2 f(ST)3 p1500:(ST)2
pctfild1500 -0.876
fctr(StTr)2 -0.665 0.582
fctr(StTr)3 -0.571 0.504 0.380
p1500:(ST)2 0.724 -0.827 -0.850 -0.417
p1500:(ST)3 0.654 -0.747 -0.435 -0.887 0.618
Lme4 version 1.1-2
R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
at1500o <- glmer(apred ~
(1 | fFarm) + (1 | Eps) + (1 | fExp),
family = binomial,
data = PU3.atristis)
summary(at1500o)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: binomial ( logit )
Formula: apred ~ (1 | fFarm) + (1 | Eps) + (1 | fExp)
Data: PU3.atristis
AIC BIC logLik deviance
236.8296 247.6316 -114.4148 228.8296
Random effects:
Groups Name Variance Std.Dev.
Eps (Intercept) 4.799e+01 6.927e+00
fFarm (Intercept) 6.542e-01 8.088e-01
fExp (Intercept) 6.056e-10 2.461e-05
Number of obs: 110, groups: Eps, 110; fFarm, 17; fExp, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.975 1.115 -7.154 8.43e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
No warnings anymore, but everything is pretty different. The model fit statement has changed from the Laplace approximation, to maximum likelihood [‘glmerMod’].
at1500a <- glmer(apred ~
pctfield1500 * factor(SiteTrt) +
(1 | fFarm) + (1 | Eps) + (1 | fExp),
family = binomial,
data = PU3.atristis)
summary(at1500a)
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: binomial ( logit )
Formula: apred ~ pctfield1500 + factor(SiteTrt) + pctfield1500:factor(SiteTrt) + (1 | fFarm) + (1 | Eps) + (1 | fExp)
Data: PU3.atristis
AIC BIC logLik deviance
242.7874 267.0917 -112.3937 224.7874
Random effects:
Groups Name Variance Std.Dev.
Eps (Intercept) 3.346e+01 5.784e+00
fFarm (Intercept) 1.809e-07 4.253e-04
fExp (Intercept) 4.380e-11 6.618e-06
Number of obs: 110, groups: Eps, 110; fFarm, 17; fExp, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.46800 3.04197 -2.455 0.0141 *
pctfield1500 0.00032 0.12301 0.003 0.9979
factor(SiteTrt)2 2.05618 4.57444 0.450 0.6531
factor(SiteTrt)3 6.72139 5.20895 1.290 0.1969
pctfield1500:factor(SiteTrt)2 -0.04882 0.14911 -0.327 0.7434
pctfield1500:factor(SiteTrt)3 -0.11324 0.16495 -0.686 0.4924
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) pc1500 f(ST)2 f(ST)3 p1500:(ST)2
pctfild1500 -0.875
fctr(StTr)2 -0.665 0.582
fctr(StTr)3 -0.584 0.511 0.388
p1500:(ST)2 0.722 -0.825 -0.850 -0.422
p1500:(ST)3 0.653 -0.746 -0.434 -0.886 0.615
Cheers,
Nava
Version 1.0 of the lme4 package marked a major change, with a pretty substantial rework of the underlying fitting code. From the changelog:
Because the internal computational machinery has changed, results from
the newest version of lme4 will not be numerically identical to those
from previous versions. For reasonably well- defined fits, they will
be extremely close (within numerical tolerances of 1e-4 or so), but
for unstable or poorly-defined fits the results may change, and very
unstable fits may fail when they (apparently) succeeded with previous
versions. Similarly, some fits may be slower with the new version,
although on average the new version should be faster and more stable.
More numerical tuning options are now available (see below);
non-default settings may restore the speed and/or ability to fit a
particular model without an error. If you notice significant or
disturbing changes when fitting a model with the new version of lme4,
please notify the maintainers.
Considering your model fits using the 0.9999 version of lme4 produced false convergence warnings, your model might be in the "unstable fits" category this is talking about.