How to interpret the output of R's glmer() with quadratic terms - r

I want to use quadratic terms to fit my general linear mixed model with id as a random effect, using the lme4 package. It's about how the distance to settlements influences the probability of occurrence of an animal. I use the following code (I hope it is correct):
glmer_dissettl <- glmer(case ~ poly(dist_settlements,2) + (1|id), data=rsf.data, family=binomial(link="logit"))
summary(glmer_dissettl)
I get the following output:
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: case ~ poly(dist_settlements, 2) + (1 | id)
Data: rsf.data
AIC BIC logLik deviance df.resid
6179.2 6205.0 -3085.6 6171.2 4654
Scaled residuals:
Min 1Q Median 3Q Max
-3.14647 -0.90518 -0.04614 0.94833 1.66806
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.02319 0.1523
Number of obs: 4658, groups: id, 18
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.02684 0.04905 0.547 0.584
poly(dist_settlements, 2)1 37.94959 2.41440 15.718 <2e-16 ***
poly(dist_settlements, 2)2 -1.28536 2.28040 -0.564 0.573
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) p(_,2)1
ply(ds_,2)1 0.083
ply(ds_,2)2 0.067 0.150
I don't know exactly how to interpret this, especially with the two lines for poly(dist_settlements,2). Next to understanding, I also wanna see if the quadratic term is making the model better than the basic model without it.
The output of the basic model without a quadratic term:
Generalized linear mixed model fit by maximum likelihood
(Laplace Approximation) [glmerMod]
Family: binomial ( logit )
Formula: case ~ scale(dist_settlements) + (1 | id)
Data: rsf.data
AIC BIC logLik deviance df.resid
6177.5 6196.9 -3085.8 6171.5 4655
Scaled residuals:
Min 1Q Median 3Q Max
-3.6009 -0.8998 -0.0620 0.9539 1.6417
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.02403 0.155
Number of obs: 4658, groups: id, 18
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.02873 0.04945 0.581 0.561
scale(dist_settlements) 0.55936 0.03538 15.810 <2e-16
(Intercept)
scale(dist_settlements) ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
scl(dst_st) 0.077
I appreciate every tip.

A couple of points.
Coefficients of non-linear model terms do not have a straightforward interpretation and you should make effect plots to be able to communicate the results from your analyses. You may use effectPlotData() from the GLMMadaptive package to do this. Refer to this page for more information.
To be able to appraise whether including a quadratic effect of dist_settlements improves the model fit, you should fit a model without the squared term (i.e. only the linear effect of dist_settlements) and a model with the squared term. Then perform a likelihood ratio test to appraise whether inclusion of complex terms improves the model fit. In case of LMMs, make sure to fit both models using maximum likelihood, not REML. For GLMMs, you don't have to borther about (RE)ML.
The variance of the random intercepts is rather close to 0, which may require your attention. Refer to this answer and this section of Ben Bolker's github for more information on this topic.
You may want to take a look at this great lecture series by Dimitris Rizopoulos for more information on (G)LMMs.

Related

Tukey post hoc test after glmm error R studio

Our data set consists of 3 periods of time measuring how often monkeys were in different hights in the tree.
After using a generalized linear mixed model on our data set we want to perform a posthoc test. We want to test if the monkeys are more often in higher areas in the different periods. We want to use the TukeyHSD() to do the tukey post hoc test, but we get an error :
Error in UseMethod("TukeyHSD") :
no applicable method for 'TukeyHSD' applied to an object of class "c('glmerMod', 'merMod')".
Also I can't install lsmeans or emmeans because it is not possible with my version of R (while I just updated R). Does anybody know how to solve this problem?
To do the glmm we used:
output2 <- glmer(StrataNumber ~ ffactor1 + ( 1 | Focal), data = aa, family = "poisson", na.action = "na.fail")
dredge(output2)
dredgeout2 <- dredge(output2)
subset(dredgeout2, delta <6)
summary(output2)
This gave us the following significant results:
> summary(output2)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
glmerMod]
Family: poisson ( log )
Formula: StrataNumber ~ ffactor1 + (1 | Focal)
Data: aa
AIC BIC logLik deviance df.resid
9404.4 9428.0 -4698.2 9396.4 2688
Scaled residuals:
Min 1Q Median 3Q Max
-1.78263 -0.33628 0.06559 0.32481 1.37514
Random effects:
Groups Name Variance Std.Dev.
Focal (Intercept) 0.006274 0.07921
Number of obs: 2692, groups: Focal, 7
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.31659 0.03523 37.368 < 2e-16 ***
ffactor12 0.09982 0.02431 4.107 4.01e-05 ***
ffactor13 0.17184 0.02425 7.087 1.37e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) ffct12
ffactor12 -0.403
ffactor13 -0.403 0.585

simr: powerSim gives 100% for all effect sizes

I have carried out a binomial GLMM to determine how latitude and native status (native/non-native) of a set of plant species affects herbivory damage. I am now trying to determine the statistical power of my model when I change the effect sizes. My model looks like this:
latglmm <- glmer(cbind(chewing,total.cells-chewing) ~ scale(latitude) * native.status + scale(sample.day.of.year) + (1|genus) + (1|species) + (1|catalogue.number), family=binomial, data=mna)
where cbind(chewing,total.cells-chewing) gives me a proportion (of leaves with herbivory damage), native.status is either "native" or "non-native" and catalogue.number acts as an observation-level random effect to deal with overdispersion. There are 10 genus, each with at least 1 native and 1 non-native species to make 26 species in total. The model summary is:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: cbind(chewing, total.cells - chewing) ~ scale(latitude) * native.status +
scale(sample.day.of.year) + (1 | genus) + (1 | species) + (1 | catalogue.number)
Data: mna
AIC BIC logLik deviance df.resid
3986.7 4023.3 -1985.4 3970.7 706
Scaled residuals:
Min 1Q Median 3Q Max
-1.3240 -0.4511 -0.0250 0.1992 1.0765
Random effects:
Groups Name Variance Std.Dev.
catalogue.number (Intercept) 1.26417 1.1244
species (Intercept) 0.08207 0.2865
genus.ID (Intercept) 0.33431 0.5782
Number of obs: 714, groups: catalogue.number, 713; species, 26; genus.ID, 10
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.61310 0.20849 -12.534 < 2e-16 ***
scale(latitude) -0.17283 0.06370 -2.713 0.00666 **
native.statusnon-native 0.11434 0.15554 0.735 0.46226
scale(sample.day.of.year) 0.28521 0.05224 5.460 4.77e-08 ***
scale(latitude):native.statusnon-native -0.02986 0.09916 -0.301 0.76327
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) scallt ntv.s- scaldy
scalelat 0.012
ntv.sttsnn- -0.304 -0.014
scaledoy 0.018 -0.085 -0.027
scllt:ntv.- -0.011 -0.634 0.006 -0.035
I should add that the actual model I have been using is a glmmTMB model as lme4 still had some overdispersion even with the observation-level random effect, but this is not compatible with simr so I am using lme4 (the results are very similar for both). I want to see what happens to the model power when I increase or decrease the effect sizes of latitude and native status but when I run fixef(latglmm1)["scale(latitude)"]<--1 and fixef(latglmm1)["native.statusnon-native"]<--1 and try this:
powerSim(latglmm, fcompare(~ scale(latitude) + native.status))
I get the following output:
Power for model comparison, (95% confidence interval):====================================================================|
100.0% (69.15, 100.0)
Test: Likelihood ratio
Comparison to ~scale(latitude) + native.status + [re]
Based on 10 simulations, (0 warnings, 0 errors)
alpha = 0.05, nrow = 1428
Time elapsed: 0 h 1 m 5 s
The output is the same (100% power) no matter what I change fixef() to. Based on other similar questions online I have ensured that my data has no NA values and according to my powerSim there are no warnings or errors to address. I am completely lost as to why this isn't working so any help would be greatly appreciated!
Alternatively, if anyone has any recommendations for other methods to carry out similar analysis I would love to hear them. What I really want is to get a p-value for each effect size I input but statistical power would be very valuable too.
Thank you!

R Contrast coding in lmer output interpretation

I want to see if four predictors ("OA_statusclosed" "OA_statusgreen" "OA_statushybrid" "OA_statusbronze") have an effect on "logAlt." I have chosen to do a lmer in r to account for random intercepts and slopes by variable "Journal".
I want to interpret the output so that a higher OA status (in order of highest status: green, hybrid, bronze, closed). In order to do this, I have contrast coded the four variables as such (adhering to the order of the variables in my df being hybrid, closed, green, bronze):
contrasts(df$OA_status.c) <- c(0.25, -0.5, 0.5, -0.25)
contrasts(df$OA_status.c)
I have ran this analysis:
M3 <- lmer(logAlt ~ OA_status + (1|Journal),
data = df,
control=lmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)))
And got this summary(M3):
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: logAlt ~ OA_status + (1 | Journal)
Data: df
Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
REML criterion at convergence: 20873.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.1272 -0.6719 0.0602 0.6618 4.4344
Random effects:
Groups Name Variance Std.Dev.
Journal (Intercept) 0.08848 0.2975
Residual 1.49272 1.2218
Number of obs: 6435, groups: Journal, 7
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 2.03867 0.15059 18.27727 13.538 5.71e-11 ***
OA_statusclosed -0.97648 0.09915 6428.62227 -9.848 < 2e-16 ***
OA_statusgreen -0.74956 0.10320 6429.65387 -7.263 4.22e-13 ***
OA_statushybrid 0.04621 0.12590 6427.44114 0.367 0.714
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) OA_sttsc OA_sttsg
OA_sttsclsd -0.640
OA_statsgrn -0.613 0.934
OA_sttshybr -0.501 0.763 0.744
I interpret this to mean that, for example, OA_statusclosed results in an average of -0.97 less of a logAlt value than the reference value, and that OA_statusclosed is a significant predictor.
I have two questions:
Am I approaching contrast coding correctly— in that, am I making "OA_statusgreen" my reference value (which is what I think I need to do?)
Am I interpreting the output correctly?
Thanks in advance!

Extracting GLM model estimates from package effects() in R

I'm using GLM models to explain which effects are the most relevant to explain the occurrence of certain wildlife behaviours.
I'm using package effects(), which provides very useful plots. Below is a sample of code of one of the models tested:
> m7 <- glmer(cbind(Feeding,Standing_Foraging) ~ Day_Night+(1|ID) , data=GLM_df , family=binomial)
> summary(m7)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: cbind(Feeding, Standing_Foraging) ~ Day_Night + (1 | ID)
Data: GLM_df
AIC BIC logLik deviance df.resid
665.3 674.2 -329.6 659.3 141
Scaled residuals:
Min 1Q Median 3Q Max
-2.09648 -0.60859 -0.02129 0.56377 3.07963
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0.1461 0.3822
Number of obs: 144, groups: ID, 6
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.13918 0.16406 -6.944 0.00000000000382 ***
Day_NightNight -0.15369 0.07525 -2.042 0.0411 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
Dy_NghtNght -0.206
And here are the plotted estimates with confidence intervals:
plot(allEffects(m7))
My question is simple: How do I access the estimate values and respective confidence intervals plotted in the figure? I can't find any information on the forum so please, any help is appreciated!
Cheers!
values <- effect("x", model)
summary(values)
x is the explanatory value you want information about
model is the name of your glmer() model.

Error bars overlap but effect is significant in glmm. What's going on?

I have run the following glmm:
mod<-glmer(data=newdata, total_flr_vis ~ treatment + flr_num + (1|individual), family=poisson)
and get this output from summary()
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: poisson ( log )
Formula: total_flr_vis ~ treatment + flr_num + (1 | individual)
Data: newdata
AIC BIC logLik deviance df.resid
706.8 718.9 -349.4 698.8 148
Scaled residuals:
Min 1Q Median 3Q Max
-3.1461 -0.8155 -0.3522 -0.2022 14.1669
Random effects:
Groups Name Variance Std.Dev.
individual (Intercept) 4.066 2.016
Number of obs: 152, groups: individual, 42
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.610176 0.432666 -3.722 0.000198 ***
treatmentR -0.457492 0.121054 -3.779 0.000157 ***
flr_num 0.037063 0.005064 7.319 2.5e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) trtmnR
treatmentR -0.117
flr_num -0.416 0.040
But I get the following plot for treatment using plot(allEffects(mod))
I don't understand why the effects plot shows overlapping error bars while the summary() output tells me that the effect of treatment is significant. Is there a problem with the model, or is it the plot? How can I troubleshoot this?
Here is the residual plot (which I got using plot(mod))
I'm not totally sure how to interpret this plot, but it does not look random to me, thus I suspect that there is something wrong with the model.
I am happy to post data if someone can tell me where to do that.
Any help would be very welcome.

Resources