Changing reference level of binary predictor causes glmer to explode - r

Running a glmer (lme4_1.1-10) on a model with a binary DV of score (coded as 1 or 0) and a few binary predictors: Trial Type, which is within-subject (Aff and Neg), and 2 between-subject variables: cdiNo (1 or 0) and cdiNot (1 or 0). Including a random effect with random slope of Trial Type by Subject. Note that here I'm modeling random slope and intercept as uncorrelated. Using bobyqa and maxIter set at 10000. Here's the model:
analysis<-glmer(score ~ TrialType*cdiNot + TrialType*cdiNo + (1|UniqueSubject) + (0+TrialType|UniqueSubject)
By default, glmer output is dummy coded to make Aff the reference level for TrialType. In that case, the output is perfectly sensible (and theoretically predicted).
AIC BIC logLik deviance df.resid
1766.8 1819.5 -873.4 1746.8 1431
Scaled residuals:
Min 1Q Median 3Q Max
-2.3214 -0.8554 0.4693 0.6288 1.5026
Random effects:
Groups Name Variance Std.Dev. Corr
UniqueSubject (Intercept) 0.3007 0.5484
UniqueSubject.1 TrialTypeAff 0.2587 0.5087
TrialTypeNeg 0.4398 0.6631 0.13
Number of obs: 1441, groups: UniqueSubject, 183
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3728 0.5285 2.598 0.00939 **
TrialTypeNeg -1.1677 0.6652 -1.755 0.07919 .
cdiNot 0.6915 0.2280 3.033 0.00242 **
cdiNo -0.4600 0.5394 -0.853 0.39384
TrialTypeNeg:cdiNot 0.2673 0.2915 0.917 0.35904
TrialTypeNeg:cdiNo 0.1239 0.6811 0.182 0.85560
However, I want to look at the simple effect of cdiNot and cdiNo relative to the other reference level of TrialType. So, I relevel:
data$TrialType<-relevel(data$TrialType, ref="Neg")
And run the exact same model again. Now I get convergence warnings and an insane output.
AIC BIC logLik deviance df.resid
1766.8 1819.5 -873.4 1746.8 1431
Scaled residuals:
Min 1Q Median 3Q Max
-2.3351 -0.8586 0.4714 0.6301 1.4968
Random effects:
Groups Name Variance Std.Dev. Corr
UniqueSubject (Intercept) 0.3377 0.5811
UniqueSubject.1 TrialTypeNeg 0.3917 0.6258
TrialTypeAff 0.2263 0.4757 0.02
Number of obs: 1441, groups: UniqueSubject, 183
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.207190 0.001148 180.5 <2e-16 ***
TrialTypeAff 1.149480 0.001148 1001.6 <2e-16 ***
cdiNot 0.949130 0.001147 827.1 <2e-16 ***
cdiNo -0.330437 0.001148 -287.9 <2e-16 ***
TrialTypeAff:cdiNot -0.247906 0.001147 -216.1 <2e-16 ***
TrialTypeAff:cdiNo -0.107224 0.001148 -93.4 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) TrlTyA cdiNot cdiNo TrlTypAff:cdNt
TrialTypAff 0.001
cdiNot 0.000 0.000
cdiNo 0.000 0.001 0.000
TrlTypAff:cdNt 0.000 0.000 0.000 0.000
TrilTypAff:cdN 0.001 0.001 0.000 0.001 0.000
convergence code: 0
Model failed to converge with max|grad| = 0.0624071 (tol = 0.001, component 1)
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
But the mystery deepens. If I just add the correlation of random slope and intercept to the model, the convergence and other warnings disappear, and the output becomes sensible again. So, using Neg as the reference level and running:
analysis<-glmer(score ~ TrialType*cdiNot + TrialType*cdiNo + (TrialType|UniqueSubject)
The output is:
AIC BIC logLik deviance df.resid
1764.8 1812.2 -873.4 1746.8 1432
Scaled residuals:
Min 1Q Median 3Q Max
-2.3215 -0.8554 0.4693 0.6288 1.5026
Random effects:
Groups Name Variance Std.Dev. Corr
UniqueSubject (Intercept) 0.7405 0.8605
TrialTypeAff 0.6125 0.7826 -0.59
Number of obs: 1441, groups: UniqueSubject, 183
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.2051 0.5009 0.410 0.6821
TrialTypeAff 1.1676 0.6654 1.755 0.0793 .
cdiNot 0.9588 0.2222 4.314 1.6e-05 ***
cdiNo -0.3360 0.5154 -0.652 0.5144
TrialTypeAff:cdiNot -0.2673 0.2915 -0.917 0.3591
TrialTypeAff:cdiNo -0.1239 0.6813 -0.182 0.8557
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) TrlTyA cdiNot cdiNo TrlTypAff:cdNt
TrialTypAff -0.621
cdiNot -0.036 0.021
cdiNo -0.966 0.600 -0.121
TrlTypAff:cdNt 0.021 -0.021 -0.636 0.078
TrilTypAff:cdN 0.603 -0.966 0.077 -0.625 -0.120
I'm very confused about what might be going on. I suppose I can imagine that if one simple effect is fine and one is unidentifiable, changing the reference level could cause a convergence error (though the astronomical numbers are still unusual). But if I understand correctly, the correlation between random slope and intercept is just an extra parameter for the model to estimate. How could including this parameter make a model that wasn't converging start to?

Related

Is there any way to split interaction effects in a linear model up?

I have a 2x2 factorial design: control vs enriched, and strain1 vs strain2. I wanted to make a linear model, which I did as follows:
anova(lmer(length ~ Strain + Insect + Strain:Insect + BW_final + (1|Pen), data = mydata))
Where length is one of the dependent variables I want to analyse, Strain and Insect as treatments, Strain:Insect as interaction effect, BW_final as covariate, and Pen as random effect.
As output I get this:
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
Strain 3.274 3.274 1 65 0.1215 0.7285
Insect 14.452 14.452 1 65 0.5365 0.4665
BW_final 45.143 45.143 1 65 1.6757 0.2001
Strain:Insect 52.813 52.813 1 65 1.9604 0.1662
As you can see, I only get 1 interaction term: Strain:Insect. However, I'd like to see 4 interaction terms: Strain1:Control, Strain1:Enriched, Strain2:Control, Strain2:Enriched.
Is there any way to do this in R?
Using summary instead of anova I get:
> summary(linearmer)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [lmerModLmerTest]
Formula: length ~ Strain + Insect + Strain:Insect + BW_final + (1 | Pen)
Data: mydata_young
REML criterion at convergence: 424.2
Scaled residuals:
Min 1Q Median 3Q Max
-1.95735 -0.52107 0.07014 0.43928 2.13383
Random effects:
Groups Name Variance Std.Dev.
Pen (Intercept) 0.00 0.00
Residual 26.94 5.19
Number of obs: 70, groups: Pen, 27
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 101.646129 7.530496 65.000000 13.498 <2e-16 ***
StrainRoss 0.648688 1.860745 65.000000 0.349 0.729
Insect 0.822454 2.062696 65.000000 0.399 0.691
BW_final -0.005188 0.004008 65.000000 -1.294 0.200
StrainRoss:Insect -3.608430 2.577182 65.000000 -1.400 0.166
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) StrnRs Insect BW_fnl
StrainRoss 0.253
Insect -0.275 0.375
BW_final -0.985 -0.378 0.169
StrnRss:Ins 0.071 -0.625 -0.775 0.016
convergence code: 0
boundary (singular) fit: see ?isSingular```

how to calculate over dispersion in a non poisson lmer model

Hello everyone, I'm looking to calculate the over dispersion in the following model :
lmer(R_ger_b~espece*traitement+(1|pop), data=d)
Here "espece" means species, "traitement" means treatment and "pop" means population. My variable called R_ger_b come from a binary variable on which I took the residu out (R_ger_b) to correct this variable from an other one (that was a 0 (non sprouted), 1 (sprouted) variable).
By doing the summary of this model I have this output :
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: R_ger_b ~ espece * traitement + (1 | pop)
Data: d
REML criterion at convergence: 2381.2
Scaled residuals:
Min 1Q Median 3Q Max
-2.0555 -0.8182 0.2951 0.6854 1.6788
Random effects:
Groups Name Variance Std.Dev.
pop (Intercept) 0.05762 0.24
Residual 1.12301 1.06
Number of obs: 800, groups: pop, 4
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.40383 0.20010 3.20699 2.018 0.13097
especemac -0.88576 0.28298 3.20699 -3.130 0.04756 *
traitementmi 0.16897 0.14987 790.00000 1.127 0.25989
traitementpeu -0.06180 0.14987 790.00000 -0.412 0.68018
traitementtemoin -0.06635 0.14987 790.00000 -0.443 0.65811
especemac:traitementmi -0.13861 0.21194 790.00000 -0.654 0.51330
especemac:traitementpeu 0.18556 0.21194 790.00000 0.876 0.38155
especemac:traitementtemoin 0.74192 0.21194 790.00000 3.501 0.00049 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) espcmc trtmntm trtmntp trtmntt espcmc:trtmntm espcmc:trtmntp
especemac -0.707
traitementm -0.374 0.265
traitementp -0.374 0.265 0.500
traitmnttmn -0.374 0.265 0.500 0.500
espcmc:trtmntm 0.265 -0.374 -0.707 -0.354 -0.354
espcmc:trtmntp 0.265 -0.374 -0.354 -0.707 -0.354 0.500
espcmc:trtmntt 0.265 -0.374 -0.354 -0.354 -0.707 0.500 0.500
But I don't really know how to calculate over dispersion there, I saw a solution about this problem on a poisson lmer but not in the case I'm working on.
Thank you for your help, I hope I asked my question well
Germain VITAL

Different model output in Mac and PC

I have been working on my PC to analyse my multilevel data. I am now working on a Mac and have run the same model. Some of the output is the same but some is quite different. I can't seem to work out why. Here is the model:
> loss.2 <- glmer.nb(Loss_across.Chain ~ Posn.c*Valence.c + (Valence.c|mood.c/Chain), data = FinalData_forpoisson, control = glmerControl(optimizer = "bobyqa", check.conv.grad = .makeCC("warning", 0.05)))
On the PC I got this output:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: Negative Binomial(4.9852) ( log )
Formula: Loss_across.Chain ~ Posn.c * Valence.c + (Valence.c | mood.c/Chain)
Data: FinalData_forpoisson
Control: ..3
AIC BIC logLik deviance df.resid
1894.7 1945.3 -936.4 1872.7 725
Scaled residuals:
Min 1Q Median 3Q Max
-1.3882 -0.7225 -0.5190 0.4375 7.1873
Random effects:
Groups Name Variance Std.Dev. Corr
Chain:mood.c (Intercept) 8.782e-15 9.371e-08
Valence.c 9.608e-15 9.802e-08 0.48
mood.c (Intercept) 0.000e+00 0.000e+00
Valence.c 1.654e-14 1.286e-07 NaN
Number of obs: 736, groups: Chain:mood.c, 92; mood.c, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.19255 0.04794 -4.016 5.92e-05 ***
Posn.c -0.61011 0.04122 -14.800 < 2e-16 ***
Valence.c -0.27372 0.09589 -2.855 0.00431 **
Posn.c:Valence.c 0.38043 0.08245 4.614 3.95e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) Posn.c Vlnc.c
Posn.c 0.491
Valence.c 0.029 -0.090
Psn.c:Vlnc. -0.090 0.062 0.491
On the Mac I got this output:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: Negative Binomial(4.9852) ( log )
Formula: Loss_across.Chain ~ Posn.c * Valence.c + (Valence.c | mood.c/Chain)
Data: FinalData_forpoisson
Control: ..3
AIC BIC logLik deviance df.resid
1894.7 1945.3 -936.4 1872.7 725
Scaled residuals:
Min 1Q Median 3Q Max
-1.3882 -0.7225 -0.5190 0.4375 7.1873
Random effects:
Groups Name Variance Std.Dev. Corr
Chain:mood.c (Intercept) 1.242e-13 3.524e-07
Valence.c 4.724e-13 6.873e-07 0.98
mood.c (Intercept) 7.998e-16 2.828e-08
Valence.c 3.217e-14 1.793e-07 1.00
Number of obs: 736, groups: Chain:mood.c, 92; mood.c, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.947e-05 4.794e-02 0.001 1.000
Posn.c 7.441e-05 4.122e-02 0.002 0.999
Valence.c -4.011e-05 9.589e-02 0.000 1.000
Posn.c:Valence.c -6.672e-05 8.245e-02 -0.001 0.999
Correlation of Fixed Effects:
(Intr) Posn.c Vlnc.c
Posn.c 0.491
Valence.c 0.029 -0.090
Psn.c:Vlnc. -0.090 0.062 0.491
Does anyone know why the output might be different across the two platforms and how I might be able to get them to align?

Interaction not significant when using a logistic model but is significant using a poisson/negative binomial model

I ran an experiment in which participants were asked to pass a story along a 4 person 'transmission chain', a bit like the game Chinese Whispers. Person 1 reads the story and re-writes it for person 2, who does the same and it continues until all four people in the chain have read and reproduced the story. I'm interested it whether positive or negative information 'survives' better in the reproductions. I have modeled this two ways: one approach was to code each item in the original story as being either present (1) or absent (0) in the reproductions and model this using a logistic model:
survival.logit <- glmer(Present ~ Posn.c*mood.c*Valence.c + (1+Valence.c|mood.c/Chain.) + (1|Item), data = Survival.Analysis_restructureddata, family = binomial, glmerControl(optimizer="bobyqa", check.conv.grad=.makeCC("warning", 2e-3)))
The other approach is to count the number of each type of statement that is lost across the chains and model this data using a poisson or negative binomial model.
survival.count <- glmer.nb(Loss_across.Chain ~ Posn.c*mood.c*Valence.c + (1 + Valence.c|mood.c/Chain), data = FinalData_forpoisson, control = glmerControl(optimizer = "bobyqa", check.conv.grad = .makeCC("warning", 0.05)))
The fixed factors in each model are:
Posn.c - position in the chain (centered)
mood.c - mood condition (a between-groups factor, centered)
Valence.c - valence of the item (positive or negative, centered)
Both models return similar results with one key exception - the interaction between position in the chain and valence is not significant in the logistic model but is highly significant in the negative binomial model. Why might this be the case?? Graphing the data suggests that there is indeed an interaction, such that positive information is lost at a faster rate than negative across the chain.
Any help would be greatly appreciated!
Edit: Please see below the model output for the both models:
Logistic:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Present ~ Posn.c * mood.c * Valence.c + (1 + Valence.c | mood.c/Chain.) + (1 | Item)
Data: Survival.Analysis_restructureddata
Control: glmerControl(optimizer = "bobyqa", check.conv.grad = .makeCC("warning", 0.002))
AIC BIC logLik deviance df.resid
5795.2 5895.4 -2882.6 5765.2 5873
Scaled residuals:
Min 1Q Median 3Q Max
-7.7595 -0.5744 0.1876 0.5450 5.5047
Random effects:
Groups Name Variance Std.Dev. Corr
Chain.:mood.c (Intercept) 7.550e-01 8.689e-01
Valence.c 1.366e+00 1.169e+00 0.47
Item (Intercept) 1.624e+00 1.274e+00
mood.c (Intercept) 3.708e-18 1.926e-09
Valence.c 7.777e-14 2.789e-07 1.00
Number of obs: 5888, groups: Chain.:mood.c, 92; Item, 16; mood.c, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.43895 0.33331 1.317 0.1879
Posn.c -0.54789 0.03153 -17.378 <2e-16 ***
mood.c -0.23004 0.19436 -1.184 0.2366
Valence.c 1.64397 0.65245 2.520 0.0117 *
Posn.c:mood.c -0.07000 0.06141 -1.140 0.2543
Posn.c:Valence.c 0.06144 0.06301 0.975 0.3295
mood.c:Valence.c -0.05999 0.28123 -0.213 0.8311
Posn.c:mood.c:Valence.c 0.01498 0.12276 0.122 0.9029
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) Posn.c mood.c Vlnc.c Psn.:. Ps.:V. md.:V.
Posn.c -0.009
mood.c -0.001 0.009
Valence.c 0.025 -0.019 -0.002
Posn.c:md.c 0.001 0.007 -0.014 -0.001
Psn.c:Vlnc. -0.018 0.054 -0.002 -0.009 -0.024
md.c:Vlnc.c -0.002 -0.002 0.399 -0.001 -0.065 0.012
Psn.c:m.:V. -0.001 -0.024 -0.046 0.001 0.060 0.007 -0.019
Negative Binomial:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: Negative Binomial(5.0188) ( log )
Formula: Loss_across.Chain ~ Posn.c * mood.c * Valence.c + (1 + Valence.c | mood.c/Chain)
Data: FinalData_forpoisson
Control: ..3
AIC BIC logLik deviance df.resid
1901.3 1970.4 -935.7 1871.3 721
Scaled residuals:
Min 1Q Median 3Q Max
-1.3727 -0.7404 -0.5037 0.4609 7.3896
Random effects:
Groups Name Variance Std.Dev. Corr
Chain:mood.c (Intercept) 1.989e-13 4.46e-07
Valence.c 3.589e-13 5.99e-07 1.00
mood.c (Intercept) 0.000e+00 0.00e+00
Valence.c 1.690e-14 1.30e-07 NaN
Number of obs: 736, groups: Chain:mood.c, 92; mood.c, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.19375 0.04797 -4.039 5.37e-05 ***
Posn.c -0.61020 0.04124 -14.798 < 2e-16 ***
mood.c 0.04862 0.09597 0.507 0.61242
Valence.c -0.27487 0.09594 -2.865 0.00417 **
Posn.c:mood.c -0.04232 0.08252 -0.513 0.60803
Posn.c:Valence.c 0.38080 0.08247 4.617 3.89e-06 ***
mood.c:Valence.c 0.13272 0.19194 0.691 0.48929
Posn.c:mood.c:Valence.c 0.05143 0.16504 0.312 0.75534
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) Posn.c mood.c Vlnc.c Psn.:. Ps.:V. md.:V.
Posn.c 0.491
mood.c -0.014 0.007
Valence.c 0.030 -0.090 -0.036
Posn.c:md.c 0.007 -0.008 0.492 -0.021
Psn.c:Vlnc. -0.090 0.063 -0.021 0.491 -0.030
md.c:Vlnc.c -0.036 -0.021 0.027 -0.014 -0.091 0.007
Psn.c:m.:V. -0.021 -0.030 -0.091 0.007 0.060 -0.008 0.492

(Quasi)-Complete separation according to a random effect in logistic GLMM

I am experiencing convergence warning and very large group variance while fitting a binary logistic GLMM model using lme4. I am wondering whether this could be related to (quasi) complete separation according to the random effect, i.e., the fact that many individuals (the random effect/grouping variable) have only 0 in the dependent variable resulting in low within individual variation? If this could be a problem, are there alternative modelling strategies to deal with such cases?
More precisely, I am studying the chance that an individual is observed in a given status (having children while leaving by their parents) at a given age. In other words, I have several observations for each individual (typically 50) specifying whether the individual was observed in this state at a given age. Here is an example:
id age status
1 21 0
1 22 0
1 23 0
1 24 1
1 25 0
1 26 1
1 27 0
...
The chance to observe a status of 1 is quite low (between 1 and 5% depending on the cases) and I have a lot of observations (150'000 observations and 3'000 individuals).
The model was fitted using glmer specifying ID (individual) as a random effect and including some explanatory factors (age categories, parental education and the period where the status was observed). I get the following convergence warnings (except when using nAGQ=0) and very large group variance (here more than 25).
"Model failed to converge with max|grad| = 2.21808 (tol = 0.001, component 2)"
"Model is nearly unidentifiable: very large eigenvalue\n - Rescale variables?"
Here is the obtained model.
AIC BIC logLik deviance df.resid
9625.0 9724.3 -4802.5 9605.0 151215
Scaled residuals:
Min 1Q Median 3Q Max
-2.529 -0.003 -0.002 -0.001 47.081
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 28.94 5.38
Number of obs: 151225, groups: id, 3067
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -10.603822 0.496392 -21.362 < 2e-16 ***
agecat[18,21) -0.413018 0.075119 -5.498 3.84e-08 ***
agecat[21,24) -1.460205 0.095315 -15.320 < 2e-16 ***
agecat[24,27) -2.844713 0.137484 -20.691 < 2e-16 ***
agecat[27,30) -3.837227 0.199644 -19.220 < 2e-16 ***
parent_educ -0.007390 0.003609 -2.048 0.0406 *
period_cat80 s 0.126521 0.113044 1.119 0.2630
period_cat90 s -0.105139 0.176732 -0.595 0.5519
period_cat00 s -0.507052 0.263580 -1.924 0.0544 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) a[18,2 a[21,2 a[24,2 a[27,3 prnt_d pr_80' pr_90'
agct[18,21) -0.038
agct[21,24) -0.006 0.521
agct[24,27) 0.006 0.412 0.475
agct[27,30) 0.011 0.325 0.393 0.378
parent_educ -0.557 0.059 0.087 0.084 0.078
perd_ct80 s -0.075 -0.258 -0.372 -0.380 -0.352 -0.104
perd_ct90 s -0.048 -0.302 -0.463 -0.471 -0.448 -0.151 0.732
perd_ct00 s -0.019 -0.293 -0.459 -0.434 -0.404 -0.138 0.559 0.739
You could try one of a few different optimizers available through the nloptr and optimx packages. There's even an allFit function available through the afex package that tries them for you (just see the allFit helpfile). e.g:
all_mod <- allFit(exist_model)
That will let you check how stable your estimates are. This points over to more resources on the gradient topic.
If you're worried about complete separation, see here for Ben Bolker's answer to use the bglmer function from the blme package. It operates much like glmer, but allows you to add priors to the model specification.

Resources