Post-hoc test for glmer - r

I'm analysing my binomial dataset with R using a generalized linear mixed model (glmer, lme4-package). I wanted to make the pairwise comparisons of a certain fixed effect ("Sound") using a Tukey's post-hoc test (glht, multcomp-package).
Most of it is working fine, but one of my fixed effect variables ("SoundC") has no variance at all (96 times a "1" and zero times a "0") and it seems that the Tukey's test cannot handle that. All pairwise comparisons with this "SoundC" give a p-value of 1.000 whereas some are clearly significant.
As a validation I changed one of the 96 "1"'s to a "0" and after that I got normal p-values again and significant differences where I expected them, whereas the difference had actually become smaller after my manual change.
Does anybody have a solution? If not, is it fine to use the results of my modified dataset and report my manual change?
Reproducible example:
Response <- c(1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,1,0,
0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,0,
1,1,0,1,1,0,1,1,1,1,0,0,1,1,0,1,1,0,1,1,0,1,1,0,1)
Data <- data.frame(Sound=rep(paste0('Sound',c('A','B','C')),22),
Response,
Individual=rep(rep(c('A','B'),2),rep(c(18,15),2)))
# Visual
boxplot(Response ~ Sound,Data)
# Mixed model
library (lme4)
model10 <- glmer(Response~Sound + (1|Individual), Data, family=binomial)
# Post-hoc analysis
library (multcomp)
summary(glht(model10, mcp(Sound="Tukey")))

This is verging on a CrossValidated question; you are definitely seeing complete separation, where there is a perfect division of your response into 0 vs 1 results. This leads to (1) infinite values of the parameters (they're only listed as non-infinite due to computational imperfections) and (2) crazy/useless values of the Wald standard errors and corresponding $p$ values (which is what you're seeing here). Discussion and solutions are given here, here, and here, but I'll illustrate a little more below.
To be a statistical grouch for a moment: you really shouldn't be trying to fit a random effect with only 3 levels anyway (see e.g. http://glmm.wikidot.com/faq) ...
Firth-corrected logistic regression:
library(logistf)
L1 <- logistf(Response~Sound*Individual,data=Data,
contrasts.arg=list(Sound="contr.treatment",
Individual="contr.sum"))
coef se(coef) p
(Intercept) 3.218876e+00 1.501111 2.051613e-04
SoundSoundB -4.653960e+00 1.670282 1.736123e-05
SoundSoundC -1.753527e-15 2.122891 1.000000e+00
IndividualB -1.995100e+00 1.680103 1.516838e-01
SoundSoundB:IndividualB 3.856625e-01 2.379919 8.657348e-01
SoundSoundC:IndividualB 1.820747e+00 2.716770 4.824847e-01
Standard errors and p-values are now reasonable (p-value for the A vs C comparison is 1 because there is literally no difference ...)
Mixed Bayesian model with weak priors:
library(blme)
model20 <- bglmer(Response~Sound + (1|Individual), Data, family=binomial,
fixef.prior = normal(cov = diag(9,3)))
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.711485 2.233667 0.7662221 4.435441e-01
## SoundSoundB -5.088002 1.248969 -4.0737620 4.625976e-05
## SoundSoundC 2.453988 1.701674 1.4421024 1.492735e-01
The specification diag(9,3) of the fixed-effect variance-covariance matrix produces
$$
\left(
\begin{array}{ccc}
9 & 0 & 0 \
0 & 9 & 0 \
0 & 0 & 9
\end{array}
\right)
$$
In other words, the 3 specifies the dimension of the matrix (equal to the number of fixed-effect parameters), and the 9 specifies the variance -- this corresponds to a standard devation of 3 or a 95% range of about $\pm 6$, which is quite large/weak/uninformative for logit-scaled responses.
These are roughly consistent (the model is very different)
library(multcomp)
summary(glht(model20, mcp(Sound="Tukey")))
## Estimate Std. Error z value Pr(>|z|)
## SoundB - SoundA == 0 -5.088 1.249 -4.074 0.000124 ***
## SoundC - SoundA == 0 2.454 1.702 1.442 0.309216
## SoundC - SoundB == 0 7.542 1.997 3.776 0.000397 ***
As I said above, I would not recommend a mixed model in this case anyway ...

Related

How to check p-value in REML (or mixed model) using lmer() in R?

This is my data and I'd like to do REML analysis to see variance components of random factor.
Plant<- rep(c("P1","P2","P3","P4"), each=9)
Leaves<- rep(rep(c("L1","L2","L3"), each=3),4)
Rep<-rep(c(1,2,3),12)
Ca<- c(3.280, 3.090, 3.185, 3.520, 3.600, 3.560, 2.880, 2.800, 2.840, 2.460, 2.440,
2.450, 1.870, 1.800, 1.835, 2.190, 2.100, 2.145, 2.770, 2.660, 2.715, 3.740,
3.440, 3.590, 2.550, 2.700, 2.625, 3.780, 3.870, 3.825, 4.070, 4.200, 4.135,
3.310, 3.400, 3.355)
tomato<- data.frame(Plant,Leaves,Rep,Ca)
and this is my code
library(lme4)
lmer <- lmer(Ca ~ (1|Plant)+ (1|Plant:Leaves), REML=TRUE, data=tomato)
summary(lmer)
I assumed that leaves are nested to plant. So I code as (1|Plant:Leaves)
This result indicates that Plant and leaves account for the most variability in data, while replicates are minor, doesn't it?
Then, I want to know plant and leave (nested to plant) are significant or not. Where can I find the p-value? or can I add more code to check p-value?
Or in REML, only variance component is possible for us to check?
If so, if I choose plant as a fixed factor like below
lmer <- lmer(Ca ~ Plant+ (1|Plant:Leaves), REML=TRUE, data=tomato)
summary(lmer)
At least, in this mixed model, p-value for plant (= fixed factor) should be presented? isn't it? But, I can't still find it.
Could you guide me how to analyze the result of REML?
This is an older question, but I wanted to clarify in case others spotted it. It is usually not standard to check significance of random effects directly. It warrants knowing why you are interested in random effects significance terms in the first place. An example given from Ben Bolker's useful FAQ on mixed models provides one way of ascertaining differences with significance terms using a likelihood ratio test for comparing models with differing random effects:
library(lme4)
m2 <- lmer(Reaction~Days+(1|Subject)+(0+Days|Subject),sleepstudy,REML=FALSE)
m1 <- update(m2,.~Days+(1|Subject))
m0 <- lm(Reaction~Days,sleepstudy)
anova(m2,m1,m0) ## two sequential tests
Therein providing a summary of whether or not the model differences with changed random effects are significant based on chi-square changes:
## Data: sleepstudy
## Models:
## m0: Reaction ~ Days
## m1: Reaction ~ Days + (1 | Subject)
## m2: Reaction ~ Days + (1 | Subject) + (0 + Days | Subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m0 3 1906.3 1915.9 -950.15 1900.3
## m1 4 1802.1 1814.8 -897.04 1794.1 106.214 1 < 2.2e-16 ***
## m2 5 1762.0 1778.0 -876.00 1752.0 42.075 1 8.782e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
You can see here that the null model, a random effects model with subject and day REs, and a final model with just subject random intercepts are compared with the test. In this way, we can in some part deduce that the m2 model may provide the best balance of parsimony and complexity. However, it is very important to have both theory-driven and data-driven reasons for this decision.

R - Repeated measure analysis - Different results for LME and Tukey post hoc test

I'm currently running a repeated measure analysis in R on 4 sub-factors: SF1, SF2, SF3, SF4
First, it is to be noted that the assumption of sphericity is violated, sample size is conisdered as reasonable large (N = 188). Group size is however not equal.
Contrasts are set to show that SF1 and SF2(combined) are signifcantly higher than SF3 and SF4(combined). Whereas the values for SF1 and SF2(between) and SF3 and SF4(between) do not differ significantly.
I.e.
Contr1<-c(1, 1, -1, -1)
Contr2<-c(1, -1, 0, 0)
Contr3<-c(0, 0, 1, -1)
contrasts(rep_table_long$Subfactor)<-cbind(Contr1, Contr2, Contr3)
The general model code is the following
rep_model <- lme(Value ~ Subfactor, random = ~1|Subject/Subfactor, data = rep_table_long, method ="ML")
By executing summary(rep_model) I received the following (truncated) output
Fixed effects: Value ~ Subfactor
Value Std.Error DF t-value p-value
(Intercept) 5.498910 0.07229032 561 76.06703 0.0000
SubfactorContr1 0.459601 0.03066438 561 14.98811 0.0000
SubfactorContr2 0.085266 0.04336598 561 1.96619 0.0498
SubfactorContr3 0.093617 0.04336598 561 2.15877 0.0313
Thus, showing SF1&SF2 are significantly larger than SF3&SF4. But SF1 is also significantly larger than SF2 and so is SF3 > SF4.
However, and here comes the reason for my question, the post hoc Tukey test showed different results:
> postHocs <- glht (rep_model, linfct = mcp(Subfactor = "Tukey"))
> summary(postHocs)
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lme.formula(fixed = Value ~ Subfactor, data = rep_table_long, random = ~1 | Subject/Subfactor, method = "ML")
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
SF2- SF1 == 0 -0.1872 0.0865 -2.165 0.133
SF3- SF1 == 0 -0.9275 0.0865 -10.723 <0.001
SF4- SF1 == 0 -1.0981 0.0865 -12.694 <0.001
SF3- SF2 == 0 -0.7403 0.0865 -8.559 <0.001
SF4- SF2 == 0 -0.9109 0.0865 -10.530 <0.001
SF4- SF3 == 0 -0.1705 0.0865 -1.971 0.199`
The result of the post hoc Tukey tests shows that the difference between SF2 and SF1 as well as between SF4 and SF3 are not significantly different.
Why do I get different results in both tests? Is it because sphericity is violated? Or am I doing something wrong here?
Any help is very appreciated.
I read about that in Andy FieldĀ“s book. The reason why you get different results from planned contarsts vs. a post hoc test is that post hoc tests are two-tailed and hence are suitable for explorative analyses (= no hypotheses). Contrasts, however, are one-tailed. When you think about t-tests it makes a difference whether you conduct a two-tailed test instead of a one-tailed test. Also, post-hoc tests like Tukey are conservative (they lack statistical power). These might be the reason why you did not find any significane with tukey.
Also, only use Tukey if your sample sizes are equal and you are confident that your population variances are similiar. (And Tukey is better with larger number of means but this is the case).
Hope this helped.
Reference: Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage publications.

How to properly set contrasts in R

I have been asked to see if there is a linear trend in 3 groups of data (5 points each) by using ANOVA and linear contrasts. The 3 groups represent data collected in 2010, 2011 and 2012. I want to use R for this procedure and I have tried both of the following:
contrasts(data$groups, how.many=1) <- contr.poly(3)
contrasts(data$groups) <- contr.poly(3)
Both ways seem to work fine but give slightly different answers in terms of their p-values. I have no idea which is correct and it is really tricky to find help for this on the web. I would like help figuring out what is the reasoning behind the different answers. I'm not sure if it has something to do with partitioning sums of squares or whatnot.
Both approaches differ with respect to whether a quadratic polynomial is used.
For illustration purposes, have a look at this example, both x and y are a factor with three levels.
x <- y <- gl(3, 2)
# [1] 1 1 2 2 3 3
# Levels: 1 2 3
The first approach creates a contrast matrix for a quadratic polynomial, i.e., with a linear (.L) and a quadratic trend (.Q). The 3 means: Create the 3 - 1th polynomial.
contrasts(x) <- contr.poly(3)
# [1] 1 1 2 2 3 3
# attr(,"contrasts")
# .L .Q
# 1 -7.071068e-01 0.4082483
# 2 -7.850462e-17 -0.8164966
# 3 7.071068e-01 0.4082483
# Levels: 1 2 3
In contrast, the second approach results in a polynomial of first order (i.e., a linear trend only). This is due to the argument how.many = 1. Hence, only 1 contrast is created.
contrasts(y, how.many = 1) <- contr.poly(3)
# [1] 1 1 2 2 3 3
# attr(,"contrasts")
# .L
# 1 -7.071068e-01
# 2 -7.850462e-17
# 3 7.071068e-01
# Levels: 1 2 3
If you're interested in the linear trend only, the second option seems more appropriate for you.
Changing the contrasts you ask for changes the degrees of freedom of the model. If one model requests linear and quadratic contrasts, and a second specifies only, say, the linear contrast, then the second model has an extra degree of freedom: this will increase the power to test the linear hypothesis, (at the cost of preventing the model fitting the quadratic trend).
Using the full ("nlevels - 1") set of contrasts creates an orthogonal set of contrasts which explore the full set of (independent) response configurations. Cutting back to just one prevents the model from fitting one configuration (in this case the quadratic component which our data in fact possess.
To see how this works, use the built-in dataset mtcars, and test the (confounded) relationship of gears to gallons. We'll hypothesize that the more gears the better (at least up to some point).
df = mtcars # copy the dataset
df$gear = as.ordered(df$gear) # make an ordered factor
Ordered factors default to polynomial contrasts, but we'll set them here to be explicit:
contrasts(df$gear) <- contr.poly(nlevels(df$gear))
Then we can model the relationship.
m1 = lm(mpg ~ gear, data = df);
summary.lm(m1)
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 20.6733 0.9284 22.267 < 2e-16 ***
# gear.L 3.7288 1.7191 2.169 0.03842 *
# gear.Q -4.7275 1.4888 -3.175 0.00353 **
#
# Multiple R-squared: 0.4292, Adjusted R-squared: 0.3898
# F-statistic: 10.9 on 2 and 29 DF, p-value: 0.0002948
Note we have F(2,29) = 10.9 for the overall model and p=.038 for our linear effect with an estimated extra 3.7 mpg/gear.
Now let's only request the linear contrast, and run the "same" analysis.
contrasts(df$gear, how.many = 1) <- contr.poly(nlevels(df$gear))
m1 = lm(mpg ~ gear, data = df)
summary.lm(m1)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 21.317 1.034 20.612 <2e-16 ***
# gear.L 5.548 1.850 2.999 0.0054 **
# Multiple R-squared: 0.2307, Adjusted R-squared: 0.205
# F-statistic: 8.995 on 1 and 30 DF, p-value: 0.005401
The linear effect of gear is now bigger (5.5 mpg) and p << .05 - A win? Except the overall model fit is now significantly worse: variance accounted for is now just 23% (was 43%)! Why is clear if we plot the relationship:
plot(mpg ~ gear, data = df) # view the relationship
So, if you're interested in the linear trend, but also expect (or are unclear about) additional levels of complexity, you should also test these higher polynomials. The quadratic (or, in general, trends up to levels-1).
Note too that in this example the physical mechanism is confounded: We've forgotten that number of gears is confounded with automatic vs manual transmission, and also with weight, and sedan vs sports car.
If someone wants to test the hypothesis that 4 gears is better than 3, they could answer this question :-)

Simple effect tests on glmm significant interactions in R

I have significant interactions in my glmm output and I know what my reference category (another interaction) is, but I need to know which variable is fixed when contrasting the significant interaction with the reference category.
I am looking for a post-hoc test called "simple effects test" (not Tukey). The same test is called "test slice" in JMP for any of you who use both R and JMP.
I have looked everywhere but cannot find the code for the simple effects test. Does anyone know how to use this test in R?
Here is an example of my glmm (using neg. binomial distribution) output:
Call:
glm.nb(formula = N ~ FoodCategory * Season + FoodCategory + Season +
(1 | Group/Animal), data = SPwg, init.theta = 0.8744631431,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.4796 -0.9720 -0.3713 -0.0350 4.7595
Coefficients: (2 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.2763 0.2940 0.939 0.34748
FoodCategoryFruit 0.8849 0.3316 2.669 0.00762 **
FoodCategoryInvertebrate -0.1962 0.5086 -0.386 0.69966
FoodCategoryPlantMatter 0.4169 1.3153 0.317 0.75128
SeasonHFLC -0.2250 0.4435 -0.507 0.61195
SeasonLFLC -0.2763 0.4610 -0.599 0.54904
1 | Group/AnimalTRUE NA NA NA NA
FoodCategoryFruit:SeasonHFLC 1.1511 0.4811 2.393 0.01673 *
FoodCategoryInvertebrate:SeasonHFLC 1.6265 0.6784 2.398 0.01651 *
FoodCategoryPlantMatter:SeasonHFLC NA NA NA NA
FoodCategoryFruit:SeasonLFLC 1.5565 0.4997 3.115 0.00184 **
FoodCategoryInvertebrate:SeasonLFLC 0.3016 0.7822 0.386 0.69984
FoodCategoryPlantMatter:SeasonLFLC 0.8640 1.4630 0.591 0.55479
---
My reference category is "FoodCategoryOther:SeasonHFHL". I know from this output, for example, that "FoodCategoryFruit:SeasonLFLC" is significantly more positive than my reference category.
However, I do not know if this is because "FoodCategoryFruit" is significantly more positive than "FoodCategoryPlantMatter" during the "SeasonLFLC" (for example) or if "FoodCategoryFruit" is significantly more positive during the "SeasonLFLC" than "FoodCategoryFruit" is during the "SeasonHFHL".
A simple effects test will fix one of the variables while testing for the effects of the other. This is what I need to work out the problem, unless someone can inform me of a similar/better/more appropriate test. However, please don't tell me Tukey, because this post-hoc test does not fix one variable while testing for the effects of the other.
This... is not a GLMM (generalized linear mixed model). You're fitting a regular old GLM with fixed effects only, albeit with the wrinkle of a negative binomial error distribution. Because glm.nb doesn't understand random effects notation, your (1 | Group/Animal) term has been interpreted as an arithmetic/logical expression, ie 1 ORed with the result of Group divided by Animal. 1 ORed with anything is identically TRUE, hence the NA coefficient for this term.
For an actual GLMM, you'll need to use something like glmer in the lme4 package, or the arm package (and possibly others I don't know about).

Getting Generalized Least Squares Means for fixed effects in nlme or lme4

Least Squares Means with their standard errors for aov object can be obtained with model.tables function:
npk.aov <- aov(yield ~ block + N*P*K, npk)
model.tables(npk.aov, "means", se = TRUE)
I wonder how to get the generalized least squares means with their standard errors from nlme or lme4 objects:
library(nlme)
data(Machines)
fm1Machine <- lme(score ~ Machine, data = Machines, random = ~ 1 | Worker )
Any comment and hint will be highly appreciated. Thanks
lme and nlme fit through maximum likelihood or restricted maximum likelihood (the latter is the default), so your results will be based on either of those methods
summary(fm1Machine) will provide you with the output that includes the means and standard errors:
....irrelevant output deleted
Fixed effects: score ~ Machine
Value Std.Error DF t-value p-value
(Intercept) 52.35556 2.229312 46 23.48507 0
MachineB 7.96667 1.053883 46 7.55935 0
MachineC 13.91667 1.053883 46 13.20514 0
Correlation:
....irrelevant output deleted
Because you have fitted the fixed effects with an intercept, you get an intercept term in the fixed effects result instead of a result for MachineA. The results for MachineB and MachineC are contrasts with the intercept, so to get the means for MachineB and MachineC, add the value of each to the intercept mean. But the standard errors are not the ones you would like.
To get the information you are after, fit the model so it doesn't have an intercept term in the fixed effects (see the -1 at the end of the fixed effects:
fm1Machine <- lme(score ~ Machine-1, data = Machines, random = ~ 1 | Worker )
This will then give you the means and standard error output you want:
....irrelevant output deleted
Fixed effects: score ~ Machine - 1
Value Std.Error DF t-value p-value
MachineA 52.35556 2.229312 46 23.48507 0
MachineB 60.32222 2.229312 46 27.05867 0
MachineC 66.27222 2.229312 46 29.72765 0
....irrelevant output deleted
To quote Douglas Bates from
http://markmail.org/message/dqpk6ftztpbzgekm
"I have a strong suspicion that, for most users, the definition of lsmeans is "the numbers that I get from SAS when I use an lsmeans statement". My suggestion for obtaining such numbers is to buy a SAS license and use SAS to fit your models."

Resources