I have seen several examples how it might be possible to select desired pairwise comparisons, but unfortunately do not know how to apply that to my data.
Here is my abbreviated data set: https://www.dropbox.com/s/x9xjc9o0222rg0w/df.csv?dl=0
# FIXED effects: age and brain_area
df$age <- factor(df$age)
df$brain_area <- factor(df$brain_area)
# RANDOM effects: subject_ID and section
df$subject_ID <- factor(df$subject_ID)
df$section <- factor(df$section)
# dependent variable: DV
# ___________________ mixed TWO-way ANOVA
require(lme4)
require(lmerTest)
require(emmeans)
model = lmer(DV ~ age * brain_area + (1 | subject_ID), data = df)
anova(model) # significant interaction and both main effects
# ____________________ ALL pairwise comparisons
emmeans(model, pairwise~brain_area|age, adj='fdr')
# ____________________ I marked below comparisons that I would like to exclude (but keep all others)
$contrasts
age = old:
contrast estimate SE df t.ratio p.value
a - b 0.0412 0.0158 174 2.603 0.0125
a - c -0.0566 0.0158 174 -3.572 0.0007
a - control 0.3758 0.0158 174 23.736 <.0001 # exclude
a - d -0.0187 0.0158 174 -1.182 0.2387
b - c -0.0978 0.0158 174 -6.175 <.0001
b - control 0.3346 0.0158 174 21.132 <.0001 # exclude
b - d -0.0599 0.0158 174 -3.786 0.0004
c - control 0.4324 0.0158 174 27.308 <.0001
c - d 0.0378 0.0158 174 2.389 0.0199
control - d -0.3946 0.0158 174 -24.918 <.0001 # exclude
age = young:
contrast estimate SE df t.ratio p.value
a - b 0.0449 0.0147 174 3.063 0.0032
a - c -0.0455 0.0147 174 -3.105 0.0032
a - control 0.2594 0.0147 174 17.694 <.0001 # exclude
a - d 0.0202 0.0147 174 1.377 0.1702
b - c -0.0904 0.0147 174 -6.169 <.0001
b - control 0.2145 0.0147 174 14.631 <.0001 # exclude
b - d -0.0247 0.0147 174 -1.686 0.1040
c - control 0.3049 0.0147 174 20.799 <.0001
c - d 0.0657 0.0147 174 4.483 <.0001
control - d -0.2392 0.0147 174 -16.317 <.0001 # exclude
# ____________________ The line below seems to work BUT completely excludes 'control' level from factor 'brain_area'. I do not wish to completely exclude it...
emmeans(model, specs=pairwise~brain_area| age,
at = list(brain_area = c("a", "b", "c", "d")), adj='fdr' )
You need to provide the contrast coefficients manually. In this case, it's fairly simple to obtain all of them, then remove the ones you don't want; something like this:
EMM <- emmeans(model, ~ brain_area | age)
EMM # show the means
coef <- emmeans:::pairwise.emmc(levels(EMM)[["brain_area"]])
coef <- coef[-c(3, 6, 10)]
contrast(EMM, coef, adjust = "fdr")
Related
I have a mixed effects model (with lme4) with a 2-way interaction term, each term having multiple levels (each 4) and I would like to investigate their effects in reference to their grand mean. I present this example here from the car data set and omit the error term since it is not neccessary for this example:
## shorten data frame for simplicity
df=Cars93[c(1:15),]
df=Cars93[is.element(Cars93$Make,c('Acura Integra', 'Audi 90','BMW 535i','Subaru Legacy')),]
df$Make=drop.levels(df$Make)
df$Model=drop.levels(df$Model)
## define contrasts (every factor has 4 levels)
contrasts(df$Make) = contr.treatment(4)
contrasts(df$Model) = contr.treatment(4)
## model
m1 <- lm(Price ~ Model*Make,data=df)
summary(m1)
as you can see, the first levels are omitted in the interaction term. And I would like to have all 4 levels in the output, referenced to the grand mean (often referred to deviant coding). These are the sources I looked at: https://marissabarlaz.github.io/portfolio/contrastcoding/#coding-schemes and How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?. The last reference does not report interactions though.
The simple answer is that what you want is not possible directly. You have to use a slightly different approach.
In a model with interactions, you want to use contrasts in which the mean is zero and not a specific level. Otherwise, the lower-order effects (i.e., main effects) are not main effects but simple effects (evaluated when the other factor level is at its reference level). This is explained in more details in my chapter on mixed models:
http://singmann.org/download/publications/singmann_kellen-introduction-mixed-models.pdf
To get what you want, you have to fit the model in a reasonable manner and then pass it to emmeans to compare against the intercept (i.e., the unweighted grand mean). This works also for interactions as shown below (as your code did not work, I use warpbreaks).
afex::set_sum_contrasts() ## uses contr.sum globally
library("emmeans")
## model
m1 <- lm(breaks ~ wool * tension,data=warpbreaks)
car::Anova(m1, type = 3)
coef(m1)[1]
# (Intercept)
# 28.14815
## both CIs include grand mean:
emmeans(m1, "wool")
# wool emmean SE df lower.CL upper.CL
# A 31.0 2.11 48 26.8 35.3
# B 25.3 2.11 48 21.0 29.5
#
# Results are averaged over the levels of: tension
# Confidence level used: 0.95
## same using test
emmeans(m1, "wool", null = coef(m1)[1], infer = TRUE)
# wool emmean SE df lower.CL upper.CL null t.ratio p.value
# A 31.0 2.11 48 26.8 35.3 28.1 1.372 0.1764
# B 25.3 2.11 48 21.0 29.5 28.1 -1.372 0.1764
#
# Results are averaged over the levels of: tension
# Confidence level used: 0.95
emmeans(m1, "tension", null = coef(m1)[1], infer = TRUE)
# tension emmean SE df lower.CL upper.CL null t.ratio p.value
# L 36.4 2.58 48 31.2 41.6 28.1 3.196 0.0025
# M 26.4 2.58 48 21.2 31.6 28.1 -0.682 0.4984
# H 21.7 2.58 48 16.5 26.9 28.1 -2.514 0.0154
#
# Results are averaged over the levels of: wool
# Confidence level used: 0.95
emmeans(m1, c("tension", "wool"), null = coef(m1)[1], infer = TRUE)
# tension wool emmean SE df lower.CL upper.CL null t.ratio p.value
# L A 44.6 3.65 48 37.2 51.9 28.1 4.499 <.0001
# M A 24.0 3.65 48 16.7 31.3 28.1 -1.137 0.2610
# H A 24.6 3.65 48 17.2 31.9 28.1 -0.985 0.3295
# L B 28.2 3.65 48 20.9 35.6 28.1 0.020 0.9839
# M B 28.8 3.65 48 21.4 36.1 28.1 0.173 0.8636
# H B 18.8 3.65 48 11.4 26.1 28.1 -2.570 0.0133
#
# Confidence level used: 0.95
Note that for coef() you probably want to use fixef() for lme4 models.
I'm following this tutorial as well as ?eff_size from package emmeans to compute eff_size() for my regression model below.
But I get the error: need an object with call component from the eff_size() call. Am I missing something?
library(lme4)
library(emmeans)
h <- read.csv('https://raw.githubusercontent.com/hkil/m/master/h.csv')
h$year <- as.factor(h$year)
m <- lmer(scale~year*group + (1|stid), data = h)
ems <- emmeans(m, pairwise ~ group*year, infer = c(T, T))
eff_size(ems, sigma = sigma(m), edf = df.residual(m))
# `Error: need an object with call component`
Are you tying for this? The object called should be a vector.
m <- lmer(scale~year*group + (1|stid), data = h)
ems <- emmeans(m, c("group","year"), infer = c(T, T))
eff_size(ems, sigma = sigma(m), edf = df.residual(m))
Output:
contrast effect.size SE df lower.CL upper.CL
C,0 - T,0 0.7289 0.224 507 0.289 1.169
C,0 - C,1 -2.0011 0.134 507 -2.263 -1.739
C,0 - T,1 -1.2370 0.229 507 -1.687 -0.787
C,0 - C,2 -3.1640 0.161 507 -3.481 -2.847
C,0 - T,2 -3.1173 0.261 507 -3.630 -2.605
T,0 - C,1 -2.7300 0.239 536 -3.199 -2.261
T,0 - T,1 -1.9659 0.159 536 -2.279 -1.653
T,0 - C,2 -3.8929 0.257 536 -4.397 -3.388
T,0 - T,2 -3.8462 0.218 536 -4.275 -3.417
C,1 - T,1 0.7642 0.232 558 0.308 1.220
C,1 - C,2 -1.1628 0.139 558 -1.436 -0.889
C,1 - T,2 -1.1162 0.253 558 -1.614 -0.619
T,1 - C,2 -1.9270 0.244 572 -2.406 -1.448
T,1 - T,2 -1.8803 0.193 572 -2.259 -1.502
C,2 - T,2 0.0467 0.258 634 -0.460 0.554
sigma used for effect sizes: 63.67
Degrees-of-freedom method: inherited from kenward-roger when re-gridding
Confidence level used: 0.95
To create the data frame:
num <- sample(1:25, 20)
x <- data.frame("Day_eclosion" = num, "Developmental" = c("AP", "MA",
"JU", "L"), "Replicate" = 1:5)
model <- glmer(Day_eclosion ~ Developmental + (1 | Replicate), family =
"poisson", data= x)
I get this return from:
a <- lsmeans(model, pairwise~Developmental, adjust = "tukey")
a$contrasts
contrast estimate SE df z.ratio p.value
AP - JU 0.2051 0.0168 Inf 12.172 <.0001
AP - L 0.3009 0.0212 Inf 14.164 <.0001
AP - MA 0.3889 0.0209 Inf 18.631 <.0001
JU - L 0.0958 0.0182 Inf 5.265 <.0001
JU - MA 0.1839 0.0177 Inf 10.387 <.0001
L - MA 0.0881 0.0222 Inf 3.964 0.0004
I am looking for a simple way to turn this output (just p values) into:
AP MA JU L
AP - <.0001 <.0001 <.0001
MA - - <.0001 0.0004
JU - - - <.0001
L - - -
I have about 20 sets of these that I need to turn into tables, so the simpler and more general the better.
Bonus points if the output is tab-deliminated, etc, so that I can easily paste into word/excel.
Thanks!
Here's a function that works...
pvmat = function(emm, ...) {
emm = update(emm, by = NULL) # need to work harder otherwise
pv = test(pairs(emm, reverse = TRUE, ...)) $ p.value
fmtpv = sprintf("%6.4f", pv)
fmtpv[pv < 0.0001] = "<.0001"
lbls = do.call(paste, emm#grid[emm#misc$pri.vars])
n = length(lbls)
mat = matrix("", nrow = n, ncol = n, dimnames = list(lbls, lbls))
mat[upper.tri(mat)] = fmtpv
idx = seq_len(n - 1)
mat[idx, 1 + idx] # trim off last row and 1st col
}
Illustration:
require(emmeans)
> warp.lm = lm(breaks ~ wool * tension, data = warpbreaks)
> warp.emm = emmeans(warp.lm, ~ wool * tension)
> warp.emm
wool tension emmean SE df lower.CL upper.CL
A L 44.6 3.65 48 37.2 51.9
B L 28.2 3.65 48 20.9 35.6
A M 24.0 3.65 48 16.7 31.3
B M 28.8 3.65 48 21.4 36.1
A H 24.6 3.65 48 17.2 31.9
B H 18.8 3.65 48 11.4 26.1
Confidence level used: 0.95
> pm = pvmat(warp.emm, adjust = "none")
> print(pm, quote=FALSE)
B L A M B M A H B H
A L 0.0027 0.0002 0.0036 0.0003 <.0001
B L 0.4170 0.9147 0.4805 0.0733
A M 0.3589 0.9147 0.3163
B M 0.4170 0.0584
A H 0.2682
Notes
As provided, this does not support by variables. Accordingly, the first line of the function disables them.
Using pairs(..., reverse = TRUE) generates the P values in the correct order needed later for upper.tri()
you can pass arguments to test() via ...
To create a tab-delimited version, use the clipr package:
clipr::write_clip(pm)
What you need is now in the clipboard and ready to paste into a spreadsheet.
Addendum
Answering this question inspired me to add a new function pwpm() to the emmeans package. It will appear in the next CRAN release, and is available now from the github site. It displays means and differences as well as P values; but the user may select which to include.
> pwpm(warp.emm)
wool = A
L M H
L [44.6] 0.0007 0.0009
M 20.556 [24.0] 0.9936
H 20.000 -0.556 [24.6]
wool = B
L M H
L [28.2] 0.9936 0.1704
M -0.556 [28.8] 0.1389
H 9.444 10.000 [18.8]
Row and column labels: tension
Upper triangle: P values adjust = “tukey”
Diagonal: [Estimates] (emmean)
Upper triangle: Comparisons (estimate) earlier vs. later
I am using LME model defined like:
mod4.lme <- lme(pRNFL ~ Init.Age + Status + I(Time^2), random= ~1|Patient/EyeID,data = long1, na.action = na.omit)
The output is:
> summary(mod4.lme)
Linear mixed-effects model fit by REML
Data: long1
AIC BIC logLik
2055.295 2089.432 -1018.647
Random effects:
Formula: ~1 | Patient
(Intercept)
StdDev: 7.949465
Formula: ~1 | EyeID %in% Patient
(Intercept) Residual
StdDev: 12.10405 2.279917
Fixed effects: pRNFL ~ Init.Age + Status + I(Time^2)
Value Std.Error DF t-value p-value
(Intercept) 97.27827 6.156093 212 15.801950 0.0000
Init.Age 0.02114 0.131122 57 0.161261 0.8725
StatusA -27.32643 3.762155 212 -7.263504 0.0000
StatusF -23.31652 3.984353 212 -5.852023 0.0000
StatusN -0.28814 3.744980 57 -0.076940 0.9389
I(Time^2) -0.06498 0.030223 212 -2.149921 0.0327
Correlation:
(Intr) Int.Ag StatsA StatsF StatsN
Init.Age -0.921
StatusA -0.317 0.076
StatusF -0.314 0.088 0.834
StatusN -0.049 -0.216 0.390 0.365
I(Time^2) -0.006 -0.004 0.001 -0.038 -0.007
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3565641 -0.4765840 0.0100608 0.4670792 2.7775392
Number of Observations: 334
Number of Groups:
Patient EyeID %in% Patient
60 119
I wanted to get comparisons between my 'Status' factors (named A, N, F and H). So I did a emmeans model using this code:
emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
The output for this, is:
> emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
$emmeans
Status emmean SE df lower.CL upper.CL
H 98.13515 2.402248 57 93.32473 102.94557
A 70.80872 2.930072 57 64.94135 76.67609
F 74.81863 3.215350 57 68.38000 81.25726
N 97.84701 2.829706 57 92.18062 103.51340
Degrees-of-freedom method: containment
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
H - A 27.3264289 3.762155 212 7.264 <.0001
H - F 23.3165220 3.984353 212 5.852 <.0001
H - N 0.2881375 3.744980 57 0.077 1.0000
A - F -4.0099069 2.242793 212 -1.788 0.4513
A - N -27.0382913 4.145370 57 -6.523 <.0001
F - N -23.0283844 4.359019 57 -5.283 <.0001
The answer is yes, emmeans does the calculation based on the model
I am using LME model defined like:
mod4.lme <- lme(pRNFL ~ Init.Age + Status + I(Time^2), random= ~1|Patient/EyeID,data = long1, na.action = na.omit)
The output is:
> summary(mod4.lme)
Linear mixed-effects model fit by REML
Data: long1
AIC BIC logLik
2055.295 2089.432 -1018.647
Random effects:
Formula: ~1 | Patient
(Intercept)
StdDev: 7.949465
Formula: ~1 | EyeID %in% Patient
(Intercept) Residual
StdDev: 12.10405 2.279917
Fixed effects: pRNFL ~ Init.Age + Status + I(Time^2)
Value Std.Error DF t-value p-value
(Intercept) 97.27827 6.156093 212 15.801950 0.0000
Init.Age 0.02114 0.131122 57 0.161261 0.8725
StatusA -27.32643 3.762155 212 -7.263504 0.0000
StatusF -23.31652 3.984353 212 -5.852023 0.0000
StatusN -0.28814 3.744980 57 -0.076940 0.9389
I(Time^2) -0.06498 0.030223 212 -2.149921 0.0327
Correlation:
(Intr) Int.Ag StatsA StatsF StatsN
Init.Age -0.921
StatusA -0.317 0.076
StatusF -0.314 0.088 0.834
StatusN -0.049 -0.216 0.390 0.365
I(Time^2) -0.006 -0.004 0.001 -0.038 -0.007
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3565641 -0.4765840 0.0100608 0.4670792 2.7775392
Number of Observations: 334
Number of Groups:
Patient EyeID %in% Patient
60 119
I wanted to get comparisons between my 'Status' factors (named A, N, F and H). So I did a emmeans model using this code:
emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
The output for this, is:
> emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
$emmeans
Status emmean SE df lower.CL upper.CL
H 98.13515 2.402248 57 93.32473 102.94557
A 70.80872 2.930072 57 64.94135 76.67609
F 74.81863 3.215350 57 68.38000 81.25726
N 97.84701 2.829706 57 92.18062 103.51340
Degrees-of-freedom method: containment
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
H - A 27.3264289 3.762155 212 7.264 <.0001
H - F 23.3165220 3.984353 212 5.852 <.0001
H - N 0.2881375 3.744980 57 0.077 1.0000
A - F -4.0099069 2.242793 212 -1.788 0.4513
A - N -27.0382913 4.145370 57 -6.523 <.0001
F - N -23.0283844 4.359019 57 -5.283 <.0001
The answer is yes, emmeans does the calculation based on the model