Is the emmeans (R) intercept-only function broken? - r

I've noticed that emmeans (in R) isn't working for an intercept-only estimate after the latest update.
Reproducible example:
test=lm(mpg~1,mtcars)
library(emmeans)
emmeans::emmeans(test,~1)
The output on 2 of my machines (windows and Linux) is:
> emmeans::emmeans(test,~1)
Error in `[[<-.data.frame`(`*tmp*`, ".wgt.", value = 2) :
replacement has 1 row, data has 0
Is this a known issue, or have I messed up my system somehow?
This used to work I believe.
It does work if you include a variable:
test2=lm(mpg~as.factor(cyl),mtcars)
emmeans(test2,~cyl)
Thanks very much for the help in advance.

It turns out that the fix for issue #197 -- and incorporated in CRAN version 1.47 -- created the issue (#206) that we see here. I think I have them both fixed now:
require(emmeans)
## Loading required package: emmeans
#206...
warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks)
emmeans(warp.lm, "1")
## 1 emmean SE df lower.CL upper.CL
## overall 28.1 1.49 48 25.2 31.1
##
## Results are averaged over the levels of: wool, tension
## Confidence level used: 0.95
emmeans(warp.lm, "1", by = "wool")
## wool = A:
## 1 emmean SE df lower.CL upper.CL
## overall 31.0 2.11 48 26.8 35.3
##
## wool = B:
## 1 emmean SE df lower.CL upper.CL
## overall 25.3 2.11 48 21.0 29.5
##
## Results are averaged over the levels of: tension
## Confidence level used: 0.95
#197...
model <- lm(Sepal.Length ~ poly(Petal.Length,2), data = iris)
emtrends(model, ~ 1, "Petal.Length", max.degree = 2)
## degree = linear:
## 1 Petal.Length.trend SE df lower.CL upper.CL
## overall 0.4474 0.0180 147 0.4119 0.483
##
## degree = quadratic:
## 1 Petal.Length.trend SE df lower.CL upper.CL
## overall 0.0815 0.0132 147 0.0554 0.108
##
## Confidence level used: 0.95
Created on 2020-06-01 by the reprex package (v0.3.0)
Users who need this now can install from github via
remotes::install_github("rvlenth/emmeans")

It is working fine with emmeans - 1.4.6 on macOS Catalina 10.15.4 and R 4.0
emmeans::emmeans(test,~1)
# 1 emmean SE df lower.CL upper.CL
# overall 20.1 1.07 31 17.9 22.3
#Confidence level used: 0.95

Related

Effect size (Cohen's d) for pairwise comparisons

I'm trying to calculate the effect size among different factor levels. To compare the two means within each factor level, the code below works fine:
cohens_d_list <- by(mydata, mydata$factor, function(sub)
cohens_d(sub$score1, sub$score2)
)
cohens_d_list
However, I couldn't figure out how to compare each factor level for a single mean (e.g. for score1, I want to compare each factor level with each other: factor level 1 vs. factor level 2, factor level 1 vs. factor level 3, factor level 1. vs factor level 4....) with each other. I used psych, effectsize, and effsize packages, but they don't seem to account for more than 2 levels in a single factor variable. Any suggestions for a code or package?
After trying dozens of packages, esvis package did the trick.
df%>%
ungroup(Group)%>% # Include this line if you get grouping error
coh_d(score1~ Group)
You get a nice table with all possible comparisons.
You can fit a model and use the eff_size() function from emmeans (which will have the benefit of using the pooled SD from all groups, not just the 2 being compared):
m <- lm(mpg ~ factor(cyl), data = mtcars)
library(emmeans)
(em <- emmeans(m, ~ cyl))
#> cyl emmean SE df lower.CL upper.CL
#> 4 26.7 0.972 29 24.7 28.7
#> 6 19.7 1.218 29 17.3 22.2
#> 8 15.1 0.861 29 13.3 16.9
#>
#> Confidence level used: 0.95
eff_size(em, sigma = sigma(m), edf = df.residual(m))
#> contrast effect.size SE df lower.CL upper.CL
#> 4 - 6 2.15 0.56 29 1.003 3.29
#> 4 - 8 3.59 0.62 29 2.320 4.86
#> 6 - 8 1.44 0.50 29 0.418 2.46
#>
#> sigma used for effect sizes: 3.223
#> Confidence level used: 0.95
Created on 2021-06-07 by the reprex package (v2.0.0)

lme4 deviant/tratment contrast coding with interactions in R - levels are missing

I have a mixed effects model (with lme4) with a 2-way interaction term, each term having multiple levels (each 4) and I would like to investigate their effects in reference to their grand mean. I present this example here from the car data set and omit the error term since it is not neccessary for this example:
## shorten data frame for simplicity
df=Cars93[c(1:15),]
df=Cars93[is.element(Cars93$Make,c('Acura Integra', 'Audi 90','BMW 535i','Subaru Legacy')),]
df$Make=drop.levels(df$Make)
df$Model=drop.levels(df$Model)
## define contrasts (every factor has 4 levels)
contrasts(df$Make) = contr.treatment(4)
contrasts(df$Model) = contr.treatment(4)
## model
m1 <- lm(Price ~ Model*Make,data=df)
summary(m1)
as you can see, the first levels are omitted in the interaction term. And I would like to have all 4 levels in the output, referenced to the grand mean (often referred to deviant coding). These are the sources I looked at: https://marissabarlaz.github.io/portfolio/contrastcoding/#coding-schemes and How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?. The last reference does not report interactions though.
The simple answer is that what you want is not possible directly. You have to use a slightly different approach.
In a model with interactions, you want to use contrasts in which the mean is zero and not a specific level. Otherwise, the lower-order effects (i.e., main effects) are not main effects but simple effects (evaluated when the other factor level is at its reference level). This is explained in more details in my chapter on mixed models:
http://singmann.org/download/publications/singmann_kellen-introduction-mixed-models.pdf
To get what you want, you have to fit the model in a reasonable manner and then pass it to emmeans to compare against the intercept (i.e., the unweighted grand mean). This works also for interactions as shown below (as your code did not work, I use warpbreaks).
afex::set_sum_contrasts() ## uses contr.sum globally
library("emmeans")
## model
m1 <- lm(breaks ~ wool * tension,data=warpbreaks)
car::Anova(m1, type = 3)
coef(m1)[1]
# (Intercept)
# 28.14815
## both CIs include grand mean:
emmeans(m1, "wool")
# wool emmean SE df lower.CL upper.CL
# A 31.0 2.11 48 26.8 35.3
# B 25.3 2.11 48 21.0 29.5
#
# Results are averaged over the levels of: tension
# Confidence level used: 0.95
## same using test
emmeans(m1, "wool", null = coef(m1)[1], infer = TRUE)
# wool emmean SE df lower.CL upper.CL null t.ratio p.value
# A 31.0 2.11 48 26.8 35.3 28.1 1.372 0.1764
# B 25.3 2.11 48 21.0 29.5 28.1 -1.372 0.1764
#
# Results are averaged over the levels of: tension
# Confidence level used: 0.95
emmeans(m1, "tension", null = coef(m1)[1], infer = TRUE)
# tension emmean SE df lower.CL upper.CL null t.ratio p.value
# L 36.4 2.58 48 31.2 41.6 28.1 3.196 0.0025
# M 26.4 2.58 48 21.2 31.6 28.1 -0.682 0.4984
# H 21.7 2.58 48 16.5 26.9 28.1 -2.514 0.0154
#
# Results are averaged over the levels of: wool
# Confidence level used: 0.95
emmeans(m1, c("tension", "wool"), null = coef(m1)[1], infer = TRUE)
# tension wool emmean SE df lower.CL upper.CL null t.ratio p.value
# L A 44.6 3.65 48 37.2 51.9 28.1 4.499 <.0001
# M A 24.0 3.65 48 16.7 31.3 28.1 -1.137 0.2610
# H A 24.6 3.65 48 17.2 31.9 28.1 -0.985 0.3295
# L B 28.2 3.65 48 20.9 35.6 28.1 0.020 0.9839
# M B 28.8 3.65 48 21.4 36.1 28.1 0.173 0.8636
# H B 18.8 3.65 48 11.4 26.1 28.1 -2.570 0.0133
#
# Confidence level used: 0.95
Note that for coef() you probably want to use fixef() for lme4 models.

Estimating effect size with emmenas for post hoc

Is there a way to have effect size (such as Cohen's d or the most appropriate) directly using emmeans()?
I cannot find anything for obtaining effect size by using emmeans()
post <- emmeans(fit, pairwise~ favorite.pirate | sex)
emmip(fit, ~ favorite.pirate | sex)
There is not a built-in provision for effect-size calculations, but you can cobble one together by defining a custom contrast function that divides each pairwise comparison by a value of sigma:
mypw.emmc = function(..., sigma = 1) {
result = emmeans:::pairwise.emmc (...)
for (i in seq_along(result[1, ]))
result[[i]] = result[[i]] / sigma
result
}
Here's a test run:
> mypw.emmc(1:3, sigma = 4)
1 - 2 1 - 3 2 - 3
1 0.25 0.25 0.00
2 -0.25 0.00 0.25
3 0.00 -0.25 -0.25
With your model, the error SD is 9.246 (look at summary(fit); so, ...
> emmeans(fit, mypw ~ sex, sigma = 9.246, name = "effect.size")
NOTE: Results may be misleading due to involvement in interactions
$emmeans
sex emmean SE df lower.CL upper.CL
female 63.8 0.434 3.03 62.4 65.2
male 74.5 0.809 15.82 72.8 76.2
other 68.8 1.439 187.08 65.9 71.6
Results are averaged over the levels of: favorite.pirate
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
$contrasts
effect.size estimate SE df t.ratio p.value
female - male -1.158 0.0996 399 -11.624 <.0001
female - other -0.537 0.1627 888 -3.299 0.0029
male - other 0.621 0.1717 981 3.617 0.0009
Results are averaged over the levels of: favorite.pirate
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 3 estimates
Some words of caution though:
The SEs of the effect sizes are misleading because they don't account for the variation in sigma.
This is not a very good example because
a. The factors interact (Edward Low is different in his profile).
Also, see the warning message.
b. The model is singular (as warned when the model was fitted), yielding an estimated variance of zero for college)
library(yarrr)
View(pirates)
library(lme4)
library(lmerTest)
fit <- lmer(weight~ favorite.pirate * sex +(1|college), data = pirates)
anova(fit, ddf = "Kenward-Roger")
post <- emmeans(fit, pairwise~ sex)
post

How to get absolute difference estimate and confidence intervals from log(x+1) variable with emmeans

I have a mixed effect model with a log(x+1) transformed response variable. The output from emmeans with the type as "response" provides the mean and confidence intervals for both groups that I am comparing. However what I want is the mean and CI of the difference between the groups (i.e. the estimate). emmeans only provides the ratio (with type="response") or the log ratio (with type="link") and I am unsure how to change this into absolute values. If you run the model without the log(x+1) transformation then emmeans provides the estimated difference and CI around this difference, not the ratios. How can I also do this when my response variable is log(x+1) transformed?
bmnameF.lme2 = lme(log(bm+1)~TorC*name, random=~TorC|site,
data=matched.cases3F, method='REML')
emmeans(lme, pairwise~TorC,
type='response')%>%confint(OmeanFHR[[2]])%>%as.data.frame
emmeans.TorC emmeans.emmean emmeans.SE emmeans.df emmeans.lower.CL emmeans.upper.CL contrasts.contrast contrasts.estimate contrasts.SE contrasts.df contrasts.lower.CL contrasts.upper.CL
Managed 376.5484 98.66305 25 219.5120 645.9267 Managed - Open 3.390123 1.068689 217 1.821298 6.310297
Open 111.0722 43.15374 25 49.8994 247.2381 Managed - Open 3.390123 1.068689 217 1.821298 6.310297
Let me show a different example so the results are reproducible to all viewers:
mod = lm(log(breaks+1) ~ wool*tension, data = warpbreaks)
As you see, with a log transformation, comparisons/contrasts are expressed as ratios by default. But this can be changed by specifying transform instead of type in the emmeans() call:
> emmeans(mod, pairwise ~ tension|wool, transform = "response")
$emmeans
wool = A:
tension response SE df lower.CL upper.CL
L 42.3 5.06 48 32.1 52.4
M 23.6 2.83 48 17.9 29.3
H 23.7 2.83 48 18.0 29.4
wool = B:
tension response SE df lower.CL upper.CL
L 27.7 3.32 48 21.0 34.4
M 28.4 3.40 48 21.6 35.3
H 19.3 2.31 48 14.6 23.9
Confidence level used: 0.95
$contrasts
wool = A:
contrast estimate SE df t.ratio p.value
L - M 18.6253 5.80 48 3.213 0.0065
L - H 18.5775 5.80 48 3.204 0.0067
M - H -0.0479 4.01 48 -0.012 0.9999
wool = B:
contrast estimate SE df t.ratio p.value
L - M -0.7180 4.75 48 -0.151 0.9875
L - H 8.4247 4.04 48 2.086 0.1035
M - H 9.1426 4.11 48 2.224 0.0772
P value adjustment: tukey method for comparing a family of 3 estimates
Or, you can do this later via the regrid() function:
emm1 = emmeans(mod, ~ tension | wool)
emm2 = regrid(emm1)
emm2 # estimates
pairs(emm2) # comparisons
regrid() creates a new emmGrid object where everything is already back-transformed, thus side-stepping the behavior that happens with contrasts of log-transformed results. (In the previous illustration, the transform argument just calls regrid after it constructs the reference grid.)
But there is another subtle thing going on: The transformation is auto-detected as log; the +1 part is ignored. Thus, the back-transformed estimates are all too large by 1. To get this right, you need to use the make.tran() function to create this generalization of the log transformation:
> emm3 = update(emmeans(mod, ~ tension | wool), tran = make.tran("genlog", 1))
> str(emm3)
'emmGrid' object with variables:
tension = L, M, H
wool = A, B
Transformation: “log(mu + 1)”
> regrid(emm3)
wool = A:
tension response SE df lower.CL upper.CL
L 41.3 5.06 48 31.1 51.4
M 22.6 2.83 48 16.9 28.3
H 22.7 2.83 48 17.0 28.4
wool = B:
tension response SE df lower.CL upper.CL
L 26.7 3.32 48 20.0 33.4
M 27.4 3.40 48 20.6 34.3
H 18.3 2.31 48 13.6 22.9
Confidence level used: 0.95
The comparisons will come out the same as shown earlier, because offsetting all the means by 1 doesn't affect the pairwise differences.
See vignette("transformations", "emmeans") or https://cran.r-project.org/web/packages/emmeans/vignettes/transformations.html for more details.

How to set confidence intervals in "emmeans"

I am have been working with the emmeans package to create an estimated marginal means for my data at .95% confidence level. Although I cannot seem to change it to .99% confidence level. Any help would be much appreciated. Usually I would use the "levels=" function but it does not seem to exist for emmeans.
library(emmeans)
emmeans(AcidLevels1, specs=~MAScore,)
Best,
-Nathan
emmeans provides method confint.emmGrid to recalculate confidence intervals, and (probably more importantly) also adjust for multiple hypothesis testing.
As you don't provide sample data, here is an example using the warpbreaks data.
library(emmeans)
lm <- lm(breaks ~ wool * tension, data = warpbreaks)
emm <- emmeans(lm, ~ wool | tension);
To recalculate confidence intervals at the 99% level (without correcting for multiple testing) do
confint(emm, adjust = "none", level = 0.99)
#tension = L:
# wool emmean SE df lower.CL upper.CL
# A 44.55556 3.646761 48 34.77420 54.33691
# B 28.22222 3.646761 48 18.44086 38.00358
#
#tension = M:
# wool emmean SE df lower.CL upper.CL
# A 24.00000 3.646761 48 14.21864 33.78136
# B 28.77778 3.646761 48 18.99642 38.55914
#
#tension = H:
# wool emmean SE df lower.CL upper.CL
# A 24.55556 3.646761 48 14.77420 34.33691
# B 18.77778 3.646761 48 8.99642 28.55914
#
#Confidence level used: 0.99
To recalculate CIs at the 99% level and correct for multiple hypothesis testing using the Bonferroni correction you can do
confint(emm, adjust = "bonferroni", level = 0.99)
#tension = L:
# wool emmean SE df lower.CL upper.CL
# A 44.55556 3.646761 48 33.82454 55.28657
# B 28.22222 3.646761 48 17.49120 38.95324
#
#tension = M:
# wool emmean SE df lower.CL upper.CL
# A 24.00000 3.646761 48 13.26898 34.73102
# B 28.77778 3.646761 48 18.04676 39.50880
#
#tension = H:
# wool emmean SE df lower.CL upper.CL
# A 24.55556 3.646761 48 13.82454 35.28657
# B 18.77778 3.646761 48 8.04676 29.50880
#
#Confidence level used: 0.99
#Conf-level adjustment: bonferroni method for 2 estimates

Resources