Direction of Estimate coefficient in a Log Regression - r

I'm analysing ordinal logistic regression and I'm wondering, how to know which direction the estimate coefficient has? My Variables are just 0, 1 for Women, Men and 0,1,2,4 for different postures. So my question is, how do I know, if the estimate describes the change from 0 to 1 or the change from 1 to 0, talking about gender?
The output added a 1 to PicSex, is it a sign, that this one has a 1->0 direction? See the code for that.
Thank you for any help
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: Int ~ PicSex + Posture + (1 | PicID)
data: x
Random effects:
Groups Name Variance Std.Dev.
PicID (Intercept) 0.0541 0.2326
Number of groups: PicID 16
Coefficients:
Estimate Std. Error z value Pr(>|z|)
PicSex1 0.3743 0.1833 2.042 0.0411 *
Posture -1.1232 0.1866 -6.018 1.77e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P.S.
Thank you here are my head results (I relabeled PicSex to Sex)
> head(Posture)
[1] 1 1 1 1 1 1
Levels: 0 1
> head(Sex)
[1] 0 0 0 0 0 0
Levels: 0 1
So the level order is the same, but on Sex it still added a 1 but on posture not. Still very confused about the directions.

Your sex has two levels,0 or 1. So PicSex1 means the effect of PicSex being 1 compared to PicSex being 0. I show an example below using the wine dataset:
library(ordinal)
DATA = wine
> head(DATA$temp)
[1] cold cold cold cold warm warm
Levels: cold warm
Here cold comes first in Levels, so it is set as the reference in any linear models.First we verify the effect of cold vs warm
do.call(cbind,tapply(DATA$rating,DATA$temp,table))
#warm has a higher average rating
Fit the model
# we fit the a model, temp is fixed effect
summary(clmm(rating ~ temp + contact+(1|judge), data = DATA))
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: rating ~ temp + contact + (1 | judge)
data: DATA
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 72 -81.57 177.13 332(999) 1.03e-05 2.8e+01
Random effects:
Groups Name Variance Std.Dev.
judge (Intercept) 1.279 1.131
Number of groups: judge 9
Coefficients:
Estimate Std. Error z value Pr(>|z|)
tempwarm 3.0630 0.5954 5.145 2.68e-07 ***
contactyes 1.8349 0.5125 3.580 0.000344 ***
Here we see warm being attached to "temp" and as we know, it has a positive coefficient because the rating is better in warm, compared to cold (the reference).
So if you set another group as the reference, you will see another name attached, and the coefficient is reversed (-3.. compared to +3.. in previous example)
# we set warm as reference now
DATA$temp = relevel(DATA$temp,ref="warm")
summary(clmm(rating ~ temp + contact+(1|judge), data = DATA))
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: rating ~ temp + contact + (1 | judge)
data: DATA
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 72 -81.57 177.13 269(810) 1.14e-04 1.8e+01
Random effects:
Groups Name Variance Std.Dev.
judge (Intercept) 1.28 1.131
Number of groups: judge 9
Coefficients:
Estimate Std. Error z value Pr(>|z|)
tempcold -3.0630 0.5954 -5.145 2.68e-07 ***
contactyes 1.8349 0.5125 3.580 0.000344 ***
So always check what is the reference before you fit the model

Related

simr: powerSim gives 100% for all effect sizes

I have carried out a binomial GLMM to determine how latitude and native status (native/non-native) of a set of plant species affects herbivory damage. I am now trying to determine the statistical power of my model when I change the effect sizes. My model looks like this:
latglmm <- glmer(cbind(chewing,total.cells-chewing) ~ scale(latitude) * native.status + scale(sample.day.of.year) + (1|genus) + (1|species) + (1|catalogue.number), family=binomial, data=mna)
where cbind(chewing,total.cells-chewing) gives me a proportion (of leaves with herbivory damage), native.status is either "native" or "non-native" and catalogue.number acts as an observation-level random effect to deal with overdispersion. There are 10 genus, each with at least 1 native and 1 non-native species to make 26 species in total. The model summary is:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: cbind(chewing, total.cells - chewing) ~ scale(latitude) * native.status +
scale(sample.day.of.year) + (1 | genus) + (1 | species) + (1 | catalogue.number)
Data: mna
AIC BIC logLik deviance df.resid
3986.7 4023.3 -1985.4 3970.7 706
Scaled residuals:
Min 1Q Median 3Q Max
-1.3240 -0.4511 -0.0250 0.1992 1.0765
Random effects:
Groups Name Variance Std.Dev.
catalogue.number (Intercept) 1.26417 1.1244
species (Intercept) 0.08207 0.2865
genus.ID (Intercept) 0.33431 0.5782
Number of obs: 714, groups: catalogue.number, 713; species, 26; genus.ID, 10
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.61310 0.20849 -12.534 < 2e-16 ***
scale(latitude) -0.17283 0.06370 -2.713 0.00666 **
native.statusnon-native 0.11434 0.15554 0.735 0.46226
scale(sample.day.of.year) 0.28521 0.05224 5.460 4.77e-08 ***
scale(latitude):native.statusnon-native -0.02986 0.09916 -0.301 0.76327
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) scallt ntv.s- scaldy
scalelat 0.012
ntv.sttsnn- -0.304 -0.014
scaledoy 0.018 -0.085 -0.027
scllt:ntv.- -0.011 -0.634 0.006 -0.035
I should add that the actual model I have been using is a glmmTMB model as lme4 still had some overdispersion even with the observation-level random effect, but this is not compatible with simr so I am using lme4 (the results are very similar for both). I want to see what happens to the model power when I increase or decrease the effect sizes of latitude and native status but when I run fixef(latglmm1)["scale(latitude)"]<--1 and fixef(latglmm1)["native.statusnon-native"]<--1 and try this:
powerSim(latglmm, fcompare(~ scale(latitude) + native.status))
I get the following output:
Power for model comparison, (95% confidence interval):====================================================================|
100.0% (69.15, 100.0)
Test: Likelihood ratio
Comparison to ~scale(latitude) + native.status + [re]
Based on 10 simulations, (0 warnings, 0 errors)
alpha = 0.05, nrow = 1428
Time elapsed: 0 h 1 m 5 s
The output is the same (100% power) no matter what I change fixef() to. Based on other similar questions online I have ensured that my data has no NA values and according to my powerSim there are no warnings or errors to address. I am completely lost as to why this isn't working so any help would be greatly appreciated!
Alternatively, if anyone has any recommendations for other methods to carry out similar analysis I would love to hear them. What I really want is to get a p-value for each effect size I input but statistical power would be very valuable too.
Thank you!

Fitting a ordinal logistic mixed effect model

How do I fit a ordinal (3 levels), logistic mixed effect model, in R? I guess it would be like a glmer except with three outcome levels.
data structure
patientid Viral_load Adherence audit_score visit
1520 0 optimal nonhazardous 1
1520 0 optimal nonhazardous 2
1520 0 optimal hazardous 3
1526 1 suboptimal hazardous 1
1526 0 optimal hazardous 2
1526 0 optimal hazardous 3
1568 2 suboptimal hazardous 1
1568 2 suboptimal nonhazardous 2
1568 2 suboptimal nonhazardous 3
Where viral load (outcome of interest) consists of three levels (0,1,2), adherence - optimal/suboptimal , audit score nonhazardous/hazardous, and 3 visits.
So an example of how the model should look using a generalized mixed effect model code.
library (lme4)
test <- glmer(viral_load ~ audit_score + adherence + (1|patientid) + (1|visit), data = df,family = binomial)
summary (test)
The results from this code is incorrect because it takes viral_load a binomial outcome.
I hope my question is clear.
You might try the ordinal packages clmm function:
fmm1 <- clmm(rating ~ temp + contact + (1|judge), data = wine)
summary(fmm1)
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: rating ~ temp + contact + (1 | judge)
data: wine
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 72 -81.57 177.13 332(999) 1.02e-05 2.8e+01
Random effects:
Groups Name Variance Std.Dev.
judge (Intercept) 1.279 1.131
Number of groups: judge 9
Coefficients:
Estimate Std. Error z value Pr(>|z|)
tempwarm 3.0630 0.5954 5.145 2.68e-07 ***
contactyes 1.8349 0.5125 3.580 0.000344 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Threshold coefficients:
Estimate Std. Error z value
1|2 -1.6237 0.6824 -2.379
2|3 1.5134 0.6038 2.507
3|4 4.2285 0.8090 5.227
4|5 6.0888 0.9725 6.261
I'm pretty sure that the link is logistic, since running the same model with the more flexible clmm2 function, where the default link is documented to be logistic, I get the same results.

Why don't I see all categories of explanatory variables in the output of my glmmTMB?

I have a dataset with both numeric and categorical variables, which I would like to include in a generalized mixed model. When I do so, the ouptut of the conditional model always "forgets" one category.
For example, in this model I include the proportion of vigilance on the total time of detected per video as response variable, and as explanatory variables: urine intensity (numeric), treatment (0 for no urine, 1 for urine), diel_period (dawn, dusk, night, day), sex (Male, Female, Undefined), height (of trees, numeric). And my 50 cameras as a random grouping effect (1 to 50).
bBI_mod8 <- glmmTMB(cbind(vigilance, total_time_behaviour - vigilance) ~
urine_intensity_heatmap + treatment + diel_period + sex + height + (1|camera),
ziformula = ~1, data = df_behaviour, family = "betabinomial")
The vigilance proportion follows a zero-inflated beta binomial regression.
summary(bBI_mod8)
When I show the output, I observe:
Family: betabinomial ( logit )
Formula: cbind(vigilance, total_time_behaviour - vigilance) ~ urine_intensity_heatmap +
treatment + diel_period + sex + height + (1 | camera)
Zero inflation: ~1
Data: df_behaviour
AIC BIC logLik deviance df.resid
2973.8 3037.1 -1474.9 2949.8 1439
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
camera (Intercept) 0.1583 0.3979
Number of obs: 1451, groups: camera, 50
Overdispersion parameter for betabinomial family (): 1.85
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.907429 0.471376 -1.925 0.054222 .
urine_intensity_heatmap -0.009844 0.004721 -2.085 0.037034 *
treatment1 -0.219403 0.154396 -1.421 0.155304
diel_periodDay -0.337329 0.235033 -1.435 0.151218
diel_periodDusk -0.543771 0.285322 -1.906 0.056675 .
diel_periodNight -0.553826 0.274879 -2.015 0.043925 *
sexMale -0.772731 0.168350 -4.590 4.43e-06 ***
sexUndefined -1.010425 0.271876 -3.716 0.000202 ***
height 0.001713 0.012352 0.139 0.889681
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Zero-inflation model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.6685 0.4298 -1.556 0.12
My problem is, as you can see, for my categorical variables, there is always one category that is omitted:
treatment1 but not treatment0
diel_periodDay, diel_periodDusk, diel_periodNight but not diel_periodDawn
sexMale, sexUndefined but not sexFemale
How can I solve this problem? Or how can I show a completer output?
In the output of generalized linear models, the estimates shown are what the effect is compared to the reference level. Unless specified, reference levels will be automatically selected based on alphabetical order.
In the above summary, using sex as an example, the estimate you see for example for sexMale, is the effect of being Male compared to being female. For treatment, that is what the effect is compared to treatment0. For diel, the same logic applies.
You can override this by manually setting the reference level to what you prefer. As is, your reference levels is Female, treatment0, diel_periodDawn based solely on alphabetical order.

Likelihood Ratio Test for LMM gives P-Value of 1?

I did an experiment in which people had to give answers to moral dilemmas that were either personal or impersonal. I now want to see if there is an interaction between the type of dilemma and the answer participants gave (yes or no) that influences their reaction time.
For this, I computed a Linear Mixed Model using the lmer()-function of the lme4-package.
My Data looks like this:
subject condition gender.b age logRT answer dilemma pers_force
1 105 a_MJ1 1 27 5.572154 1 1 1
2 107 b_MJ3 1 35 5.023881 1 1 1
3 111 a_MJ1 1 21 5.710427 1 1 1
4 113 c_COA 0 31 4.990433 1 1 1
5 115 b_MJ3 1 23 5.926926 1 1 1
6 119 b_MJ3 1 28 5.278115 1 1 1
My function looks like this:
lmm <- lmer(logRT ~ pers_force * answer + (1|subject) + (1|dilemma),
data = dfb.3, REML = FALSE, control = lmerControl(optimizer="Nelder_Mead"))
with subjects and dilemmas as random factors. This is the output:
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: logRT ~ pers_force * answer + (1 | subject) + (1 | dilemma)
Data: dfb.3
Control: lmerControl(optimizer = "Nelder_Mead")
AIC BIC logLik deviance df.resid
-13637.3 -13606.7 6825.6 -13651.3 578
Scaled residuals:
Min 1Q Median 3Q Max
-3.921e-07 -2.091e-07 2.614e-08 2.352e-07 6.273e-07
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 3.804e-02 1.950e-01
dilemma (Intercept) 0.000e+00 0.000e+00
Residual 1.155e-15 3.398e-08
Number of obs: 585, groups: subject, 148; contrasts, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.469e+00 1.440e-02 379.9
pers_force1 -1.124e-14 5.117e-09 0.0
answer -1.095e-15 4.678e-09 0.0
pers_force1:answer -3.931e-15 6.540e-09 0.0
Correlation of Fixed Effects:
(Intr) prs_f1 answer
pers_force1 0.000
answer 0.000 0.447
prs_frc1:aw 0.000 -0.833 -0.595
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular
I then did a Likelihood Ratio Test using a reduced model to obtain p-Values:
lmm_null <- lmer(logRT ~ pers_force + answer + (1|subject) + (1|dilemma),
data = dfb.3, REML = FALSE,
control = lmerControl(optimizer="Nelder_Mead"))
anova(lmm,lmm_null)
For both models, I get the warning "boundary (singular) fit: see ?isSingular", but if I drop one random effect to make the structure simpler, then I get the warning that the models failed to converge (which is a bit strange), so I ignored it.
But then, the LRT output looks like this:
Data: dfb.3
Models:
lmm_null: logRT ~ pers_force + answer + (1 | subject) + (1 | dilemma)
lmm: logRT ~ pers_force * answer + (1 | subject) + (1 | dilemma)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
lmm_null 6 -13639 -13613 6825.6 -13651
lmm 7 -13637 -13607 6825.6 -13651 0 1 1
As you can see, the Chi-Square value is 0 and the p-Value is exactly 1, which seems very strange. I guess something must have gone wrong here, but I can't figure out what.
You say
logRT is the average logarithmized reaction time across all those 4 dilemmas.
If I'm interpreting this correctly — i.e., each subject has the same response for all of the times they are observed — then this is the proximal cause of your problem. (I know I've seen this exact problem before, but I don't know where — here? r-sig-mixed-models#r-project.org?)
simulate data
library(lme4)
set.seed(101)
dd1 <- expand.grid(subject=factor(100:150), contrasts=factor(1:4))
dd1$answer <- rbinom(nrow(dd1),size=1,prob=0.5)
dd1$logRT <- simulate(~answer + contrasts + (1|subject),
family=gaussian,
newparams=list(beta=c(0,1,1,-1,2),theta=1,sigma=1),
newdata=dd1)[[1]]
regular fit
This is fine and gives answers close to the true parameters:
m1 <- lmer(logRT~answer + contrasts + (1|subject), data=dd1)Linear mixed model fit by REML ['lmerMod']
## Random effects:
## Groups Name Std.Dev.
## subject (Intercept) 1.0502
## Residual 0.9839
## Number of obs: 204, groups: subject, 51
## Fixed Effects:
## (Intercept) answer contrasts2 contrasts3 contrasts4
## -0.04452 0.85333 1.16785 -1.07847 1.99243
now average the responses by subject
We get a raft of warning messages, and the same pathologies you are seeing (residual variance and all parameter estimates other than the intercept are effectively zero). This is because lmer is trying to estimate residual variance from the within-subject variation, and we have gotten rid of it!
I don't know why you are doing the averaging. If this is unavoidable, and your design is the randomized-block type shown here (each subject sees all four dilemmas/contrasts), then you can't estimate the dilemma effects.
dd2 <- transform(dd1, logRT=ave(logRT,subject))
m2 <- update(m1, data=dd2)
## Random effects:
## Groups Name Std.Dev.
## subject (Intercept) 6.077e-01
## Residual 1.624e-05
## Number of obs: 204, groups: subject, 51
## Fixed Effects:
## (Intercept) answer contrasts2 contrasts3 contrasts4
## 9.235e-01 1.031e-10 -1.213e-11 -1.672e-15 -1.011e-11
Treating the dilemmas as a random effect won't do what you want (allow for individual-to-individual variability in how they were presented). That among-subject variability in the dilemmas is going to get lumped into the residual variability, where it belongs — I would recommend treating it as a fixed effect.

Significant interaction in linear mixed effect model but plot shows overlapping confidence intervals?

I try to show you as much as possible of the structure of the data and the results I produced.
The structure of the data is the following:
GroupID Person Factor2 Factor1 Rating
<int> <int> <fctr> <fctr> <int>
1 2 109 2 0 1
2 2 109 2 1 -2
3 2 104 1 0 4
4 2 236 1 1 1
5 2 279 1 1 2
6 2 179 2 1 0
Person is the participant ID, GroupID is the kind of stimulus rated, Factor 1 (levels 0 and 1) and Factor 2 (levels 1 and 2) are fixed factors and the Ratings are the outcome variables.
I am trying to print a plot for a significant interaction in a linear mixed effect model. I used the packages lme4 and lmerTest to analyze the data.
This is the model we ran:
> model_interaction <- lmer(Rating ~ Factor1 * Factor2 + ( 1 | Person) +
(1 | GroupID), data)
> model_interaction
Linear mixed model fit by REML ['merModLmerTest']
Formula: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
Data: data
REML criterion at convergence: 207223.9
Random effects:
Groups Name Std.Dev.
Person (Intercept) 1.036
GroupID (Intercept) 1.786
Residual 1.880
Number of obs: 50240, groups: Person, 157; GroupID, 80
Fixed Effects:
(Intercept) Factor11 Factor22 Factor11:Factor22
-0.43823 0.01313 0.08568 0.12440
When I use the summary() function R returns the following output
> summary(model_interaction)
Linear mixed model fit by REML
t-tests use Satterthwaite approximations to degrees of freedom
['lmerMod']
Formula: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
Data: data
REML criterion at convergence: 207223.9
Scaled residuals:
Min 1Q Median 3Q Max
-4.8476 -0.6546 -0.0213 0.6516 4.2284
Random effects:
Groups Name Variance Std.Dev.
Person (Intercept) 1.074 1.036
GroupID (Intercept) 3.191 1.786
Residual 3.533 1.880
Number of obs: 50240, groups: Person, 157; GroupID, 80
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -4.382e-01 2.185e-01 1.110e+02 -2.006 0.047336 *
Factor11 1.313e-02 2.332e-02 5.004e+04 0.563 0.573419
Factor22 8.568e-02 6.275e-02 9.793e+03 1.365 0.172138
Factor11:Factor22 1.244e-01 3.385e-02 5.002e+04 3.675 0.000238 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) Fctr11 Fctr22
Factor11 -0.047
Factor22 -0.135 0.141
Fctr11:Fc22 0.034 -0.694 -0.249
I know it is not possible to interpret p-Values for linear mixed effects model. So I ran an additional anova, comparing the interaction model to a model with just the main effects of Factor1 and Factor2
> model_Factor1_Factor2 = lmer(Rating ~ Factor1 + Factor2 +
( 1 | Person) + (1 | GroupID), data)
> anova(model_Factor1_Factor2, model_interaction)
Data: data
Models:
object: Rating ~ Factor1 + Factor2 + (1 | Person) + (1 | GroupID)
..1: Rating ~ Factor1 * Factor2 + (1 | Person) + (1 | GroupID)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
object 6 207233 207286 -103611 207221
..1 7 207222 207283 -103604 207208 13.502 1 0.0002384 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I interpreted this Output as: the Interaction of Factor1 and Factor2 explains additional variance in my outcome measurement compared to the model with only the main effects of Factor1 and Factor2.
Since interpreting output for linear mixed effects models is hard I would like to print a graph showing the interaction of Factor1 and Factor2. I did so using lsmeans package (first I used the plot(allEffects) but after reading this How to get coefficients and their confidence intervals in mixed effects models? question I realized that it was not the right way to print graphs for linear mixed effect models).
So this is what I did (following this Website http://rcompanion.org/handbook/G_06.html)
> leastsquare = lsmeans(model_interaction, pairwise ~ Factor2:Factor1,
adjust="bon")
> CLD = cld(leastsquare, alpha=0.05, Letters=letters, adjust="bon")
> CLD$.group=gsub(" ", "", CLD$.group)
> CLD
Factor2 Factor1 lsmean SE df lower.CL upper.CL .group
1 0 -0.4382331 0.2185106 111.05 -0.9930408 0.1165746 a
1 1 -0.4251015 0.2186664 111.36 -0.9803048 0.1301018 a
2 0 -0.3525561 0.2190264 112.09 -0.9086735 0.2035612 a
2 1 -0.2150234 0.2189592 111.95 -0.7709700 0.3409233 b
Degrees-of-freedom method: satterthwaite
Confidence level used: 0.95
Conf-level adjustment: bonferroni method for 4 estimates
P value adjustment: bonferroni method for 6 tests
significance level used: alpha = 0.05
This is the plotting funtion I used
> ggplot(CLD, aes(`Factor1`, y = lsmean, ymax = upper.CL,
ymin = lower.CL, colour = `Factor2`, group = `Factor2`)) +
geom_pointrange(stat = "identity",
position = position_dodge(width = 0.1)) +
geom_line(position = position_dodge(width = 0.1))
The plot can be found using this link (I am not allowed to post images yet, please excuse the workaround)
Interaction of Factor1 and Factor2
Now my question is the following: Why do I have a significant interaction and a significant amount of explained variance by this interaction but my confidence intervals in the plot overlap? I guess I did something wrong with the confidence intervals? Or is it because it is just not possible to interpret the significance indices for linear mixed effects models?
Because it’s apples and oranges.
Apples: confidence intervals for means.
Oranges: tests of differences of means.
Means, and differences of means, are different statistics, and they have different standard errors and other distributional properties. In mixed models especially, they can be radically different because some sources of variation may cancel out when you take differences.
Don’t try to use confidence intervals to do comparisons. It’s like trying to make chicken soup out of hamburger.

Resources