Can I use emmeans with LME model? - r

I am using LME model defined like:
mod4.lme <- lme(pRNFL ~ Init.Age + Status + I(Time^2), random= ~1|Patient/EyeID,data = long1, na.action = na.omit)
The output is:
> summary(mod4.lme)
Linear mixed-effects model fit by REML
Data: long1
AIC BIC logLik
2055.295 2089.432 -1018.647
Random effects:
Formula: ~1 | Patient
(Intercept)
StdDev: 7.949465
Formula: ~1 | EyeID %in% Patient
(Intercept) Residual
StdDev: 12.10405 2.279917
Fixed effects: pRNFL ~ Init.Age + Status + I(Time^2)
Value Std.Error DF t-value p-value
(Intercept) 97.27827 6.156093 212 15.801950 0.0000
Init.Age 0.02114 0.131122 57 0.161261 0.8725
StatusA -27.32643 3.762155 212 -7.263504 0.0000
StatusF -23.31652 3.984353 212 -5.852023 0.0000
StatusN -0.28814 3.744980 57 -0.076940 0.9389
I(Time^2) -0.06498 0.030223 212 -2.149921 0.0327
Correlation:
(Intr) Int.Ag StatsA StatsF StatsN
Init.Age -0.921
StatusA -0.317 0.076
StatusF -0.314 0.088 0.834
StatusN -0.049 -0.216 0.390 0.365
I(Time^2) -0.006 -0.004 0.001 -0.038 -0.007
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3565641 -0.4765840 0.0100608 0.4670792 2.7775392
Number of Observations: 334
Number of Groups:
Patient EyeID %in% Patient
60 119
I wanted to get comparisons between my 'Status' factors (named A, N, F and H). So I did a emmeans model using this code:
emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
The output for this, is:
> emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
$emmeans
Status emmean SE df lower.CL upper.CL
H 98.13515 2.402248 57 93.32473 102.94557
A 70.80872 2.930072 57 64.94135 76.67609
F 74.81863 3.215350 57 68.38000 81.25726
N 97.84701 2.829706 57 92.18062 103.51340
Degrees-of-freedom method: containment
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
H - A 27.3264289 3.762155 212 7.264 <.0001
H - F 23.3165220 3.984353 212 5.852 <.0001
H - N 0.2881375 3.744980 57 0.077 1.0000
A - F -4.0099069 2.242793 212 -1.788 0.4513
A - N -27.0382913 4.145370 57 -6.523 <.0001
F - N -23.0283844 4.359019 57 -5.283 <.0001

The answer is yes, emmeans does the calculation based on the model

Related

converting proc mixed to R

I have been trying to convert some PROC MIXED SAS code into R, but without success. The code is:
proc mixed data=rmanova4;
class randomization_arm cancer_type site wk;
model chgpf=randomization_arm cancer_type site wk;
repeated / subject=study_id;
contrast '12 vs 4' randomization_arm 1 -1;
lsmeans randomization_arm / cl pdiff alpha=0.05;
run;quit;
I have tried something like
mod4 <- lme(chgpf ~ Randomization_Arm + Cancer_Type + site + wk, data=rmanova.data, random = ~ 1 | Study_ID, na.action=na.exclude)
but I am getting different estimate values.
Perhaps I am misunderstanding something basic. Any comment/suggestion would be greatly appreciated.
(Additional editing)
I am adding here the output. Part of the output from the SAS code is below:
Least Squares Means
Effect Randomization_Arm Estimate Standard Error DF t Value Pr > |t| Alpha Lower Upper
Randomization_Arm 12 weekly BTA -4.5441 1.3163 222 -3.45 0.0007 0.05 -7.1382 -1.9501
Randomization_Arm 4 weekly BTA -6.4224 1.3143 222 -4.89 <.0001 0.05 -9.0126 -3.8322
Differences of Least Squares Means
Effect Randomization_Arm _Randomization_Arm Estimate Standard Error DF t Value Pr > |t| Alpha Lower Upper
Randomization_Arm 12 weekly BTA 4 weekly BTA 1.8783 1.4774 222 1.27 0.2049 0.05 -1.0332 4.7898
The output from the R code is below:
Linear mixed-effects model fit by REML
Data: rmanova.data
AIC BIC logLik
6522.977 6578.592 -3249.488
Random effects:
Formula: ~1 | Study_ID
(Intercept) Residual
StdDev: 16.59143 12.81334
Fixed effects: chgpf ~ Randomization_Arm + Cancer_Type + site + wk
Value Std.Error DF t-value p-value
(Intercept) 2.332268 2.314150 539 1.0078294 0.3140
Randomization_Arm4 weekly BTA -1.708401 2.409444 222 -0.7090435 0.4790
Cancer_TypeProsta -4.793787 2.560133 222 -1.8724761 0.0625
site2 -1.492911 3.665674 222 -0.4072678 0.6842
site3 -4.002252 3.510111 222 -1.1402066 0.2554
site4 -12.013758 5.746988 222 -2.0904442 0.0377
site5 -3.823504 4.938590 222 -0.7742097 0.4396
wk2 0.313863 1.281047 539 0.2450052 0.8065
wk3 -3.606267 1.329357 539 -2.7127905 0.0069
wk4 -4.246526 1.345526 539 -3.1560334 0.0017
Correlation:
(Intr) R_A4wB Cnc_TP site2 site3 site4 site5 wk2 wk3
Randomization_Arm4 weekly BTA -0.558
Cancer_TypeProsta -0.404 0.046
site2 -0.257 0.001 -0.087
site3 -0.238 0.004 -0.163 0.201
site4 -0.255 0.031 0.151 0.101 0.095
site5 -0.172 -0.016 -0.077 0.139 0.151 0.073
wk2 -0.254 -0.008 0.010 0.011 -0.003 0.005 -0.001
wk3 -0.257 0.005 0.020 0.014 0.006 -0.001 -0.002 0.464
wk4 -0.251 -0.007 0.022 0.020 0.002 0.006 -0.002 0.461 0.461
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-5.6784364 -0.3796392 0.1050812 0.4588555 3.1055046
Number of Observations: 771
Number of Groups: 229
Adding some comments and observations
Since my original posting, I have tried various pieces of R code but I am getting different estimates from those given in SAS.
More importantly, the standard errors are almost double than those given by SAS.
Any suggestions would be greatly appreciated.
I got the solution to the problem from someone after posting the question at the R-sig-ME. It seems that the above SAS fits actually a simple linear regression model, assuming independent across observations, which is equivalent to
proc glm data=rmanova4;
class randomization_arm cancer_type site wk;
model chgpf = randomization_arm cancer_type site wk;
run;
which of course in R is equivalent to
lm(chgpf ~ Randomization_Arm + Cancer_Type + site + wk, data=rmanova.data)

Is there any way to split interaction effects in a linear model up?

I have a 2x2 factorial design: control vs enriched, and strain1 vs strain2. I wanted to make a linear model, which I did as follows:
anova(lmer(length ~ Strain + Insect + Strain:Insect + BW_final + (1|Pen), data = mydata))
Where length is one of the dependent variables I want to analyse, Strain and Insect as treatments, Strain:Insect as interaction effect, BW_final as covariate, and Pen as random effect.
As output I get this:
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
Strain 3.274 3.274 1 65 0.1215 0.7285
Insect 14.452 14.452 1 65 0.5365 0.4665
BW_final 45.143 45.143 1 65 1.6757 0.2001
Strain:Insect 52.813 52.813 1 65 1.9604 0.1662
As you can see, I only get 1 interaction term: Strain:Insect. However, I'd like to see 4 interaction terms: Strain1:Control, Strain1:Enriched, Strain2:Control, Strain2:Enriched.
Is there any way to do this in R?
Using summary instead of anova I get:
> summary(linearmer)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [lmerModLmerTest]
Formula: length ~ Strain + Insect + Strain:Insect + BW_final + (1 | Pen)
Data: mydata_young
REML criterion at convergence: 424.2
Scaled residuals:
Min 1Q Median 3Q Max
-1.95735 -0.52107 0.07014 0.43928 2.13383
Random effects:
Groups Name Variance Std.Dev.
Pen (Intercept) 0.00 0.00
Residual 26.94 5.19
Number of obs: 70, groups: Pen, 27
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 101.646129 7.530496 65.000000 13.498 <2e-16 ***
StrainRoss 0.648688 1.860745 65.000000 0.349 0.729
Insect 0.822454 2.062696 65.000000 0.399 0.691
BW_final -0.005188 0.004008 65.000000 -1.294 0.200
StrainRoss:Insect -3.608430 2.577182 65.000000 -1.400 0.166
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) StrnRs Insect BW_fnl
StrainRoss 0.253
Insect -0.275 0.375
BW_final -0.985 -0.378 0.169
StrnRss:Ins 0.071 -0.625 -0.775 0.016
convergence code: 0
boundary (singular) fit: see ?isSingular```

what it means for AIC NA in glmmPQL (MASS) summary output?

i'm new to glmmPQL.
library(MASS)
pql<-glmmPQL(fixed = sleeve~pain+stiff+diff,random = ~1|time,family = "binomial",data = knee)
when i run this model R give this warning "iteration 1", i don't know what it means
summary(pql)
Data: knee
AIC BIC logLik
NA NA NA
what it means by NA in AIC,BIC and loglik?
full result look like this
Linear mixed-effects model fit by maximum likelihood
Data: knee
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | time
(Intercept) Residual
StdDev: 4.93377e-05 0.9880118
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: sleeve ~ pain + stiff + diff
Value Std.Error DF t-value p-value
(Intercept) 0.5775110 0.6854296 21 0.8425534 0.4090
pain -0.0844256 0.1856925 21 -0.4546528 0.6540
stiff -0.0152056 0.1430767 21 -0.1062757 0.9164
diff 0.0022752 0.0415744 21 0.0547254 0.9569
Correlation:
(Intr) pain stiff
pain -0.004
stiff -0.275 -0.369
diff -0.134 -0.901 0.058
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.3181996 -0.8774770 -0.4111929 0.8923366 1.7722900
Number of Observations: 27
Number of Groups: 3

Different outputs using ggpredict for glmer and glmmTMB model

I am trying to predict and graph models with species presence as the response. However I've run into the following problem: the ggpredict outputs are wildly different for the same data in glmer and glmmTMB. However, the estimates and AIC are very similar. These are simplified models only including date (which has been centered and scaled), which seems to be the most problematic to predict.
yntest<- glmer(MYOSOD.P~ jdate.z + I(jdate.z^2) + I(jdate.z^3) +
(1|area/SiteID), family = binomial, data = sodpYN)
> summary(yntest)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: MYOSOD.P ~ jdate.z + I(jdate.z^2) + I(jdate.z^3) + (1 | area/SiteID)
Data: sodpYN
AIC BIC logLik deviance df.resid
1260.8 1295.1 -624.4 1248.8 2246
Scaled residuals:
Min 1Q Median 3Q Max
-2.0997 -0.3218 -0.2013 -0.1238 9.4445
Random effects:
Groups Name Variance Std.Dev.
SiteID:area (Intercept) 1.6452 1.2827
area (Intercept) 0.6242 0.7901
Number of obs: 2252, groups: SiteID:area, 27; area, 9
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.96778 0.39190 -7.573 3.65e-14 ***
jdate.z -0.72258 0.17915 -4.033 5.50e-05 ***
I(jdate.z^2) 0.10091 0.08068 1.251 0.21102
I(jdate.z^3) 0.25025 0.08506 2.942 0.00326 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) jdat.z I(.^2)
jdate.z 0.078
I(jdat.z^2) -0.222 -0.154
I(jdat.z^3) -0.071 -0.910 0.199
The glmmTMB model + summary:
Tyntest<- glmmTMB(MYOSOD.P ~ jdate.z + I(jdate.z^2) + I(jdate.z^3) +
(1|area/SiteID), family = binomial("logit"), data = sodpYN)
> summary(Tyntest)
Family: binomial ( logit )
Formula: MYOSOD.P ~ jdate.z + I(jdate.z^2) + I(jdate.z^3) + (1 | area/SiteID)
Data: sodpYN
AIC BIC logLik deviance df.resid
1260.8 1295.1 -624.4 1248.8 2246
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
SiteID:area (Intercept) 1.6490 1.2841
area (Intercept) 0.6253 0.7908
Number of obs: 2252, groups: SiteID:area, 27; area, 9
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.96965 0.39638 -7.492 6.78e-14 ***
jdate.z -0.72285 0.18250 -3.961 7.47e-05 ***
I(jdate.z^2) 0.10096 0.08221 1.228 0.21941
I(jdate.z^3) 0.25034 0.08662 2.890 0.00385 **
---
ggpredict outputs
testg<-ggpredict(yntest, terms ="jdate.z[all]")
> testg
# Predicted probabilities of MYOSOD.P
# x = jdate.z
x predicted std.error conf.low conf.high
-1.95 0.046 0.532 0.017 0.120
-1.51 0.075 0.405 0.036 0.153
-1.03 0.084 0.391 0.041 0.165
-0.58 0.072 0.391 0.035 0.142
-0.14 0.054 0.390 0.026 0.109
0.35 0.039 0.399 0.018 0.082
0.79 0.034 0.404 0.016 0.072
1.72 0.067 0.471 0.028 0.152
Adjusted for:
* SiteID = 0 (population-level)
* area = 0 (population-level)
Standard errors are on link-scale (untransformed).
testgTMB<- ggpredict(Tyntest, "jdate.z[all]")
> testgTMB
# Predicted probabilities of MYOSOD.P
# x = jdate.z
x predicted std.error conf.low conf.high
-1.95 0.444 0.826 0.137 0.801
-1.51 0.254 0.612 0.093 0.531
-1.03 0.136 0.464 0.059 0.280
-0.58 0.081 0.404 0.038 0.163
-0.14 0.054 0.395 0.026 0.110
0.35 0.040 0.402 0.019 0.084
0.79 0.035 0.406 0.016 0.074
1.72 0.040 0.444 0.017 0.091
Adjusted for:
* SiteID = NA (population-level)
* area = NA (population-level)
Standard errors are on link-scale (untransformed).
The estimates are completely different and I have no idea why.
I did try to use both the ggeffects package from CRAN and the developer version in case that changed anything. It did not. I am using the most up to date version of glmmTMB.
This is my first time asking a question here so please let me know if I should provide more information to help explain the problem.
I checked and the issue is the same when using predict instead of ggpredict, which would imply that it is a glmmTMB issue?
GLMER:
dayplotg<-expand.grid(jdate.z=seq(min(sodp$jdate.z), max(sodp$jdate.z), length=92))
Dfitg<-predict(yntest, re.form=NA, newdata=dayplotg, type='response')
dayplotg<-data.frame(dayplotg, Dfitg)
head(dayplotg)
> head(dayplotg)
jdate.z Dfitg
1 -1.953206 0.04581691
2 -1.912873 0.04889584
3 -1.872540 0.05195598
4 -1.832207 0.05497553
5 -1.791875 0.05793307
6 -1.751542 0.06080781
glmmTMB:
dayplot<-expand.grid(jdate.z=seq(min(sodp$jdate.z), max(sodp$jdate.z), length=92),
SiteID=NA,
area=NA)
Dfit<-predict(Tyntest, newdata=dayplot, type='response')
head(Dfit)
dayplot<-data.frame(dayplot, Dfit)
head(dayplot)
> head(dayplot)
jdate.z SiteID area Dfit
1 -1.953206 NA NA 0.4458236
2 -1.912873 NA NA 0.4251926
3 -1.872540 NA NA 0.4050944
4 -1.832207 NA NA 0.3855801
5 -1.791875 NA NA 0.3666922
6 -1.751542 NA NA 0.3484646
I contacted the ggpredict developer and figured out that if I used poly(jdate.z,3) rather than jdate.z + I(jdate.z^2) + I(jdate.z^3) in the glmmTMB model, the glmer and glmmTMB predictions were the same.
I'll leave this post up even though I was able to answer my own question in case someone else has this question later.

Emmeans in LME and adjusting for "time since" [duplicate]

I am using LME model defined like:
mod4.lme <- lme(pRNFL ~ Init.Age + Status + I(Time^2), random= ~1|Patient/EyeID,data = long1, na.action = na.omit)
The output is:
> summary(mod4.lme)
Linear mixed-effects model fit by REML
Data: long1
AIC BIC logLik
2055.295 2089.432 -1018.647
Random effects:
Formula: ~1 | Patient
(Intercept)
StdDev: 7.949465
Formula: ~1 | EyeID %in% Patient
(Intercept) Residual
StdDev: 12.10405 2.279917
Fixed effects: pRNFL ~ Init.Age + Status + I(Time^2)
Value Std.Error DF t-value p-value
(Intercept) 97.27827 6.156093 212 15.801950 0.0000
Init.Age 0.02114 0.131122 57 0.161261 0.8725
StatusA -27.32643 3.762155 212 -7.263504 0.0000
StatusF -23.31652 3.984353 212 -5.852023 0.0000
StatusN -0.28814 3.744980 57 -0.076940 0.9389
I(Time^2) -0.06498 0.030223 212 -2.149921 0.0327
Correlation:
(Intr) Int.Ag StatsA StatsF StatsN
Init.Age -0.921
StatusA -0.317 0.076
StatusF -0.314 0.088 0.834
StatusN -0.049 -0.216 0.390 0.365
I(Time^2) -0.006 -0.004 0.001 -0.038 -0.007
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.3565641 -0.4765840 0.0100608 0.4670792 2.7775392
Number of Observations: 334
Number of Groups:
Patient EyeID %in% Patient
60 119
I wanted to get comparisons between my 'Status' factors (named A, N, F and H). So I did a emmeans model using this code:
emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
The output for this, is:
> emmeans(mod4.lme, pairwise ~ Status, adjust="bonferroni")
$emmeans
Status emmean SE df lower.CL upper.CL
H 98.13515 2.402248 57 93.32473 102.94557
A 70.80872 2.930072 57 64.94135 76.67609
F 74.81863 3.215350 57 68.38000 81.25726
N 97.84701 2.829706 57 92.18062 103.51340
Degrees-of-freedom method: containment
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio p.value
H - A 27.3264289 3.762155 212 7.264 <.0001
H - F 23.3165220 3.984353 212 5.852 <.0001
H - N 0.2881375 3.744980 57 0.077 1.0000
A - F -4.0099069 2.242793 212 -1.788 0.4513
A - N -27.0382913 4.145370 57 -6.523 <.0001
F - N -23.0283844 4.359019 57 -5.283 <.0001
The answer is yes, emmeans does the calculation based on the model

Resources