gls() vs. lme() in the nlme package - r

In the nlme package there are two functions for fitting linear models (lme and gls).
What are the differences between
them in terms of the types of models
that can be fit, and the fitting
process?
What is the design
rational for having two functions to
fit linear mixed models where most
other systems (e.g. SAS SPSS) only
have one?
Update: Added bounty. Interested to know differences in the fitting process, and the rational.

From Pinheiro & Bates 2000, Section 5.4, p250:
The gls function is used to fit the
extended linear model, using either
maximum likelihood, or restricted
maximum likelihood. It can be veiwed
as an lme function without the
argument random.
For further details, it would be instructive to compare the lme analysis of the orthodont dataset (starting on p147 of the same book) with the gls analysis (starting on p250). To begin, compare
orth.lme <- lme(distance ~ Sex * I(age-11), data=Orthodont)
summary(orth.lme)
Linear mixed-effects model fit by REML
Data: Orthodont
AIC BIC logLik
458.9891 498.655 -214.4945
Random effects:
Formula: ~Sex * I(age - 11) | Subject
Structure: General positive-definite
StdDev Corr
(Intercept) 1.7178454 (Intr) SexFml I(-11)
SexFemale 1.6956351 -0.307
I(age - 11) 0.2937695 -0.009 -0.146
SexFemale:I(age - 11) 0.3160597 0.168 0.290 -0.964
Residual 1.2551778
Fixed effects: distance ~ Sex * I(age - 11)
Value Std.Error DF t-value p-value
(Intercept) 24.968750 0.4572240 79 54.60945 0.0000
SexFemale -2.321023 0.7823126 25 -2.96687 0.0065
I(age - 11) 0.784375 0.1015733 79 7.72226 0.0000
SexFemale:I(age - 11) -0.304830 0.1346293 79 -2.26421 0.0263
Correlation:
(Intr) SexFml I(-11)
SexFemale -0.584
I(age - 11) -0.006 0.004
SexFemale:I(age - 11) 0.005 0.144 -0.754
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.96534486 -0.38609670 0.03647795 0.43142668 3.99155835
Number of Observations: 108
Number of Groups: 27
orth.gls <- gls(distance ~ Sex * I(age-11), data=Orthodont)
summary(orth.gls)
Generalized least squares fit by REML
Model: distance ~ Sex * I(age - 11)
Data: Orthodont
AIC BIC logLik
493.5591 506.7811 -241.7796
Coefficients:
Value Std.Error t-value p-value
(Intercept) 24.968750 0.2821186 88.50444 0.0000
SexFemale -2.321023 0.4419949 -5.25124 0.0000
I(age - 11) 0.784375 0.1261673 6.21694 0.0000
SexFemale:I(age - 11) -0.304830 0.1976661 -1.54214 0.1261
Correlation:
(Intr) SexFml I(-11)
SexFemale -0.638
I(age - 11) 0.000 0.000
SexFemale:I(age - 11) 0.000 0.000 -0.638
Standardized residuals:
Min Q1 Med Q3 Max
-2.48814895 -0.58569115 -0.07451734 0.58924709 2.32476465
Residual standard error: 2.256949
Degrees of freedom: 108 total; 104 residual
Notice that the estimates of the fixed effects are the same (to 6 decimal places), but the standard errors are different, as is the correlation matrix.

Interesting question.
In principle the only difference is that gls can't fit models with random effects, whereas lme can. So the commands
fm1 <- gls(follicles ~ sin(2*pi*Time)+cos(2*pi*Time),Ovary,
correlation=corAR1(form=~1|Mare))
and
lm1 <- lme(follicles~sin(2*pi*Time)+cos(2*pi*Time),Ovary,
correlation=corAR1(form=~1|Mare))
ought to give the same result but they don't. The fitted parameters differ slightly.

Related

Getting p-values for mixed model run using lmer function

I've run some mixed models using lmer and they don't give p-values. I would like to know if there is a way to get p-values for these models. Someone suggested using the afex package. I've looked into this and am confused and overwhelmed. At https://rdrr.io/rforge/afex/man/mixed.html, for example, it gives what looks like very complicated and involved code; it's so overwhelming it makes me wonder if this is really what I need to do! Below is an example of a mixed model I run; I would like to get p-values for the fixed effects and the correlations of fixed effects. Any help would be appreciated!
Linear mixed model fit by REML ['lmerMod']
Formula: score ~ group + condition + (1 | subject) + (1 | token_set) + (1 | list)
Data: EN_JT_1
REML criterion at convergence: 744.9
Scaled residuals:
Min 1Q Median 3Q Max
-3.5860 -0.0364 0.2183 0.5424 1.6575
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 0.006401 0.08000
token_set (Intercept) 0.001667 0.04083
list (Intercept) 0.000000 0.00000
Residual 0.084352 0.29043
Number of obs: 1704, groups: subject, 71; token_set, 24; list, 2
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.99796 0.02425 41.156
groupHS -0.08453 0.02741 -3.084
groupSB -0.03103 0.03034 -1.023
conditionEN-GJT-D-ENG -0.10329 0.01990 -5.190
conditionEN-GJT-D-NNS -0.01288 0.02617 -0.492
conditionEN-GJT-D-NTR -0.19250 0.02596 -7.415
Correlation of Fixed Effects:
(Intr) gropHS gropSB cEN-GJT-D-E cEN-GJT-D-NN
groupHS -0.452
groupSB -0.409 0.361
cEN-GJT-D-E -0.410 0.000 0.000
cEN-GJT-D-NN -0.531 0.000 0.000 0.380
cEN-GJT-D-NT -0.535 0.000 0.000 0.383 0.700
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular

R: Plotting Mixed Effect models plot results

I am working on linguistic data and try to investigate the realisation of the vowel in words such as NURSE. There are more less 3 categories that can be realised, which I coded as <Er, Ir, Vr>. I then measured Formant values (F1 and F2). Then I created an LME that predicts the F1 and F2 values with different fixed and random effects but the main effect is a cross random effect of phoneme (i.e. <Er, Ir, Vr>) and individual. An example model can be found below.
Linear mixed model fit by REML ['lmerMod']
Formula:
F2 ~ (phoneme | individual) + (1 | word) + age + frequency +
(1 | zduration)
Data: nurse_female
REML criterion at convergence: 654.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.09203 -0.20332 0.03263 0.25273 1.37056
Random effects:
Groups Name Variance Std.Dev. Corr
zduration (Intercept) 0.27779 0.5271
word (Intercept) 0.04488 0.2118
individual (Intercept) 0.34181 0.5846
phonemeIr 0.54227 0.7364 -0.82
phonemeVr 1.52090 1.2332 -0.93 0.91
Residual 0.06326 0.2515
Number of obs: 334, groups:
zduration, 280; word, 116; individual, 23
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.79167 0.32138 5.575
age -0.01596 0.00508 -3.142
frequencylow -0.37587 0.18560 -2.025
frequencymid -1.18901 0.27738 -4.286
frequencyvery high -0.68365 0.26564 -2.574
Correlation of Fixed Effects:
(Intr) age frqncyl frqncym
age -0.811
frequencylw -0.531 -0.013
frequencymd -0.333 -0.006 0.589
frqncyvryhg -0.356 0.000 0.627 0.389
The question is now, how would I go about plotting the mean F2 values for each individual and for each of 3 variants <Er, Ir, Vr>?
I tried plotting the random effects as a caterpillar plot and get the following, but I am not sure, if this is accurate or does what I want. If what I have done Is right, are there any other better ways of plotting it?
ranefs_nurse_female_F2 <- ranef(nurse_female_F2.lmer8_2)
dotplot(ranefs_nurse_female_F2)

Generalized least squares results interpretation

I checked my linear regression model (WMAN = Species, WDNE = sea surface temp) and found auto-correlation so instead, I am trying generalized least squares with the following script;
library(nlme)
modelwa <- gls(WMAN ~WDNE, data=dat,
correlation = corAR1(form=~MONTH),
na.action=na.omit)
summary(modelwa)
I compared both models;
> library(MuMIn)
> model.sel(modelw,modelwa)
Model selection table
(Intrc) WDNE class na.action correlation df logLik AICc delta
modelwa 31.50 0.1874 gls na.omit crAR1(MONTH) 4 -610.461 1229.2 0.00
modelw 11.31 0.7974 lm na.excl 3 -658.741 1323.7 94.44
weight
modelwa 1
modelw 0
Abbreviations:
na.action: na.excl = ‘na.exclude’
correlation: crAR1(MONTH) = ‘corAR1(~MONTH)’
Models ranked by AICc(x)
I believe the results suggest I should use gls as the AIC is lower.
My problem is, I have been reporting F-value/R²/p-value, but the output from the gls does not have these?
I would be very grateful if someone could assist me in interpreting these results?
> summary(modelwa)
Generalized least squares fit by REML
Model: WMAN ~ WDNE
Data: mp2017.dat
AIC BIC logLik
1228.923 1240.661 -610.4614
Correlation Structure: ARMA(1,0)
Formula: ~MONTH
Parameter estimate(s):
Phi1
0.4809973
Coefficients:
Value Std.Error t-value p-value
(Intercept) 31.496911 8.052339 3.911524 0.0001
WDNE 0.187419 0.091495 2.048401 0.0424
Correlation:
(Intr)
WDNE -0.339
Standardized residuals:
Min Q1 Med Q3 Max
-2.023362 -1.606329 -1.210127 1.427247 3.567186
Residual standard error: 18.85341
Degrees of freedom: 141 total; 139 residual
>
I have now overcome the problem of auto-correlation so I can use lm()
Add lag1 of residual as an X variable to the original model. This can be done using the slide function in DataCombine package.
library(DataCombine)
econ_data <- data.frame(economics, resid_mod1=lmMod$residuals)
econ_data_1 <- slide(econ_data, Var="resid_mod1",
NewVar = "lag1", slideBy = -1)
econ_data_2 <- na.omit(econ_data_1)
lmMod2 <- lm(pce ~ pop + lag1, data=econ_data_2)
This script can be found here

Interpreting the output of summary(glmer(...)) in R

I'm an R noob, I hope you can help me:
I'm trying to analyse a dataset in R, but I'm not sure how to interpret the output of summary(glmer(...)) and the documentation isn't a big help:
> data_chosen_stim<-glmer(open_chosen_stim~closed_chosen_stim+day+(1|ID),family=binomial,data=chosenMovement)
> summary(data_chosen_stim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: open_chosen_stim ~ closed_chosen_stim + day + (1 | ID)
Data: chosenMovement
AIC BIC logLik deviance df.resid
96.7 105.5 -44.4 88.7 62
Scaled residuals:
Min 1Q Median 3Q Max
-1.4062 -1.0749 0.7111 0.8787 1.0223
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 0 0
Number of obs: 66, groups: ID, 35
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.4511 0.8715 0.518 0.605
closed_chosen_stim2 0.4783 0.5047 0.948 0.343
day -0.2476 0.5060 -0.489 0.625
Correlation of Fixed Effects:
(Intr) cls__2
clsd_chsn_2 -0.347
day -0.916 0.077
I understand the GLM behind it, but I can't see the weights of the independent variables and their error bounds.
update: weights.merMod already has a type argument ...
I think what you're looking for weights(object,type="working").
I believe these are the diagonal elements of W in your notation?
Here's a trivial example that matches up the results of glm and glmer (since the random effect is bogus and gets an estimated variance of zero, the fixed effects, weights, etc etc converges to the same value).
Note that the weights() accessor returns the prior weights by default (these are all equal to 1 for the example below).
Example (from ?glm):
d.AD <- data.frame(treatment=gl(3,3),
outcome=gl(3,1,9),
counts=c(18,17,15,20,10,20,25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(),
data=d.AD)
library(lme4)
d.AD$f <- 1 ## dummy grouping variable
glmer.D93 <- glmer(counts ~ outcome + treatment + (1|f),
family = poisson(),
data=d.AD,
control=glmerControl(check.nlev.gtr.1="ignore"))
Fixed effects and weights are the same:
all.equal(fixef(glmer.D93),coef(glm.D93)) ## TRUE
all.equal(unname(weights(glm.D93,type="working")),
weights(glmer.D93,type="working"),
tol=1e-7) ## TRUE

Root mean square error in R - mixed effect model

Could you please tell me how to get/compute the value RMSE (root mean square error) in R when you perform a mixed effect model
Data: na.omit(binh)
AIC BIC logLik
888.6144 915.1201 -436.3072
Random effects:
Formula: ~1 | Study
(Intercept) Residual
StdDev: 3.304345 1.361858
Fixed effects: Eeff ~ ADF + CP + DE + ADF2 + DE2
Value Std.Error DF t-value p-value
(Intercept) -0.66390 18.870908 158 -0.035181 0.9720
ADF 1.16693 0.424561 158 2.748556 0.0067
CP 0.25723 0.097524 158 2.637575 0.0092
DE -36.09593 12.031791 158 -3.000046 0.0031
ADF2 -0.03708 0.011014 158 -3.366625 0.0010
DE2 4.77918 1.932924 158 2.472513 0.0145
Correlation:
(Intr) ADF CP DE ADF2
ADF -0.107
CP -0.032 0.070
DE 0.978 -0.291 -0.043
ADF2 0.058 -0.982 -0.045 0.250
DE2 -0.978 0.308 0.039 -0.997 -0.265
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.28168116 -0.45260885 0.06528363 0.57071734 2.54144168
Number of Observations: 209
Number of Groups: 46
You don't give details of what function you used to make your model, but they tend to store their residuals using the same name, which you could check with str(), and RMSE is easily calculated from the residuals:
#make a model
library(nlme)
r <- lme(conc ~ age, data=IGF)
#get the RMSE
r.rmse <- sqrt(mean(r$residuals^2))
And in comments below, Ben Bolker points out that objects made by model fitting functions should have a residuals method, making it possible to do this (although some types of models may return residuals that have been transformed):
r.rmse <- sqrt(mean(residuals(r)^2))
The same result can be obtained from:
library(nlme)
library(sjstats)
fit <- lmer(Yield ~ Species + (1|Population/variety), data = df1,REML=T)
rmse(fit)

Resources