The question looks naive but I am puzzled with the configuration of the nlme function in R to get equivalent results to a given lme model.
This appears to work. Note that the defaults for method are different for lme ("REML") and nlme ("ML") ...
m1 <- lme(distance ~ age,
random = ~ age |Subject, data=Orthodont,
method="ML")
nlme requires starting values - cheat here and use those from lme:
m2 <- nlme(distance ~ mu,
fixed = mu ~ age,
random = mu ~ age | Subject,
data=Orthodont,
start=list(fixed=fixef(m1)))
The variance-covariance matrices are almost identical.
> VarCorr(m1)
Subject = pdLogChol(age)
Variance StdDev Corr
(Intercept) 4.81407327 2.1940996 (Intr)
age 0.04619252 0.2149244 -0.581
Residual 1.71620466 1.3100399
> VarCorr(m2)
Subject = pdLogChol(list(mu ~ age))
Variance StdDev Corr
mu.(Intercept) 4.81408901 2.1941032 m.(In)
mu.age 0.04619255 0.2149245 -0.581
Residual 1.71620373 1.3100396
Related
I have fitted a mixed effects model considering both functions widely used in R, namely: the lme function from the nlme package and the lmer function from the lme4 package.
To readjust the model from lme to lme4, following the same reparametrization, I used the following information from this topic, being that is only possible to do this in lme4 in a hackable way.: Heterocesdastic model of mixed effects via lmer function
I apologize for hosting the data in a link, however, I couldn't find an internal R database that has variables that might match my problem.
Data: https://drive.google.com/file/d/1jKFhs4MGaVxh-OPErvLDfMNmQBouywoM/view?usp=sharing
The fitted models were:
library(nlme)
library(lme4)
ModLME = lme(Var1~I(Var2)+I(Var2^2),
random = ~1|Var3,
weights = varIdent(form=~1|Var4),
Dataone, method="REML")
ModLMER = lmer(Var1~I(Var2)+I(Var2^2)+(1|Var3)+(0+dummy(Var4,"1")|Var5),
Dataone, REML = TRUE,
control=lmerControl(check.nobs.vs.nlev="ignore",
check.nobs.vs.nRE="ignore"))
Which are equivalent, see:
all.equal(REMLcrit(ModLMER), c(-2*logLik(ModLME)))
[1] TRUE
all.equal(fixef(ModLME), fixef(ModLMER), tolerance=1e-7)
[1] TRUE
> summary(ModLME)
Linear mixed-effects model fit by REML
Data: Dataone
AIC BIC logLik
-209.1431 -193.6948 110.5715
Random effects:
Formula: ~1 | Var3
(Intercept) Residual
StdDev: 0.05789852 0.03636468
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | Var4
Parameter estimates:
0 1
1.000000 5.641709
Fixed effects: Var1 ~ I(Var2) + I(Var2^2)
Value Std.Error DF t-value p-value
(Intercept) 0.9538547 0.01699642 97 56.12093 0
I(Var2) -0.5009804 0.09336479 97 -5.36584 0
I(Var2^2) -0.4280151 0.10038257 97 -4.26384 0
summary(ModLMER)
Linear mixed model fit by REML. t-tests use Satterthwaites method [lmerModLmerTest]
Formula: Var1 ~ I(Var2) + I(Var2^2) + (1 | Var3) + (0 + dummy(Var4, "1") |
Var5)
Data: Dataone
Control: lmerControl(check.nobs.vs.nlev = "ignore", check.nobs.vs.nRE = "ignore")
REML criterion at convergence: -221.1
Scaled residuals:
Min 1Q Median 3Q Max
-4.1151 -0.5891 0.0374 0.5229 2.1880
Random effects:
Groups Name Variance Std.Dev.
Var3 (Intercept) 6.466e-12 2.543e-06
Var5 dummy(Var4, "1") 4.077e-02 2.019e-01
Residual 4.675e-03 6.837e-02
Number of obs: 100, groups: Var3, 100; Var5, 100
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.95385 0.01700 95.02863 56.121 < 2e-16 ***
I(Var2) -0.50098 0.09336 92.94048 -5.366 5.88e-07 ***
I(Var2^2) -0.42802 0.10038 91.64017 -4.264 4.88e-05 ***
However, when observing the residuals of these models, note that they are not similar. See that in the model adjusted by lmer, mysteriously appears a residue with the shape of a few points close to a straight line. So, how could you solve such a problem so that they are identical? I believe the problem is in the lme4 model.
aa=plot(ModLME, main="LME")
bb=plot(ModLMER, main="LMER")
gridExtra::grid.arrange(aa,bb,ncol=2)
I can tell you what's going on and what should in principle fix it, but at the moment the fix doesn't work ...
The residuals being plotted take all of the random effects into account, which in the case of the lmer fit includes the individual-level random effects (the (0+dummy(Var4,"1")|Var5) term), which leads to weird residuals for the Var4==1 group. To illustrate this:
plot(ModLMER, col = Dataone$Var4+1)
i.e., you can see that the weird residuals are exactly the ones in red == those for which Var4==1.
In theory we should be able to get the same residuals via:
res <- Dataone$Var1 - predict(ModLMER, re.form = ~(1|Var3))
i.e., ignore the group-specific observation-level random effect term. However, it looks like there is a bug at the moment ("contrasts can be applied only to factors with 2 or more levels").
An extremely hacky solution is to construct the random-effect predictions without the observation-level term yourself:
## fixed-effect predictions
p0 <- predict(ModLMER, re.form = NA)
## construct RE prediction, Var3 term only:
Z <- getME(ModLMER, "Z")
b <- drop(getME(ModLMER, "b"))
## zero out observation-level components
b[101:200] <- 0
## add RE predictions to fixed predictions
p1 <- drop(p0 + Z %*% b)
## plot fitted vs residual
plot(p1, Dataone$Var1 - p1)
For what it's worth, this also works:
library(glmmTMB)
ModGLMMTMB <- glmmTMB(Var1~I(Var2)+I(Var2^2)+(1|Var3),
dispformula = ~factor(Var4),
REML = TRUE,
data = Dataone)
I am running the following line of code in R:
model = lme(divedepth ~ oarea, random=~1|deployid, data=GDataTimes, method="REML")
summary(model)
and I am seeing this result:
Linear mixed-effects model fit by REML
Data: GDataTimes
AIC BIC logLik
2512718 2512791 -1256352
Random effects:
Formula: ~1 | deployid
(Intercept) Residual
StdDev: 9.426598 63.50004
Fixed effects: divedepth ~ oarea
Value Std.Error DF t-value p-value
(Intercept) 25.549003 3.171766 225541 8.055135 0.0000
oarea2 12.619669 0.828729 225541 15.227734 0.0000
oarea3 1.095290 0.979873 225541 1.117787 0.2637
oarea4 0.852045 0.492100 225541 1.731447 0.0834
oarea5 2.441955 0.587300 225541 4.157933 0.0000
[snip]
Number of Observations: 225554
Number of Groups: 9
However, I cannot find the p-value for the random variable: deployID. How can I see this value?
As stated in the comments, there is stuff about significance tests of random effects in the GLMM FAQ. You should definitely consider:
why you are really interested in the p-value (it's not never of interest, but it's an unusual case)
the fact that the likelihood ratio test is extremely conservative for testing variance parameters (in this case it gives a p-value that's 2x too large)
Here's an example that shows that the lme() fit and the corresponding lm() model without the random effect have commensurate log-likelihoods (i.e., they're computed in a comparable way) and can be compared with anova():
Load packages and simulate data (with zero random effect variance)
library(lme4)
library(nlme)
set.seed(101)
dd <- data.frame(x = rnorm(120), f = factor(rep(1:3, 40)))
dd$y <- simulate(~ x + (1|f),
newdata = dd,
newparams = list(beta = rep(1, 2),
theta = 0,
sigma = 1))[[1]]
Fit models (note that you cannot compare a model fitted with REML to a model without random effects).
m1 <- lme(y ~ x , random = ~ 1 | f, data = dd, method = "ML")
m0 <- lm(y ~ x, data = dd)
Test:
anova(m1, m0)
## Model df AIC BIC logLik Test L.Ratio p-value
## m1 1 4 328.4261 339.5761 -160.2131
## m0 2 3 326.4261 334.7886 -160.2131 1 vs 2 6.622332e-08 0.9998
Here the test correctly identifies that the two models are identical and gives a p-value of 1.
If you use lme4::lmer instead of lme you have some other, more accurate (but slower) options (RLRsim and PBmodcomp packages for simulation-based tests): see the GLMM FAQ.
In R, I am searching for a way to estimate confidence intervals for linear contrasts for lmer models that use either kenward-rogers or satterthwaite degrees of freedom and SE.
For example, I can compute a CI for a fixed effect parameter in a mixed model like SAS with R, using the t-value (with df from KR) and SE.
mod<-lmerTest::lmer(y~time1+treatment+time1:treatment+(1|PersonID),data=data)
lmerTest::summary(mod,ddf = "Kenward-Roger")
This output:
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 49.0768 1.0435 56.4700 47.029 < 2e-16 ***
time1 5.8224 0.5963 48.0000 9.764 5.51e-13 ***
treatment 1.6819 1.4758 56.4700 1.140 0.2592
time1:treatment 2.0425 0.8433 48.0000 2.422 0.0193 *
Allows a CI for time1 like:
5.8224+abs(qt(0.05/2, 48))*0.5963 #7.021342
5.8224-abs(qt(0.05/2, 48))*0.5963 #4.623458
I would like to do this same thing for a linear contrast of the fixed coefficients. This is the p-value but there is no SE output.
pbkrtest::KRmodcomp(mod,matrix(c(0,0,1,0),nrow = 1))
stat ndf ddf F.scaling p.value
Ftest 1.2989 1.0000 56.4670 1 0.2592
Is there anyway to get a SE or a CI from lmer linear contrasts that uses this type of df?
For this, you have at least two options: using the lsmeans package, or doing it manually (using functions vcovAdj.lmerMod and pbkrtest::get_Lb_ddf). Personally, I go with the later if the contrast to be tested is not very "simple", because I find the syntax in lsmeans a bit complicated.
To exemplify, take the following model:
library(pbkrtest)
library(lme4)
library(nlme) # for the 'Orthodont' data
# 'age' is a numeric variable, while 'Sex' and 'Subject' are factors
model <- lmer(distance ~ age : Sex + (1 | Subject), data = Orthodont)
Linear mixed model fit by REML ['lmerMod']
Formula: distance ~ age:Sex + (1 | Subject)
…
Fixed Effects:
(Intercept) age:SexMale age:SexFemale
16.7611 0.7555 0.5215
from which we would like to obtain stats on the difference between the coefficients for age in males and females (i.e., age:SexMale - age:SexFemale).
Using lsmeans:
library(lsmeans)
# Evaluate the contrast at a value of 'age' set to 1,
# so that the resulting value is equal to the regression coefficient
lsm = lsmeans(model, pairwise ~ age : Sex, at = list(age = 1))$contrast
produces:
contrast estimate SE df t.ratio p.value
1,Male - 1,Female 0.2340135 0.06113276 42.64 3.828 0.0004
Alternatively, doing the calculation manually:
# Specify the contrasts: age:SexMale - age:SexFemale
# Must have the same order as the fixed effects in the model
K = c("(Intercept)" = 0, "age:SexMale" = 1, "age:SexFemale" = -1)
# Retrieve the adjusted variance-covariance matrix, to calculate the SE
V = pbkrtest::vcovAdj.lmerMod(model, 0)
# Point estimate, SE and df
point_est = sum(K * fixef(model))
SE = sqrt(sum(K * (V %*% K)))
df = pbkrtest::get_Lb_ddf(model, K)
alpha = 0.05 # significance level
# Calculate confidence interval for the difference between the 'age' coefficients for males and females
Delta_age_CI = point_est + SE * qt(c(0.5 * alpha, 1 - 0.5 * alpha), df)
will result in a point estimate equal to 0.2340135, SE 0.06113276, df 42.63844, and confidence interval [0.1106973, 0.3573297]
I am trying to specify both a random intercept and random slope term in a GAMM model with one fixed effect.
I have successfully fitted a model with a random intercept using the below code within the mgcv library, but can now not determine what the syntax is for a random slope within the gamm() function:
M1 = gamm(dur ~ s(dep, bs="ts", k = 4), random= list(fInd = ~1), data= df)
If I was using both a random intercept and slope within a linear mixed-effects model I would write it in the following way:
M2 = lme(dur ~ dep, random=~1 + dep|fInd, data=df)
The gamm() supporting documentation states that the random terms need to be given in the list form as in lme() but I cannot find any interpretable examples that include both slope and intercept terms. Any advice / solutions would be much appreciated.
The gamm4 function in the gamm4 package contains a way to do this. You specify the random intercept and slope in the same way that you do in the lmer style. In your case:
M1 = gamm4(dur~s(dep,bs="ts",k=4), random = ~(1+dep|fInd), data=df)
Here is the gamm4 documentation:
https://cran.r-project.org/web/packages/gamm4/gamm4.pdf
Here is the gamm() syntax to enter correlated random intercept and slope effects, using the sleepstudy dataset.
library(nlme)
library(mgcv)
data(sleepstudy,package='lme4')
# Model via lme()
fm1 <- lme(Reaction ~ Days, random= ~1+Days|Subject, data=sleepstudy, method='REML')
# Model via gamm()
fm1.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1+Days), data=sleepstudy, method='REML')
VarCorr(fm1)
VarCorr(fm1.gamm$lme)
# Both are identical
# Subject = pdLogChol(1 + Days)
# Variance StdDev Corr
# (Intercept) 612.0795 24.740241 (Intr)
# Days 35.0713 5.922103 0.066
# Residual 654.9424 25.591843
The syntax to enter uncorrelated random intercept and slope effects is the same for lme() and gamm().
# Model via lme()
fm2 <- lme(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
# Model via gamm()
fm2.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
VarCorr(fm2)
VarCorr(fm2.gamm$lme)
# Both are identical
# Variance StdDev
# Subject = pdLogChol(1)
# (Intercept) 627.5690 25.051328
# Subject = pdLogChol(0 + Days)
# Days 35.8582 5.988172
# Residual 653.5838 25.565285
This answer also shows how to enter multiple random effects into lme().
I have the following linear models
library(nlme)
fm2 <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)
fm2.lm <- lm(distance ~ age + Sex,data = Orthodont)
How can I obtain the standard error of distance with age and Sex?
For fm2 (linear mixed model), you can do
sqrt(diag(summary(fm2)$varFix))
#(Intercept) age SexFemale
# 0.83392247 0.06160592 0.76141685
For fm2.lm (linear model), you can do
summary(fm2.lm)$coefficients[, "Std. Error"]
#(Intercept) age SexFemale
# 1.11220946 0.09775895 0.44488623
see attributes(summary(your.model)). what you're after is summary(your.model)$coefficients (or did I get your question wrong?). just use subsetting with [] to get what you want