I would like to fit a nonlinear mixed model and then test differences between parameters in treatment and control groups.
I am using nlmer from the lme4 package.
I am using the Oranges dataset as test data for this problem.
The circumferences of 5 trees are measured over time. Each tree exhibits logistic growth. In the basic example, we include Tree as a random effect.
I have extended the data so that there is a treat and control group (the treat is just a copy of control with circumference values doubled).
My problem is, I'd like to have 'treat' as a fixed effect and then test the differences between the non-linear model parameter Asym in the treatment and control groups.
library(lme4)
#Toy data based on Orange (lme4)
# Create a copy of Orange data, double the circumference values, make new labels for trees (no. 6-10) and label all as treatment (1)
Orange.with.treatment<-Orange
Orange.with.treatment$circumference<-Orange.with.treatment$circumference*2
Orange.with.treatment$Tree <- as.factor(as.numeric(Orange.with.treatment$Tree) + 5)
Orange.with.treatment$treat<- as.factor(rep(1,length(Orange$Tree)))
# Create a copy of Orange data and label all as control (1)
Orange.control<-Orange
Orange.control$treat<- as.factor(rep(0,length(Orange$Tree)))
# combine into one dataframe
Orange.full<-(rbind(Orange.control,Orange.with.treatment))
# a nlmer fit not considering treatment as a factor
startvec <- c(Asym = 200, xmid = 725, scal = 350)
(nm1 <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym|Tree,
Orange.full, start = startvec))
# a nlmer fit considering treatment as a fixed factor?
startvec <- c(Asym = 200, xmid = 725, scal = 350)
(nm2 <- nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym+treat|Tree,
Orange.full, start = startvec))
# test differences in parameters between treat and control?
I have tried adding treat alongside Asym in the formula, but I don;t think that is correct.
What I would like is a summary of Asym in treat and control, and a way to statistically test the difference between them.
Since you seem to be open to using other tools, here is an nlme solution:
library(nlme)
mod <- nlme(circumference ~ SSlogis(age, Asym, xmid, scal), data = Orange.full,
fixed = Asym + xmid + scal ~ treat, random = Asym + xmid + scal ~ 1 | Tree,
start = c(200, 200, 725, 0, 350, 0), control = nlmeControl(msMaxIter = 1000))
summary(mod)
#Nonlinear mixed-effects model fit by maximum likelihood
# Model: circumference ~ SSlogis(age, Asym, xmid, scal)
# Data: Orange.full
# AIC BIC logLik
# 608.9452 638.1756 -291.4726
#
#Random effects:
# Formula: list(Asym ~ 1, xmid ~ 1, scal ~ 1)
# Level: Tree
# Structure: General positive-definite, Log-Cholesky parametrization
# StdDev Corr
#Asym.(Intercept) 43.23426 As.(I) xm.(I)
#xmid.(Intercept) 38.35359 -0.031
#scal.(Intercept) 32.49873 -0.968 0.279
#Residual 11.27260
#
#Fixed effects: Asym + xmid + scal ~ treat
# Value Std.Error DF t-value p-value
#Asym.(Intercept) 191.2135 22.30629 55 8.572177 0.0000
#Asym.treat1 193.0409 31.56922 55 6.114847 0.0000
#xmid.(Intercept) 722.4272 53.37976 55 13.533729 0.0000
#xmid.treat1 5.0466 62.02158 55 0.081368 0.9354
#scal.(Intercept) 349.4497 41.68009 55 8.384092 0.0000
#scal.treat1 7.3181 48.41709 55 0.151146 0.8804
#
#<snip>
As you see, this shows a significant treatment effect on the asymptote but not on the other parameters, as expected.
Related
I have a linear mixed effect model with three-way interaction fitted by the following code:
m <- lmer(cog ~ PRS*poly(Age, 2, raw=T)*Score
+ gender + Edu + fam + factor(Time)
+ (1|family/DBID),
data = test_all, REML = F)
In this model, there is a three-way interaction between PRS, Score, and polynomial terms of age with two degrees (linear + quadratic). For this three way interaction, how can I obtain the marginal effect(slope) of one variable, conditional on the other variables? For example, what is the slope of PRS, when age = 50 and score = 1?
Second, I tried to use the following code to plot this three-way interaction:
plot <- ggpredict(m, ci.lvl=0.95, c("PRS [all]", "Age [60, 65, 70, 75, 80]", "Score[0, 0.321, 0.695, 1.492, 1.914, 3.252]"))
plot(m)
The interaction plot finally shows but R didn't give the confidence interval. The error message is Error: Confidence intervals could not be computed.
How can I plot this three-way interaction with confidence interval?
You can use the marginaleffects package to do that (disclaimer: I am
the maintainer):
library(marginaleffects)
library(lme4)
mod <- lmer(mpg ~ hp * am * vs + (1 | cyl), data = mtcars)
mfx <- marginaleffects(mod, newdata = typical(vs = 0, am = 1, cyl = 4))
summary(mfx)
## Average marginal effects
## type Term Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
## 1 response am 4.10167 2.13391 1.9221 0.0545884 -0.08072 8.28406
## 2 response hp -0.03724 0.01378 -2.7016 0.0069001 -0.06426 -0.01022
## 3 response vs -0.61237 3.52755 -0.1736 0.8621833 -7.52625 6.30151
##
## Model type: lmerMod
## Prediction type: response
I am running the following line of code in R:
model = lme(divedepth ~ oarea, random=~1|deployid, data=GDataTimes, method="REML")
summary(model)
and I am seeing this result:
Linear mixed-effects model fit by REML
Data: GDataTimes
AIC BIC logLik
2512718 2512791 -1256352
Random effects:
Formula: ~1 | deployid
(Intercept) Residual
StdDev: 9.426598 63.50004
Fixed effects: divedepth ~ oarea
Value Std.Error DF t-value p-value
(Intercept) 25.549003 3.171766 225541 8.055135 0.0000
oarea2 12.619669 0.828729 225541 15.227734 0.0000
oarea3 1.095290 0.979873 225541 1.117787 0.2637
oarea4 0.852045 0.492100 225541 1.731447 0.0834
oarea5 2.441955 0.587300 225541 4.157933 0.0000
[snip]
Number of Observations: 225554
Number of Groups: 9
However, I cannot find the p-value for the random variable: deployID. How can I see this value?
As stated in the comments, there is stuff about significance tests of random effects in the GLMM FAQ. You should definitely consider:
why you are really interested in the p-value (it's not never of interest, but it's an unusual case)
the fact that the likelihood ratio test is extremely conservative for testing variance parameters (in this case it gives a p-value that's 2x too large)
Here's an example that shows that the lme() fit and the corresponding lm() model without the random effect have commensurate log-likelihoods (i.e., they're computed in a comparable way) and can be compared with anova():
Load packages and simulate data (with zero random effect variance)
library(lme4)
library(nlme)
set.seed(101)
dd <- data.frame(x = rnorm(120), f = factor(rep(1:3, 40)))
dd$y <- simulate(~ x + (1|f),
newdata = dd,
newparams = list(beta = rep(1, 2),
theta = 0,
sigma = 1))[[1]]
Fit models (note that you cannot compare a model fitted with REML to a model without random effects).
m1 <- lme(y ~ x , random = ~ 1 | f, data = dd, method = "ML")
m0 <- lm(y ~ x, data = dd)
Test:
anova(m1, m0)
## Model df AIC BIC logLik Test L.Ratio p-value
## m1 1 4 328.4261 339.5761 -160.2131
## m0 2 3 326.4261 334.7886 -160.2131 1 vs 2 6.622332e-08 0.9998
Here the test correctly identifies that the two models are identical and gives a p-value of 1.
If you use lme4::lmer instead of lme you have some other, more accurate (but slower) options (RLRsim and PBmodcomp packages for simulation-based tests): see the GLMM FAQ.
I am trying to specify both a random intercept and random slope term in a GAMM model with one fixed effect.
I have successfully fitted a model with a random intercept using the below code within the mgcv library, but can now not determine what the syntax is for a random slope within the gamm() function:
M1 = gamm(dur ~ s(dep, bs="ts", k = 4), random= list(fInd = ~1), data= df)
If I was using both a random intercept and slope within a linear mixed-effects model I would write it in the following way:
M2 = lme(dur ~ dep, random=~1 + dep|fInd, data=df)
The gamm() supporting documentation states that the random terms need to be given in the list form as in lme() but I cannot find any interpretable examples that include both slope and intercept terms. Any advice / solutions would be much appreciated.
The gamm4 function in the gamm4 package contains a way to do this. You specify the random intercept and slope in the same way that you do in the lmer style. In your case:
M1 = gamm4(dur~s(dep,bs="ts",k=4), random = ~(1+dep|fInd), data=df)
Here is the gamm4 documentation:
https://cran.r-project.org/web/packages/gamm4/gamm4.pdf
Here is the gamm() syntax to enter correlated random intercept and slope effects, using the sleepstudy dataset.
library(nlme)
library(mgcv)
data(sleepstudy,package='lme4')
# Model via lme()
fm1 <- lme(Reaction ~ Days, random= ~1+Days|Subject, data=sleepstudy, method='REML')
# Model via gamm()
fm1.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1+Days), data=sleepstudy, method='REML')
VarCorr(fm1)
VarCorr(fm1.gamm$lme)
# Both are identical
# Subject = pdLogChol(1 + Days)
# Variance StdDev Corr
# (Intercept) 612.0795 24.740241 (Intr)
# Days 35.0713 5.922103 0.066
# Residual 654.9424 25.591843
The syntax to enter uncorrelated random intercept and slope effects is the same for lme() and gamm().
# Model via lme()
fm2 <- lme(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
# Model via gamm()
fm2.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
VarCorr(fm2)
VarCorr(fm2.gamm$lme)
# Both are identical
# Variance StdDev
# Subject = pdLogChol(1)
# (Intercept) 627.5690 25.051328
# Subject = pdLogChol(0 + Days)
# Days 35.8582 5.988172
# Residual 653.5838 25.565285
This answer also shows how to enter multiple random effects into lme().
I can use gls() from the nlme package to build mod1 with no random effects.
I can then compare mod1 using AIC to mod2 built using lme() which does include a random effect.
mod1 = gls(response ~ fixed1 + fixed2, method="REML", data)
mod2 = lme(response ~ fixed1 + fixed2, random = ~1 | random1, method="REML",data)
AIC(mod1,mod2)
Is there something similar to gls() for the lme4 package which would allow me to build mod3 with no random effects and compare it to mod4 built using lmer() which does include a random effect?
mod3 = ???(response ~ fixed1 + fixed2, REML=T, data)
mod4 = lmer(response ~ fixed1 + fixed2 + (1|random1), REML=T, data)
AIC(mod3,mod4)
With modern (>1.0) versions of lme4 you can make a direct comparison between lmer fits and the corresponding lm model, but you have to use ML --- it's hard to come up with a sensible analogue of the "REML criterion" for a model without random effects (because it would involve a linear transformation of the data that set all of the fixed effects to zero ...)
You should be aware that there are theoretical issues with information-theoretic comparisons between models with and without variance components: see the GLMM FAQ for more information.
library(lme4)
fm1 <- lmer(Reaction~Days+(1|Subject),sleepstudy, REML=FALSE)
fm0 <- lm(Reaction~Days,sleepstudy)
AIC(fm1,fm0)
## df AIC
## fm1 4 1802.079
## fm0 3 1906.293
I prefer output in this format (delta-AIC rather than raw AIC values):
bbmle::AICtab(fm1,fm0)
## dAIC df
## fm1 0.0 4
## fm0 104.2 3
To test, let's simulate data with no random effect (I had to try a couple of random-number seeds to get an example where the among-subject std dev was actually estimated as zero):
rr <- simulate(~Days+(1|Subject),
newparams=list(theta=0,beta=fixef(fm1),
sigma=sigma(fm1)),
newdata=sleepstudy,
family="gaussian",
seed=103)[[1]]
ss <- transform(sleepstudy,Reaction=rr)
fm1Z <- update(fm1,data=ss)
VarCorr(fm1Z)
## Groups Name Std.Dev.
## Subject (Intercept) 0.000
## Residual 29.241
fm0Z <- update(fm0,data=ss)
all.equal(c(logLik(fm0Z)),c(logLik(fm1Z))) ## TRUE
While I agree that with Ben that the simplest solution is to set REML=FALSE, the maximum REML likelihood for a model without random effects is well defined and is fairly straightforward to compute via the well known relation
between the ordinary profile likelihood function and the restricted likelihood.
The following code simulates data for which the estimated variance of the random intercept of a LMM ends up at 0 such that the maximum restricted log likelihood of the LMM should be equal to the restricted likelihood of the model without any random effects included.
The restricted likelihood of the LM is computed via the above formula and evaluates to the same value as that of the LMM.
An even simpler alternative is to use glmmTMB:
library(lme4)
#> Loading required package: Matrix
# simulate some toy data for which the LMM ends up at the boundary
set.seed(5)
n <- 100 # the sample size
x <- rnorm(n)
y <- rnorm(n)
group <- factor(rep(1:10,10))
# fit the LMM via REML
mod1 <- lmer(y ~ x + (1|group), REML=TRUE, control=lmerControl(boundary.tol=1e-8))
#> boundary (singular) fit: see ?isSingular
logLik(mod1)
#> 'log Lik.' -147.8086 (df=4)
# fit a model without random effects and compute its maximum REML log likelihood
mod0 <- lm(y ~ x)
p <- length(coef(mod0)) # number of fixed effect parameters
X <- model.matrix(mod0) # the fixed effect design matrix
sigma.REML <- summary(mod0)$sigma # REMLE of sigma
# the maximum ordinary log likelihood evaluated at the REML estimates
logLik.lm.at.REML <- sum(dnorm(residuals(mod0), 0, sigma.REML, log=TRUE))
# the restricted log likelihood of the model without random effects (via above formula)
logLik.lm.at.REML + p/2*log(2*pi) - 1/2*(- p*log(sigma.REML^2) + determinant(crossprod(X))$modulus)
#> [1] -147.8086
#> attr(,"logarithm")
#> [1] TRUE
library(glmmTMB)
data <- data.frame(y,x,group)
logLik(glmmTMB(y~x, family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=3)
logLik(glmmTMB(y~x+(1|group), family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=4)
I have the following linear models
library(nlme)
fm2 <- lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)
fm2.lm <- lm(distance ~ age + Sex,data = Orthodont)
How can I obtain the standard error of distance with age and Sex?
For fm2 (linear mixed model), you can do
sqrt(diag(summary(fm2)$varFix))
#(Intercept) age SexFemale
# 0.83392247 0.06160592 0.76141685
For fm2.lm (linear model), you can do
summary(fm2.lm)$coefficients[, "Std. Error"]
#(Intercept) age SexFemale
# 1.11220946 0.09775895 0.44488623
see attributes(summary(your.model)). what you're after is summary(your.model)$coefficients (or did I get your question wrong?). just use subsetting with [] to get what you want