"gls function fits a linear model using generalized least squares. The errors are allowed to be correlated and/or have unequal variances."
Example
# NOT RUN {
# AR(1) errors within each Mare
fm1 <- gls(follicles ~ sin(2*pi*Time) + cos(2*pi*Time), Ovary,
correlation = corAR1(form = ~ 1 | Mare))
# variance increases as a power of the absolute fitted values
fm2 <- update(fm1, weights = varPower())
# }
I got all the above information from https://www.rdocumentation.org/packages/nlme/versions/3.1-137/topics/gls
In the example, they used a nonlinear model "follicles ~ sin(2*pi*Time) + cos(2*pi*Time)". My quesion is why they used gls fucntion to fit the nonlinear model? Any idea please!
Thank you in advance
Related
Aside from R function nlme::lme(), I'm wondering how else I can model the Level-1 residual variance-covariance structure?
ps. My search showed I could possibly use glmmTMB package but it seems it is not about Level-1 residuals but random-effects themselves (see below code).
glmmTMB::glmmTMB(y ~ times + ar1(times | subjects), data = data) ## DON'T RUN
nlme::lme (y ~ times, random = ~ times | subjects,
correlation = corAR1(), data = data) ## DON'T RUN
glmmTMB can effectively be used to model level-1 residuals, by adding an observation-level random effect to the model (and if necessary suppressing the level-1 variance via dispformula ~ 0. For example, comparing the same fit in lme and glmmTMB:
library(glmmTMB)
library(nlme)
data("sleepstudy" ,package="lme4")
ss <- sleepstudy
ss$times <- factor(ss$Days) ## needed for glmmTMB
I initially tried with random = ~Days|Subject but neither lme nor glmmTMB were happy (overfitted):
lme1 <- lme(Reaction ~ Days, random = ~1|Subject,
correlation=corAR1(form=~Days|Subject), data=ss)
m1 <- glmmTMB(Reaction ~ Days + (1|Subject) +
ar1(times + 0 | Subject),
dispformula=~0,
data=ss,
REML=TRUE,
start=list(theta=c(4,4,1)))
Unfortunately, in order to get a good answer with glmmTMB I did have to tweak the starting values ...
Random effect and variance-covariance matrix of random effect with lme4 package are extracted as following:
library(lme4)
fm1 <- lmer(Reaction ~ Days + (1|Subject), sleepstudy)
fm1.rr <- ranef(fm1,condVar=TRUE)
fm1.pv <- attr(rr[[1]],"postVar")
I wonder how I can do this with mgcv?
'gam.vcomp' function does extract the estimated variance components, but not for each level of random effect.
library(mgcv)
fm2 <- gam(Reaction ~ Days + s (Subject, bs="re"), data = sleepstudy, method = "REML")
gam.vcomp(fm2)
library(lme4)
data(sleepstudy)
fm1 <- lmer(Reaction ~ Days + (1|Subject), sleepstudy)
fm1.rr <- ranef(fm1,condVar=TRUE)$Subject[,1]
fm1.pv <- sqrt(attr(ranef(fm1,condVar=TRUE) [['Subject']],"postVar")[1,1,])
library(mgcv)
fm2 <- gam(Reaction ~ Days + s (Subject, bs="re"),
data = sleepstudy, method = "REML")
To extract random effect for each Subject
idx <-grep("Subject", names(coef(fm2)))
fm2.rr<-coef(fm2)[idx]
attributes(fm2.rr)<-NULL
We can see that random effects in both models are identical as expected.
To extract variance-covariance matrix for random effect and calculate an error we use parameter Vp which is a Bayesian posterior covariance matrix:
fm2.pv <-sqrt(diag(fm2$Vp))[idx]
Or frequentist estimated covariance matrix Ve
fm2.pv <-sqrt(diag(fm2$Ve))[idx]
We can see that random effect errors estimated with mgcv slightly differ that those estimated with lme4 model. Errors based on a Bayesian posterior covariance matrix are larger, whereas based on a frequentist matrix are smaller.
You can also use the package gamm4, which is based on the gamm package but using lme4 underneath. The model would be fitted as:
fm3 <- gamm4(Reaction ~ Days, random = ~ (1|Subject), data = sleepstudy)
Random effects and variance-covariance matrix of random effects can be obtained following the normal lme4 procedure.
fm3.rr <- ranef(fm3$mer,condVar=TRUE)
fm3.pv <- attr(fm3.rr[[1]],"postVar")[1,1,]
However gamm4 can be much slower than gam so read the help file to see when it best suits your need.
I am trying to specify both a random intercept and random slope term in a GAMM model with one fixed effect.
I have successfully fitted a model with a random intercept using the below code within the mgcv library, but can now not determine what the syntax is for a random slope within the gamm() function:
M1 = gamm(dur ~ s(dep, bs="ts", k = 4), random= list(fInd = ~1), data= df)
If I was using both a random intercept and slope within a linear mixed-effects model I would write it in the following way:
M2 = lme(dur ~ dep, random=~1 + dep|fInd, data=df)
The gamm() supporting documentation states that the random terms need to be given in the list form as in lme() but I cannot find any interpretable examples that include both slope and intercept terms. Any advice / solutions would be much appreciated.
The gamm4 function in the gamm4 package contains a way to do this. You specify the random intercept and slope in the same way that you do in the lmer style. In your case:
M1 = gamm4(dur~s(dep,bs="ts",k=4), random = ~(1+dep|fInd), data=df)
Here is the gamm4 documentation:
https://cran.r-project.org/web/packages/gamm4/gamm4.pdf
Here is the gamm() syntax to enter correlated random intercept and slope effects, using the sleepstudy dataset.
library(nlme)
library(mgcv)
data(sleepstudy,package='lme4')
# Model via lme()
fm1 <- lme(Reaction ~ Days, random= ~1+Days|Subject, data=sleepstudy, method='REML')
# Model via gamm()
fm1.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1+Days), data=sleepstudy, method='REML')
VarCorr(fm1)
VarCorr(fm1.gamm$lme)
# Both are identical
# Subject = pdLogChol(1 + Days)
# Variance StdDev Corr
# (Intercept) 612.0795 24.740241 (Intr)
# Days 35.0713 5.922103 0.066
# Residual 654.9424 25.591843
The syntax to enter uncorrelated random intercept and slope effects is the same for lme() and gamm().
# Model via lme()
fm2 <- lme(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
# Model via gamm()
fm2.gamm <- gamm(Reaction ~ Days, random= list(Subject=~1, Subject=~0+Days), data=sleepstudy, method='REML')
VarCorr(fm2)
VarCorr(fm2.gamm$lme)
# Both are identical
# Variance StdDev
# Subject = pdLogChol(1)
# (Intercept) 627.5690 25.051328
# Subject = pdLogChol(0 + Days)
# Days 35.8582 5.988172
# Residual 653.5838 25.565285
This answer also shows how to enter multiple random effects into lme().
I can use gls() from the nlme package to build mod1 with no random effects.
I can then compare mod1 using AIC to mod2 built using lme() which does include a random effect.
mod1 = gls(response ~ fixed1 + fixed2, method="REML", data)
mod2 = lme(response ~ fixed1 + fixed2, random = ~1 | random1, method="REML",data)
AIC(mod1,mod2)
Is there something similar to gls() for the lme4 package which would allow me to build mod3 with no random effects and compare it to mod4 built using lmer() which does include a random effect?
mod3 = ???(response ~ fixed1 + fixed2, REML=T, data)
mod4 = lmer(response ~ fixed1 + fixed2 + (1|random1), REML=T, data)
AIC(mod3,mod4)
With modern (>1.0) versions of lme4 you can make a direct comparison between lmer fits and the corresponding lm model, but you have to use ML --- it's hard to come up with a sensible analogue of the "REML criterion" for a model without random effects (because it would involve a linear transformation of the data that set all of the fixed effects to zero ...)
You should be aware that there are theoretical issues with information-theoretic comparisons between models with and without variance components: see the GLMM FAQ for more information.
library(lme4)
fm1 <- lmer(Reaction~Days+(1|Subject),sleepstudy, REML=FALSE)
fm0 <- lm(Reaction~Days,sleepstudy)
AIC(fm1,fm0)
## df AIC
## fm1 4 1802.079
## fm0 3 1906.293
I prefer output in this format (delta-AIC rather than raw AIC values):
bbmle::AICtab(fm1,fm0)
## dAIC df
## fm1 0.0 4
## fm0 104.2 3
To test, let's simulate data with no random effect (I had to try a couple of random-number seeds to get an example where the among-subject std dev was actually estimated as zero):
rr <- simulate(~Days+(1|Subject),
newparams=list(theta=0,beta=fixef(fm1),
sigma=sigma(fm1)),
newdata=sleepstudy,
family="gaussian",
seed=103)[[1]]
ss <- transform(sleepstudy,Reaction=rr)
fm1Z <- update(fm1,data=ss)
VarCorr(fm1Z)
## Groups Name Std.Dev.
## Subject (Intercept) 0.000
## Residual 29.241
fm0Z <- update(fm0,data=ss)
all.equal(c(logLik(fm0Z)),c(logLik(fm1Z))) ## TRUE
While I agree that with Ben that the simplest solution is to set REML=FALSE, the maximum REML likelihood for a model without random effects is well defined and is fairly straightforward to compute via the well known relation
between the ordinary profile likelihood function and the restricted likelihood.
The following code simulates data for which the estimated variance of the random intercept of a LMM ends up at 0 such that the maximum restricted log likelihood of the LMM should be equal to the restricted likelihood of the model without any random effects included.
The restricted likelihood of the LM is computed via the above formula and evaluates to the same value as that of the LMM.
An even simpler alternative is to use glmmTMB:
library(lme4)
#> Loading required package: Matrix
# simulate some toy data for which the LMM ends up at the boundary
set.seed(5)
n <- 100 # the sample size
x <- rnorm(n)
y <- rnorm(n)
group <- factor(rep(1:10,10))
# fit the LMM via REML
mod1 <- lmer(y ~ x + (1|group), REML=TRUE, control=lmerControl(boundary.tol=1e-8))
#> boundary (singular) fit: see ?isSingular
logLik(mod1)
#> 'log Lik.' -147.8086 (df=4)
# fit a model without random effects and compute its maximum REML log likelihood
mod0 <- lm(y ~ x)
p <- length(coef(mod0)) # number of fixed effect parameters
X <- model.matrix(mod0) # the fixed effect design matrix
sigma.REML <- summary(mod0)$sigma # REMLE of sigma
# the maximum ordinary log likelihood evaluated at the REML estimates
logLik.lm.at.REML <- sum(dnorm(residuals(mod0), 0, sigma.REML, log=TRUE))
# the restricted log likelihood of the model without random effects (via above formula)
logLik.lm.at.REML + p/2*log(2*pi) - 1/2*(- p*log(sigma.REML^2) + determinant(crossprod(X))$modulus)
#> [1] -147.8086
#> attr(,"logarithm")
#> [1] TRUE
library(glmmTMB)
data <- data.frame(y,x,group)
logLik(glmmTMB(y~x, family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=3)
logLik(glmmTMB(y~x+(1|group), family = gaussian(), data=data, REML=TRUE))
#> 'log Lik.' -147.8086 (df=4)
Still quite new to R (and statistics to be honest) and I have currently only used it for simple linear regression models. But now one of my data sets clearly shows a inverted U pattern. I think I have to do a quadratic regression analysis on this data, but I'm not sure how. What I tried so far is:
independentvar2 <- independentvar^2
regression <- lm(dependentvar ~ independentvar + independentvar2)
summary (regression)
plot (independentvar, dependentvar)
abline (regression)
While this would work for a normal linear regression, it doesn't work for non-linear regressions. Can I even use the lm function since I thought that meant linear model?
Thanks
Bert
This example is from this SO post by #Tom Liptrot.
plot(speed ~ dist, data = cars)
fit1 = lm(speed ~ dist, cars) #fits a linear model
plot(speed ~ dist, data = cars)
abline(fit1) #puts line on plot
fit2 = lm(speed ~ I(dist^2) + dist, cars) #fits a model with a quadratic term
fit2line = predict(fit2, data.frame(dist = -10:130))