I would like to estimate a random effect for Subject within each Day. In the data set, not every Subject is observed every Day, and therefore, I should not have an estimated effect for each Subject in each Day. I want to estimate a separate variance parameter (10 total, 1 for each Day) and distribution of Subjects within each Day (independent between time steps), to evaluate the change in the among Subject variability over time. Is this possible with lmer?
library(lme4)
data(sleepstudy)
set.seed(1)
sleep = sample_n(sleepstudy, size=500, replace=T)
sleep$Days = as.factor(sleep$Days)
table(sleep$Days, sleep$Subject)
fm1 <- lmer(Reaction ~ Days + (1|Days/Subject), sleep)
summary(fm1)
ranef(fm1)
This sounds like you would benefit more from using a Latent Growth Curve model as opposed to a Mixed Model. I believe the growth function in lavaan would best suit you.
Related
I am trying to run a mixed effects model that uses time as a fixed effect and some covariate variables.
The repeated measures of mean score of depression (=depr) is the dependent variable. We measured the outcome at 6 time points.
lme(depr ~ time, random = ~ 1 | ID, data=longirisk, method = "ML")
Now, I am interested in whether the values of a third variable (here: stess) contribute to mean scores of depression. Maybe I could add an interaction term (time * stress) or I could add a fixed effect (stress). However, I am not sure how to consider repeated measured stress values.
Any ideas to solve this problems?
We are currently working with plant phenology.
We built a linear mixed model for each species present in the study area.
We set Days From Snowmelt (The sum of days from snowmelt to the visit day along the summer) as the response variable while Mean phenology (mean phenology state for each plot ( there are 3 on each locality) is calculated by the mean phenological state from the 12 subplots into each plot is divided. from 1-6, the higher the number the more advanced the cycle). year and plot nested within the locality are set as random factors.
Once the model is built and revised, we want to predict the days from snowmelt for each species to achieve the phenological phases of interest, which happen to have a mean of 2, 3, 4, and 5. (corresponding to vegetative, flowering, fruit development and dispersion, respectively)
I have tried the function predict() but I get no heterogeneity between phases for each species, the progression seems to be linear (as shown in the image file).
Could this be just because is a linear model so will it only give linear responses? Are there any other ways to get predictions from these kinds of models and show their CI?
How can i get predictions with CI from lmerTest models?
I think you probably mean pediction intervals. You can use the predictInterval function in the merTools package. For example:
library(lmerTest); library(merTools)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), data = sleepstudy)
head(predictInterval(fm1, level = 0.95, seed = 123, n.sims = 100))
Could this be just because is a linear model so will it only give linear responses?
Yes ! If you fit a linear model, then the predictions will be linear. Of course, you can model nonlinearity with a linear model in several ways including transformation(s), nonlinear terms (the model is still linear in the parameters) and splines.
I am using Zero-inflation Poisson (zip) and Zero-inflation negative binomial (zinb) regressions to detect temporal trends in count data (death per year for 30 years reported at 6 hospitals) that has may zeros and Overdispersion.
I have written some codes using pscl package and my goal is to compare trends among hospitals.
Counts<- read.csv("data.csv", header = T)
Years= Counts$X
Ho1= Counts$Ho1
Ho2= Counts$Ho2
Ho3= Counts$Ho3
... .........
... ..........
require(pscl)
zip1 <- zeroinfl(Ho1 ~ Years, dist = "poisson")
zinb4 <- zeroinfl(Ho4 ~ Years, dist = "negbin")
But when I plot some of the data it shows slightly increasing trends whereas the zip and zinb show negative trends
Here is an example:
zip result:
zip1
Call:
zeroinfl(formula = Ho1 ~ Years, dist = "poisson")
Count model coefficients (poisson with log link):
(Intercept) Years
-4.836815 0.002837
Zero-inflation model coefficients (binomial with logit link):
(Intercept) Years
467.2323 -0.2353
for this model the trend (slope) is -0.235 and when I used ordinary least squares (OLS) the trend= 0.043.
My understanding is that both zip and OLS should differ slightly.
So I was thinking maybe my codes are not correct or I am missing something.
I would appreciate any thoughts and suggestion
With increasing Years you get increasing counts (= higher responses and less zeros) and you get decreasing zero inflation (= higher responses and less zeros). Thus, the effects in both components of the model appear to be in sync and conform with your OLS results.
I am analysing some whale tourism data and am trying to construct linear mixed effect models in the nlme package to see if any of my explanatory variables affect encounter time between whales and tourists. (I am also open to running this model in lme4.)
My variables are:
mins: encounter time (response variable)
Id: individual whale ID (random effect)
Vessel: vessel Id (random effect)
Sex: sex of the animal
Length: length of the animal
Year
Month (nested within Year).
So my random variables are Id and Vessel and I also have Year and Month as nested random effects.
I have come up with the following:
form1 <- formula(Min ~ length + Sex+ Encounter)
model1 <- lme(form1,
random = list(Id = ~1,
Vessel = ~1,
Year=~1,
Month = ~1), data=wsdata, method="ML")
But all my random effects become nested within Id.
Is there any way I can define Id and Vessel as separate random effects and Year and Month as nested random effects?
In general it's much easier to specify crossed (what you mean by "separate", I think) random effects in lme4, so unless you need models for temporal or spatial autocorrelation or heteroscedasticity (which are still easier to achieve with nlme), I would go ahead with
library(lme4)
fit <- lmer(mins ~ Length + Sex+ (1|Id) + (1|Vessel) +
(1|Year/Month), data=wsdata, REML=FALSE)
A few other comments:
what is encounter? it was in your formula but not in your description of the data set
it seems quite likely that encounter times (a duration of encounters?) would be skewed, in which case you might want to log-transform them.
My aim is to forecast the daily number of registrations in two different channels.
Week seasonality is quite strong, especially the weekends and also observed annual effects. Moreover, I have a few special event days, which significantly differ from the others days.
First, I applied a TBATS model on these two channels.
x.msts <- msts(Channel1_reg,seasonal.periods=c(7,365.25))
# fit model
fit <- tbats(x.msts)
fit
plot(fit)
forecast_channel1 <- forecast(fit,h=30)
First channel:
TBATS(0, {2,3}, -, {<7,3>, <365.25,2>})
Call: tbats(y = x.msts)
Parameters
Lambda: 0
Alpha: 0.0001804516
Gamma-1 Values: -1.517954e-05 1.004701e-05
Gamma-2 Values: -3.059654e-06 -2.796211e-05
AR coefficients: 0.249944 0.544593
MA coefficients: 0.215696 -0.361379 -0.21082
Second channel:
BATS(0, {2,2}, 0.929, -)
Call: tbats(y = y.msts)
Parameters
Lambda: 0
Alpha: 0.1652762
Beta: -0.008057904
Damping Parameter: 0.928972
AR coefficients: -0.586163 -0.676921
MA coefficients: 0.924758 0.743675
If I forecast the second channel, I only get blank values instead of any forecasts.
Could you please help why is that so?
Do you have any suggestion how to build in the specific event days into this model?
Thank you all!
tbats and bats are occasionally unstable, and your second model is showing infinite forecasts. There are already some bug reports about similar issues.
In any case, as you want to use event information, you would be better building a harmonic regression model with ARMA errors.
For example, suppose your event information is recorded as a dummy variable event1. Then the model can be fitted as follows:
harmonics <- fourier(x.msts, K=c(2,2))
fit1 <- auto.arima(x.msts, lambda=0,
xreg=cbind(harmonics,event1), seasonal=FALSE)
f1 <- forecast(fit1,
xreg=cbind(fourierf(x.msts, K=c(2,2), h=200), rep(0,200)))
This assumes that the event will not occur in the next 200 days (hence the 200 0s). I have used harmonics of order 2 for both weeks and years. Adjust these to minimize the AICc of the model.
This model is actually very similar to the TBATS model you are fitting except that the lambda value has been specified rather than estimated, and the seasonality is fixed over time rather than being allowed to evolve. The advantage is that the harmonic regression model tends to be more stable, and allows covariates to be included.