Fitting random factors for a linear model using lme4 - r

I have 4 random factors and I want to provide its linear model using lme4. But struggled to fit the model.
Assuming A is nested within B (2 levels), which in turn nested within each of xx preceptors (P). All responded to xx Ms (M).
I want to fit my model to get variances for each factor and their interactions.
I have used the following codes to fit the model, but I was unsuccessful.
lme4::lmer(value ~ A +
(1 + A|B) +
(1 + P|A),
(1+ P|M),
data = myData, na.action = na.exclude)
I also read interesting materials here, but Still, I struggle to fit the model. Any help?

At a guess, if the nesting structure is ( P (teachers) / B (occasions) / A (participants) ), meaning that the occasions for one teacher are assumed to be completely independent of the occasions for any other teacher, and that participants in turn are never shared across occasions or teachers, but questions (M) are shared across all teachers and occasions and participants:
value ~ 1 + (1| P / B / A) + (1|M)
Some potential issues:
as you hint in the comments, it may not be practical to fit random effects for factors with small numbers of levels (say, < 5); this is likely to lead to the dreaded "singular model" message (see the GLMM FAQ for more detail).
if all of the questions (M) are answered by every participant, then in principle it's possible to fit a model that takes account of the among-question correlation within participants: the maximal model would be ~ 1 + (M | P / B / A) (which would look for among-question correlations at the level of teacher, occasion within teacher, and participant within occasion within teacher). However, this is very unlikely to work in practice (especially if each participant answers each question only once, in which case the teacher:occasion:participant:question variance will be confounded with the residual variance in a linear model). In this case, you will get an error about "probably unidentifiable": see e.g. this question for more explanation/detail.

Related

Min timepoints to model longitudinal data with natural quadratic splines?

I'm new to applying splines to longitudinal data, so here comes my question:
I've some longitudinal data on growing mice in 3 timepoints: at x, y and z months. It's known from the existent literature that the trajectories of growth in this type of data are usually better modeled in non-linear terms.
However, since I have only 3 timepoints, I wonder if this allows me to apply natural quadratic spline to age variable in my lmer model?
edit:I mean is
lmer<-mincLmer(File ~ ns(Age,2) * Genotype + Sex + (1|Subj_ID),data, mask=mask)
a legit way to go around?
I'm sorry if this is a stupid question - I'm just a lonely PhD student without supervision, and I would be super-grateful for any advice!!!
Marina
With the nls() function you can fit your data to whatever non-linear function you want. Then, from the biological point of view, probably your data is described by a Gompertz-like function (sigmoidal), but as you have only three time points, probably you can simplify these kind of functions into an exponential one. Try the following:
fit_formula <- independent_variable ~ a * exp(b * dependent_variable)
result <- nls(formula = fit_formula, data = your_Dataset)
It will probably give you an error the first times, something like singular matrix gradient at initial estimates ; if this happens, try adding the additional parameter start, where you provide different starting values for a and b more close to the true values. Remember that in your dataset, the column names must be equal to the names of the variables in the formula.

Linear mixed effects models in R - mixed advice on random effects factors with less than 5 levels

I ran an experiment where I moved subjects arm about the elbow to a reference position, returned it to a home position and then asked them to try and replicate the position, all without vision of their arm. I then measured the error in their position matching as an estimate of their upper limb position sense acuity. This experiment aimed to compare position sense acuity between a group of older and young adults.
The experiment was designed such that subjects performed 4 repeats at each of 3 reference positions for both extension and flexion movements of the elbow. To make best use of the data (avoid averaging and potential loss of data across repeated measure levels with a mixed ANOVA), I would like to analyse the effect of age group (2 levels) on matching error whilst controlling for reference position (3 levels) and movement direction (2 levels) in a linear mixed effects model, but I’m having some issues working out how to model the random effects.
On the one hand, I have fairly consistently read that that random effects factors typically need a minimum of 5-6 levels to achieve a robust estimate of variance (e.g. pg.33, https://lme4.r-forge.r-project.org/book/Ch2.pdf and https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5970551/) which makes me think that it would not be possible to model reference position (3 levels) and movement direction (2 levels) with random intercepts in this way…
Err_model_1 <- lmer(error ~ age_group + (1|subjects) + (1|move_direct) + (1|ref_pos))
In which case I was considering including them as fixed factors instead, however, I have also read that for designs using within-subjects measures as fixed factors, they should also be modelled with a random slope for a maximal model that minimizes Type I error rates (Barr 2013; https://www.frontiersin.org/articles/10.3389/fpsyg.2013.00328/full) which somewhat contradicts the minimum 5-6 number of levels rule, but I think would look as follows…
Err_model_2 <- lmer(error ~ age_group * move_direct * ref_pos + (1 + move_direct + ref_pos | subjects)
Is there something specific to using the within-subjects measures as
fixed factors in Err_model_2 that allows you to model them as a
random effect validly?
Is there another way I would be able to model movement direction and reference position as random effects in this
kind of model?
Any other help or comments would be appreciated, thanks!

Incorporating time series into a mixed effects model in R (using lme4)

I've had a search for similar questions and come up short so apologies if there are related questions that I've missed.
I'm looking at the amount of time spent on feeders (dependent variable) across various conditions with each subject visiting feeders 30 times.
Subjects are exposed to feeders of one type which will have a different combination of being scented/unscented, having visual patterns/being blank, and having these visual or scented patterns presented in one of two spatial arrangements.
So far my model is:
mod<-lmer(timeonfeeder ~ scent_yes_no + visual_yes_no +
pattern_one_or_two + (1|subject), data=data)
How can I incorporate the visit numbers into the model to see if these factors have an effect on the time spent on the feeders over time?
You have a variety of choices (this question might be marginally better for CrossValidated).
as #Dominix suggests, you can allow for a linear increase or decrease in time on feeder over time. It probably makes sense to allow this change to vary across birds:
timeonfeeder ~ time + ... + (time|subject)
you could allow for an arbitrary pattern of change over time (i.e. not just linear):
timeonfeeder ~ factor(time) + ... + (1|subject)
this probably doesn't make sense in your case, because you have a large number of observations, so it would require many parameters (it would be more sensible if you had, say, 3 time points per individual)
you could allow for a more complex pattern of change over time via an additive model, i.e. modeling change over time with a cubic spline. For example:
library(mgcv)
gamm(timeonfeeder ~ s(time) + ... , random = ~1|subject
(1) this assumes the temporal pattern is the same across subjects; (2) because gamm() uses lme rather than lmer under the hood you have to specify the random effect as a separate argument. (You could also use the gamm4 package, which uses lmer under the hood.)
You might want to allow for temporal autocorrelation. For example,
lme(timeonfeeder ~ time + ... ,
random = ~ time|subject,
correlation = corAR1(form= ~time|subject) , ...)

Nested model in R

I'm having a huge problem with a nested model I am trying to fit in R.
I have response time experiment with 2 conditions with 46 people each and 32 measures each. I would like measures to be nested within people and people nested within conditions, but I can't get it to work.
The code I thought should make sense was:
nestedmodel <- lmer(responsetime ~ 1 + condition +
(1|condition:person) + (1|person:measure), data=dat)
However, all I get is an error:
Error in checkNlevels(reTrms$flist, n = n, control) :
number of levels of each grouping factor must be < number of observations
Unfortunately, I do not even know where to start looking what the problem is here.
Any ideas? Please, please, please? =)
Cheers!
This might be more appropriate on CrossValidated, but: lme4 is trying to tell you that one or more of your random effects is confounded with the residual variance. As you've described your data, I don't quite see why: you should have 2*46*32=2944 total observations, 2*46=92 combinations of condition and person, and 46*32=1472 combinations of measure and person.
If you do
lf <- lFormula(responsetime ~ 1 + condition +
(1|condition:person) + (1|person:measure), data=dat)
and then
lapply(lf$reTrms$Ztlist,dim)
to look at the transposed random-effect design matrices for each term, what do you get? You should (based on your description of your data) see that these matrices are 1472 by 2944 and 92 by 2944, respectively.
As #MrFlick says, a reproducible example would be nice. Other things you could show us are:
fit the model anyway, using lmerControl(check.nobs.vs.nRE="ignore") to ignore the test, and show us the results (especially the random effects variances and the statement of the numbers of groups)
show us the results of with(dat,table(table(interaction(condition,person))) to give information on the number of replicates per combination (and similarly for measure)

Repeated-measures ANOVA in R: +Error(subject) or + Error(subject/ VI1 * VI2)?

I have spent a lot of time on multiple posts and tutorials, but I still do not understand wich "rule" I have to apply to my current data, and why.
My experiment follows a within-subject design, as every subjects (n=17) performed a task in 2 conditions, accross 5 blocks of trials. VD is the mean RTs, fixed effects are condition and block, random effect is subject.
I would like to analyse the interaction between condition and block.
Using aov
I first aggregated my data:
ag<-aggregate(RT~condition+block+subject,data=d, FUN=mean)
But then I don't know if I have to include my within-subject factors into my Error term:
(1)aov<-aov(RT~ condition * block + Error(subject/(condition * block)), data=ag)
OR
(2)aov<-aov(RT~ condition * block + Error(subject), data=ag)
I have seen on several posts that the within factors have to be included in the error term, as in (1), but I do not understand how the dfs are calculated.
Using lmer
Additionally, I would like to attempt using lmer instead of aov.
I suspect that the equivalent of (2) would be:
lmer(RT ~ 1+(1|sujet)+condition*block, ag)
But if it is the (1) which is the correct one, I can not figure out how does it would have to be specified using lmer.

Resources