Repeated measures ANOVA - r

What is the generic code to perform a repeated measures ANOVA?
I am currently using the code:
summary(aov(Chlo~Site,data=alldata)).
There are three different sites (Site) and four expirmental variables that I am testing individually (Chlo, SST, DAC and PAR). I am also assessing any differences in these variables and year (between 2003 and 2012):
summary(aov(Chlo~Year,data=year))
Any help would be appreciated!

In general you should avoid performing multiple calls with aov and rather use a mixed effects linear model.
You can find several examples in this post by Paul Gribble
I often use the nlme package, such as:
require(nlme)
model <- lme(dv ~ myfactor, random = ~1|subject/myfactor, data=mydata)
Depending on the situation you may run in more complex situations, I would suggest to have a look at the very clear book by Julian Faraway "Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models".
Also, you may want to ask on CrossValidated if you have more specific statistical questions.

The trick using aov function is that you just need to add Error term. As one of the guides says: The Error term must reflect that we have "treatments nested within subjects".
So, in your case, if Site is a repeated measure you should use:
summary(aov(Chlo ~ Site + Error(subject/Site), data=alldata))

Related

How to train a multiple linear regression model to find the best combination of variables?

I want to run a linear regression model with a large number of variables and I want an R function to iterate on good combinations of these variables and give me the best combination.
The glmulti package will do this fairly effectively:
Automated model selection and model-averaging. Provides a wrapper for glm and other functions, automatically generating all possible models (under constraints set by the user) with the specified response and explanatory variables, and finding the best models in terms of some Information Criterion (AIC, AICc or BIC). Can handle very large numbers of candidate models. Features a Genetic Algorithm to find the best models when an exhaustive screening of the candidates is not feasible.
Unsolicited advice follows:
HOWEVER. Please be aware that while this approach can find the model that minimizes within-sample error (the goodness of fit on your actual data), it has two major problems that should make you think twice about using it.
this type of data-driven model selection will almost always destroy your ability to make reliable inferences (compute p-values, confidence intervals, etc.). See this CrossValidated question.
it may overfit your data (although using the information criteria listed in the package description will help with this)
There are a number of different ways to characterize a "best" model, but AIC is a common one, and base R offers step(), and package MASS offers stepAIC().
summary(lm1 <- lm(Fertility ~ ., data = swiss))
slm1 <- step(lm1)
summary(slm1)
slm1$anova

Most straightforward R package for setting subject as random effect in mixed logit model

I have a dataset in which individuals, each belonging to a particular group, repeatedly chose between multiple discrete outcomes.
subID group choice
1 Big A
1 Big B
2 Small B
2 Small B
2 Small C
3 Big A
3 Big B
. . .
. . .
I want to test how group membership influences choice, and want to account for non-independence of observations due to repeated choices being made by the same individuals. In turn, I planned to implement a mixed multinomial regression treating group as a fixed effect and subID as a random effect. It seems that there are a few options for multinomial logits in R, and I'm hoping for some guidance on which may be most easily implemented for this mixed model:
1) multinom - GLM, via nnet, allows the usage of the multinom function. This appears to be a nice, clear, straightforward option... for fixed effect models. However is there a manner to implement random effects with multinom? A previous CV post suggests that multinom is able to handle mixed-effects GLM with poisson distribution and a log link. However, I don't understand (a) why this is the case or (b) the required syntax. Can anyone clarify?
2) mlogit - A fantastic package, with incredibly helpful vignettes. However, the "mixed logit" documentation refers to models that have random effects related to alternative specific covariates (implemented via the rpar argument). My model has no alternative specific variables; I simply want to account for the random intercepts of the participants. Is this possible with mlogit? Is that variance automatically accounted for by setting subID as the id.var when shaping the data to long form with mlogit.data? EDIT: I just found an example of "tricking" mlogit to provide random coefficients for variables that vary across individuals (very bottom here), but I don't quite understand the syntax involved.
3) MCMCglmm is evidently another option. However, as a relative novice with R and someone completely unfamiliar with Bayesian stats, I'm not personally comfortable parsing example syntax of mixed logits with this package, or, even following the syntax, making guesses at priors or other needed arguments.
Any guidance toward the most straightforward approach and its syntax implementation would be thoroughly appreciated. I'm also wondering if the random effect of subID needs to be nested within group (as individuals are members of groups), but that may be a question for CV instead. In any case, many thanks for any insights.
I would recommend the Apollo package by Hess & Palma. It comes with a great documentation and a quite helpful user group.

R: mixed model with heteroscedastic data -> only lm function works?

This question asks the same question, but hasn't been answered. My question relates to how to specify the model with the lm() function and is therefore a programming (not statistical) question.
I have a mixed design (2 repeated and 1 independent predictors). Participants were first primed into group A or B (this is the independent predictor) and then they rated how much they liked 4 different statements (these are the two repeated predictors).
There are many great online resources how to model this data. However, my data is heterscedastic. So I like to use heteroscedastic-consistent covariance matrices. This paper explains it well. The sandwich and lmtest packages are great. Here is a good explanation how to do it for a indpendent design in R with lm(y ~ x).
It seems that I have use lm, else it wont work?
Here is the code for a regression model assuming that all variances are equal (which they are not as Levene's test comes back significant).
fit3 <- nlme:::lme(DV ~ repeatedIV1*repeatedIV2*independentIV1, random = ~1|participants, df) ##works fine
Here is the code for an indepedent model correcting for heteroscedasticity, which works.
fit3 <- lm(DV ~ independentIV1)
library(sandwich)
vcovHC(fit3, type = 'HC4', sandwich = F)
library(lmtest)
coef(fit3, vcov. = vcovHC, type = 'HC4')
So my question really is, how to specify my model with lm?
Alternative approaches in R how to fit my model accounting for heteroscedasticity are welcome too!
Thanks a lot!!!
My impression is that your problems come from mixing various approaches for various aspects (repeated measurements/correlation vs. heteroscedasticity) that cannot be mixed so easily. Instead of using random effects you might also consider fixed effects, or instead of only adjusting the inference for heteroscedasticity you might consider a Gaussian model and model both mean and variance, etc. For me, it's hard to say what is the best route forward here. Hence, I only comment on some aspects regarding the sandwich package:
The sandwich package is not limited to lm/glm only but it is in principle object-oriented, see vignette("sandwich-OOP", package = "sandwich") (also published as doi:10.18637/jss.v016.i09.
There are suitable methods for a wide variety of packages/models but not
for nlme or lme4. The reason is that it's not so obvious for which mixed-effects models the usual sandwich trick actually works. (Disclaimer: But I'm no expert in mixed-effects modeling.)
However, for lme4 there is a relatively new package
called merDeriv (https://CRAN.R-project.org/package=merDeriv) that
supplies estfun and bread methods so that sandwich covariances can be
computed for lmer output etc. There is also a working paper associated
with that package: https://arxiv.org/abs/1612.04911

Linear mixed model with repeated Anova issue

I have a question related to repeated Anova with Linear mixed model. I have found a nice example posted very long time ago on another website (https://stats.stackexchange.com/questions/237512/how-to-perform-post-hoc-test-on-lmer-model) but I am not sure if this method is correct for my case.
library (lme4)
library (lmerTest)
Result<- lmer (Value~Type+ (1|Material), data = data)
summary (Result)
library(multcomp)
summary(glht(Results, linfct = mcp(Material= "Tukey")), test = adjusted("holm"))
The person who gave this example mentioned: "After you've fit your lmer model you can do ANOVA, MANOVA, and multiple comparison procedures on the model object". Briefly, I have repeated measurements of 6 groups and I want to see which material differs from which material. When I run this code I see the results I am looking for but I would like to ask if this method can be used for repeated anova or should I use different library? in that case I would be happy if you can edit the code. Thanks in advance.
Best Regards.

Linear mixed model with crossed repeated effects and AR1 covariance structure, in R

I have within-subject physiological data from participants (part), who have all looked at stimuli (reading newspapers) on three rounds (round), which each have five papers (paper), and within each there are variable number of visits (visit) in the newspaper. I have two fixed factors (CONDhier and CONDabund) plus interaction to predict the physiological state (e.g., EDA), which is usually autoregressive. I try to take into account individual differences in physiology with random effects (let's settle for intercept only for now), and perhaps fatigue over rounds with another random effect.
Thus, my model that I would like to run in R would be, in SPSS:
MIXED EDA BY CONDhier CONDabund
/FIXED=CONDhier CONDabund CONDhier*CONDabund | SSTYPE(3)
/RANDOM=INTERCEPT | SUBJECT(part) COVTYPE(VC)
/RANDOM=INTERCEPT | SUBJECT(part*round) COVTYPE(VC)
/PRINT=SOLUTION
/METHOD=REML
/REPEATED=visit | SUBJECT(part*round*paper) COVTYPE(AR1).
Now, I have understood that while lme does not do crossed terms well, lmer (that handles crossed terms with no problem) cannot use different covariance structures. I can run simple lme models such as
lme(EDA ~ factor(CONDhier) * factor(CONDabund), random= ~1
|part, na.action=na.exclude, data=phys2)
but a more complicated model is beyond me. I have read that crossed terms in lme can be done with random definitions like
random=pdBlocked(list(pdCompSymm(~part), pdCompSymm(~round-1), pdCompSymm(~paper-1),
pdCompSymm(~visit-1)))
but that seems to block the AR1 structure, and the second random intercept for part*round, from me. And I'm not so sure it's the same as my SPSS syntax anyway.
So, any advice? Although there are lots of different writings on lme and lmer, I couldn't find one that would have both crossed terms and AR1.
(Also, the syntax on lme seems quite obscure: from several different sources I have understood that | groups what is on the left under what's on the right, that / makes nested terms, that ~1 is random intercept, ~x is random slope, and ~1+x is both, but there seems to be at least : and -1 definitions that I couldn't find anywhere. Is there a tutorial that would explain all the different definitions?)
Consider the R package MCMCglmm which allows for complex mixed effects models.
https://cran.r-project.org/web/packages/MCMCglmm/vignettes/CourseNotes.pdf
Although it can be challenging to implement, it may solve the problems you've been having. It allows the fixed and random effects formulas to be given separately, eg.
fixed <- formula(EDA ~ CONDhier * CONDabund)
rand <- formula( ~(us(1+ CONDhier):part + us(1+ CONDhier):round + us(1+ CONDhier):paper + us(1+ CONDhier):visit))
The covariance structure between the random effects are given as coefficients which can be examined using summary() on the MCMCglmm object once the model has been run.
If you have a cross-covariance matrix use canonical correlation analysis (CCA). There is a documented R package for CCA.

Resources