Homoscedascity test for Two-Way ANOVA - r

I've been using var.test and bartlett.test to check basic ANOVA assumptions, among others, homoscedascity (homogeniety, equality of variances). Procedure is quite simple for One-Way ANOVA:
bartlett.test(x ~ g) # where x is numeric, and g is a factor
var.test(x ~ g)
But, for 2x2 tables, i.e. Two-Way ANOVA's, I want to do something like this:
bartlett.test(x ~ c(g1, g2)) # or with list; see latter:
var.test(x ~ list(g1, g2))
Of course, ANOVA assumptions can be checked with graphical procedures, but what about "an arithmetic option"? Is that, at all, manageable? How do you test homoscedascity in Two-Way ANOVA?

Hypothesis testing is the wrong tool to use to asses the validity of model assumptions. If the sample size is small, you have no power to detect any variance differences, even if the variance differences are large. If you have a large sample size you have power to detect even the most trivial deviations from equal variance, so you will almost always reject the null. Simulation studies have shown that preliminary testing of model assumption leads to unreliable type I errors.
Looking at the residuals across all cells is a good indicator, or if your data are normal, you can use the AIC or BIC with/without equal variances as a selection procedure.
If you think there are unequal variances, drop the assumption with something like:
library(car)
model.lm <- lm(formula=x ~ g1 + g2 + g1*g2,data=dat,na.action=na.omit)
Anova(model.lm,type='II',white.adjust='hc3')
You don't loose much power with the robust method (hetroscedastic consistent covariance matrices), so if in doubt go robust.

You can test for heteroscedasticity using the Fligner–Killeen test of homogeneity of variances. Supposing your model is something like
model<-aov(gain~diet*supplement)
fligner.test(gain~diet*supplement)
Fligner-Killeen test of homogeneity of variances
data: gain by diet by supplement
Fligner-Killeen:med chi-squared = 2.0236, df = 2, p-value = 0.3636
You could have used bartlett.test (but this is more a test of non-normality than of equality of variances)
bartlett.test(gain~diet*supplement)
Bartlett test of homogeneity of variances
data: gain by diet by supplement
Bartlett's K-squared = 2.2513, df = 2, p-value = 0.3244
Moreover, you could perform the Levene test for equal group variances in both one-way and two-way ANOVA. Implementations of Levene's test can be found in packages car (link fixed), s20x and lawstat
levene.test(gain~diet*supplement) # car package version
Levene's Test for Homogeneity of Variance
Df F value Pr(>F)
group 11 1.1034 0.3866
36

For bartlett.test
bartlett.test(split(x,list(g1,g2)))
var.test is not applicable as it works only when there are two groups.

Related

User specified variance-covariance matrix in car::Anova not working

I am trying to use the car::Anova function to carry out joint Wald chi-squared tests for interaction terms involving categorical variables.
I would like to compare results when using bootstrapped variance-covariance matrix for the model coefficients. I have some concerns about the normality of residuals and am doing this as a first step before considering permutation tests as an alternative to joint Wald chi-squared tests.
I have found the variance covariance from the model fitted on 1000 bootstrap resamples of the data. The problem is that the car::Anova.merMod function does not seem to use the user-specified variance covariance matrix. I get the same results whether I specify vcov. or not.
I have made a very simple example below where I try to use the identity matrix in Anova(). I have tried this with the more realistic bootstrapped var-cov as well.
I looked at the code on github and it looks like there is a line where vcov. is overwritten using vcov(mod), so that might be an error. However I thought I'd see if anyone here had come across this issue or could see if I had made a mistake.
Any help would be great!
df1 = data.frame( y = rbeta(180,2,5), x = rnorm(180), group = letters[1:30] )
mod1 = lmer(y ~ x + (1|group), data = df1)
# Default, uses variance-covariance from the model
Anova(mod1)
# Should use user-specified varcov matrix but does not - same results as above
Anova(mod1, vcov. = diag(2))
# I'm not bootstrapping the var-cov matrix here to save space/time
p.s. Using car::linearHypothesis works for user-specified vcov, but this does not give results using type 3 sums of squares. It is also more laborious to use for more than one interaction term. Therefore I'd prefer to use car::Anova if possible.

Linear mixed model comparison with ANOVA R

I have two models:
model1 = y~ a+b*c+ 1|d
model2 = y~ a*e+c+1|d
I wanted to compare how they do.
anova(model1, model2)
This is the result:
Why is the p value 0?
Thank you!
Desperate grad student
Hi Desperate Grad student! Typically, the ANOVA test is used to test the necessity of a complex model with respect to a simpler, more parsimonious model. Since, in your case, you're comparing two models with the same number of parameters, you have 0 degrees of freedom (where df = # of parameters in the complex model - # of parameters in the simpler model). This is why you have an absent p-value associated with this comparison.
However, since you have the information criteria for both of these models (AIC/BIC), you can use that to compare the two. Here, model 1 is favorable since its AIC and BIC are lower than the IC for model 2.
If you're set on using the ANOVA approach to compare models, consider creating an "intercept only" model using model0 <- y ~ 1 as your basis for comparison.

ANOVA on ranks VS kruskal wallis, how different is it

i'm not sure that this is the perfect place for such a question but maybe you can help me.
I want to check for differences of a quantitative variable between 3 treatments, i.e perform an ANOVA.
Unfortunately the residuals of my model aren't normally distributed.
I usually have here two solutions : Transform my data or use a non parametric equivalent of my test (here a kruskal wallis rank test).
None of the transformations that i tried managed to satisfy normality (log, 1/x, square root, tukey and boxcox power) so I wanted to use a kruskal and to move on.
However, my project manager insisted on having only ANOVAs and talked about ANOVA on rank as a magic solution.
Working on R I looked for some examples and find a function art from ARTool package that perform anova on rank.
library(ARTool)
model <- art(variable~treatment,data)
anova(model)
Basically it takes your variable and replace it by its rank (dealing with ties by averaging the rank) as :
model2 <- lm(rank(variable, ties.method = "average")~treatment,data)
anova(model2)
gives exactly the same output.
I'm not an expert statistician and I wonder how valid is this solution/transformation.
It seems quite brutal to me and not this far from the logic of the kruskal-wallis test
even tho the statistic is not computed directly on ranks.
I find this very confusing to have an 'ANOVA on ranks' test that is different from the kruskal-wallis (also known as One-way ANOVA on ranks) and I don't know how to chose between those two tests.
I don't know if I've been very clear and if someone can help me but, anyway,
Thanks for your attention and comments!
PS: here is an exemple on dummy data
library(ARTool)
# note that dummy data are random so we shouldn't have the same results
treatment <- as.factor(c(rep("A",100),rep("B",100),rep("C",100)))
variable <- as.numeric(c(sample(c(0:30),100,replace=T),sample(c(10:40),100,replace=T),sample(c(5:35),100,replace=T)))
dummy <- data.frame(treatment,variable)
model <- art(variable~treatment)
anova(model) #f.value = 30.746 and p = 7.312e-13
model2 <- lm(rank(variable, ties.method = "average")~treatment,dummy)
anova(model2) #f.value = 30.746 and p = 7.312e-13
kruskal.test(variable~treatment,dummy)

How to test significance of polynomial (linear) trends among groups with unequal variances?

I am testing for a linear trend among several groups, however, my data has violated the assumption of equal variance among groups (tested by Levene's homogeneity of variance).
In SPSS, along with the significance of linear trend assuming equal variance, there is automatic output for significance where equal variances are not assumed. What 'test' or 'adjustment' is being done? Can I do this in R, and how?
Image of SPSS output: (https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSZRs3EM3wJz5raHhav-LLBTmTyfLJO0z4xHDEzI-3uI15BoBQ5)
I'm struggling to find what exactly SPSS is doing, but it could be some sort of welch correction?
# TEST homogeneity of variance
leveneTest(ICECAP_A ~ SFMental_f, data = SCI)
p < 0.001 so we reject null of homogeneity of variance.
# Use built-in contr.poly() function: Tell R to get a polynomial contrast matrix for 5 levels/groups
contrasts(SCI$SFMental_f) <- contr.poly(n=5)
# call an ANOVA
anova.SFMental <- aov(ICECAP_A ~ SFMental_f, data = SCI)
# print output, show linear trend result
summary.aov(anova.SFMental, split=list (SFMental_f=list ("Linear"=1)))
Now I have the significance for linear trend. How do I get the significance if we do NOT assume equal variances?
It seems that SPSS does a correction using the [Welch-Satterthwaite Equation]1. Thanks to Andy Field for the tip. But there is no direct R alternative, so I constructed the contrasts in the usual way and ran a robust model with lmRob() instead.

Levene post hoc test in R

I have a problem regarding my data analysis in R. One of my hypothesis is basically that my groups will differ in terms of spread of the scores, indicating that there would be a difference in extremity between the groups.
I decided to check my hypothesis with Levenes test, which turned out significant and should thus highlight that the standard deviations is significantly different between the groups.
But I do not know of any post hoc tests for Levenes test, and after reading up on possible post hoc analyses I decided to conduct an ANOVA on the residuals, and then do a post hoc test on the ANOVA.
This is the code I've tried so far:
leveneTest(SS_mean~RA01, DF)
DF$residuals <- abs(DF$SS_mean - DF$SS_mean_big) #SS_mean = Participants score,
#SS_mean_big = mean for each group.
My test and post hoc test looks like this:
levene.anova<-aov(residuals~RA01, DF) #RA01 is the groups. Four in total
summary(levene.anova)
TukeyHSD(levene.anova)
The ANOVA on the residuals turned out significant as well, but the p-value changed from 0.04 (Levenes test) to 0.01 (ANOVA on residuals).
When reading about it, it seemed like Levene test is just an ANOVA on the resiudals, and thus it should give me the same results. And I am also unsure what post hoc test i should use. I thought about Dunnett as well as it includes a baseline, which corresponds to one of my groups.
Lastly, I did a leveneTest on the residuals as well "leveneTest(residuals~RA01)", which turned out significant. Is it better for me to use a non-parametric test, e.g. Kruskal-Wallis h-test and conduct a post hoc test on my kruskal wallis test instead? And if this is the case, what would be the appropriate test? Should I use a pairwise Mann Whitney u-test or Dunn test?
As this is the first time im doing something like this, I'm unsure about if this is a legitimate analysis, I would really appreciate your help or input!
Levene's test should indeed give the same p-value as an ANOVA on the residuals.
See for example this code:
data("mtcars")
mtcars$cyl <- as.factor(mtcars$cyl)
# Calculate means and add them to data
cyl_means <- aggregate(disp ~ cyl, data = mtcars, FUN = mean)
colnames(cyl_means)[2] <- "disp_mean"
mtcars2 <- merge(mtcars, cyl_means, by = "cyl")
# Residuals and anova
mtcars2$residuals <- abs(mtcars2$disp - mtcars2$disp_mean)
res.aov <- aov(residuals ~ cyl, data = mtcars2)
summary(res.aov)
# Levene's test
lawstat::levene.test(mtcars$disp, mtcars$cyl, location = "mean")
Maybe you accidentially ran the Brown–Forsythe test instead, which is the default in lawstat::levene.test, and which uses the median instead of the mean to calculate residuals.
Use Dunnett's if you are only interested in comparing the groups to one baseline group.
Use TukeyHSD if you want all pairwise comparisons among groups.

Resources