manova () in R with between and within subject factors - r

I am using stats() 3.5.2 to run a manova with:
participant 1:20
gender as between subject factor
group as within subject factor
anxiety as dependent measure
BAC as dependent measures
The dataset follow:
treat4 = data.frame (
participant = rep(1:20,3),
gender = factor (rep(c(rep("male", 10), rep ("female", 10)),3)),
group = factor (c(rep("control",20), rep("run",20), rep("party",20))),
anxiety = round(c(rnorm(20, mean=55, sd=5),rnorm(20, mean=20, sd=5),rnorm(20, mean=75, sd=5))),
BAC = round(c(rep(0.01,20), rep(0.01,20), rnorm(20, mean= 0.09, sd=0.01)),2))
I apply the manova () function and summarize as follows:
mod = manova(cbind(anxiety,BAC) ~ gender + Error(group),data=treat4)
summary (mod)
This is what I get:
Error: group
Df Pillai approx F num Df den Df Pr(>F)
Residuals 2
Error: Within
Df Pillai approx F num Df den Df Pr(>F)
gender 1 0.013447 0.37482 2 55 0.6892
Residuals 56
There are a couple of issues:
1) Gender seems to be accounted as within-subjects factor
2) I don't get any statistics for the group factor
Any help?

if anxiety and BAC are your dependent variables, you place them on the left side of tilda (~) with cbind, to indicate multivariate response, and use Error() to specify the within group effect (or random effect). The rest on the right side of tilda (~) are your between group effect (or fixed effect):
manova(cbind(anxiety,BAC) ~ gender + Error(group),data=treat4)
Call:
manova(cbind(anxiety, BAC) ~ gender + Error(group), data = treat4)
Grand Means:
anxiety BAC
49.96666667 0.03766667
Stratum 1: group
Terms:
Residuals
anxiety 33156.63
BAC 0.09185333
Deg. of Freedom 2
Residual standard errors: 128.7568 0.2143051
Stratum 2: Within
Terms:
gender Residuals
anxiety 8.0667 1527.2333
BAC 0.0000 0.0034
Deg. of Freedom 1 56
Residual standard errors: 5.222262 0.007807201
Estimated effects are balanced

Thanks #StupidWolf for your answer.
However, when I apply summary () to the model:
summary(manova(cbind(anxiety,BAC) ~ gender + Error(group),data=treat4))
I get the following:
Error: group
Df Pillai approx F num Df den Df Pr(>F)
Residuals 2
Error: Within
Df Pillai approx F num Df den Df Pr(>F)
gender 1 0.039097 1.1189 2 55 0.334
Residuals 56
There are a couple of issues:
1) Gender seems to be accounted as within-subjects factor
2) I don't get any statistics for the group factor

I know this comes a bit late but I faced the same issue and I think you can simply solve it this way:
summary(manova(cbind(anxiety,BAC) ~ gender + group + Error(factor(participant)),data=treat4))
Basically, you need to add group as an IV (by doing + group).
Then you use the Error() to indicate how it needs to identify unique subjects, it needs to do this by the participant number, rather than by the group.
Don't forget to make the participant into a factor, otherwise, it causes problems!

Related

the R function summary puts a "1" in front of variables when its input is the output of an aovp function

I am running a permutation test for a factorial ANOVA design. The output I get seems fine, but the summary function places a "1" in front of each variable. Here's my code:
summary(aovp(`Mean time in roi(s)` ~ Roi + Age + Sex, data = df))
Here is my output
[1] "Settings: unique SS : numeric variables centered"
Component 1 :
Df R Sum Sq R Mean Sq Iter Pr(Prob)
Roi1 2 22.870 11.4349 5000 0.0206 *
Age1 2 37.128 18.5641 5000 0.0004 ***
Sex1 1 0.004 0.0037 51 0.8824
Residuals 72 211.118 2.9322
How can I get rid of those "1"s? Do they indicate a problem?
Additional Info:
The Roi and Age factors each have three levels. Roi is a fixed factor. Age is a random factor. Sex has two levels. It is a random factor.
Thank you!

In R can you manually set degrees of freedom for lm() or Anova()?

I am replicating SPSS code in R that runs several Type 3 ANOVAs. In SPSS you can specify specific contrasts in an ANOVA (e.g., compare level 2 v level 4 in this 5-level variable). The resulting ANOVA tables return a test where the degrees of freedom are equal to the full sample, rather than the sample that is just concentrated in those two levels.
In R, I use the command below to run an ANOVA comparing those two levels but the resulting Residuals DF is based on the subsample of only those two levels rather than the full sample. Is there a way I can manually set the DF in either the lm() or Anova() function to avoid this issue? Or is there a way to specify contrasts that uses the full sample DF?
Anova(lm(DV ~ FiveLevelFactor, data = data, type = 3, subset = FiveLevelFactor == "2" | FiveLevelFactor == "4"))
How about using the linearHypothesis() function from the car package:
library(car)
data(Ornstein)
mod <- lm(interlocks ~ log(assets) + sector + nation, data=Ornstein)
linearHypothesis(mod, "nationUK = nationUS")
# Linear hypothesis test
#
# Hypothesis:
# nationUK - nationUS = 0
#
# Model 1: restricted model
# Model 2: interlocks ~ log(assets) + sector + nation
#
# Res.Df RSS Df Sum of Sq F Pr(>F)
# 1 235 29829
# 2 234 29690 1 138.36 1.0904 0.2975

How do I run Kruskal and post HOC on multiple variables in R?

Please excuse me if I have not formated my code correctly as I am new to the site. I also do not know how to provide sample data properly.
I have a data set of 42 obs. and 37 variables (first column being the group, 3 groups) of non normal distributed data; I want to compare all of my 36 parameters between the 3 groups and do a subsequent post hoc (pairwise.wilcox?).
The data are flow cell counts for three different patient groups. I have been able to perform the initial comparison creating a formula and running an aov (though I would like to do Kruskal) but have not found a way to perform the post hoc to all variables in the same way.
#Data
Type Neutrophils Monocytes NKC .....
------------------------------------------
IN 546 2663 545
IN 0797 7979 008
OUT 0899 3899 345
OUT 6868 44533 689
HC 9898 43443 563
#Cbind all variable together to run model on all
formula <- as.formula(paste0("cbind(", paste(names(LessCount)[-1],
collapse = ","), ") ~ Type"))
print(formula)
#Run test on model
fit <- aov(formula, data=LessCount)
#Print results
summary(fit)
Response Neutrophils :
Df Sum Sq Mean Sq F value Pr(>F)
Type 2 18173966 9086983 1.8099 0.1771
Residuals 39 195806220 5020672
Response Monocytes :
Df Sum Sq Mean Sq F value Pr(>F)
Type 2 694945 347472 0.7131 0.4964
Residuals 39 19004809 487303
Response Mono.Classic :
Df Sum Sq Mean Sq F value Pr(>F)
Type 2 1561778 780889 2.5842 0.08833 .
Residuals 39 11785116 302182
###export anova####
capture.output(summary(fit),file="test1.csv")
#If Significant,Check which# (currently doing by hand individually)
pairwise.wilcox.test(LessCount$pDCs, LessCount$Type,
p.adjust.method = "BH")
I get out a table the results for the aov for every variable in my console, but would like to do the same for the post hoc, since I need every p value.
Thank you in advance.
Maybe you can directly use the function kruskal.test() and get the p.values.
Here is an example with the iris dataset. I use the function apply() in order to apply the kruskal.test function to each variable (except Species, which is the variable with group information).
data(iris)
apply(iris[-5], 2, function(x) kruskal.test(x = x, g = iris$Species)$p.value)
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 8.918734e-22 1.569282e-14 4.803974e-29 3.261796e-29

wilcoxon test using data stratification

I have a really basic problem. I have the concentrations of one chemical stored in one column and the gender of the study participant in a second column.
What is the code to do the wilcoxon test to see if there is a difference between the concentrations found in boys and the concentrations found in girls? Some explanation of the code would also be useful for me to understand how it works. Thanks!
I got this code for the ANOVA test to work which is also fine. Can anyone tell me if it does the thing that I need?
av <- aov(UC_MEHP ~ BQF05C1, data=data)
av
summary(av)
the output looks like this
> av <- aov(UC_MEHP ~ BQF05C1, data=data)
> av
Call:
aov(formula = UC_MEHP ~ BQF05C1, data = data)
Terms:
BQF05C1 Residuals
Sum of Squares 0.3445 2917.4564
Deg. of Freedom 1 151
Residual standard error: 4.395555
Estimated effects may be unbalanced
21 observations deleted due to missingness
> summary(av)
Df Sum Sq Mean Sq F value Pr(>F)
BQF05C1 1 0.3 0.344 0.018 0.894
Residuals 151 2917.5 19.321
21 observations deleted due to missingness
I'm sorry, I know it's not a very advanced question...
From ?wilcox.test:
## S3 method for class 'formula'
wilcox.test(formula, data, subset, na.action, ...)
...
formula: a formula of the form ‘lhs ~ rhs’ where ‘lhs’ is a numeric
variable giving the data values and ‘rhs’ a factor with two
levels giving the corresponding groups.
So wilcox.test(UC_MEHP ~ BQF05C1, data=data) should work (assuming that BQF05C1 is the column specifying gender and UC_MEHP is the concentration).

Why Error() in aov() gives three levels?

I'm trying to understand how to properly run an Repeated Measures or Nested ANOVA in R, without using mixed models. From consulting tutorials, the formula for a one-variable repeated measures anova is:
aov(Y ~ IV+ Error(SUBJECT/IV) )
where IV is the within subjects and subject is the identity of the subjects. However, most examples show outputs with two strata: Error:subject and Error: subject:WS. Meanwhile I am getting three strata ( Error:subject and Error: subject:WS, Error:within). Why do I have three strata, when I'm trying to specify only two (Within and Between)?
Here is an reproducible example:
data(beavers)
id = rep(c("beaver1","beaver2"),times=c(nrow(beaver1),nrow(beaver2)))
data = data.frame(id=id,rbind(beaver1,beaver2))
data$activ=factor(data$activ)
aov(temp~activ+Error(id/activ),data=data)
temp is a continuous measure of temperature, id is the identity of the beaver activ is binary factor for activity. The output of the model is:
Error: id
Df Sum Sq Mean Sq
activ 1 28.74 28.74
Error: id:activ
Df Sum Sq Mean Sq F value Pr(>F)
activ 1 15.313 15.313 18.51 0.145
Residuals 1 0.827 0.827
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 210 7.85 0.03738

Resources