How to interpret a pre-post test ANCOVA result? - r

I have a basic pre-post trial design. Two randomized Groups and two tests for each participant in each group one prior to the intervention (here V1) and one post (V2).
I am completely new to this and have been reading up a lot on this and based on a few sources it was suggested that an ANCOVA test with the pre-test as a covariate was the most appropriate.
So, I modeled as follows:
y <- aov(V2~Group+V1, data=x)
And checked for normality of residuals and used the Levene's test to test for correlation between V2 and Group.
I got the following result for a certain variable of interest -
summary(y)
Df Sum Sq Mean Sq F value Pr(>F)
Group 1 29996 29996 4.315 0.0386 *
V1 1 3710598 3710598 533.844 <2e-16 ***
Residuals 325 2258983 6951
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I followed it up with a post-hoc Turkey Test and found that it was significant there as well.
I have a couple of questions:
Is this the correct way to go?
Why is my V1 (pretest) covariate having such a high level of significance and what does it mean? (I assumed that randomization essentially means that there is no difference between the groups at baseline).
Can I conclude that there truly is a difference between the two groups for this particular aspect based on this?

Related

R CCA - Can species scores be related to CCA axis & how does the biplot arrow length relates to significance of variables?

Hallo this is my first question in stackoverflow or any simliar forum, so please excuse and be kind if I missed something out ;)
I am using the vegan package in R to calculate a cca analysis. Because my study is about intraspecific variation of species traits, I do not have a "plot X species- matrix" but an "individuum X trait- matrix" representing a "physio-chemcial-niche" (so my species scores look different than they used to).
So my questions are:
is it appropiate to do this analysis in this way?
Is it possible to interpret the CCA axis based on the "species scores" (which are not species scores in my case) - I would like to have informations like: CCA1 is most related to trait X.
How can I interpret the length of the biplot arrows in comparison to premutaion test (anova.cca) - Because I get many "long" arrows but looking at the permutation test only few of them are significant?
Here is my summary(cca)-Output:
Call:
cca(formula = mniche_g ~ cover_total * Richness + altitude + Eastness + lan_TEMP + lan_REACT + lan_NUTRI + lan_MOIST + Condition(glacier/transect/plot/individuum), data = mres_g_sc)
Partitioning of scaled Chi-square:
Inertia Proportion
Total 0.031551 1.00000
Conditioned 0.001716 0.05439
Constrained 0.006907 0.21890
Unconstrained 0.022928 0.72670
Eigenvalues, and their contribution to the scaled Chi-square
after removing the contribution of conditiniong variables
Importance of components:
CCA1 CCA2 CCA3 CA1 CA2 CA3
Eigenvalue 0.00605 0.0005713 0.0002848 0.0167 0.00382 0.002413
Proportion Explained 0.20280 0.0191480 0.0095474 0.5596 0.12805 0.080863
Cumulative Proportion 0.20280 0.2219458 0.2314932 0.7911 0.91914 1.000000
Accumulated constrained eigenvalues
Importance of components:
CCA1 CCA2 CCA3
Eigenvalue 0.00605 0.0005713 0.0002848
Proportion Explained 0.87604 0.0827150 0.0412425
Cumulative Proportion 0.87604 0.9587575 1.0000000
Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
Species scores
CCA1 CCA2 CCA3 CA1 CA2 CA3
SLA_range_ind 0.43964 -0.002623 -0.0286814 -0.75599 -0.04823 0.003317
SLA_mean_ind 0.01771 -0.042969 0.0246679 -0.01180 0.12732 0.053094
LNC -0.10613 -0.064207 -0.0637272 0.07261 -0.15962 0.198612
LCC -0.01375 0.012131 -0.0005462 0.02573 -0.01539 -0.021314
...
Here is my anova.cca(cca)-Output:
Permutation test for cca under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
Model: cca(formula = mniche_g ~ cover_total * Richness + altitude + Eastness + lan_TEMP + lan_REACT + lan_NUTRI + lan_MOIST + Condition(glacier/transect/plot/individuum), data = mres_g_sc)
Df ChiSquare F Pr(>F)
cover_total 1 0.0023710 10.4442 0.002 **
Richness 1 0.0006053 2.6663 0.080 .
altitude 1 0.0022628 9.9676 0.001 ***
Eastness 1 0.0005370 2.3657 0.083 .
lan_TEMP 1 0.0001702 0.7497 0.450
lan_REACT 1 0.0005519 2.4313 0.094 .
lan_NUTRI 1 0.0000883 0.3889 0.683
lan_MOIST 1 0.0001017 0.4479 0.633
cover_total:Richness 1 0.0002184 0.9620 0.351
Residual 101 0.0229283
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
and here the biplot:
enter image description here
Thank you all!
I don't have sufficient information to say if it is sensible to use CCA for your data. I'm suspicious and I think it may not be sensible. The critical question is "does the sum of traits make sense?". If it does not, *CA makes no sense, because you need both row and column sums there, and if you have measured your variables in different units, their sum makes no sense. For instance, if you change the units of one "trait", say, from inches to centimetres, the results will change. It is probable wiser to use RDA/PCA with equalizing scaling of variables.
You can get the relationship of single variable to an axis from the analysis. It is just the ordination score of that variable. Visually you see it by projecting the point to the axis, numerically with summary or scores. However, I think you should not want to have that, but I won't stop you if you do something that I think you should not do. (Interpretation of rotated dimensions may be more meaningful – axes are just a framework of reference to draw plots.)
Brief answer: the arrow lengths have no relation to the so-called significances. Longer answer: The scaling of biplot arrow lengths depends on the scaling of your solution and the number of constrained axes in your solution. The biplot scores are based on the relationship with the so-called Linear Combination scores – which are completely defined by these very same variables, and the multiple correlation of the constraining variable with all constrained axes is 1. In the default scaling ("species"), all your biplot arrows have unit length in the full constrained solution, but if you show only two of several axes, the arrows appear shorter if they are long in the dimensions that you do not show, and they appear long, if the dimensions you show are the only ones that are important for these variables. With other scalings, you also add scaling by axis eigenvalues. However, these lengths have nothing to do with so-called significances of these variables. (BTW, you used sequential method in your significance tests which means that the testing order will influence the results. This is completely OK, but different from interpreting arrows which are not order-dependent.)

Wrong degree of freedom for between-subject factor in lmer

I'm testing how visual perspective(1, completely first person -> 11, completely third person) can vary as a function of Culture (AA, EA), Valence (Positive, Negative) and Event Type (Memory, Imagination) while control age (continuous), sex (M, F) and SES (continuous) and allowing individual differences.
This is an unbalanced design as participants can have as we give participants 10 prompts, but participants can choose to either recall or imagine a relevant event. Therefore, each participants may have as many memories (no greater than 10) and as many imaginations (no greater than 10) as they want. In total we have 363 participants.
My dataset looks like this:
The model I fit looks like
VP.full.lm <- lmer(Visual.Perspective ~ Culture * Event.Type * Valence +
Sex + Age + SES +
(1|Participant.Number),
data=VP_Long)
When I run anova() function to see the effects of all variables, here is the output:
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
Culture 30.73 30.73 1 859.1 4.9732 0.0260008 *
Valence 6.38 6.38 1 3360.3 1.0322 0.3097185
Event.Type 1088.61 1088.61 1 3385.9 176.1759 < 2.2e-16 ***
Sex 45.12 45.12 1 358.1 7.3014 0.0072181 **
Age 7.95 7.95 1 358.1 1.2869 0.2573719
SES 6.06 6.06 1 358.7 0.9807 0.3226824
Culture:Valence 6.39 6.39 1 3364.6 1.0348 0.3091004
Culture:Event.Type 71.53 71.53 1 3389.7 11.5766 0.0006756 ***
Valence:Event.Type 2.89 2.89 1 3385.4 0.4682 0.4938573
Culture:Valence:Event.Type 3.47 3.47 1 3390.6 0.5617 0.4536399
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
As you can see, the DF for effect of culture is off -- since culture is a between-subject factor, its DF cannot be larger than our sample size. I've tried to use ddf = Roger-Kenward and tested the effect of culture using emmeans::test(contrast(emmeans(VP.full.lm,c("Culture")), "trt.vs.ctrl"), joint = T), yet none of these methods solved the problems with the degree of freedom issue.
I also thought about that maybe those participants who did not provide both memories and imaginations are confusing the lmer model, so I subsetted my data to only include participants who provided both types of events. However, the degree of freedom problem persists. It's also worth mentioning that once I removed the interaction between Culture and Event.Type, the degree of freedom became plausible.
I wonder if anyone knows what is going on here, and how can we fix this issue? Or is there way we can explain away this weird issue...?
Thanks so much in advance!
This question might be more appropriate for CrossValidated ...
Not a complete solution, but some ideas:
from a practical point of view, the difference between 363 (or even 350) denominator df and 859 ddf is very small: the manual p-value calculation based on an F-statistic of 4.9732 gives pf(4.9732,1,350,lower.tail=FALSE)=0.0264, hardly different from your value of 0.260.
since you are fitting a simple model (LMM not GLMM, only a single simple random effect, etc.), you might be able to refit your model in lme (from the nlme package): it uses a simpler df computation that might give you the 'right' answer in this case. Alternatively, you can get code from here that implements a (slightly extended) version of the algorithm from lme.
since you're doing type-III Anova, you should be very careful with the parameterization/contrasts in your model: if you're not using centered (sum-to-zero) contrasts, your results may not mean what you think (the afex::mixed() function does some checks to make sure that this is true). It's conceivable (although I doubt it) that the contrasts are throwing of your ddf calculations as well.
it's not clear how you're measuring "visual perspective", but if it's a ratings scale you might be better off with an ordinal response model ...

Power analysis for multiple regression using pwr and R

I want to determine the sample size necessary to detect an effect of an interaction term of two continuous variables (scaled) in a multiple regression with other covariates.
We have found an effect where previous smaller studies have failed. These effects are small, but a reviewer is asking us say that previous studies were probably underpowered, and to provide some measure to support that.
I am using the pwr.f2.test() function in the pwr package, as follows:
pwr.f2.test(u = nominator, v = denominator, f2 = effect size, sig.level = 0.05, power = 0.8), and the denominator I set to NULL so I can get sample size.
Here is my model output from summary():
Estimate Std. Error t value Pr(>|t|)
(Intercept) -21.2333 20.8127 -1.02 0.30800
age 0.0740 0.0776 0.95 0.34094
wkdemand 1.6333 0.5903 2.77 0.00582 **
hoops 0.8662 0.6014 1.44 0.15028
wtlift 5.2417 1.3912 3.77 0.00018 ***
height05 0.2205 0.0467 4.72 2.9e-06 ***
amtRS 0.1041 0.2776 0.37 0.70779
allele1_numS -0.0731 0.2779 -0.26 0.79262
amtRS:allele1_numS 0.6267 0.2612 2.40 0.01670 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.17 on 666 degrees of freedom
Multiple R-squared: 0.0769, Adjusted R-squared: 0.0658
F-statistic: 6.94 on 8 and 666 DF, p-value: 8.44e-09
And the model effects sizes estimates from modelEffectSizes() function in lmSupport package:
Coefficients
SSR df pEta-sqr dR-sqr
(Intercept) 53.5593 1 0.0016 NA
age 46.7344 1 0.0014 0.0013
wkdemand 393.9119 1 0.0114 0.0106
hoops 106.7318 1 0.0031 0.0029
wtlift 730.5385 1 0.0209 0.0197
height05 1145.0394 1 0.0323 0.0308
amtRS 7.2358 1 0.0002 0.0002
allele1_numS 3.5599 1 0.0001 0.0001
amtRS:allele1_numS 296.2219 1 0.0086 0.0080
Sum of squared errors (SSE): 34271.3
Sum of squared total (SST): 37127.3
The question:
What value do I put in the f2 slot of pwr.f2.test()? I take it the numerator is going to be 1, and I should use the pEta-sqr from modelEffectSizes(), so in this case 0.0086?
Also, the estimated sample sizes I get are often much larger than our sample size 675 - does this mean we were 'lucky' to have picked up a significant effects (we'll only detect them 50% of the time, given the effect size)? Note that I we have multiple measures of different things all pointing to the same finding, so I'm relatively satisfied there.
What value do I put in the f2 slot of pwr.f2.test()?
For each of pwr functions, you enter three of the four quantities (effect size, sample size, significance level, power) and the fourth will be calculated (1). In pwr.f2.test u and v are the numerator and denominator degrees of freedom. And f2 is used as the effect size measure. E.g. you will put there an effect size estimate.
Is pEta-sqr the correct 'effect size' to use?
Now, there are many different effect size measures. Pwr uses specifically Cohen´s F 2 and it is different from pEta-sqr, so I wouldn´t recommend it.
Which effect size measure I could use then?
As #42- mentioned, you could try to use delta-R2 effect, which in your output variables is labeled “dR-sqr”. You could do this with variation of Cohen’s f 2 measuring local effect size which was described by Selya et al. (2012). It uses the following equation:
In the equation, B is the variable of interest, A is the set of all other variables , R2AB is the proportion of variance accounted for by A and B together (relative to a model with no regressors), and R²A is the proportion of variance accounted for by A (relative to a model with no regressors). I would do as #42- suggested – e.g. build two models, one with the interaction and one without and use their delta-R2 effect size.
Importantly, as #42- correctly pointed out, if the reviewers ask you if prior studies were underpowered, you need to use the sample sizes of those studies to make any power calculation. If you are using parameters of your own study, first of all you already know the answer – that you did have sufficient power to detect a difference, and second, you are doing it post hoc which also doesn´t sound correct.
https://www.statmethods.net/stats/power.html
Selya et al., 2012: A Practical Guide to Calculating Cohen’s f2, a Measure of Local Effect Size, from PROC MIXED. Front Psychol. 2012;3:111.

Nested Anova in R

I want to show that seeds of different species display different length due to the factor Species.
For each species, I have several trees and for each tree, I have several seeds measured.
Using R, I did an ANOVA:
summary(aov(Length ~ Species))
However, the reviewer noticed a problem of independence because seeds may provide from the same tree. (and this is indeed a real problem !)
To answer this issue, I think that I should do a nested ANOVA. Is that right ?
However, there are plenty of ways to write the code:
summary(aov(Length ~ Species*Tree))
summary(aov(Length ~ Tree*Species))
summary(aov(Length ~ Species/Tree))
summary(aov(Length ~ Species+Error(Tree)))
I believe this is the last possibility listed that will allow me to show that the length of seeds is different due to the species and taking into account that the seeds may come from the same tree.
Can you confirm ?
When I run the command, I obtain this:
Error: Tree
Df Sum Sq Mean Sq F value Pr(>F)
Species 12 320.6 26.715 14.98 4.96e-15 ***
Residuals 71 126.6 1.784
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 1541 11.92 0.007733
Which indeed means that species have a significative impact on the seed length, is that right ?
Thanks so much for your help !!
Muriel
See here for some examples of nested ANOVA in R as well as some insight into mixed models.
I'd install the package lme4, do ?lmer in R, and look into the section "Mixed and Multilevel Models" on the page provided. Perhaps this is a better approach for your data.

partition of anova and comparisons (orthogonal single df) in r

I want to do single df orthogonal contrast in anova (fixed or mixed model). Here is just example:
require(nlme)
data (Alfalfa)
Variety: a factor with levels Cossack, Ladak, and Ranger
Date : a factor with levels None S1 S20 O7
Block: a factor with levels 1 2 3 4 5 6
Yield : a numeric vector
These data are described in Snedecor and Cochran (1980) as an example
of a split-plot design. The treatment structure used in the experiment
was a 3\times4 full factorial, with three varieties of alfalfa and four
dates of third cutting in 1943. The experimental units were arranged
into six blocks, each subdivided into four plots. The varieties of alfalfa
(Cossac, Ladak, and Ranger) were assigned randomly to the blocks and
the dates of third cutting (None, S1—September 1, S20—September 20,
and O7—October 7) were randomly assigned to the plots.
All four dates were used on each block.
model<-with (Alfalfa, aov(Yield~Variety*Date +Error(Block/Date/Variety)))
> summary(model)
Error: Block
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5 4.15 0.83
Error: Block:Date
Df Sum Sq Mean Sq F value Pr(>F)
Date 3 1.9625 0.6542 17.84 3.29e-05 ***
Residuals 15 0.5501 0.0367
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Block:Date:Variety
Df Sum Sq Mean Sq F value Pr(>F)
Variety 2 0.1780 0.08901 1.719 0.192
Variety:Date 6 0.2106 0.03509 0.678 0.668
Residuals 40 2.0708 0.05177
I want to perform some comparison (orthogonal contrasts within a group), for example for date, two contrasts:
(a) S1 vs others (S20 O7)
(b) S20 vs 07,
For variety factor two contrasts:
(c) Cossack vs others (Ladak and Ranger)
(d) Ladak vs Ranger
Thus the anova output would look like:
Error: Block
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5 4.15 0.83
Error: Block:Date
Df Sum Sq Mean Sq F value Pr(>F)
Date 3 1.9625 0.6542 17.84 3.29e-05 ***
(a) S1 vs others ? ?
(b) S20 vs 07 ? ?
Residuals 15 0.5501 0.0367
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Block:Date:Variety
Df Sum Sq Mean Sq F value Pr(>F)
Variety 2 0.1780 0.08901 1.719 0.192
(c) Cossack vs others ? ? ?
(d) Ladak vs Ranger ? ? ?
Variety:Date 6 0.2106 0.03509 0.678 0.668
Residuals 40 2.0708 0.05177
How can I perform this ? ....................
First of all, why use ANOVA? You can use lme from the nlme package and in addition to the hypothesis tests aov gives you, you also get interpretable estimates of the effect sizes and the directions of the effects. At any rate, two approaches come to mind:
Specify contrasts on the variables manually, as explained here.
Install the multcomp package and use glht.
glht is a little opinionated about models that are multivariate in their predictors. Long story short, though, if you were to create a diagonal matrix cm0 with the same dimensions and dimnames as the vcov of your model (let's assume it's an lme fit called model0), then summary(glht(model0,linfct=cm0)) should give the same estimates, SEs, and test statistics as summary(model0)$tTable (but incorrect p-values). Now, if you mess around with linear combinations of rows from cm0 and create new matrices with the same number of columns as cm0 but these linear combinations as rows, you'll eventually figure out the pattern to creating a matrix that will give you the intercept estimate for each cell (check it against predict(model0,level=0)). Now, another matrix with differences between various rows of this matrix will give you corresponding between-group differences. The same approach but with numeric values set to 1 instead of 0 can be used to get the slope estimates for each cell. Then the differences between these slope estimates can be used to get between-group slope differences.
Three things to keep in mind:
As I said the p-values are going to be wrong for models other than lm, (possibly, haven't tried) aov, and certain survival models. This is because glht assumes a z distribution instead of a t distribution by default (except for lm). To get correct p-values, take the test statistic glht calculates and manually do 2*pt(abs(STAT),df=DF,lower=F) to get the two-tailed p-value where STAT is the test statistic returned by glht and DF is the df from the corresponding type of default contrast in summary(model0)$tTable.
Your contrasts probably no longer test independent hypotheses, and multiple testing correction is necessary, if it wasn't already. Run the p-values through p.adjust.
This is my own distillation of a lot of handwaving from professors and colleagues, and a lot of reading of Crossvalidated and Stackoverflow on related topics. I could be wrong in multiple ways, and if I am, hopefully someone more knowlegeable will correct us both.

Resources