Multiple testing methods - r

I want to simulate the effect of different kinds of multiple testing correction such as Bonferroni, Fisher's LSD, DUncan, Dunn-Sidak Newman-Keuls, Tukey, etc... on Anova.
I guess I should simply run a regular Anova. And then accept as significant p.values which I calculate by using p.adjust. But I'm not getting how this p.adjust function works. Could give me some insights about p.adjust() ?
when running:
> p.adjust(c(0.05,0.05,0.1),"bonferroni")
# [1] 0.15 0.15 0.30
Could someone explain as to what does this mean?
Thank you for your answer. I kinda know a bit of all that. But I still don't understand the output of p.adjust. I'd expect that...
P.adjust(0.08,'bonferroni',n=10)
... would returns 0.008 and not 0.8. n=10 doesn't it mean that I'm doing 10 comparisons. and isn't 0.08 the "original alpha" (I mean the threshold I'd use to reject the NULL hypothesis if I had one simple comparison)

You'll have to read about each multiple testing correction technique, whether it be False Discovery Rate (FDR) or Family-Wise Error Rate (FWER). (Thanks to #thelatemail for pointing out to expand the abbreviations).
Bonferroni correction controls the FWER by setting the significance level alpha to alpha/n where n is the number of hypotheses tested in a typical multiple comparison (here n=3).
Let's say you are testing at 5% alpha. Meaning if your p-value is < 0.05, then you reject your NULL. For n=3, then, for Bonferroni correction, you could then divide alpha by 3 = 0.05/3 ~ 0.0167 and then check if your p-values are < 0.0167.
Equivalently (which is directly evident), instead of checking pval < alpha/n, you could take the n to the other side pval * n < alpha. So that the alpha remains the same value. So, your p-values get multiplied by 3 and then would be checked if they are < alpha = 0.05 for example.
Therefore, the output you obtain is the FWER controlled p-value and if this is < alpha (5% say), then you would reject the NULL, else you'd accept the NULL hypothesis.
For each tests, there are different procedures to control the false-positives due to multiple testing. Wikipedia might be a good start point to learn about other tests as to how they correct for controlling false-positives.
However, your output of p.adjust, gives in general multiple-testing corrected p-value. In case of Bonferroni, it is FWER controlled p-value. In case of BH method, it is FDR corrected p-value (or also otherwise called q-value).
Hope this helps a bit.

Related

Is there a way to change the significance level (alpha) in R?

Im trying to perform a simple hypothesis test but now I need t-values for alpha = 0.01 instead of 0.05 (the default). Is there a way to do this in R?
This is what I am trying to get for alpha = 0.01:
enter image description here
If you used the t.test function in R, you can use the argument conf.level = 0.99, since the confidence level is equivalent to 1 – the alpha level. You can also read this page on Rdocumentation on the t.test function for more information on what arguments can be used
This seems to be a statistical question rather than a programming question, and as such probably belongs on CrossValidated ...
Results table:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 93.10386 44.482243 2.093057 3.647203e-02
educ 39.82828 3.314467 12.016497 3.783286e-32
When you change alpha (the cutoff value for significance testing), nothing in the table above — neither the t-statistic (t value) nor the p-value (Pr(>|t|)) — changes. The only thing that changes is the judgment of whether you rejected or failed to reject the null hypothesis. In this case, since the p-value for the intercept (0.036) is between 0.01 and 0.05, the conclusion would change from "reject H0" (alpha=0.05) to "fail to reject H0" (alpha=0.01). The p-value for educ is way less than 0.01, so the conclusion would be "reject" either way.
In most cases, base-R functions don't specify an alpha value; they let you make the decision yourself. If you do have a vector of p-values, you could implement an alpha threshold by saying
result <- ifelse(pval<alpha, "reject H0", "fail to reject H0")

How to calculate p-values for each feature in R using two sample t-test

I have two data frames cases and controls and I performed two sample t-test as shown below.But I am doing feature extraction from the feature set of (1299 features/columns) so I want to calculate p-values for each feature. Based on the p-value generated for each feature I want to reject or accept the null hypothesis.
Can anyone explain to me how the below output is interpreted and how to calculate the p-values for each feature?
t.test(New_data_zero,New_data_one)
Welch Two Sample t-test
data: New_data_zero_pca and New_data_one_pca
t = -29.086, df = 182840000, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.02499162 -0.02183612
sample estimates:
mean of x mean of y
0.04553462 0.06894849
Look at ?t.test. x and y are supposed to be vectors not matrixes. So the function is automatically converting them to vectors. What you want to do, assuming that columns are features and the two matrixes have the same features, is:
pvals=vector()
for (i in seq(ncol(New_data_zero))){
pvals[i]=t.test(New_data_zero[,i], New_data_one[,i])$p.value
}
Then you can look at pvals (probably in log scale) and after multiple hypothesis testing correction (see ?p.adjust).
Let's also address the enormously bad idea of this approach to finding differences among your features. Even if all of the effects between these 1299 features are literally zero you will find *significant results in 0.05 of all possible 1299 2-way comparisons which makes this strategy effectively meaningless. I would strongly suggest taking a look at an introductory statistics text, especially the section on family-wise type I error rates before proceeding.

Student's t alpha change

I need to run several one sample t-tests. Is there a way of changing the alpha level for this test? or, is it necessary to do some kind of correction for a one sample t test? (like a bonferroni correction for a paired t test) ?
Many thanks!
The t.test function returns a p-value
t.test(rnorm(10))$p.value
You can set the cut-off. The function does have an argument conf.level for the confidence interval.
To correct for multiple comparisons, see p.adjust.
p_values = c(0.1, 0.01, 0.05)
p.adjust(p_values, method="bonferroni")
[1] 0.30 0.03 0.15
If you look at the output of the t.test you may have noticed that the output is independent of alpha. The test gives you information you would need to make decisions but the decision criteria are not in it. For that reason it's difficult for people to help you because your question doesn't seem to have anything to do with the t.test R command. I do caution against adjusting the p-values post-hoc using p.adjust. This is especially problematic because many adjustments are actually varying alpha (but, as you indicate Bonferroni uses a single one). It is much more honest reporting to report your modified alpha value which, for Bonferroni, is just 0.05 / number of tests.

Perform a Shapiro-Wilk Normality Test

I want to perform a Shapiro-Wilk Normality Test test. My data is csv format. It looks like this:
heisenberg
HWWIchg
1 -15.60
2 -21.60
3 -19.50
4 -19.10
5 -20.90
6 -20.70
7 -19.30
8 -18.30
9 -15.10
However, when I perform the test, I get:
shapiro.test(heisenberg)
Error in [.data.frame(x, complete.cases(x)) :
undefined columns selected
Why isnt`t R selecting the right column and how do I do that?
What does shapiro.test do?
shapiro.test tests the Null hypothesis that "the samples come from a Normal distribution" against the alternative hypothesis "the samples do not come from a Normal distribution".
How to perform shapiro.test in R?
The R help page for ?shapiro.test gives,
x - a numeric vector of data values. Missing values are allowed,
but the number of non-missing values must be between 3 and 5000.
That is, shapiro.test expects a numeric vector as input, that corresponds to the sample you would like to test and it is the only input required. Since you've a data.frame, you'll have to pass the desired column as input to the function as follows:
> shapiro.test(heisenberg$HWWIchg)
# Shapiro-Wilk normality test
# data: heisenberg$HWWIchg
# W = 0.9001, p-value = 0.2528
Interpreting results from shapiro.test:
First, I strongly suggest you read this excellent answer from Ian Fellows on testing for normality.
As shown above, the shapiro.test tests the NULL hypothesis that the samples came from a Normal distribution. This means that if your p-value <= 0.05, then you would reject the NULL hypothesis that the samples came from a Normal distribution. As Ian Fellows nicely put it, you are testing against the assumption of Normality". In other words (correct me if I am wrong), it would be much better if one tests the NULL hypothesis that the samples do not come from a Normal distribution. Why? Because, rejecting a NULL hypothesis is not the same as accepting the alternative hypothesis.
In case of the null hypothesis of shapiro.test, a p-value <= 0.05 would reject the null hypothesis that the samples come from normal distribution. To put it loosely, there is a rare chance that the samples came from a normal distribution. The side-effect of this hypothesis testing is that this rare chance happens very rarely. To illustrate, take for example:
set.seed(450)
x <- runif(50, min=2, max=4)
shapiro.test(x)
# Shapiro-Wilk normality test
# data: runif(50, min = 2, max = 4)
# W = 0.9601, p-value = 0.08995
So, this (particular) sample runif(50, min=2, max=4) comes from a normal distribution according to this test. What I am trying to say is that, there are many many cases under which the "extreme" requirements (p < 0.05) are not satisfied which leads to acceptance of "NULL hypothesis" most of the times, which might be misleading.
Another issue I'd like to quote here from #PaulHiemstra from under comments about the effects on large sample size:
An additional issue with the Shapiro-Wilk's test is that when you feed it more data, the chances of the null hypothesis being rejected becomes larger. So what happens is that for large amounts of data even very small deviations from normality can be detected, leading to rejection of the null hypothesis event though for practical purposes the data is more than normal enough.
Although he also points out that R's data size limit protects this a bit:
Luckily shapiro.test protects the user from the above described effect by limiting the data size to 5000.
If the NULL hypothesis were the opposite, meaning, the samples do not come from a normal distribution, and you get a p-value < 0.05, then you conclude that it is very rare that these samples do not come from a normal distribution (reject the NULL hypothesis). That loosely translates to: It is highly likely that the samples are normally distributed (although some statisticians may not like this way of interpreting). I believe this is what Ian Fellows also tried to explain in his post. Please correct me if I've gotten something wrong!
#PaulHiemstra also comments about practical situations (example regression) when one comes across this problem of testing for normality:
In practice, if an analysis assumes normality, e.g. lm, I would not do this Shapiro-Wilk's test, but do the analysis and look at diagnostic plots of the outcome of the analysis to judge whether any assumptions of the analysis where violated too much. For linear regression using lm this is done by looking at some of the diagnostic plots you get using plot(lm()). Statistics is not a series of steps that cough up a few numbers (hey p < 0.05!) but requires a lot of experience and skill in judging how to analysis your data correctly.
Here, I find the reply from Ian Fellows to Ben Bolker's comment under the same question already linked above equally (if not more) informative:
For linear regression,
Don't worry much about normality. The CLT takes over quickly and if you have all but the smallest sample sizes and an even remotely reasonable looking histogram you are fine.
Worry about unequal variances (heteroskedasticity). I worry about this to the point of (almost) using HCCM tests by default. A scale location plot will give some idea of whether this is broken, but not always. Also, there is no a priori reason to assume equal variances in most cases.
Outliers. A cooks distance of > 1 is reasonable cause for concern.
Those are my thoughts (FWIW).
Hope this clears things up a bit.
You are applying shapiro.test() to a data.frame instead of the column. Try the following:
shapiro.test(heisenberg$HWWIchg)
You failed to specify the exact columns (data) to test for normality.
Use this instead
shapiro.test(heisenberg$HWWIchg)
Set the data as a vector and then place in the function.

Simulating p-values for Chi-Squared Test using Monte-Carlo Method

I'm trying to write a script in R that allows to aproximate by simulation the critical values (p-values) for a Pearson Chi Squared test, taking different alpha values.
I know that an option in "chisq.test" exists, but I want to know how to do this simulation by hand.
For example:
Please check the code at http://www.biostat.wisc.edu/~kbroman/teaching/stat371/comp21.R (I don't know how to put the code properly)
If you check the last part ("p-value by simulation"), you'll see the way p-value are obtained in the script. I want to do this, but taking different alpha values.
Thank you very much!
The calculation of p-value of any statistical test (whatever method: classical, bootstrap) has nothing to do with alpha value if you mean significance level by that. You need alpha value when making a decision to accept or reject the null hypothesis (if p-value is less than chosen alpha then reject the null).
If you have done a simulation as shown in the script, and have derived a vector of simulation values xsqsim, then the critical value for an alpha level of alpha is approximately
quantile(xsqsim,1-alpha)
You have to be a little bit careful if you have a small sample, because the critical value should be the value of the test statistic q such that the probability of the observed value being greater than or equal to q is equal to alpha ...

Resources