test for difference of means returns wrong result - r

I'm running the example of the R-intro manual:
A = c(79.98, 80.04, 80.02, 80.04, 80.03, 80.03, 80.04, 79.97, 80.05, 80.03, 80.02, 80.00, 80.02)
B = c(80.02, 79.94, 79.98, 79.97, 79.97, 80.03, 79.95, 79.97)
t.test(A, B)
Which produces the following result:
Welch Two Sample t-test
data: A and B
t = 3.2499, df = 12.027, p-value = 0.006939
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.01385526 0.07018320
sample estimates:
mean of x mean of y
80.02077 79.97875
The question is: if the difference of means is contained within the confidence interval (80.02077-79.97875=0.04202 and 0.01385526<0.04202<0.07018320) why does it conclude that the alternative hypothesis is true and not that the null hypothesis is true?

I think this is a language/interpretation problem. You are interpreting
alternative hypothesis: true difference in means is not equal to 0
as
The alternative hypothesis is true. The difference in means is not equal to 0
rather than (as intended)
The alternative hypothesis is: "the true difference in means is not equal to 0"
(According to strict frequentist logic we would never conclude "the alternative hypothesis is true", only that we can reject the null hypothesis.)
In order to evaluate the conclusions of the test, you should look at the 95% confidence interval (0.01385526, 0.07018320) and/or the p-value (0.0069). The procedure implemented in R does not follow a "Neyman-Pearson" style where you pre-specify an alpha level and dichotomize the result into "reject null hypothesis" or "fail to reject null hypothesis". If you want to do this, you can either just look at the p-value or, if you want R to do it for you,
alpha <- 0.05 ## or whatever your preferred cutoff is
t_result <- t.test(A,B)
t_result$p.value<alpha ## TRUE (reject null hypothesis)
Furthermore, your interpretation of the confidence interval is wrong. You should look to see whether the confidence interval includes zero; it will always be centred on the observed difference (so the observed difference will always be included in the 95% CI).

Related

Is using 'adjust = "tukey"' in emmeans equivalent to a Tukey HSD test?

I have been looking around and I am quite confused about Tukey adjustment in emmeans.
The documentation of emmeans doesn't mention Tukey HSD at all, but in here it is said that "For most contrast() results, adjust is often something else, depending on what type of contrasts are created. For example, pairwise comparisons default to adjust = "tukey", i.e., the Tukey HSD method.".
As I understand, Tukey HSD is essentially a series of pairwise t-test with adjustment for type I error.
But the emmeans function is calculating estimated marginal means (EMMs), which I assume are not pairwise t-tests; then applying the Tukey adjustment to emmeans output, would not be an equivalent to Tukey HSD post hoc test.
A second related question would be what the function "tukey.emmc", also from emmeans, does?
[Update]
I guess my second question is, what is the difference between tukey.emmc and contrast() with 'adjust = "tukey"'?
Using adjust = "tukey" means that critical values and adjusted P values are obtained from the Studentized range distribution qtukey() and ptukey() respectively. Those are the same critical values that are used in the Tukey HSD test. But to put a very fine edge on it, the Tukey HSD method is really defined only for independent samples of equal size, which may or may not be the case for emmeans() results. For more details, see ? summary.emmGrid and refer to the section on P-value adjustments.
Regarding the second question, both pairwise.emmc() generates contrast coefficients for pairwise comparisons; as does revpairwise.emmc(). Here is a third possibility:
> emmeans:::tukey.emmc
function(levs, reverse = FALSE, ...) {
if (reverse)
revpairwise.emmc(levs, ...)
else
pairwise.emmc(levs, ...)
}
That is, tukey.emmc() invokes one of those pairwise-comparison methods depending on reverse. Thus, contrast(..., method = "tukey", reverse = TRUE) is equivalent to contrast(..., method = "revpairwise").
Every .emmc function passes a default adjustment method to contrast(), and in the case of pairwise.emmc() and tukey.emmc(), that default is adjust = "tukey". Thus, calling contrast(..., method = "pairwise") is the same as contrast(..., method = "pairwise", adjust = "tukey"). Whereas calling some other contrast function may produce different defaults. For example, consec.emmc() passes the "mvt" adjustment by default:
> emmeans:::consec.emmc(1:4)
2 - 1 3 - 2 4 - 3
1 -1 0 0
2 1 -1 0
3 0 1 -1
4 0 0 1
> attributes(.Last.value)
$names
[1] "2 - 1" "3 - 2" "4 - 3"
$row.names
[1] 1 2 3 4
$class
[1] "data.frame"
$desc
[1] "changes between consecutive levels"
$adjust
[1] "mvt"
An additional comment about the Tukey adjustment: That adjustment is only appropriate for a single set of pairwise comparisons. If you specify adjust = "tukey" for non-pairwise comparisons or arbitrary contrasts, it will overrule you and use the "sidak" adjustment instead.

Im solving a statistic problem using R and I need help regarding confidence interval for the population mean difference [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
> be = c(81.6, 88.5, 80.3, 100.2, 94.3, 90.3)
> af = c(83.0, 84.8, 73.0, 92.5, 89.4, 85.7)
> t.test(be, af, alternative = "two.sided", mu = 0, conf.level = 0.99)
Welch Two Sample t-test
data: be and af
t = 1.084, df = 9.8542, p-value = 0.3042
alternative hypothesis: true difference in means is not equal to 0
99 percent confidence interval:
-8.636224 17.569557
sample estimates:
mean of x mean of y
89.20000 84.73333
So i've obtained the p-value = 0.3042 which is greater than α = 0.01, hence i'll reject H1 and the answer for (b) will be "12-week special exercise program do not reduce weight"
But how should i answer for (a) ?
In answering part b, you should have firstly performed a Shapiro-Wilks test to determine if the difference in weigths are Normally distributed.
Secondly, your use of the t-test function is incorrect. You should have applied a paired t-test.
>t.test(be,af,paired = TRUE,conf.level = 0.99)
Paired t-test
data: be and af
t = 3.3388, df = 5, p-value = 0.02058
alternative hypothesis: true difference in means is not equal to 0
99 percent confidence interval:
-0.9276381 9.8609714
sample estimates:
mean of the differences
4.466667
This output shows the p-value of 0.02 which means you reject the null-hypothesis of no effect. (Intuitively, the results do show a material weight drop in 4 out of 5 cases)
for part a of your question, the 99% confidence interval is given in the output you provided above: (-0.92,9.87).
You can also confirm this using the CI function Rmisc package or calculate on your own. See code below.
dif<-be-af
library(Rmisc)
CI(dif,0.99)
#Calculate limits
mean(dif)
s<-sd(dif)
n<-length(dif)
error <- qt(0.995,df=n-1)*s/sqrt(n)
mean(dif)-error #Lower Limit
mean(dif)+error #Upper Limit
Ref: https://www.cyclismo.org/tutorial/R/confidence.html#calculating-a-confidence-interval-from-a-normal-distribution
You include the answer to A as part of your question!
99 percent confidence interval:
-8.636224 17.569557
To be clear though, you won't reject H1, you'll fail to reject H0. The reason behind this phrasing is that perhaps with another set of data you might find a different result. Hope this helps.

Is it possible to change Type I error threshold using t.test() function?

I am asked to compute a test statistic using the t.test() function, but I need to reduce the type I error. My prof showed us how to change a confidence level for this function, but not the acceptable type I error for null hypothesis testing. The goal is for the argument to automatically compute a p-value based on a .01 error rate rather than the normal .05.
The r code below involves a data set that I have downloaded.
t.test(mid$log_radius_area, mu=8.456)
I feel like I've answered this somewhere, but can't seem to find it on SO or CrossValidated.
Similarly to this question, the answer is that t.test() doesn't specify any threshold for rejecting/failing to reject the null hypothesis; it reports a p-value, and you get to decide whether to reject or not. (The conf.level argument is for adjusting which confidence interval the output reports.)
From ?t.test:
t.test(1:10, y = c(7:20))
Welch Two Sample t-test
data: 1:10 and c(7:20)
t = -5.4349, df = 21.982, p-value = 1.855e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.052802 -4.947198
sample estimates:
mean of x mean of y
5.5 13.5
Here the p-value is reported as 1.855e-05, so the null hypothesis would be rejected for any (pre-specified) alpha level >1.855e-05. Note that the output doesn't say anywhere "the null hypothesis is rejected at alpha=0.05" or anything like that. You could write your own function to do that, using the $p.value element that is saved as part of the test results:
report_test <- function(tt, alpha=0.05) {
cat("the null hypothesis is ")
if (tt$p.value > alpha) {
cat("**NOT** ")
}
cat("rejected at alpha=",alpha,"\n")
}
tt <- t.test(1:10, y = c(7:20))
report_test(tt)
## the null hypothesis is rejected at alpha= 0.05
Most R package/function writers don't bother to do this, because they figure that it should be simple enough for users to do for themselves.

R - Error T-test For loop command between variables

Currently in the process of writing a For Loop that'll calculate and print t-test results, I'm testing for the difference in means of all variables (faminc, fatheduc, motheduc, white, cigtax, cigprice) between smokers and non-smokers ("smoke"; 0=non, 1=smoker)
Current code:
type <- c("faminc", "fatheduc", "motheduc", "white", "cigtax", "cigprice")
count <- 1
for(name in type){
temp <- subset(data, data[name]==1)
cat("For", name, "between smokers and non, the difference in means is: \n")
print(t.test(temp$smoke))
count <- count + 1
}
However, I feel that 'temp' doesn't belong here and when running the code I get:
For faminc between smokers and non, the difference in means is:
Error in t.test.default(temp$smoke) : not enough 'x' observations
The simple code of
t.test(faminc~smoke,data=data)
does what I need, but I'd like to get some practice/better understanding of for loops.
Here is a solution that generates the output requested in the OP, using lapply() with the mtcars data set.
data(mtcars)
varList <- c("wt","disp","mpg")
results <- lapply(varList,function(x){
t.test(mtcars[[x]] ~ mtcars$am)
})
names(results) <- varList
for(i in 1:length(results)){
message(paste("for variable:",names(results[i]),"difference between manual and automatic transmissions is:"))
print(results[[i]])
}
...and the output:
> for(i in 1:length(results)){
+ message(paste("for variable:",names(results[i]),"difference between manual and automatic transmissions is:"))
+ print(results[[i]])
+ }
for variable: wt difference between manual and automatic transmissions is:
Welch Two Sample t-test
data: mtcars[[x]] by mtcars$am
t = 5.4939, df = 29.234, p-value = 6.272e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.8525632 1.8632262
sample estimates:
mean in group 0 mean in group 1
3.768895 2.411000
for variable: disp difference between manual and automatic transmissions is:
Welch Two Sample t-test
data: mtcars[[x]] by mtcars$am
t = 4.1977, df = 29.258, p-value = 0.00023
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
75.32779 218.36857
sample estimates:
mean in group 0 mean in group 1
290.3789 143.5308
for variable: mpg difference between manual and automatic transmissions is:
Welch Two Sample t-test
data: mtcars[[x]] by mtcars$am
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.280194 -3.209684
sample estimates:
mean in group 0 mean in group 1
17.14737 24.39231
>
Compare your code that works...
t.test(faminc~smoke,data=data)
You are specifying a relationship between variables (faminc~smoke) which means that you think the mean of faminc is different between the values of smoke and you wish to use the data dataset.
The equivalent line in your loop...
print(t.test(temp$smoke))
...only gives the single column of temp$smoke after having selected those who have the value 1 for each of faminc, fatheduc etc. So even if you wrote...
print(t.test(faminc~smoke, data=data))
Further your count is doing nothing.
If you want to perform a range of testes in this manner you could
type <- c("faminc", "fatheduc", "motheduc", "white", "cigtax", "cigprice")
for(name in type){
cat("For", name, "between smokers and non, the difference in means is: \n")
print(t.test(name~smoke, data=data))
}
Whether this is what you want to do though isn't clear to me, your variables suggest family (faminc), father (fatheduc), mother (motheduc), ethnicity (white), tax (cigtax) and price (cigprice).
I can't think why you would want to compare the mean cigarette price or tax between smokers and non-smokers, because the later are not going to have any value for this since they don't smoke!
You're code suggests these are perhaps binary variables though (since you are filtering on each value being 1) which to me suggests this isn't even what you want to do.
If you wish to look in subsets of data then a tidier approach to performing regression rather than loops is to use purrr.
In future when asking consider providing a sample of data along with the full copy & pasted output as advised in How to create a Minimal, Complete, and Verifiable example - Help Center - Stack Overflow. Because this allows people to see in greater detail what you are doing (e.g. I've only guessed about your variables). With statistics its also useful to state what your hypothesis is too to help people understand what it is you are trying to achieve overall.

Calculating standard error after a log-transform

Consider a random set of numbers that are normally distributed:
x <- rnorm(n=1000, mean=10)
We'd like to know the mean and the standard error on the mean so we do the following:
se <- function(x) { sd(x)/length(x) }
mean(x) # something near 10.0 units
se(x) # something near 0.001 units
Great!
However, let's assume we don't necessarily know that our original distribution follows a normal distribution. We log-transform the data and perform the same standard error calculation.
z <- log(x, base=10)
mean(z) # something near 1 log units
se(z) # something near 0.000043 log units
Cool, but now we need to back-transform to get our answer in units NOT log units.
10^mean(z) # something near 10.0 units
10^se(z) # something near 1.00 units
My question: Why, for a normal distribution, does the standard error differ depending on whether it was calculated from the distribution itself or if it was transformed, calculated, and back-transformed? In this example, it is interesting that the difference is almost exactly 3 orders of magnitude. Note: the means came out the same regardless of the transformation.
EDIT #1: Ultimately, I am interested in calculating a mean and confidence intervals for non-normally distributed data, so if you can give some guidance on how to calculate 95% CI's on transformed data including how to back-transform to their native units, I would appreciate it!
END EDIT #1
EDIT #2: I tried using the quantile function to get the 95% confidence intervals:
quantile(x, probs = c(0.05, 0.95)) # around [8.3, 11.6]
10^quantile(z, probs = c(0.05, 0.95)) # around [8.3, 11.6]
So, that converged on the same answer, which is good. However, using this method doesn't provide the exact same interval using non-normal data with "small" sample sizes:
t <- rlnorm(10)
mean(t) # around 1.46 units
10^mean(log(t, base=10)) # around 0.92 units
quantile(t, probs = c(0.05, 0.95)) # around [0.211, 4.79]
10^(quantile(log(t, base=10), probs = c(0.05, 0.95))) # around [0.209, 4.28]
Which method would be considered "more correct". I assume one would pick the most conservative estimate?
As an example, would you report this result for the non-normal data (t) as having a mean of 0.92 units with a 95% confidence interval of [0.211, 4.79]?
END EDIT #2
Thanks for your time!

Resources