this may be a trivial question.
In my data, I have two groups grp1 and grp2. In each group, I have some observations assigned to the treatment group and some observations assigned to the control group.
My question is whether there is a statistically significant difference on dv of the treatment in grp1 and grp2. In some way, this is a difference in differences.
I want to estimate if the following difference is significant:
dd = mean(dv_grp1_treat-dv_grp1_control)-mean(dv_grp2_treat-dv_grp2_control)
# create data
install.packages("librarian")
librarian::shelf(librarian,tidyverse,truncnorm)
aud_tr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.1, sd=1))) %>% mutate(group="grp1_tr")
aud_notr <- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2, sd=1))) %>% mutate(group="grp1_notr")
noaud_tr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.4, sd=1))) %>% mutate(group="grp2_tr")
noaud_notr<- as.data.frame(list(avglist=rtruncnorm(625, a=0,b=4, mean=2.1, sd=1))) %>% mutate(group="grp2_notr")
df<- bind_rows(aud_tr,aud_notr,noaud_tr,noaud_notr)
unique(df$group)
[1] "grp1_treat" "grp1_control" "grp2_treat" "grp2_control"
I know how to run t.test for difference in means between in each group, but how do I do it if I want to examine the difference across groups?
t.test(df$dv[df$group=="grp1_treat"],df$dv[df$group=="grp1_control"])
t.test(df$dv[df$group=="grp2_treat"],df$dv[df$group=="grp2_control"])
It sounds like you need a two-way analysis of variance (ANOVA). Firstly, you should ensure that you separate out "group membership" and "treatment versus control" into two columns, since these are really two distinct variables:
df$treatment <- ifelse(grepl('treat', df$group), 'treat', 'control')
df$group <- ifelse(grepl('1', df$group), 'grp1', 'grp2')
Then you can carry out a two way ANOVA using aov
summary(aov(dv ~ group + treatment, data = df))
#> Df Sum Sq Mean Sq F value Pr(>F)
#> group 1 1.18 1.175 1.362 0.245
#> treatment 1 26.14 26.145 30.307 1.14e-07 ***
#> Residuals 197 169.95 0.863
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This tells you that, in this sample, the effect of treatment was significant, but the effect of group membership was not
Data
Obviously, we don't have your data since it wasn't supplied in the question, but the following sample data frame has the same names and structure as your own:
set.seed(1)
df <- data.frame(dv = c(rnorm(50, 3.2), rnorm(50, 3.8),
rnorm(50, 3.5), rnorm(50, 4.1)),
group = rep(c('grp1_control', 'grp1_treat',
'grp2_control', 'grp2_treat'), each = 50))
Related
I have a question about how to compare coefficients in a multivariate regression in R.
I conducted a survey in which I measured three different attitudes (scale variables). My goal is to estimate whether some characteristics of the respondents (age, gender, education and ideological position) can explain their (positve/negative) attitudes.
I was advised to conduct a multivariate multiple regression instead of three univariate multiple regression. The code of my multivariate model is:
MMR <- lm(cbind(Attitude_1, Attitude_2, Attitude_3) ~
Age + Gender + Education + Ideological_position,
data = survey)
summary(MMR)
What I am trying to do next is to estimate whether the coefficients of let's say 'Gender' are statistically significant across the three individual models.
I found a very clear instruction how to do this in Stata (https://stats.idre.ucla.edu/stata/dae/multivariate-regression-analysis/), but I don't have a license, so I have to find an alternative in R. I know a similar question has been asked here before (R - Testing equivalence of coefficients in multivariate multiple regression), but the answer was that there does not exist a package (or function) in R which can be used for this purpose. Because this answer was provided a few years back, I was wondering whether in the meantime some new packages or functions are implemented.
More precisely, I was wondering whether I can use the linearHypothesis() function (https://www.rdocumentation.org/packages/car/versions/3.0-11/topics/linearHypothesis)? I already know that this function allows me to test, for instance, whether the coefficient of Gender equals to coefficient of Education:
linearhypothesis(MMR, c("GenderFemale", "EducationHigh-educated")
Can I also use this function to test whether the coefficient of Gender in the equation modelling Attitude_1 equals the coefficient of Gender in the equation modelling Attitude_2 or Attitude_3?
Any help would be greatly appreciated!
Since the model presented in the question is not reproducible (the input is missing) let us use this model instead.
fm0 <- lm(cbind(cyl, mpg) ~ wt + hp, mtcars)
We will discuss two approaches using as our linear hypothesis that the intercepts of the cyl and mpg groups are the same, that the wt slopes are the same and the hp slopes are the same.
1) Mean/Variance
In this approach we base the entire comparison only on the coefficients and their variance covariance matrix.
library(car)
v <- vcov(fm0)
co <- setNames(c(coef(fm0)), rownames(v))
h1 <- c("cyl:(Intercept) = mpg:(Intercept)", "cyl:wt = mpg:wt", "cyl:hp = mpg:hp")
linearHypothesis(NULL, h1, coef. = co, vcov. = v)
giving:
Linear hypothesis test
Hypothesis:
cyl:((Intercept) - mpg:(Intercept) = 0
cyl:wt - mpg:wt = 0
cyl:hp - mpg:hp = 0
Model 1: restricted model
Model 2: structure(list(), class = "formula", .Environment = <environment>)
Note: Coefficient covariance matrix supplied.
Df Chisq Pr(>Chisq)
1
2 3 878.53 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
To explain what linearHypothesis is doing note that In this case the hypothesis matrix is L <- t(c(1, -1)) %x% diag(3) and given v then as a large sample approximation we have that L %*% co is distributed as N(0, L %*% v %*% t(L)) under the null hypothesis hence t(L %*% co) %*% solve(L %*% v %*% t(L)) %*% L %*% co is distributed as chi squared with nrow(L) degrees of freedom.
L <- t(c(1, -1)) %>% diag(3)
nrow(L) # degrees of freedom
SSH <- t(L %*% co) %*% solve(L %*% v %*% t(L)) %*% L %*% co # chisq
p <- pchisq(SSH, nrow(L), lower.tail = FALSE) # p value
2) Long form model
With this approach (which is not equivalent to the first one shown above) convert mtcars from wide to long form, mt2. We show how to do that using reshape or pivot_longer at the end but for now we will just form it explicitly. Define lhs as the 32x2 matrix on the left hand side of the fm0 formula, i.e. cbind(cyl, mpg). Note that its column names are c("cyl", "mpg"). Stringing out lhs column by column into a 64 long vector of the cyl column followed by the mpg column gives us our new dependent variable y. We also form a grouping variable g. the same length as y which indicates which column in lhs the corresponding element of y is from.
With mt2 defined we can form fm1. In forming fm1 We will use a weight vector w based on the fm0 sigma values to reflect the fact that the two groups, cyl and mpg, have different values of sigma given by the vector sigma(fm0).
We show below that the fm0 and fm1 models have the same coefficients and then run linearHypothesis.
library(car)
lhs <- fm0$model[[1]]
g. <- colnames(lhs)[col(lhs)]
y <- c(lhs)
mt2 <- with(mtcars, data.frame(wt, hp, g., y))
w <- 1 / sigma(fm0)[g.]^2
fm1 <- lm(y ~ g./(wt + hp) + 0, mt2, weights = w)
# note coefficient names
variable.names(fm1)
## [1] "g.cyl" "g.mpg" "g.cyl:wt" "g.mpg:wt" "g.cyl:hp" "g.mpg:hp"
# check that fm0 and fm1 have same coefs
all.equal(c(t(coef(fm0))), coef(fm1), check.attributes = FALSE)
## [1] TRUE
h2 <- c("g.mpg = g.cyl", "g.mpg:wt = g.cyl:wt", "g.mpg:hp = g.cyl:hp")
linearHypothesis(fm1, h2)
giving:
Linear hypothesis test
Hypothesis:
- g.cyl + g.mpg = 0
- g.cyl:wt + g.mpg:wt = 0
- g.cyl:hp + g.mpg:hp = 0
Model 1: restricted model
Model 2: y ~ g./(wt + hp) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 61 1095.8
2 58 58.0 3 1037.8 345.95 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
If L is the hypothesis matrix which is the same as L in (1) except the columns are reordered, q is its number of rows, n is the number of rows of mt2 then SSH/q is distributed F(q, n-q-1) so we have:
n <- nrow(mt2)
L <- diag(3) %x% t(c(1, -1)) # note difference from (1)
q <- nrow(L)
SSH <- t(L %*% coef(fm1)) %*% solve(L %*% vcov(fm1) %*% t(L)) %*% L %*% coef(fm1)
SSH/q # F value
pf(SSH/q, q, n-q-1, lower.tail = FALSE) # p value
anova
An alternative to linearHypothesis is to define the reduced model and then compare the two models using anova. mt2 and w are from above. No packages are used.
fm2 <- lm(y ~ hp + wt, mt2, weights = w)
anova(fm2, fm1)
giving:
Analysis of Variance Table
Model 1: y ~ hp + wt
Model 2: y ~ g./(wt + hp) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 61 1095.8
2 58 58.0 3 1037.8 345.95 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Alternate wide to long calculation
An alternate way to form mt2 is by reshaping mtcars from wide form to long form using reshape.
mt2a <- mtcars |>
reshape(dir = "long", varying = list(colnames(lhs)), v.names = "y",
timevar = "g.", times = colnames(lhs)) |>
subset(select = c("wt", "hp", "g.", "y"))
or using tidyverse (which has rows in a different order but that should not matter as long as mat2b is used consistently in forming fm1 and w.
library(dplyr)
library(tidyr)
mt2b <- mtcars %>%
select(mpg, cyl, wt, hp) %>%
pivot_longer(all_of(colnames(lhs)), names_to = "g.", values_to = "y")
df <- data.frame (rating1 = c(1,5,2,4,5),
rating2 = c(2,1,2,4,2),
rating3 = c(0,2,1,2,0),
race = c("black", "asian", "white","black","white"),
gender = c("male","female","female","male","female")
)
I'd like to conduct t-test of group mean (e.g. mean of asians in rating1) and the overall mean of each rating (e.g. rating1). Below is my code for Asians in rating1.
asian_df <- df %>%
filter(race == "asian")
t.test(asian_df$rating1, mean(df$rating1))
Then for Blacks in rating 2, I'd run
black_df <- df %>%
filter(race == "black")
t.test(black_df$rating2, mean(df$rating2))
How can I write a function that automates the t-test for each group? So far I have to manually change the variable name to essentially run for each race, each gender and on each rating (rating 1 to rating 3). Thanks!
Performing multiple t-tests increases your risk of Type I error and you will need to adjust for multiple comparisons in order for your results to be valid/meaningful. You can run the t-tests by looping through your variables, e.g.
library(tidyverse)
df <- data.frame (rating1 = c(5,8,7,8,9,6,9,7,8,5,8,5),
rating2 = c(2,7,8,4,9,3,6,1,7,3,9,1),
rating3 = c(0,6,1,2,7,2,9,1,6,2,3,1),
race = c("asian", "asian", "asian","black","asian","black","white","black","white","black","white","black"),
gender = c("male","female","female","male","female","male","female","male","female","male","female","male")
)
for (rac in unique(df$race)){
tmp_df <- df %>%
filter(race == rac)
print(rac)
print(t.test(tmp_df$rating1,
rep(mean(df$rating1),
length(tmp_df$rating1))))
}
[1] "asian"
Welch Two Sample t-test
data: tmp_df$rating1 and rep(mean(df$rating1), length(tmp_df$rating1))
t = 0.19518, df = 3, p-value = 0.8577
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.550864 2.884198
sample estimates:
mean of x mean of y
7.250000 7.083333
[1] "black"
Welch Two Sample t-test
data: tmp_df$rating1 and rep(mean(df$rating1), length(tmp_df$rating1))
t = -1.5149, df = 4, p-value = 0.2044
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.5022651 0.7355985
sample estimates:
mean of x mean of y
6.200000 7.083333
[1] "white"
Welch Two Sample t-test
data: tmp_df$rating1 and rep(mean(df$rating1), length(tmp_df$rating1))
t = 3.75, df = 2, p-value = 0.06433
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.1842176 2.6842176
sample estimates:
mean of x mean of y
8.333333 7.083333
for (gend in unique(df$gender)){
tmp_df <- df %>%
filter(gender == gend)
print(gend)
print(t.test(tmp_df$rating1,
rep(mean(df$rating1),
length(tmp_df$rating1))))
}
[1] "male"
Welch Two Sample t-test
data: tmp_df$rating1 and rep(mean(df$rating1), length(tmp_df$rating1))
t = -2.0979, df = 5, p-value = 0.09
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.4107761 0.2441094
sample estimates:
mean of x mean of y
6.000000 7.083333
[1] "female"
Welch Two Sample t-test
data: tmp_df$rating1 and rep(mean(df$rating1), length(tmp_df$rating1))
t = 3.5251, df = 5, p-value = 0.01683
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.2933469 1.8733198
sample estimates:
mean of x mean of y
8.166667 7.083333
Due to multiple testing (in this example, 5 t-tests) your chance of a false positive is 1 - (1 - 0.05)^5 = 22.62% <- very high. To account for this, you can apply the Bonferroni correction, which basically takes your required p-value (in this case, p < 0.05) and divides it by the number of tests (i.e. the new p-value required to reject the null is p < 0.01). When you apply this correction, even the 'best' t-test result (gender; p-value = 0.01683) is not statistically significant.
An alternative approach would be to compare means in all conditions using ANOVA, then use Tukey's HSD to determine which groups are different. Tukey's HSD is a single post-hoc test, so you don't need to account for multiple testing, and your results are valid. Adapting this approach to your problem might be a better way to go e.g.
anova_one_way <- aov(rating1 + rating2 + rating3 ~ race + gender, data = df)
summary(anova_one_way)
Df Sum Sq Mean Sq F value Pr(>F)
race 2 266.70 133.35 14.01 0.00243 **
gender 1 140.08 140.08 14.72 0.00497 **
Residuals 8 76.13 9.52
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(anova_one_way)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = rating1 + rating2 + rating3 ~ race + gender, data = df)
$race
diff lwr upr p adj
black-asian -7.050000 -12.963253 -1.136747 0.0224905
white-asian 4.416667 -2.315868 11.149201 0.2076254
white-black 11.466667 5.029132 17.904201 0.0023910
$gender
diff lwr upr p adj
male-female -3.416667 -7.523829 0.6904958 0.0913521
I have some data about trends over time in drug use across the state. I want to know whether there have been changes in the gender difference in intravenous drug use versus gender differences in all recreational drug use over time.
My data is below. I think I might need to use time-series analysis, but I'm not sure. Any help would be much appreciated.
enter image description here
Since the description in the question does not match the data as there is no information on gender we will assume from the subject that we want to determine if the trends of illicit and iv are the same.
Comparing Trends
Note that there is no autocorrelation in the detrended values of iv or illicit so we will use ordinary linear models.
iv <- c(0.4, 0.3, 0.4, 0.3, 0.2, 0.2)
illicit <- c(5.5, 5.7, 4.8, 4.7, 6.1, 5.3)
time <- 2011:2016
ar(resid(lm(iv ~ time)))
## Call:
## ar(x = resid(lm(iv ~ time)))
##
## Order selected 0 sigma^2 estimated as 0.0024
ar(resid(lm(illicit ~ time)))
## Call:
## ar(x = resid(lm(illicit ~ time)))
##
## Order selected 0 sigma^2 estimated as 0.287
Create a 12x3 data frame long with columns time, value and ind (iv or illicit). Then run a linear model with two slopes and and another with one slope. Both have two intercepts. Then compare them using anova. Evidently they are not significantly different so we cannot reject the hypothesis that the slopes are the same.
wide <- data.frame(iv, illicit)
long <- cbind(time, stack(wide))
fm2 <- lm(values ~ ind/(time + 1) + 0, long)
fm1 <- lm(values ~ ind + time + 0, long)
anova(fm1, fm2)
giving:
Analysis of Variance Table
Model 1: values ~ ind + time + 0
Model 2: values ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 1.4629
2 8 1.4469 1 0.016071 0.0889 0.7732
Comparing model with slopes to one without slopes
Actually the slopes are not significant in the first place and we cannot reject the hypothesis that both the slopes are zero. Compare to a two intercept model with no slopes.
fm0 <- lm(values ~ ind + 0, long)
anova(fm0, fm2)
giving:
Analysis of Variance Table
Model 1: values ~ ind + 0
Model 2: values ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 10 1.4750
2 8 1.4469 2 0.028143 0.0778 0.9258
or running a stepwise regression we find that its favored model is one with two intercepts and no slopes:
step(fm2)
giving:
Start: AIC=-17.39
values ~ ind/(time + 1) + 0
Df Sum of Sq RSS AIC
- ind:time 2 0.028143 1.4750 -21.155
<none> 1.4469 -17.386
Step: AIC=-21.15
values ~ ind - 1
Df Sum of Sq RSS AIC
<none> 1.475 -21.155
- ind 2 172.28 173.750 32.073
Call:
lm(formula = values ~ ind - 1, data = long)
Coefficients:
indiv indillicit
0.30 5.35
log transformed values
If we use log(values) then we similarly find no autocorrelation (not shown) but we do find the slopes of the log transformed values are significantly different.
fm2log <- lm(log(values) ~ ind/(time + 1) + 0, long)
fm1log <- lm(log(values) ~ ind + time + 0, long)
anova(fm1log, fm2log)
giving:
Analysis of Variance Table
Model 1: log(values) ~ ind + time + 0
Model 2: log(values) ~ ind/(time + 1) + 0
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 0.35898
2 8 0.18275 1 0.17622 7.7141 0.02402 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Normally from aov() you can get residuals after using summary() function on it.
But how can I get residuals when I use Repeated measures ANOVA and formula is different?
## as a test, not particularly sensible statistically
npk.aovE <- aov(yield ~ N*P*K + Error(block), npk)
npk.aovE
summary(npk.aovE)
Error: block
Df Sum Sq Mean Sq F value Pr(>F)
N:P:K 1 37.0 37.00 0.483 0.525
Residuals 4 306.3 76.57
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
N 1 189.28 189.28 12.259 0.00437 **
P 1 8.40 8.40 0.544 0.47490
K 1 95.20 95.20 6.166 0.02880 *
N:P 1 21.28 21.28 1.378 0.26317
N:K 1 33.14 33.14 2.146 0.16865
P:K 1 0.48 0.48 0.031 0.86275
Residuals 12 185.29 15.44
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Intuitial summary(npk.aovE)$residuals return NULL..
Can anyone can help me with this?
Look at the output of
> names(npk.aovE)
and try
> npk.aovE$residuals
EDIT: I apologize I read your example way too quickly. What I suggested is not possible with multilevel models with aov(). Try the following:
> npk.pr <- proj(npk.aovE)
> npk.pr[[3]][, "Residuals"]
Here's a simpler reproducible anyone can mess around with if they run into the same issue:
x1 <- gl(8, 4)
block <- gl(2, 16)
y <- as.numeric(x1) + rnorm(length(x1))
d <- data.frame(block, x1, y)
m <- aov(y ~ x1 + Error(block), d)
m.pr <- proj(m)
m.pr[[3]][, "Residuals"]
The other option is with lme:
require(MASS) ## for oats data set
require(nlme) ## for lme()
require(multcomp) ## for multiple comparison stuff
Aov.mod <- aov(Y ~ N * V + Error(B/V), data = oats)
the_residuals <- aov.out.pr[[3]][, "Residuals"]
Lme.mod <- lme(Y ~ N * V, random = ~1 | B/V, data = oats)
the_residuals <- residuals(Lme.mod)
The original example came without the interaction (Lme.mod <- lme(Y ~ N * V, random = ~1 | B/V, data = oats)) but it seems to be working with it (and producing different results, so it is doing something).
And that's it...
...but for completeness:
1 - The summaries of the model
summary(Aov.mod)
anova(Lme.mod)
2 - The Tukey test with repeated measures anova (3 hours looking for this!!). It does raises a warning when there is an interaction (* instead of +), but it seems to be safe to ignore it. Notice that V and N are factors inside the formula.
summary(Lme.mod)
summary(glht(Lme.mod, linfct=mcp(V="Tukey")))
summary(glht(Lme.mod, linfct=mcp(N="Tukey")))
3 - The normality and homoscedasticity plots
par(mfrow=c(1,2)) #add room for the rotated labels
aov.out.pr <- proj(aov.mod)
#oats$resi <- aov.out.pr[[3]][, "Residuals"]
oats$resi <- residuals(Lme.mod)
qqnorm(oats$resi, main="Normal Q-Q") # A quantile normal plot - good for checking normality
qqline(oats$resi)
boxplot(resi ~ interaction(N,V), main="Homoscedasticity",
xlab = "Code Categories", ylab = "Residuals", border = "white",
data=oats)
points(resi ~ interaction(N,V), pch = 1,
main="Homoscedasticity", data=oats)
By default lm summary test slope coefficient equal to zero. My question is very basic. I want to know how to test slope coefficient equal to non-zero value. One approach could be to use confint but this does not provide p-value. I also wonder how to do one-sided test with lm.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)
summary(lm.D9)
Call:
lm(formula = weight ~ group)
Residuals:
Min 1Q Median 3Q Max
-1.0710 -0.4938 0.0685 0.2462 1.3690
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0320 0.2202 22.850 9.55e-15 ***
groupTrt -0.3710 0.3114 -1.191 0.249
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6964 on 18 degrees of freedom
Multiple R-squared: 0.07308, Adjusted R-squared: 0.02158
F-statistic: 1.419 on 1 and 18 DF, p-value: 0.249
confint(lm.D9)
2.5 % 97.5 %
(Intercept) 4.56934 5.4946602
groupTrt -1.02530 0.2833003
Thanks for your time and effort.
as #power says, you can do by your hand.
here is an example:
> est <- summary.lm(lm.D9)$coef[2, 1]
> se <- summary.lm(lm.D9)$coef[2, 2]
> df <- summary.lm(lm.D9)$df[2]
>
> m <- 0
> 2 * abs(pt((est-m)/se, df))
[1] 0.2490232
>
> m <- 0.2
> 2 * abs(pt((est-m)/se, df))
[1] 0.08332659
and you can do one-side test by omitting 2*.
UPDATES
here is an example of two-side and one-side probability:
> m <- 0.2
>
> # two-side probability
> 2 * abs(pt((est-m)/se, df))
[1] 0.08332659
>
> # one-side, upper (i.e., greater than 0.2)
> pt((est-m)/se, df, lower.tail = FALSE)
[1] 0.9583367
>
> # one-side, lower (i.e., less than 0.2)
> pt((est-m)/se, df, lower.tail = TRUE)
[1] 0.0416633
note that sum of upper and lower probabilities is exactly 1.
Use the linearHypothesis function from car package. For instance, you can check if the coefficient of groupTrt equals -1 using.
linearHypothesis(lm.D9, "groupTrt = -1")
Linear hypothesis test
Hypothesis:
groupTrt = - 1
Model 1: restricted model
Model 2: weight ~ group
Res.Df RSS Df Sum of Sq F Pr(>F)
1 19 10.7075
2 18 8.7292 1 1.9782 4.0791 0.05856 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The smatr package has a slope.test() function with which you can use OLS.
In addition to all the other good answers, you could use an offset. It's a little trickier with categorical predictors, because you need to know the coding.
lm(weight~group+offset(1*(group=="Trt")))
The 1* here is unnecessary but is put in to emphasize that you are testing against the hypothesis that the difference is 1 (if you want to test against a hypothesis of a difference of d, then use d*(group=="Trt")
You can use t.test to do this for your data. The mu parameter sets the hypothesis for the difference of group means. The alternative parameter lets you choose between one and two-sided tests.
t.test(weight~group,var.equal=TRUE)
Two Sample t-test
data: weight by group
t = 1.1913, df = 18, p-value = 0.249
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.2833003 1.0253003
sample estimates:
mean in group Ctl mean in group Trt
5.032 4.661
t.test(weight~group,var.equal=TRUE,mu=-1)
Two Sample t-test
data: weight by group
t = 4.4022, df = 18, p-value = 0.0003438
alternative hypothesis: true difference in means is not equal to -1
95 percent confidence interval:
-0.2833003 1.0253003
sample estimates:
mean in group Ctl mean in group Trt
5.032 4.661
Code up your own test. You know the estimated coeffiecient and you know the standard error. You could construct your own test stat.