I have a data set like this
iu
sample
obs
1.5625
s
0.312
1.5625
s
0.302
3.125
s
0.335
3.125
s
0.333
6.25
s
0.423
6.25
s
0.391
12.5
s
0.562
12.5
s
0.56
25
s
0.84
25
s
0.843
50
s
1.202
50
s
1.185
100
s
1.408
100
s
1.338
200
s
1.42
200
s
1.37
1.5625
t
0.317
1.5625
t
0.313
3.125
t
0.345
3.125
t
0.343
6.25
t
0.413
6.25
t
0.404
12.5
t
0.577
12.5
t
0.557
25
t
0.863
25
t
0.862
50
t
1.22
50
t
1.197
100
t
1.395
100
t
1.364
200
t
1.425
200
t
1.415
I want to use R to recreate SAS code below. I believe this SAS code means a nonlinear fit is performed for each subsets, where three parameters are the same and one parameter is different.
proc nlin data=assay;
model obs=D+(A-D)/(1+(iu/((cs∗(sample=“S”)
+Ct∗(sample=“T”))))∗∗(B));
parms D=1 B=1 Cs=1 Ct=1 A=1;
run;
So I write something like this then get
nlm_1 <- nls(obs ~ (a - d) / (1 + (iu / c[sample]) ^ b) + d, data = csf_1, start = list(a = 0.3, b = 1.8, c = c(25, 25), d = 1.4))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
But without[sample], the model can be calculated
nlm_1 <- nls(obs ~ (a - d) / (1 + (iu / c) ^ b) + d, data = csf_1, start = list(a = 0.3, b = 1.8, c = c(25), d = 1.4))
summary(nlm_1)
Formula: obs ~ (a - d)/(1 + (iu/c)^b) + d
Parameters:
Estimate Std. Error t value Pr(>|t|)
a 0.31590 0.00824 38.34 <2e-16 ***
b 1.83368 0.06962 26.34 <2e-16 ***
c 25.58422 0.55494 46.10 <2e-16 ***
d 1.44777 0.01171 123.63 <2e-16 ***
---
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.02049 on 28 degrees of freedom
Number of iterations to convergence: 4
Achieved convergence tolerance: 6.721e-06
I don't get it, could some one tell me what's wrong with my code, and how can I achieve my goal with R? Thanks!
Thanks to #akrun. After I converting csf_1$sample to factor, I finally get what I wanted.
csf_1[, 2] <- as.factor(c(rep("s", 16), rep("t", 16)))
nlm_1 <- nls(obs ~ (a - d) / (1 + (iu / c[sample]) ^ b) + d, data = csf_1, start = list(a = 0.3, b = 1.8, c = c(25, 25), d = 1.4))
summary(nlm_1)
Formula: obs ~ (a - d)/(1 + (iu/c[sample])^b) + d
Parameters:
Estimate Std. Error t value Pr(>|t|)
a 0.315874 0.008102 38.99 <2e-16 ***
b 1.833303 0.068432 26.79 <2e-16 ***
c1 26.075317 0.656779 39.70 <2e-16 ***
c2 25.114050 0.632787 39.69 <2e-16 ***
d 1.447901 0.011518 125.71 <2e-16 ***
---
Signif. codes:
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.02015 on 27 degrees of freedom
Number of iterations to convergence: 4
Achieved convergence tolerance: 6.225e-06
Related
In the case when there's a categorical variable (unordered factor) in a linear model or lmer formula the function uses the first factor level as the 'control' group for contrasts. In my case I have a categorical variable with several levels and would like for each level to be the 'control' base group. Is there a function that automates this process and creates a nice matrix with p-values for all combinations? Here's a sample code using the diamonds dataset.
library(lmer);library(lmerTest)
#creating unordered factor
diamonds$color=factor(sample(c('red','white','blue','green','black'),nrow(diamonds),replace=T))
#lmer formula with factor in fixed effects
mod=lmer(data=diamonds,carat~color+(1|clarity))
summary(mod,corr=F)
As show in the summary, 'black' is used as the control, so I would like all the other colors to be used as control.
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest] Formula: carat ~ color + (1 | clarity) Data:
diamonds
REML criterion at convergence: 64684
Scaled residuals: Min 1Q Median 3Q Max
-2.228 -0.740 -0.224 0.540 8.471
Random effects: Groups Name Variance Std.Dev. clarity
(Intercept) 0.0763 0.276 Residual 0.1939 0.440
Number of obs: 53940, groups: clarity, 8
Fixed effects:
Estimate Std. Error df t value Pr(>|t|) (Intercept) 0.786709 0.097774 7.005805 8.05 0.000087
*** colorblue -0.000479 0.005989 53927.996020 -0.08 0.94 colorgreen 0.007455 0.005998 53927.990722 1.24 0.21 colorred 0.000746 0.005986 53927.988909 0.12 0.90 colorwhite 0.000449 0.005971 53927.993708 0.08 0.94
--- Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
I could imagine wanting to do this for one of two reasons. First, would be to get the predicted value of the outcome at each level of the unordered factor (controlling for everything else in the model). The other would be to calculate all of the pairwise differences across the levels of the factor. If either of these is your goal, there are better ways to do it. Let's take the first one - generating predicted outcomes for each value of the factor holding everything else constant. Let's start by using the diamonds data and using the existing color variable, but making it an unordered factor.
library(lme4)
library(lmerTest)
library(multcomp)
library(ggeffects)
#creating unordered factor
data(diamonds, package="ggplot2")
diamonds$color <- as.factor(as.character(diamonds$color))
Now, we can run the model:
#lmer formula with factor in fixed effects
mod=lmer(data=diamonds,carat~color+(1|clarity))
The function glht in the multcomp package tests pairwise differences among factor levels. Here is the output.
summary(glht(mod, linfct = mcp(color="Tukey")))
#>
#> Simultaneous Tests for General Linear Hypotheses
#>
#> Multiple Comparisons of Means: Tukey Contrasts
#>
#>
#> Fit: lmer(formula = carat ~ color + (1 | clarity), data = diamonds)
#>
#> Linear Hypotheses:
#> Estimate Std. Error z value Pr(>|z|)
#> E - D == 0 0.025497 0.006592 3.868 0.00216 **
#> F - D == 0 0.116241 0.006643 17.497 < 0.001 ***
#> G - D == 0 0.181010 0.006476 27.953 < 0.001 ***
#> H - D == 0 0.271558 0.006837 39.721 < 0.001 ***
#> I - D == 0 0.392373 0.007607 51.577 < 0.001 ***
#> J - D == 0 0.511159 0.009363 54.592 < 0.001 ***
#> F - E == 0 0.090744 0.005997 15.130 < 0.001 ***
#> G - E == 0 0.155513 0.005789 26.863 < 0.001 ***
#> H - E == 0 0.246061 0.006224 39.536 < 0.001 ***
#> I - E == 0 0.366876 0.007059 51.975 < 0.001 ***
#> J - E == 0 0.485662 0.008931 54.380 < 0.001 ***
#> G - F == 0 0.064768 0.005807 11.154 < 0.001 ***
#> H - F == 0 0.155317 0.006258 24.819 < 0.001 ***
#> I - F == 0 0.276132 0.007091 38.939 < 0.001 ***
#> J - F == 0 0.394918 0.008962 44.065 < 0.001 ***
#> H - G == 0 0.090548 0.006056 14.952 < 0.001 ***
#> I - G == 0 0.211363 0.006910 30.587 < 0.001 ***
#> J - G == 0 0.330150 0.008827 37.404 < 0.001 ***
#> I - H == 0 0.120815 0.007276 16.606 < 0.001 ***
#> J - H == 0 0.239602 0.009107 26.311 < 0.001 ***
#> J - I == 0 0.118787 0.009690 12.259 < 0.001 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)
If you wanted all the predicted values of carat for the different values of color, you could us ggpredict() from the ggeffects package:
g <- ggpredict(mod, terms = "color")
plot(g)
Plotting the g object produces the plot, but printing it will show the values and confidence intervals/
Created on 2023-02-01 by the reprex package (v2.0.1)
I am trying to create a summary table for my Cox model. However, when I use modelsummary function, it gives me a table that shows coef. But I want to display exp(coef) on my summary table. How can I change coef to exp(coef)?
I use this script to create a summary table:
modelsummary(model.1,
statistic='({conf.low}, {conf.high})',
stars=TRUE,
vcov = 'classical',
coef_omit = "Intercept",
coef_rename=c('ln_reb_capacity'='Relative rebel strength',
'terrcont'='Rebel territorial control', 'gdp'='Economic strength',
'bdbest'='Conflict intensity', 'roughterrain'='Rough terrain',
'loot'='Lootable resources', 'in_tpop'='Population size',
'powersharing'='Sharing Leadership'),
title = 'Table I.',
output='gt'
)
This is the summary table:
Table I.
─────────────────────────────────────────────────────────
Model 1
─────────────────────────────────────────────────────────
Relative rebel strength 0.125*
(0.016, 0.235)
Rebel territorial control -0.295+
(-0.638, 0.048)
Economic strength 0.000
(0.000, 0.000)
Conflict intensity 0.000
(0.000, 0.000)
Rough terrain 0.098
(-0.210, 0.405)
Lootable resources 0.105
(-0.298, 0.507)
Population size -0.119+
(-0.249, 0.011)
Sharing Leadership 0.046
(-0.393, 0.486)
─────────────────────────────────────────────────────────
Num.Obs. 260
AIC 1678.5
BIC 1707.0
RMSE 0.83
Std.Errors Classical
─────────────────────────────────────────────────────────
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
─────────────────────────────────────────────────────────
Column names: , Model 1
Here is my result for Cox model:
Call:
coxph(formula = Surv(month_duration, EndConflict) ~ ln_reb_capacity +
terrcont + gdp + bdbest + roughterrain + loot + in_tpop +
powersharing, data = df)
n= 260, number of events= 183
(108 observations deleted due to missingness)
coef exp(coef) se(coef) z Pr(>|z|)
ln_reb_capacity 0.125154562 1.133323609 0.055831926 2.242 0.0250 *
terrcont -0.295113621 0.744446997 0.174927860 -1.687 0.0916 .
gdp -0.000004416 0.999995584 0.000017623 -0.251 0.8021
bdbest -0.000010721 0.999989279 0.000016057 -0.668 0.5043
roughterrain 0.097602616 1.102524573 0.156809154 0.622 0.5337
loot 0.104686159 1.110362079 0.205406301 0.510 0.6103
in_tpop -0.119020975 0.887789179 0.066355450 -1.794 0.0729 .
powersharing 0.046026931 1.047102610 0.224229347 0.205 0.8374
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
exp(coef) exp(-coef) lower .95 upper .95
ln_reb_capacity 1.1333 0.8824 1.0159 1.264
terrcont 0.7444 1.3433 0.5284 1.049
gdp 1.0000 1.0000 1.0000 1.000
bdbest 1.0000 1.0000 1.0000 1.000
roughterrain 1.1025 0.9070 0.8108 1.499
loot 1.1104 0.9006 0.7424 1.661
in_tpop 0.8878 1.1264 0.7795 1.011
powersharing 1.0471 0.9550 0.6747 1.625
Concordance= 0.617 (se = 0.023 )
Likelihood ratio test= 18.96 on 8 df, p=0.02
Wald test = 18.2 on 8 df, p=0.02
Score (logrank) test = 18.36 on 8 df, p=0.02
Thanks.
You could adjust the argument exponentiate. If it's TRUE, the estimate, conf.low, and conf.high statistics are exponentiated, and the std.error is transformed to exp(estimate)*std.error
(by the delta method).
modelsummary(model.1,
...,
exponentiate = TRUE
)
I have done an incomplete factorial design (fractional factorial design) experiment on different fertilizer applications.
Here is the format of the data: Excerpt of data
I want to do an ANOVA in R using the function aov. I have 450 data points in total, 'Location' has 5 factors, N has 3, and F1,F2,F3+F4 have two each.
Here is the code that I am using:
ANOVA1<-aov(PlantWeight~Location*N*F1*F2*F3*F4, data = data)
summary(ANOVA1)
Independent variables F1,F2,F3+F4 are not applied in a factorial manner. Each sample either has F1,F2,F3+F4 or nothing applied. In the cases where no F1,F2,F3 or F4 fertiliser was applied the value 0 has been put in every column. This is the control to which each of F1,F2,F3+F4 will be compared to. If F1 has been applied then column F1 will read 1 and it will read NA in F2,F3+F4 columns.
When I try to run this ANOVA I get this error message:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
Another approach I had was to but an 'x' instead of 'NA'. This has issues because it is assuming that x is a factor when it is not. It seemed to work fine except it would always ignore the F4.
ANOVA2<-aov(PlantWeight~((F1*Location*N)+(F2*Location*N)+(F3*Location*N)+
(F4*Location*N)), data = data)
summary(ANOVA2)
Results:
Df Sum Sq Mean Sq F value Pr(>F)
F1 2 10.3 5.13 5.742 0.00351 **
Location 6 798.6 133.11 149.027 < 2e-16 ***
N 2 579.6 289.82 324.485 < 2e-16 ***
F2 1 0.3 0.33 0.364 0.54667
F3 1 0.4 0.44 0.489 0.48466
F1:Location 10 26.5 2.65 2.962 0.00135 **
F1:N 4 6.6 1.66 1.857 0.11737
Location:N 10 113.5 11.35 12.707 < 2e-16 ***
Location:F2 5 6.5 1.30 1.461 0.20188
N:F2 2 2.7 1.37 1.537 0.21641
Location:F3 5 33.6 6.72 7.529 9.73e-07 ***
N:F3 2 2.5 1.23 1.375 0.25409
F1:Location:N 20 12.4 0.62 0.696 0.83029
F2:Location:N 10 18.9 1.89 2.113 0.02284 *
F3:Location:N 10 26.8 2.68 3.001 0.00118 **
Residuals 359 320.6 0.89
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Any help on how to approach this would be wonderful!
This question already has an answer here:
How to write linearly dependent column in a matrix in terms of linearly independent columns?
(1 answer)
Closed 7 years ago.
I have a data frame of which I know certain columns are an exact linear formula of some of the other columns, but I don't know which columns they are.
A B C D E G
1 -8453 319 3363 -16382 8290 2683
2 2269 -5687 5810 6626 5857 1283
3 8381 5725 1099 -6145 8507 1393
4 -2248 3936 5394 -10503 1803 7910
5 9579 4210 4027 4049 5235 112
6 7351 3717 2357 -1357 5458 1890
7 -8323 -9181 7914 -2417 2252 8937
8 731 -5936 5948 -4190 7621 9184
9 -7419 5345 218 -20339 7139 654
10 -9353 4583 444 -22751 6108 3151
DT <- structure(list(A = c(-6381L, 6029L, 171L, 6451L, -8843L, -4651L,
-4142L, -9292L, -5857L, 3378L), B = c(-9170L, 6601L, -4307L,
8391L, -5360L, 3783L, 4481L, 3990L, 5308L, -8744L), C = c(7899L,
1031L, 8288L, 2034L, 2146L, 2862L, 4911L, 1808L, 4351L, 287L),
D = c(4772L, -12577L, 7358L, -10506L, -15314L, -17401L, -7939L,
-29133L, -17846L, 5631L), E = c(15L, 5708L, 5272L, 5651L,
8126L, 8805L, 20L, 9129L, 3786L, 5498L), G = c(5901L, 7328L,
136L, 4949L, 5851L, 3024L, 4207L, 8530L, 7246L, 1280L)), class = "data.frame", row.names = c(NA,
-10L), .Names = c("A", "B", "C", "D", "E", "G"))
My initial reaction was to loop through the columns DT and perform a lm on the remaining columns, searching for r.squared == 1, but I was wondering whether there are functions for this specific task.
My first guess ended up working pretty well
❥ output <- lm(A ~ C + D + E + G + B, data = DT)
❥ summary(output)
Call:
lm(formula = A ~ C + D + E + G + B, data = DT)
Residuals:
1 2 3 4 5 6 7 8
-4.80e-12 1.59e-12 3.61e-12 -2.82e-12 2.79e-12 -5.58e-12 1.49e-12 -8.34e-14
9 10
3.40e-12 4.10e-13
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.75e-13 8.62e-12 7.00e-02 0.95
C -1.00e+00 7.90e-16 -1.27e+15 <2e-16 ***
D 1.00e+00 3.94e-16 2.54e+15 <2e-16 ***
E 1.00e+00 9.46e-16 1.06e+15 <2e-16 ***
G 1.00e+00 1.17e-15 8.51e+14 <2e-16 ***
B 1.00e+00 3.85e-16 2.60e+15 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.99e-12 on 4 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 2.53e+30 on 5 and 4 DF, p-value: <2e-16
Warning message:
In summary.lm(output) : essentially perfect fit: summary may be unreliable
I would challenge your claim (or at least twhat I initially thought was your claim). My first tool for investigating it was Hmisc::rcorr which calculates all of the correlation coefficients. If any pair was a linear combination of the other then the correlation coef should be 1.0
> rcorr(data.matrix(DT))
A B C D E G
A 1.00 0.22 -0.28 0.40 -0.05 -0.35
B 0.22 1.00 -0.32 -0.67 0.18 0.44
C -0.28 -0.32 1.00 0.49 -0.58 -0.27
D 0.40 -0.67 0.49 1.00 -0.55 -0.72
E -0.05 0.18 -0.58 -0.55 1.00 0.07
G -0.35 0.44 -0.27 -0.72 0.07 1.00
As it turns out it requires all 6 columns to get linear dependence, since removing any one column leaves the sub-matrix full rank:
sapply(1:6, function(i) rankMatrix(as.matrix(DT[-i])) )
[1] 5 5 5 5 5 5
Playing around with Rolands comment to see what the factors would be to get complete linear dependnence:
sapply(LETTERS[1:5], function(col) round( lm(as.formula(paste0(col, " ~ .")), data = DT)$coef,4) )
A B C D E
(Intercept) 0 0 0 0 0
B 1 1 -1 1 1
C -1 1 1 -1 -1
D 1 -1 1 1 1
E 1 -1 1 -1 -1
G 1 -1 1 -1 -1
#Hugh: Be sure to cite StackOverflow in your homework assignment writeup ;-)
Here's a way of making similar matrices:
res <- replicate(5, sample((-10000):10000, 10) )
res2 <- res %*% sample(c(-1,1) , 5, repl=TRUE)
res3 <- cbind(res2, res)
And then checking a couple of them with Dason's linfinder:
> linfinder(data.matrix(res3))
[1] "Column_6 = -1*Column_1 + -1*Column_2 + -1*Column_3 + -1*Column_4 + -1*Column_5"
> res2 <- res %*% sample(c(-1,1) , 5, repl=TRUE)
> res3 <- cbind(res2, res)
> linfinder(data.matrix(res3))
[1] "Column_6 = -1*Column_1 + -0.999999999999999*Column_2 + 0.999999999999999*Column_3 + 0.999999999999999*Column_4 + 0.999999999999999*Column_5"
>
I'm having trouble with setting a priori contrasts and would like to ask for some help. The following code should give two orthogonal contrasts to the factor level "d".
Response <- c(1,3,2,2,2,2,2,2,4,6,5,5,5,5,5,5,4,6,5,5,5,5,5,5)
A <- factor(c(rep("c",8),rep("d",8),rep("h",8)))
contrasts(A) <- cbind("d vs h"=c(0,1,-1),"d vs c"=c(-1,1,0))
summary.lm(aov(Response~A))
What I get is:
Call:
aov(formula = Response ~ A)
Residuals:
Min 1Q Median 3Q Max
-1.000e+00 -3.136e-16 -8.281e-18 -8.281e-18 1.000e+00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.0000 0.1091 36.661 < 2e-16 ***
Ad vs h -1.0000 0.1543 -6.481 2.02e-06 ***
Ad vs c 2.0000 0.1543 12.961 1.74e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5345 on 21 degrees of freedom
Multiple R-squared: 0.8889, Adjusted R-squared: 0.8783
F-statistic: 84 on 2 and 21 DF, p-value: 9.56e-11
But I expect the Estimate of (Intercept) to be 5.00, as it should be equal to the level d, right? Also the other estimates look strange...
I know I can get the correct values easier using relevel(A, ref="d") (where they are displayed correctly), but I am interested in learning the correct formulation to test own hypotheses. If I run a similar example with the folowing code (from a website), it works as expected:
irrigation<-factor(c(rep("Control",10),rep("Irrigated 10mm",10),rep("Irrigated20mm",10)))
biomass<-1:30
contrastmatrix<-cbind("10 vs 20"=c(0,1,-1),"c vs 10"=c(-1,1,0))
contrasts(irrigation)<-contrastmatrix
summary.lm(aov(biomass~irrigation))
Call:
aov(formula = biomass ~ irrigation)
Residuals:
Min 1Q Median 3Q Max
-4.500e+00 -2.500e+00 3.608e-16 2.500e+00 4.500e+00
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15.5000 0.5528 28.04 < 2e-16 ***
irrigation10 vs 20 -10.0000 0.7817 -12.79 5.67e-13 ***
irrigationc vs 10 10.0000 0.7817 12.79 5.67e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.028 on 27 degrees of freedom
Multiple R-squared: 0.8899, Adjusted R-squared: 0.8817
F-statistic: 109.1 on 2 and 27 DF, p-value: 1.162e-13
I would really appreciate some explanation for this.
Thanks, Jeremias
I think the problem is in the understanding of contrasts (You may ?contrasts for detail). Let me explain in detail:
If you use the default way for factor A,
A <- factor(c(rep("c",8),rep("d",8),rep("h",8)))
> contrasts(A)
d h
c 0 0
d 1 0
h 0 1
thus the model lm gives you are
Mean(Response) = Intercept + beta_1 * I(d = 1) + beta_2 * I(h = 1)
summary.lm(aov(Response~A))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.000 0.189 10.6 7.1e-10 ***
Ad 3.000 0.267 11.2 2.5e-10 ***
Ah 3.000 0.267 11.2 2.5e-10 ***
So for group c, the mean is just intercept 2, for group d , the mean is 2 + 3 = 5, same for group h.
What if you use your own contrast:
contrasts(A) <- cbind("d vs h"=c(0,1,-1),"d vs c"=c(-1,1,0))
A
[1] c c c c c c c c d d d d d d d d h h h h h h h h
attr(,"contrasts")
d vs h d vs c
c 0 -1
d 1 1
h -1 0
The model you fit turns out to be
Mean(Response) = Intercept + beta_1 * (I(d = 1) - I(h = 1)) + beta_2 * (I(d = 1) - I(c = 1))
= Intercept + (beta_1 + beta_2) * I(d = 1) - beta_2 * I(c = 1) - beta_1 * I(h = 1)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.000 0.109 36.66 < 2e-16 ***
Ad vs h -1.000 0.154 -6.48 2.0e-06 ***
Ad vs c 2.000 0.154 12.96 1.7e-11 ***
So for group c, the mean is 4 - 2 = 2, for group d, the mean is 4 - 1 + 2 = 5, for group h, the mean is 4 - (-1) = 5.
==========================
Update:
The easiest way to do your contrast is to set the base (reference) level to be d.
contrasts(A) <- contr.treatment(3, base = 2)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.00e+00 1.89e-01 26.5 < 2e-16 ***
A1 -3.00e+00 2.67e-01 -11.2 2.5e-10 ***
A3 -4.86e-17 2.67e-01 0.0 1
If you want to use your contrast:
Response <- c(1,3,2,2,2,2,2,2,4,6,5,5,5,5,5,5,4,6,5,5,5,5,5,5)
A <- factor(c(rep("c",8),rep("d",8),rep("h",8)))
mat<- cbind(rep(1/3, 3), "d vs h"=c(0,1,-1),"d vs c"=c(-1,1,0))
mymat <- solve(t(mat))
my.contrast <- mymat[,2:3]
contrasts(A) <- my.contrast
summary.lm(aov(Response~A))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.00e+00 1.09e-01 36.7 < 2e-16 ***
Ad vs h 7.69e-16 2.67e-01 0.0 1
Ad vs c 3.00e+00 2.67e-01 11.2 2.5e-10 ***
Reference: http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm