When should I use aov() and when anova()? - r

I have referred to much of online literature but it is increasing my confusion. Much of the discussion is too technical with terms unbalanced designs and I, II or III factor ANOVA and everything.
I only know that aov() uses lm() internally and is useful for data with factors. Whereas anova() can be used for different models on same dataset.
Is my understanding correct?

anova is substantially different from aov. Why not read R's documentation ?aov and ?anova? In short:
aov fits a model (as you are already aware, internally it calls lm), so it produces regression coefficients, fitted values, residuals, etc; It produces an object of primary class "aov" but also a secondary class "lm". So, it is an augmentation of an "lm" object.
anova is a generic function. In your scenario you are referring to anova.lm or anova.lmlist (read ?anova.lm for more info). The former analyses a fitted model (produced by lm or aov), while the latter analyses several nested (increasingly large) fitted models (by lm or aov). They both aim at producing type I (sequential) ANOVA table.
In practice, you first use lm / aov to fit a model, then use anova to analyse the result. There is nothing better than trying a small example:
fit <- aov(sr ~ ., data = LifeCycleSavings) ## can also use `lm`
z <- anova(fit)
Now, have a look at their structure. aov returns a large object:
str(fit)
#List of 12
# $ coefficients : Named num [1:5] 28.566087 -0.461193 -1.691498 -0.000337 0.409695
# ..- attr(*, "names")= chr [1:5] "(Intercept)" "pop15" "pop75" "dpi" ...
# $ residuals : Named num [1:50] 0.864 0.616 2.219 -0.698 3.553 ...
# ..- attr(*, "names")= chr [1:50] "Australia" "Austria" "Belgium" "Bolivia" ...
# $ effects : Named num [1:50] -68.38 -14.29 7.3 -3.52 -7.94 ...
# ..- attr(*, "names")= chr [1:50] "(Intercept)" "pop15" "pop75" "dpi" ...
# $ rank : int 5
# $ fitted.values: Named num [1:50] 10.57 11.45 10.95 6.45 9.33 ...
# ..- attr(*, "names")= chr [1:50] "Australia" "Austria" "Belgium" "Bolivia" ...
# $ assign : int [1:5] 0 1 2 3 4
# $ qr :List of 5
# ..$ qr : num [1:50, 1:5] -7.071 0.141 0.141 0.141 0.141 ...
# .. ..- attr(*, "dimnames")=List of 2
# .. .. ..$ : chr [1:50] "Australia" "Austria" "Belgium" "Bolivia" ...
# .. .. ..$ : chr [1:5] "(Intercept)" "pop15" "pop75" "dpi" ...
# .. ..- attr(*, "assign")= int [1:5] 0 1 2 3 4
# ..$ qraux: num [1:5] 1.14 1.17 1.16 1.15 1.05
# ..$ pivot: int [1:5] 1 2 3 4 5
# ..$ tol : num 1e-07
# ..$ rank : int 5
# ..- attr(*, "class")= chr "qr"
# $ df.residual : int 45
# $ xlevels : Named list()
# $ call : language aov(formula = sr ~ ., data = LifeCycleSavings)
# $ terms :Classes 'terms', 'formula' language sr ~ pop15 + pop75 + dpi + ddpi
# .. ..- attr(*, "variables")= language list(sr, pop15, pop75, dpi, ddpi)
# .. ..- attr(*, "factors")= int [1:5, 1:4] 0 1 0 0 0 0 0 1 0 0 ...
# .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. ..$ : chr [1:5] "sr" "pop15" "pop75" "dpi" ...
# .. .. .. ..$ : chr [1:4] "pop15" "pop75" "dpi" "ddpi"
# .. ..- attr(*, "term.labels")= chr [1:4] "pop15" "pop75" "dpi" "ddpi"
# .. ..- attr(*, "order")= int [1:4] 1 1 1 1
# .. ..- attr(*, "intercept")= int 1
# .. ..- attr(*, "response")= int 1
# .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
# .. ..- attr(*, "predvars")= language list(sr, pop15, pop75, dpi, ddpi)
# .. ..- attr(*, "dataClasses")= Named chr [1:5] "numeric" "numeric" "numeric" "numeric" ...
# .. .. ..- attr(*, "names")= chr [1:5] "sr" "pop15" "pop75" "dpi" ...
# $ model :'data.frame': 50 obs. of 5 variables:
# ..$ sr : num [1:50] 11.43 12.07 13.17 5.75 12.88 ...
# ..$ pop15: num [1:50] 29.4 23.3 23.8 41.9 42.2 ...
# ..$ pop75: num [1:50] 2.87 4.41 4.43 1.67 0.83 2.85 1.34 0.67 1.06 1.14 ...
# ..$ dpi : num [1:50] 2330 1508 2108 189 728 ...
# ..$ ddpi : num [1:50] 2.87 3.93 3.82 0.22 4.56 2.43 2.67 6.51 3.08 2.8 ...
# ..- attr(*, "terms")=Classes 'terms', 'formula' language sr ~ pop15 + pop75 + dpi + ddpi
# .. .. ..- attr(*, "variables")= language list(sr, pop15, pop75, dpi, ddpi)
# .. .. ..- attr(*, "factors")= int [1:5, 1:4] 0 1 0 0 0 0 0 1 0 0 ...
# .. .. .. ..- attr(*, "dimnames")=List of 2
# .. .. .. .. ..$ : chr [1:5] "sr" "pop15" "pop75" "dpi" ...
# .. .. .. .. ..$ : chr [1:4] "pop15" "pop75" "dpi" "ddpi"
# .. .. ..- attr(*, "term.labels")= chr [1:4] "pop15" "pop75" "dpi" "ddpi"
# .. .. ..- attr(*, "order")= int [1:4] 1 1 1 1
# .. .. ..- attr(*, "intercept")= int 1
# .. .. ..- attr(*, "response")= int 1
# .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
# .. .. ..- attr(*, "predvars")= language list(sr, pop15, pop75, dpi, ddpi)
# .. .. ..- attr(*, "dataClasses")= Named chr [1:5] "numeric" "numeric" "numeric" "numeric" ...
# .. .. .. ..- attr(*, "names")= chr [1:5] "sr" "pop15" "pop75" "dpi" ...
# - attr(*, "class")= chr [1:2] "aov" "lm"
While anova returns:
str(z)
#Classes ‘anova’ and 'data.frame': 5 obs. of 5 variables:
# $ Df : int 1 1 1 1 45
# $ Sum Sq : num 204.1 53.3 12.4 63.1 650.7
# $ Mean Sq: num 204.1 53.3 12.4 63.1 14.5
# $ F value: num 14.116 3.689 0.858 4.36 NA
# $ Pr(>F) : num 0.000492 0.061125 0.359355 0.042471 NA
# - attr(*, "heading")= chr "Analysis of Variance Table\n" "Response: sr"

Related

Error message in model comparison using anova() in R

I am trying to compare the two models below
H1 <- lm(y ~ x1 + x2, data = df)
H2 <- lm(y ~ x1 + x2 + x3, data = df)
anova(H1, H2)
However, I get an error message:
Error: Argument 'data' must be a data frame
And when I define the data, then I get another error message:
anova(H1, H2, data = df)
Error in .subset2(x, i) : recursive indexing failed at level 2
I tried to look at the models and they show (not sure if I am looking at the correct one, but):
H1
model list[89 x 3] (S3: data.frame) A data.frame with 89 rows and 3 columns
y double[89] 3.00 3.50 4.25 5.11 1.00 ...
x1 double[89] 19 24 31 35 20 21 ...
x2 double[89] 1 1 1 1 2 1 1 ...
str(H1)
List of 12
$ coefficients : Named num [1:3] 5.42739 0.000294 -0.950346
..- attr(*, "names")= chr [1:3] "(Intercept)" "x1" "x2"
$ residuals : Named num [1:89] -1.4844 -0.9835 -0.2326 -2.5338 0.0177 ...
..- attr(*, "names")= chr [1:89] "1" "2" "3" "4" ...
$ effects : Named num [1:89] -40.783 0.796 -3.258 -2.349 0.068 ...
..- attr(*, "names")= chr [1:89] "(Intercept)" "x1" "x2" "" ...
$ rank : int 3
$ fitted.values: Named num [1:89] 4.48 4.48 4.48 3.53 4.48 ...
..- attr(*, "names")= chr [1:89] "1" "2" "3" "4" ...
$ assign : int [1:3] 0 1 2
$ qr :List of 5
..$ qr : num [1:89, 1:3] -9.434 0.106 0.106 0.106 0.106 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:89] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:3] "(Intercept) "x1" "x2"
.. ..- attr(*, "assign")= int [1:3] 0 1 2
..$ qraux: num [1:3] 1.11 1.02 1.03
..$ pivot: int [1:3] 1 2 3
..$ tol : num 1e-07
..$ rank : int 3
..- attr(*, "class")= chr "qr"
$ df.residual : int 86
$ xlevels : Named list()
$ call : language lm(formula = y ~ x1 + x2, data = df)
$ terms :Classes 'terms', 'formula' language y ~ x1 + x2
.. ..- attr(*, "variables")= language list(y, x1, x2)
.. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:3] "y" "x1" "x2"
.. .. .. ..$ : chr [1:2] "x1" "x2"
.. ..- attr(*, "term.labels")= chr [1:2] "x1" "x2"
.. ..- attr(*, "order")= int [1:2] 1 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(y, x1, x2)
.. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:3] "y" "x1" "x2"
$ model :'data.frame': 89 obs. of 3 variables:
..$ y : num [1:89] 3 3.5 4.25 1 4.5 5.25 4.75 3.75 3.5 5 ...
..$ x1 : num [1:89] 25 22 19 24 18 24 18 18 21 19 ...
..$ x2 : num [1:89] 1 1 1 2 1 1 1 1 1 1 ...
..- attr(*, "terms")=Classes 'terms', 'formula' language y ~ x1 + x2
.. .. ..- attr(*, "variables")= language list(y, x1, x2)
.. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:3] "y" "x1" "x2"
.. .. .. .. ..$ : chr [1:2] "x1" "x2"
.. .. ..- attr(*, "term.labels")= chr [1:2] "x1" "x2"
.. .. ..- attr(*, "order")= int [1:2] 1 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(y x1 x2)
.. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:3] "y" "x1" "x2"
- attr(*, "class")= chr "lm"
H2
model list[89 x 3] (S3: data.frame) A data.frame with 89 rows and 4 columns
y double[89] 3.00 3.50 4.25 5.11 1.00 ...
x1 double[89] 19 24 31 35 20 21 ...
x2 double[89] 1 1 1 1 2 1 1 ...
x3 double[89] 0 0 0 0 1 0 0
both have xlevels list[0]
Let me know if you need more information.
I would really appreciate it if you could help me out with this!
Anova calls the data function from the lm objects. Are you sure they are right?
The term data in fit should be a data frame with columns x1, x2 and y.
If x1, x2 and y are vectors you don't use the data argument.

R: Coeftest causes error

I am performing a Newey-West test to assess an estimator of a regression with heteroskedastic and autocorrelated residuals.
I am using the "sandwich" and "lmtest" packages.
While I can easily reproduce examples found on other sites, my own script causes the error:
Error in dimnames(cd) <- list(as.character(index(x)), colnames(x)) : 'dimnames' applied to non-array
My code:
ffregression <- lm(ex.return ~ fff$Mkt.RF + fff$SMB + fff$HML)
coeftest(ffregression,vcov=NeweyWest)
str(ffregression):
List of 12
$ coefficients : Named num [1:4] 0.00604 0.72976 0.90351 0.13548
..- attr(*, "names")= chr [1:4] "(Intercept)" "fff$Mkt.RF" "fff$SMB" "fff$HML"
$ residuals :An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34] -0.03637 0.0408 -0.00672 0.04648 -0.02275 ...
Indexed by objects of class: [yearmon] TZ:
Original class: 'double'
xts Attributes:
NULL
$ effects :An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34] -0.0606 0.1785 0.1379 0.0204 -0.0262 ...
Indexed by objects of class: [yearmon] TZ:
Original class: 'double'
xts Attributes:
NULL
$ rank : int 4
$ fitted.values:An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34] -0.000725 -0.031716 0.003868 0.051386 -0.047005 ...
Indexed by objects of class: [yearmon] TZ:
Original class: 'double'
xts Attributes:
NULL
$ assign : int [1:4] 0 1 2 3
$ qr :List of 5
..$ qr : num [1:34, 1:4] -5.831 0.171 0.171 0.171 0.171 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:34] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:4] "(Intercept)" "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. ..- attr(*, "assign")= int [1:4] 0 1 2 3
..$ qraux: num [1:4] 1.17 1.05 1.14 1.14
..$ pivot: int [1:4] 1 2 3 4
..$ tol : num 1e-07
..$ rank : int 4
..- attr(*, "class")= chr "qr"
$ df.residual : int 30
$ xlevels : Named list()
$ call : language lm(formula = ex.return ~ fff$Mkt.RF + fff$SMB + fff$HML)
$ terms :Classes 'terms', 'formula' language ex.return ~ fff$Mkt.RF + fff$SMB + fff$HML
.. ..- attr(*, "variables")= language list(ex.return, fff$Mkt.RF, fff$SMB, fff$HML)
.. ..- attr(*, "factors")= int [1:4, 1:3] 0 1 0 0 0 0 1 0 0 0 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:4] "ex.return" "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. .. .. ..$ : chr [1:3] "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. ..- attr(*, "term.labels")= chr [1:3] "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. ..- attr(*, "order")= int [1:3] 1 1 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(ex.return, fff$Mkt.RF, fff$SMB, fff$HML)
.. ..- attr(*, "dataClasses")= Named chr [1:4] "nmatrix.1" "nmatrix.1" "nmatrix.1" "nmatrix.1"
.. .. ..- attr(*, "names")= chr [1:4] "ex.return" "fff$Mkt.RF" "fff$SMB" "fff$HML"
$ model :'data.frame': 34 obs. of 4 variables:
..$ ex.return :An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34, 1] -0.03709 0.00909 -0.00285 0.09786 -0.06975 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "RF"
Indexed by objects of class: [yearmon] TZ:
Original class: 'double'
xts Attributes:
NULL
..$ fff$Mkt.RF:An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34, 1] 0.0043 -0.0019 0.0206 0.0261 -0.0204 0.0424 -0.0197 0.0252 0.0255 -0.0006 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "Mkt.RF"
Indexed by objects of class: [yearmon] TZ:
xts Attributes:
NULL
..$ fff$SMB :An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34, 1] -0.0185 -0.0419 -0.0185 0.0301 -0.0422 0.004 -0.038 0.0428 -0.0205 0.0259 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "SMB"
Indexed by objects of class: [yearmon] TZ:
xts Attributes:
NULL
..$ fff$HML :An ‘xts’ object on Mar 2014/Dec 2016 containing:
Data: num [1:34, 1] 0.0503 0.011 -0.0036 -0.0066 -0.0002 -0.0055 -0.0119 -0.0168 -0.0298 0.0212 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "HML"
Indexed by objects of class: [yearmon] TZ:
xts Attributes:
NULL
..- attr(*, "terms")=Classes 'terms', 'formula' language ex.return ~ fff$Mkt.RF + fff$SMB + fff$HML
.. .. ..- attr(*, "variables")= language list(ex.return, fff$Mkt.RF, fff$SMB, fff$HML)
.. .. ..- attr(*, "factors")= int [1:4, 1:3] 0 1 0 0 0 0 1 0 0 0 ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:4] "ex.return" "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. .. .. .. ..$ : chr [1:3] "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. .. ..- attr(*, "term.labels")= chr [1:3] "fff$Mkt.RF" "fff$SMB" "fff$HML"
.. .. ..- attr(*, "order")= int [1:3] 1 1 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(ex.return, fff$Mkt.RF, fff$SMB, fff$HML)
.. .. ..- attr(*, "dataClasses")= Named chr [1:4] "nmatrix.1" "nmatrix.1" "nmatrix.1" "nmatrix.1"
.. .. .. ..- attr(*, "names")= chr [1:4] "ex.return" "fff$Mkt.RF" "fff$SMB" "fff$HML"
- attr(*, "class")= chr "lm"
Example:
set.seed(04012017)
n<-34
correlated_residuals<-arima.sim(list(ar = .9), n)
y<-correlated_residuals
x<-1:n
plot(x,correlated_residuals)
fit<-lm(y~x)
abline(fit)
summary(fit) # standard estimates
coeftest(fit,vcov=NeweyWest(fit,verbose=T))
str(fit):
List of 12
$ coefficients : Named num [1:2] -0.179 0.148
..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
$ residuals : Named num [1:34] -0.9529 0.976 0.3025 -0.0486 -1.1214 ...
..- attr(*, "names")= chr [1:34] "1" "2" "3" "4" ...
$ effects : Named num [1:34] -14.026 8.449 0.25 -0.085 -1.142 ...
..- attr(*, "names")= chr [1:34] "(Intercept)" "x" "" "" ...
$ rank : int 2
$ fitted.values: Named num [1:34] -0.0314 0.1163 0.264 0.4116 0.5593 ...
..- attr(*, "names")= chr [1:34] "1" "2" "3" "4" ...
$ assign : int [1:2] 0 1
$ qr :List of 5
..$ qr : num [1:34, 1:2] -5.831 0.171 0.171 0.171 0.171 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:34] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:2] "(Intercept)" "x"
.. ..- attr(*, "assign")= int [1:2] 0 1
..$ qraux: num [1:2] 1.17 1.23
..$ pivot: int [1:2] 1 2
..$ tol : num 1e-07
..$ rank : int 2
..- attr(*, "class")= chr "qr"
$ df.residual : int 32
$ xlevels : Named list()
$ call : language lm(formula = y ~ x)
$ terms :Classes 'terms', 'formula' language y ~ x
.. ..- attr(*, "variables")= language list(y, x)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "y" "x"
.. .. .. ..$ : chr "x"
.. ..- attr(*, "term.labels")= chr "x"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(y, x)
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "y" "x"
$ model :'data.frame': 34 obs. of 2 variables:
..$ y: num [1:34] -0.984 1.092 0.566 0.363 -0.562 ...
..$ x: int [1:34] 1 2 3 4 5 6 7 8 9 10 ...
..- attr(*, "terms")=Classes 'terms', 'formula' language y ~ x
.. .. ..- attr(*, "variables")= language list(y, x)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "y" "x"
.. .. .. .. ..$ : chr "x"
.. .. ..- attr(*, "term.labels")= chr "x"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(y, x)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "y" "x"
- attr(*, "class")= chr "lm"
It seems that R is complaining that it cannot assign dimnames because your data is coming from a non-array. Try to use only one xts object for your data instead.
fff$ex.return <- ex.return
ffregression <- lm(ex.return ~ Mkt.RF + SMB + HML, data = fff)
coeftest(ffregression,vcov=NeweyWest)

How do I access individual objects of an anova summary?

I have an object,
> str(summary(A$aov[[2]]))
List of 1
$ :Classes ‘anova’ and 'data.frame': 4 obs. of 5 variables:
..$ Df : num [1:4] 1 1 1 26
..$ Sum Sq : num [1:4] 0.00966 0.01137 0.00458 1.13068
..$ Mean Sq: num [1:4] 0.00966 0.01137 0.00458 0.04349
..$ F value: num [1:4] 0.222 0.261 0.105 NA
..$ Pr(>F) : num [1:4] 0.641 0.614 0.748 NA
- attr(*, "class")= chr [1:2] "summary.aov" "listof"
however I can't seem to figure out how to access "Sum Sq"?
I tried the following already,
summary(A$aov[[2]])$'Sum Sq'
summary(A$aov[[2]])[2]
summary(A$aov[[2]])[2,]
Would be even great for learning if I can extract the Sum Sq values without the summary function. For example, typing
> A$aov[[2]]
Terms:
Adult_Group Sex Adult_Group:Sex Residuals
Sum of Squares 0.0096577 0.0113658 0.0045836 1.1306777
Deg. of Freedom 1 1 1 26
Residual standard error: 0.2085368
Estimated effects may be unbalanced
> str(A$aov[[2]])
List of 9
$ coefficients : Named num [1:3] -0.00757 -0.01057 -0.00746
..- attr(*, "names")= chr [1:3] "Adult_Group1" "Sex1" "Adult_Group1:Sex1"
$ residuals : Named num [1:29] 0.04596 -0.00541 0.41315 -0.24305 -0.06205 ...
..- attr(*, "names")= chr [1:29] "2" "3" "4" "5" ...
$ effects : Named num [1:29] 0.0983 -0.1066 -0.0677 -0.305 -0.0284 ...
..- attr(*, "names")= chr [1:29] "Adult_Group1" "Sex1" "Adult_Group1:Sex1" "" ...
$ rank : int 3
$ fitted.values: Named num [1:29] 4.16e-17 6.51e-17 4.51e-02 3.49e-02 -1.90e-02 ...
..- attr(*, "names")= chr [1:29] "2" "3" "4" "5" ...
$ assign : int [1:3] 1 2 4
$ qr :List of 5
..$ qr : num [1:29, 1:3] -9.30 -5.97e-17 -3.23e-01 -2.50e-01 1.36e-01 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:29] "2" "3" "4" "5" ...
.. .. ..$ : chr [1:3] "Adult_Group1" "Sex1" "Adult_Group1:Sex1"
.. ..- attr(*, "assign")= int [1:3] 1 2 4
..$ qraux: num [1:3] 1 1 1.28
..$ pivot: int [1:3] 1 2 3
..$ tol : num 1e-07
..$ rank : int 3
..- attr(*, "class")= chr "qr"
$ df.residual : int 26
$ terms :Classes 'terms', 'formula' language value ~ Adult_Group + Sex + Region + Adult_Group:Sex + Adult_Group:Region + Sex:Region + Adult_Group:Sex:Region
.. ..- attr(*, "variables")= language list(value, Adult_Group, Sex, Region)
.. ..- attr(*, "factors")= int [1:4, 1:7] 0 1 0 0 0 0 1 0 0 0 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:4] "value" "Adult_Group" "Sex" "Region"
.. .. .. ..$ : chr [1:7] "Adult_Group" "Sex" "Region" "Adult_Group:Sex" ...
.. ..- attr(*, "term.labels")= chr [1:7] "Adult_Group" "Sex" "Region" "Adult_Group:Sex" ...
.. ..- attr(*, "order")= int [1:7] 1 1 1 2 2 2 3
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: 0x000000005c411db8>
- attr(*, "class")= chr [1:2] "aov" "lm"
So I guess from another level, how do I get extract Sum Sq from either,
summary(A$aov[[2]]) or A$aov[[2]]
Would be great to learn how do this since str confuses me.

Create a new lm object using R?

Assuming I have an X matrix and y vector such as the following:
X=
[,1] [,2] [,3]
[1,] 83.0 234.289 235.6 ...
[2,] 88.5 259.426 232.5 ...
[3,] 88.2 258.054 368.2 ...
y=
[1] 60.323 61.122 60.171...
After conducting a decomposition, I find the coefficients B and residuals E. How can I create a new lm object (ie lmQ) that stores my results for B and E in order to get something like this:
> lmQ(X)
$coefficients
X1 X2
B1 B2
$residuals
[1] R1 R2 R3...
Please help thanks!
Objects are just lists with a class attribute.
Create a list, assign the components, set the class -- til it looks like what str() gives.
Now, lm() returns somewhat large objects so you have some work to do:
> df <- data.frame(y=runif(10), x=rnorm(10))
> fit <- lm(y ~ x, df)
> str(fit)
List of 12
$ coefficients : Named num [1:2] 0.5368 0.0314
..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
$ residuals : Named num [1:10] 0.2899 -0.3592 -0.1753 0.3187 -0.0235 ...
..- attr(*, "names")= chr [1:10] "1" "2" "3" "4" ...
$ effects : Named num [1:10] -1.734 0.104 -0.325 0.485 -0.158 ...
..- attr(*, "names")= chr [1:10] "(Intercept)" "x" "" "" ...
$ rank : int 2
$ fitted.values: Named num [1:10] 0.553 0.526 0.573 0.479 0.569 ...
..- attr(*, "names")= chr [1:10] "1" "2" "3" "4" ...
$ assign : int [1:2] 0 1
$ qr :List of 5
..$ qr : num [1:10, 1:2] -3.162 0.316 0.316 0.316 0.316 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:10] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:2] "(Intercept)" "x"
.. ..- attr(*, "assign")= int [1:2] 0 1
..$ qraux: num [1:2] 1.32 1.22
..$ pivot: int [1:2] 1 2
..$ tol : num 1e-07
..$ rank : int 2
..- attr(*, "class")= chr "qr"
$ df.residual : int 8
$ xlevels : Named list()
$ call : language lm(formula = y ~ x, data = df)
$ terms :Classes 'terms', 'formula' length 3 y ~ x
.. ..- attr(*, "variables")= language list(y, x)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "y" "x"
.. .. .. ..$ : chr "x"
.. ..- attr(*, "term.labels")= chr "x"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(y, x)
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "y" "x"
$ model :'data.frame': 10 obs. of 2 variables:
..$ y: num [1:10] 0.843 0.167 0.398 0.798 0.545 ...
..$ x: num [1:10] 0.504 -0.337 1.151 -1.835 1.011 ...
..- attr(*, "terms")=Classes 'terms', 'formula' length 3 y ~ x
.. .. ..- attr(*, "variables")= language list(y, x)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "y" "x"
.. .. .. .. ..$ : chr "x"
.. .. ..- attr(*, "term.labels")= chr "x"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(y, x)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "y" "x"
- attr(*, "class")= chr "lm"
>

Extract p-value from aov

I am looking to extract the p-value generated from an anova in R.
Here is what I am running:
test <- aov(asq[,9] ~ asq[,187])
summary(test)
Yields:
Df Sum Sq Mean Sq F value Pr(>F)
asq[, 187] 1 3.02 3.01951 12.333 0.0004599 ***
Residuals 1335 326.85 0.24483
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
12 observations deleted due to missingness
When I look a the structure, this is what I see. I usually can work through lists to get what I need, but I am having a hard time with this one. A Google searched also seemed to reveal much simpler structures than I am getting.
NOTE: ASQ is my data frame.
str(test)
List of 13
$ coefficients : Named num [1:2] 0.2862 0.0973
..- attr(*, "names")= chr [1:2] "(Intercept)" "asq[, 187]"
$ residuals : Named num [1:1337] 0.519 0.519 -0.481 -0.481 -0.481 ...
..- attr(*, "names")= chr [1:1337] "1" "2" "3" "4" ...
$ effects : Named num [1:1337] -16.19 -1.738 -0.505 -0.505 -0.505 ...
..- attr(*, "names")= chr [1:1337] "(Intercept)" "asq[, 187]" "" "" ...
$ rank : int 2
$ fitted.values: Named num [1:1337] 0.481 0.481 0.481 0.481 0.481 ...
..- attr(*, "names")= chr [1:1337] "1" "2" "3" "4" ...
$ assign : int [1:2] 0 1
$ qr :List of 5
..$ qr : num [1:1337, 1:2] -36.565 0.0273 0.0273 0.0273 0.0273 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:1337] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:2] "(Intercept)" "asq[, 187]"
.. ..- attr(*, "assign")= int [1:2] 0 1
..$ qraux: num [1:2] 1.03 1.02
..$ pivot: int [1:2] 1 2
..$ tol : num 1e-07
..$ rank : int 2
..- attr(*, "class")= chr "qr"
$ df.residual : int 1335
$ na.action :Class 'omit' Named int [1:12] 26 257 352 458 508 624 820 874 1046 1082 ...
.. ..- attr(*, "names")= chr [1:12] "26" "257" "352" "458" ...
$ xlevels : list()
$ call : language aov(formula = asq[, 9] ~ asq[, 187])
$ terms :Classes 'terms', 'formula' length 3 asq[, 9] ~ asq[, 187]
.. ..- attr(*, "variables")= language list(asq[, 9], asq[, 187])
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "asq[, 9]" "asq[, 187]"
.. .. .. ..$ : chr "asq[, 187]"
.. ..- attr(*, "term.labels")= chr "asq[, 187]"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(asq[, 9], asq[, 187])
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "asq[, 9]" "asq[, 187]"
$ model :'data.frame': 1337 obs. of 2 variables:
..$ asq[, 9] : int [1:1337] 1 1 0 0 0 1 1 1 0 0 ...
..$ asq[, 187]: int [1:1337] 2 2 2 2 2 2 2 2 2 2 ...
..- attr(*, "terms")=Classes 'terms', 'formula' length 3 asq[, 9] ~ asq[, 187]
.. .. ..- attr(*, "variables")= language list(asq[, 9], asq[, 187])
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "asq[, 9]" "asq[, 187]"
.. .. .. .. ..$ : chr "asq[, 187]"
.. .. ..- attr(*, "term.labels")= chr "asq[, 187]"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(asq[, 9], asq[, 187])
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. .. ..- attr(*, "names")= chr [1:2] "asq[, 9]" "asq[, 187]"
..- attr(*, "na.action")=Class 'omit' Named int [1:12] 26 257 352 458 508 624 820 874 1046 1082 ...
.. .. ..- attr(*, "names")= chr [1:12] "26" "257" "352" "458" ...
- attr(*, "class")= chr [1:2] "aov" "lm"
Here:
summary(test)[[1]][["Pr(>F)"]][1]
since the suggest above didn't work for me this is how i managed to solve it:
sum_test = unlist(summary(test))
then looking at the names with
names(sum_test)
i have"Pr(>F)1" and "Pr(>F)2", when the first it the requested value, so
sum_test["Pr(>F)1"]
will give the requested value
I know this is old but I looked around online and didn't find an explanation or general solution and this thread is one of the first things that comes up in a Google search.
Aniko is right, the easiest way is looking in summary(test).
tests <- summary(test)
str(tests)
That gives you a list of 1 for an independent measures aov object but it could have multiple items with repeated measures. With the repeated measures each item in the list is defined by the error term for the item in the list. Where a lot of new people get confused is that if it's between measures the one lone list item isn't named. So, they don't really notice that and don't understand why using a typical selector doesn't work.
In the independent measures case something like the following works.
tests[[1]]$'Pr(>F)'
In repeated measures it's similar but you could also use named items like...
myModelSummary$'Error: subject:A'[[1]]$'Pr(>F)'
Note there I still had to do that list selection because each one of the list items in the repeated measures model is again a list of 1.
Check out str(summary(test)) - that's where you see the p-value.
Somewhat shorter, than in BurningLeo's advice:
summary(test)[[1]][[1,"Pr(>F)"]]
summary(aov(y~factor(x)))[[1]][[5]][1]
unlist(summary(myAOV)[[2]])[[9]]
2 and 9 are the positions of p-value in myAOV model

Resources