I'm using this code to run an ANOVA using type II SS, when the error gets thrown Error: $ operator is invalid for atomic vectors
library(tidyverse)
programmers <- read_table("http://tofu.byu.edu/stat230/programmers.txt")
programmers$LargeSystemExp <-
as_factor(programmers$LargeSystemExp)
programmers$YearsOfExp <-
as_factor(programmers$YearsOfExp)
prog.lm <- lm(TimePredictionError ~ LargeSystemExp + YearsOfExp + LargeSystemExp:YearsOfExp, data=programmers)
anova(prog.lm)
anova(prog.lm,type=2)
How can I run the last line of code without error?
For type 2 ANOVA, use car::Anova will work.
car::Anova(prog.lm, type = 2)
Anova Table (Type II tests)
Response: TimePredictionError
Sum Sq Df F value Pr(>F)
LargeSystemExp 34504 1 358.59 2.469e-13 ***
YearsOfExp 41720 2 216.79 2.540e-13 ***
LargeSystemExp:YearsOfExp 24234 2 125.93 2.614e-11 ***
Residuals 1732 18
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Related
I am using the lmPerm package in R, running aovp(), I have specified "Exact" permutations in my code, however the output is still reading Pr(Prob). Why is this and how do I fix this so that the output reads Pr(Exact)?
For my smaller data set it works fine, however not for my larger data set.
My code:
anova_model2 <- aovp(Eigen ~ as.factor(Weight), data = CentralityDataA,perm="Exact") summary(anova_model2)
Output:
> summary(anova_model2)
Component 1 :
Df R Sum Sq R Mean Sq Iter Pr(Prob)
as.factor(Weight) 2 0.17866 0.089329 5000 0.0058 **
Residuals 12 0.11938 0.009948
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I'm using 'gamlss' from the package 'gamlss' (version 5.4-1) in R for a generalized additive model for location scale and shape.
My model looks like this
propvoc3 = gamlss(proporcion.voc ~ familiaridad * proporcion)
When I want to see the Anova table I get this output
Mu link function: identity
Mu Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.625e-01 9.476e-02 5.936 1.9e-06 ***
familiaridaddesconocido -1.094e-01 1.059e-01 -1.032 0.31042
proporcionmayor 4.375e-01 1.340e-01 3.265 0.00281 **
proporcionmenor 1.822e-17 1.340e-01 0.000 1.00000
familiaridaddesconocido:proporcionmayor -3.281e-01 1.708e-01 -1.921 0.06464 .
familiaridaddesconocido:proporcionmenor 5.469e-01 1.708e-01 3.201 0.00331 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
------------------------------------------------------------------
I just want to know if there is a way to get the values just by variable and not by every term?
I'm extremely new to R and need your help!
I performed an Anova/Factorial Anova and wanted to do a Tukey test however I got this error:
Error in `[.data.frame`(mf, mf.cols[[i]]) : undefined columns selected
Here is what I did for the anova and such (removed section testing for normality)
> data.aov<- aov(`FREQUENCY OF INGESTION` ~ `HYDROLOGY REGIME`*`DEPTH ZONE`*`ST. LOCATION`)
> anova(data.aov)
Analysis of Variance Table
Response: FREQUENCY OF INGESTION
Df Sum Sq Mean Sq F value Pr(>F)
`HYDROLOGY REGIME` 1 0.0002 0.0001530 0.0218 0.88274
`DEPTH ZONE` 3 0.0147 0.0049134 0.6990 0.55288
`ST. LOCATION` 1 0.0202 0.0201579 2.8677 0.09085 .
`HYDROLOGY REGIME`:`DEPTH ZONE` 2 0.0229 0.0114514 1.6291 0.19691
`DEPTH ZONE`:`ST. LOCATION` 1 0.0018 0.0017877 0.2543 0.61422
Residuals 651 4.5761 0.0070293
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> TukeyHSD(data.aov)
Error in `[.data.frame`(mf, mf.cols[[i]]) : undefined columns selected
> library(multcompView)
> multcompLetters(extract_p(TukeyHSD(aov(`FREQUENCY OF INGESTION`~`HYDROLOGY REGIME`*`DEPTH ZONE`*`ST. LOCATION`))) ```
Try using the TukeyC package. There are several facilities compared to other packages for factorial experiments, split-plot and etc. Follow the link: https://cran.r-project.org/web/packages/TukeyC/TukeyC.pdf
I have ran a quasipoisson GLM with the following code:
Output3 <- glm(GCN ~ DHSI + N + P, PondsTask2, family = quasipoisson(link = "log"))
and received this output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.69713 0.56293 -3.015 0.00272 **
DHSI 3.44795 0.74749 4.613 0.00000519 ***
N -0.59648 0.36357 -1.641 0.10157
P -0.01964 0.37419 -0.052 0.95816
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
With the DHSI being statistically significant, but the other two variables not being significant. How do I go about dropping variables until I have the minimum model?
I've got a function to do ANOVA for a specific column (this code is simplified, my code does some other related things to that column too, and I do this set of calculations for different columns, so it deserves a function). alz is my dataframe.
analysis <- function(column) {
print(anova(lm(alz[[column]] ~ alz$Category)))
}
I call it e.g.:
analysis("VariableX")
And then in the output I get:
Analysis of Variance Table
Response: alz[[column]]
Df Sum Sq Mean Sq F value Pr(>F)
alz$Category 2 4.894 2.44684 9.3029 0.0001634 ***
Residuals 136 35.771 0.26302
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How to make the output show the column name instead of alz[[column]]?
Here is an example:
> f <- function(n) {
+ fml <- as.formula(paste(n, "~cyl"))
+ print(anova(lm(fml, data = mtcars)))
+ }
>
> f("mpg")
Analysis of Variance Table
Response: mpg
Df Sum Sq Mean Sq F value Pr(>F)
cyl 1 817.71 817.71 79.561 6.113e-10 ***
Residuals 30 308.33 10.28
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
analysis <- function(column) {
afit <- anova(lm( alz[[column]] ~ alz$Category))
attr(afit, "heading") <- sub("\\: .+$", paste(": ", column) , attr( afit, "heading") )
print(afit)
}
The anova object carries its "Response:" value in an attribute named "heading". You would be better advised to use the 'data' argument to lm in the manner #kohske illustrated.