I'm extremely new to R and need your help!
I performed an Anova/Factorial Anova and wanted to do a Tukey test however I got this error:
Error in `[.data.frame`(mf, mf.cols[[i]]) : undefined columns selected
Here is what I did for the anova and such (removed section testing for normality)
> data.aov<- aov(`FREQUENCY OF INGESTION` ~ `HYDROLOGY REGIME`*`DEPTH ZONE`*`ST. LOCATION`)
> anova(data.aov)
Analysis of Variance Table
Response: FREQUENCY OF INGESTION
Df Sum Sq Mean Sq F value Pr(>F)
`HYDROLOGY REGIME` 1 0.0002 0.0001530 0.0218 0.88274
`DEPTH ZONE` 3 0.0147 0.0049134 0.6990 0.55288
`ST. LOCATION` 1 0.0202 0.0201579 2.8677 0.09085 .
`HYDROLOGY REGIME`:`DEPTH ZONE` 2 0.0229 0.0114514 1.6291 0.19691
`DEPTH ZONE`:`ST. LOCATION` 1 0.0018 0.0017877 0.2543 0.61422
Residuals 651 4.5761 0.0070293
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> TukeyHSD(data.aov)
Error in `[.data.frame`(mf, mf.cols[[i]]) : undefined columns selected
> library(multcompView)
> multcompLetters(extract_p(TukeyHSD(aov(`FREQUENCY OF INGESTION`~`HYDROLOGY REGIME`*`DEPTH ZONE`*`ST. LOCATION`))) ```
Try using the TukeyC package. There are several facilities compared to other packages for factorial experiments, split-plot and etc. Follow the link: https://cran.r-project.org/web/packages/TukeyC/TukeyC.pdf
Related
I am using the lmPerm package in R, running aovp(), I have specified "Exact" permutations in my code, however the output is still reading Pr(Prob). Why is this and how do I fix this so that the output reads Pr(Exact)?
For my smaller data set it works fine, however not for my larger data set.
My code:
anova_model2 <- aovp(Eigen ~ as.factor(Weight), data = CentralityDataA,perm="Exact") summary(anova_model2)
Output:
> summary(anova_model2)
Component 1 :
Df R Sum Sq R Mean Sq Iter Pr(Prob)
as.factor(Weight) 2 0.17866 0.089329 5000 0.0058 **
Residuals 12 0.11938 0.009948
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I'm using 'gamlss' from the package 'gamlss' (version 5.4-1) in R for a generalized additive model for location scale and shape.
My model looks like this
propvoc3 = gamlss(proporcion.voc ~ familiaridad * proporcion)
When I want to see the Anova table I get this output
Mu link function: identity
Mu Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.625e-01 9.476e-02 5.936 1.9e-06 ***
familiaridaddesconocido -1.094e-01 1.059e-01 -1.032 0.31042
proporcionmayor 4.375e-01 1.340e-01 3.265 0.00281 **
proporcionmenor 1.822e-17 1.340e-01 0.000 1.00000
familiaridaddesconocido:proporcionmayor -3.281e-01 1.708e-01 -1.921 0.06464 .
familiaridaddesconocido:proporcionmenor 5.469e-01 1.708e-01 3.201 0.00331 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
------------------------------------------------------------------
I just want to know if there is a way to get the values just by variable and not by every term?
I'm using this code to run an ANOVA using type II SS, when the error gets thrown Error: $ operator is invalid for atomic vectors
library(tidyverse)
programmers <- read_table("http://tofu.byu.edu/stat230/programmers.txt")
programmers$LargeSystemExp <-
as_factor(programmers$LargeSystemExp)
programmers$YearsOfExp <-
as_factor(programmers$YearsOfExp)
prog.lm <- lm(TimePredictionError ~ LargeSystemExp + YearsOfExp + LargeSystemExp:YearsOfExp, data=programmers)
anova(prog.lm)
anova(prog.lm,type=2)
How can I run the last line of code without error?
For type 2 ANOVA, use car::Anova will work.
car::Anova(prog.lm, type = 2)
Anova Table (Type II tests)
Response: TimePredictionError
Sum Sq Df F value Pr(>F)
LargeSystemExp 34504 1 358.59 2.469e-13 ***
YearsOfExp 41720 2 216.79 2.540e-13 ***
LargeSystemExp:YearsOfExp 24234 2 125.93 2.614e-11 ***
Residuals 1732 18
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I have ran a quasipoisson GLM with the following code:
Output3 <- glm(GCN ~ DHSI + N + P, PondsTask2, family = quasipoisson(link = "log"))
and received this output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.69713 0.56293 -3.015 0.00272 **
DHSI 3.44795 0.74749 4.613 0.00000519 ***
N -0.59648 0.36357 -1.641 0.10157
P -0.01964 0.37419 -0.052 0.95816
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
With the DHSI being statistically significant, but the other two variables not being significant. How do I go about dropping variables until I have the minimum model?
I've got a function to do ANOVA for a specific column (this code is simplified, my code does some other related things to that column too, and I do this set of calculations for different columns, so it deserves a function). alz is my dataframe.
analysis <- function(column) {
print(anova(lm(alz[[column]] ~ alz$Category)))
}
I call it e.g.:
analysis("VariableX")
And then in the output I get:
Analysis of Variance Table
Response: alz[[column]]
Df Sum Sq Mean Sq F value Pr(>F)
alz$Category 2 4.894 2.44684 9.3029 0.0001634 ***
Residuals 136 35.771 0.26302
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How to make the output show the column name instead of alz[[column]]?
Here is an example:
> f <- function(n) {
+ fml <- as.formula(paste(n, "~cyl"))
+ print(anova(lm(fml, data = mtcars)))
+ }
>
> f("mpg")
Analysis of Variance Table
Response: mpg
Df Sum Sq Mean Sq F value Pr(>F)
cyl 1 817.71 817.71 79.561 6.113e-10 ***
Residuals 30 308.33 10.28
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
analysis <- function(column) {
afit <- anova(lm( alz[[column]] ~ alz$Category))
attr(afit, "heading") <- sub("\\: .+$", paste(": ", column) , attr( afit, "heading") )
print(afit)
}
The anova object carries its "Response:" value in an attribute named "heading". You would be better advised to use the 'data' argument to lm in the manner #kohske illustrated.