I have the following object M, from which I need to extract the fstatistic. It is a model generated by the function summaryC of a model generated by aovp, both functions from package lmPerm. I have tried hints for extracting values from normal linear models and from the functions in attr, extract and getElement, but without success.
Anybody could give me a hint?
> str(M)
List of 2
$ Error: vegetation: NULL
$ Error: Within :List of 11
..$ NA : NULL
..$ terms :Classes 'terms', 'formula' length 3 Temp ~ depth
.. .. ..- attr(*, "variables")= language list(Temp, depth)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "Temp" "depth"
.. .. .. .. ..$ : chr "depth"
.. .. ..- attr(*, "term.labels")= chr "depth"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ residuals : Named num [1:498] -46.9 -43.9 -46.9 -38.9 -41.9 ...
.. ..- attr(*, "names")= chr [1:498] "3" "4" "5" "6" ...
..$ coefficients : num [1:4, 1:4] -2.00 -1.00 -1.35e-14 1.00 2.59 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
.. .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
..$ aliased : Named logi [1:4] FALSE FALSE FALSE FALSE
.. ..- attr(*, "names")= chr [1:4] "depth1" "depth2" "depth3" "depth4"
..$ sigma : num 29
..$ df : int [1:3] 4 494 4
..$ r.squared : num 0.00239
..$ adj.r.squared: num -0.00367
..$ **fstatistic** : Named num [1:3] 0.395 3 494
.. ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
..$ cov.unscaled : num [1:4, 1:4] 0.008 -0.002 -0.002 -0.002 -0.002 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
..- attr(*, "class")= chr "summary.lmp"
- attr(*, "class")= chr "listof"
there it goes a reproducible example to play with:
Temp=1:100
depth<- rep( c("1","2","3","4","5"), 100)
vegetation=rep( c("1","2"), 50)
df=data.frame(Temp,depth,vegetation)
M=summaryC(aovp(Temp~depth+Error(vegetation),df, perm=""))
as the str output from your example shows, M is a list of two lists, the second one contains what you want. Hence list extraction via [[ does the trick:
> M[[2]][["fstatistic"]]
value numdf dendf
0.3946 3.0000 494.0000
If this is not what you want, please comment.
Related
I am a biochemist working with R as a non-professional and get into a problem now. I have a dataframe and I want to compare my different treatment groups and the positive control with a medium control. The statistical test I want to use is an anova followed by a Dunnetts test. I used the multcomp- and the DescTools-package for this and I get there with this code
Particle <- factor(c("Medium", "PosCon", "Trt1", "Trt2", "Trt3", "Medium", "PosCon", "Trt1", "Trt2", "Trt3", "Medium", "PosCon", "Trt1", "Trt2", "Trt3"))
Values <- c(1.0, 263.0, 3.1, 1.2, 0.9, 1.0, 244.0, 2.4, 1.6, 1.1, 1.0, 255.0, 3.8, 2.0, 0.8)
myDataframe <- data.frame(Particle, Values)
str(myDataframe)
a1 <- aov(Values ~ Particle, data= myDataframe)
summary(a1)
#Output
# Df Sum Sq Mean Sq F value Pr(>F)
#Particle 4 152832 38208 2084 1.48e-14 ***
#Residuals 10 183 18
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
myDataframe.dunnett <- glht(a1, linfct = mcp(Particle= "Dunnett"))
myDataframe.dunnett
summary(myDataframe.dunnett)
# Output:
# Simultaneous Tests for General Linear Hypotheses
#
#Multiple Comparisons of Means: Dunnett Contrasts
#
#
#Fit: aov(formula = Values ~ Particle, data = myDataframe)
#
#Linear Hypotheses:
# Estimate Std. Error t value Pr(>|t|)
#PosCon - Medium == 0 253.00000 3.49616 72.365 <0.001 ***
#Trt1 - Medium == 0 2.10000 3.49616 0.601 0.930
#Trt2 - Medium == 0 0.60000 3.49616 0.172 0.999
#Trt3 - Medium == 0 -0.06667 3.49616 -0.019 1.000
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#(Adjusted p values reported -- single-step method)
Now I want to get the p-values extracted (or Pr(>|t|)) and I want to get them as a four-digit number (three would also work). I used str(summary(myDataframe.dunnett)) and names(summary(myDataframe.dunnett)) to get to know what to extract, but when I extract it, it is a no digit-number as this:
str(summary(myDataframe.dunnett))
names(summary(myDataframe.dunnett)) #just to get to know the names
x <- summary(myDataframe.dunnett)$test$pvalues
x
[1] 0 1 1 1
attr(,"error")
[1] 0.0001612462
Does anyone know what that is or knows a better way to kind of "extract" the significance levels after an Anova and Dunnetts test into a vector? I need those to convert them into the significance-stars above a plot.
I have the feeling that this might help, but I could not figure out how to modify it for my data:
Thanks for your help!
If you look at the structure of the summary object you can see that the values you are hoping to extract are in a list element named test
str( summary(myDataframe.dunnett) )
List of 10
$ model :List of 13
..$ coefficients : Named num [1:5] 1 253 2.1 0.6 -0.0667
.. ..- attr(*, "names")= chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..$ residuals : Named num [1:15] -1.48e-15 9.00 1.92e-15 -4.00e-01 -3.33e-02 ...
.. ..- attr(*, "names")= chr [1:15] "1" "2" "3" "4" ...
..$ effects : Named num [1:15] -201.8857 -390.926 -2.8833 -0.8957 0.0816 ...
.. ..- attr(*, "names")= chr [1:15] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..$ rank : int 5
..$ fitted.values: Named num [1:15] 1 254 3.1 1.6 0.933 ...
.. ..- attr(*, "names")= chr [1:15] "1" "2" "3" "4" ...
..$ assign : int [1:5] 0 1 1 1 1
..$ qr :List of 5
.. ..$ qr : num [1:15, 1:5] -3.873 0.258 0.258 0.258 0.258 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:15] "1" "2" "3" "4" ...
.. .. .. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
.. .. ..- attr(*, "assign")= int [1:5] 0 1 1 1 1
.. .. ..- attr(*, "contrasts")=List of 1
.. .. .. ..$ Particle: chr "contr.treatment"
.. ..$ qraux: num [1:5] 1.26 1.54 1.54 1.53 1.52
.. ..$ pivot: int [1:5] 1 2 3 4 5
.. ..$ tol : num 1e-07
.. ..$ rank : int 5
.. ..- attr(*, "class")= chr "qr"
..$ df.residual : int 10
..$ contrasts :List of 1
.. ..$ Particle: chr "contr.treatment"
..$ xlevels :List of 1
.. ..$ Particle: chr [1:5] "Medium" "PosCon" "Trt1" "Trt2" ...
..$ call : language aov(formula = Values ~ Particle, data = myDataframe)
..$ terms :Classes 'terms', 'formula' language Values ~ Particle
.. .. ..- attr(*, "variables")= language list(Values, Particle)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "Values" "Particle"
.. .. .. .. ..$ : chr "Particle"
.. .. ..- attr(*, "term.labels")= chr "Particle"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. ..- attr(*, "predvars")= language list(Values, Particle)
.. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. .. ..- attr(*, "names")= chr [1:2] "Values" "Particle"
..$ model :'data.frame': 15 obs. of 2 variables:
.. ..$ Values : num [1:15] 1 263 3.1 1.2 0.9 1 244 2.4 1.6 1.1 ...
.. ..$ Particle: Factor w/ 5 levels "Medium","PosCon",..: 1 2 3 4 5 1 2 3 4 5 ...
.. ..- attr(*, "terms")=Classes 'terms', 'formula' language Values ~ Particle
.. .. .. ..- attr(*, "variables")= language list(Values, Particle)
.. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. .. ..$ : chr [1:2] "Values" "Particle"
.. .. .. .. .. ..$ : chr "Particle"
.. .. .. ..- attr(*, "term.labels")= chr "Particle"
.. .. .. ..- attr(*, "order")= int 1
.. .. .. ..- attr(*, "intercept")= int 1
.. .. .. ..- attr(*, "response")= int 1
.. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. .. .. ..- attr(*, "predvars")= language list(Values, Particle)
.. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "factor"
.. .. .. .. ..- attr(*, "names")= chr [1:2] "Values" "Particle"
..- attr(*, "class")= chr [1:2] "aov" "lm"
$ linfct : num [1:4, 1:5] 0 0 0 0 1 0 0 0 0 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : Named chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
.. .. ..- attr(*, "names")= chr [1:4] "Particle1" "Particle2" "Particle3" "Particle4"
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
..- attr(*, "type")= chr "Dunnett"
$ rhs : num [1:4] 0 0 0 0
$ coef : Named num [1:5] 1 253 2.1 0.6 -0.0667
..- attr(*, "names")= chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
$ vcov : num [1:5, 1:5] 6.11 -6.11 -6.11 -6.11 -6.11 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
.. ..$ : chr [1:5] "(Intercept)" "ParticlePosCon" "ParticleTrt1" "ParticleTrt2" ...
$ df : int 10
$ alternative: chr "two.sided"
$ type : chr "Dunnett"
$ focus : chr "Particle"
$ test :List of 7
..$ pfunction :function (type = c("univariate", "adjusted", p.adjust.methods), ...)
..$ qfunction :function (conf.level, adjusted = TRUE, ...)
..$ coefficients: Named num [1:4] 253 2.1 0.6 -0.0667
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ sigma : Named num [1:4] 3.5 3.5 3.5 3.5
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ tstat : Named num [1:4] 72.3652 0.6007 0.1716 -0.0191
.. ..- attr(*, "names")= chr [1:4] "PosCon - Medium" "Trt1 - Medium" "Trt2 - Medium" "Trt3 - Medium"
..$ pvalues : num [1:4] 0 0.93 0.999 1
.. ..- attr(*, "error")= num 0.000305
..$ type : chr "single-step"
..- attr(*, "class")= chr "mtest"
- attr(*, "class")= chr [1:2] "summary.glht" "glht"
... and that the test list is a rather complicated list in its own right, ...
> ( summary(myDataframe.dunnett)$test )
$pfunction
function (type = c("univariate", "adjusted", p.adjust.methods),
...)
{
type <- match.arg(type)
pfct <- function(q) {
switch(object$alternative, two.sided = {
low <- rep(-abs(q), dim)
upp <- rep(abs(q), dim)
}, less = {
low <- rep(q, dim)
upp <- rep(Inf, dim)
}, greater = {
low <- rep(-Inf, dim)
upp <- rep(q, dim)
})
pmvt(lower = low, upper = upp, df = df, corr = cr, ...)
}
switch(object$alternative, two.sided = {
if (df > 0) pvals <- 2 * (1 - pt(abs(tstat), df)) else pvals <- 2 *
(1 - pnorm(abs(tstat)))
}, less = {
if (df > 0) pvals <- pt(tstat, df) else pvals <- pnorm(tstat)
}, greater = {
if (df > 0) pvals <- 1 - pt(tstat, df) else pvals <- 1 -
pnorm(tstat)
})
if (type == "univariate")
return(pvals)
if (type == "adjusted") {
ret <- numeric(length(tstat))
error <- 0
for (i in 1:length(tstat)) {
tmp <- pfct(tstat[i])
if (attr(tmp, "msg") != "Normal Completion" && length(grep("^univariate",
attr(tmp, "msg"))) == 0)
warning(attr(tmp, "msg"))
if (error < attr(tmp, "error"))
error <- attr(tmp, "error")
ret[i] <- tmp
}
ret <- 1 - ret
attr(ret, "error") <- error
return(ret)
}
return(p.adjust(pvals, method = type))
}
<bytecode: 0x55c5419b3f18>
<environment: 0x55c540a7c100>
$qfunction
function (conf.level, adjusted = TRUE, ...)
{
tail <- switch(object$alternative, two.sided = "both.tails",
less = "lower.tail", greater = "upper.tail")
if (adjusted) {
calpha <- qmvt(conf.level, df = df, corr = cr, tail = tail,
...)
}
else {
calpha <- qmvt(conf.level, df = df, corr = matrix(1),
tail = tail, ...)
}
ret <- calpha$quantile
attr(ret, "error") <- calpha$estim.prec
return(ret)
}
<bytecode: 0x55c5419b9d20>
<environment: 0x55c540a7c100>
$coefficients
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
253.00000000 2.10000000 0.60000000 -0.06666667
$sigma
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
3.496157 3.496157 3.496157 3.496157
$tstat
PosCon - Medium Trt1 - Medium Trt2 - Medium Trt3 - Medium
72.36517911 0.60065959 0.17161703 -0.01906856
$pvalues
[1] 0.0000000 0.9298994 0.9992779 0.9999999
attr(,"error")
[1] 0.0001864546
$type
[1] "single-step"
attr(,"class")
[1] "mtest"
... and that the p-values are in a sublist of that list in an element named pvalues:
> ( summary(myDataframe.dunnett)$test$pvalues )
[1] 0.0000000 0.9299389 0.9992772 0.9999999
attr(,"error")
[1] 0.0001414288
... or with the DescTools function DunnetTest
(z <- DunnettTest(formula(a1$call)))
## Dunnett's test for comparing several treatments with a control :
## 95% family-wise confidence level
##
## $Medium
## diff lwr.ci upr.ci pval
## PosCon-Medium 253.00000000 242.883587 263.11641 <2e-16 ***
## Trt1-Medium 2.10000000 -8.016413 12.21641 0.9299
## Trt2-Medium 0.60000000 -9.516413 10.71641 0.9993
## Trt3-Medium -0.06666667 -10.183079 10.04975 1.0000
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
z$Medium[,"pval"]
## PosCon-Medium Trt1-Medium Trt2-Medium Trt3-Medium
## 0.0000000 0.9298875 0.9992763 0.9999999
I'm relatively new to R and I'm running network and behavior coevolution models using the R Package RSiena.
My data set consists of around 100 networks and for each of these networks, I run one RSiena model.
ans.1 <- siena07(myalgorithm, data=mydata.1, effects=myeff.1, batch=TRUE)
...
ans.100 <- siena07(myalgorithm, data=mydata.100, effects=myeff.100, batch=TRUE)
Now I want to test the goodness of fit for each of the multiple network models. I actually know how to check the goodness of fit for a single model.
gof <- sienaGOF(ans.1, verbose=TRUE, varName="Friend", IndegreeDistribution)
plot(gof)
But I don't know how to combine the GOF results of all 100 models to get an overall impression. How can I get a table with the model number and the p-values. Or can I plot the results for all models within one plot? Or is there a better way?
So far I tried to put the GOF results in a list:
goftest <-list()
goftest[[1]] <- sienaGOF(ans.1, verbose=TRUE, varName="Friend", IndegreeDistribution)
...
goftest[[100]] <- sienaGOF(ans.100, verbose=TRUE, varName="Friend", IndegreeDistribution)
plot(goftest)
goftest[[1]] #Output:
"Siena Goodness of Fit ( IndegreeDistribution ), all periods
=====
Monte Carlo Mahalanobis distance test p-value: 0.941
-----
One tailed test used (i.e. estimated probability of greater distance than observation).
-----
Calculated joint MHD = ( 14.4 ) for current model."
str(goftest[[1]])#Output:
"List of 1
$ Joint:List of 8
..$ p : num 0.941
..$ SimulatedTestStat: Named num [1:2000] 9.97 16.02 6.83 10.14 8.65 ...
.. ..- attr(*, "names")= chr [1:2000] "1" "2" "3" "4" ...
..$ ObservedTestStat : num 2.09
..$ TwoTailed : logi FALSE
..$ Simulations : int [1:2000, 1:9] 21 22 22 21 19 26 30 23 25 26 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2000] "1" "2" "3" "4" ...
.. .. ..$ : NULL
..$ Observations : int [1, 1:9] 26 48 63 73 76 78 78 78 78
..$ InvCovSimStats : num [1:9, 1:9] 13.2509 4.9587 1.2948 0.231 0.0895 ...
..$ Rank : int 9
.. ..- attr(*, "method")= chr "tolNorm2"
.. ..- attr(*, "useGrad")= logi FALSE
.. ..- attr(*, "tol")= num 2e-15
..- attr(*, "class")= chr "sienaGofTest"
..- attr(*, "sienaFitName")= chr "sienaFitObject"
..- attr(*, "auxiliaryStatisticName")= chr "IndegreeDistribution"
..- attr(*, "key")= chr [1:9] "0" "1" "2" "3" ...
- attr(*, "class")= chr "sienaGOF"
- attr(*, "scoreTest")= logi FALSE
- attr(*, "originalMahalanobisDistances")= num [1:3] 2.15 3.51 8.74
- attr(*, "oneStepMahalanobisDistances")=List of 3
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
- attr(*, "joinedOneStepMahalanobisDistances")= Named num(0)
..- attr(*, "names")= chr(0)
- attr(*, "oneStepMahalanobisDistances_old")=List of 3
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
..$ : Named num(0)
.. ..- attr(*, "names")= chr(0)
- attr(*, "joinedOneStepMahalanobisDistances_old")= Named num(0)
..- attr(*, "names")= chr(0)
- attr(*, "oneStepSpecs")= num[1:20, 0 ]
- attr(*, "auxiliaryStatisticName")= chr "IndegreeDistribution"
- attr(*, "simTime")= 'proc_time' Named num [1:5] 39.61 0.28 40.21 NA NA
..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
- attr(*, "twoTailed")= logi FALSE
- attr(*, "joined")= logi TRUE"
But I don't know how to extract the p-Values and get a table, which just contains the network number and the associated p-value.
Furthermore, the plot command just produces error messages and no output so far.
It seems that, in R, I can refer to a variable with part of a variable name. But I am confused about why I can do that.
Use the following code as an example:
library(car)
scatterplot(housing ~ total)
house.lm <- lm(housing ~ total)
summary(house.lm)
str(summary(house.lm))
summary(house.lm)$coefficients[2,2]
summary(house.lm)$coe[2,2]
When I print the structure of summary(house.lm), I got the following output:
> str(summary(house.lm))
List of 11
$ call : language lm(formula = housing ~ total)
$ terms :Classes 'terms', 'formula' language housing ~ total
.. ..- attr(*, "variables")= language list(housing, total)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "housing" "total"
.. .. .. ..$ : chr "total"
.. ..- attr(*, "term.labels")= chr "total"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(housing, total)
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "housing" "total"
$ residuals : Named num [1:162] -8.96 -11.43 3.08 8.45 2.2 ...
..- attr(*, "names")= chr [1:162] "1" "2" "3" "4" ...
$ coefficients : num [1:2, 1:4] 28.4523 0.0488 10.2117 0.0103 2.7862 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:2] "(Intercept)" "total"
.. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
$ aliased : Named logi [1:2] FALSE FALSE
..- attr(*, "names")= chr [1:2] "(Intercept)" "total"
$ sigma : num 53.8
$ df : int [1:3] 2 160 2
$ r.squared : num 0.123
$ adj.r.squared: num 0.118
$ fstatistic : Named num [1:3] 22.5 1 160
..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
$ cov.unscaled : num [1:2, 1:2] 3.61e-02 -3.31e-05 -3.31e-05 3.67e-08
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:2] "(Intercept)" "total"
.. ..$ : chr [1:2] "(Intercept)" "total"
- attr(*, "class")= chr "summary.lm"
However, it seems that I can refer to the variable coefficients with all of the following commands:
summary(house.lm)$coe[2,2]
summary(house.lm)$coef[2,2]
summary(house.lm)$coeff[2,2]
summary(house.lm)$coeffi[2,2]
summary(house.lm)$coeffic[2,2]
summary(house.lm)$coeffici[2,2]
summary(house.lm)$coefficie[2,2]
summary(house.lm)$coefficien[2,2]
summary(house.lm)$coefficient[2,2]
summary(house.lm)$coefficients[2,2]
They all give the same results: 0.01029709
Therefore, I was wondering when I can refer to a variable with only part of its name in R?
You can do it when rest of name is unambiguous. For example
df <- data.frame(abcd = c(1,2,3), xyz = c(4,5,6), abc = c(5,6,7))
> df$xy
[1] 4 5 6
> df$ab
NULL
> df$x
[1] 4 5 6
df$xy and even df$x gives right data, but df$ab results in NULL because it can refer to both df$abc and df$abcd. It's like when you type df$xy in RStudio and press Ctrl + Space you will get rigtht variable name, so you could refer to part of variable name.
http://adv-r.had.co.nz/Functions.html#lexical-scoping
When calling a function you can specify arguments by position, by
complete name, or by partial name. Arguments are matched first by
exact name (perfect matching), then by prefix matching, and finally by
position.
When you are doing quick coding to analyse some data, using partial names is not a problem, but I tend to agree, it's not good when writing code. In a package you can't do that, R-CMD check will find every occurence.
I am trying to use the R baseline-package on a sample dataset that I have for, to test and evaluate the current baseline algorithm that I have.
I wanted to apply the fillpeaks algorithm as a trend line to compare.
bc.fillPeaks <- baseline(milk$spectra[1, drop=FALSE], lambda=6,
hwi=50, it=10, int=2000, method="fillPeaks")
plot(bc.fillPeaks)
But my problem is that the sample data that I have does not fit the matrix structure which is used in the example. When I look at the data.frame used for the example I don't understand it
'data.frame': 45 obs. of 2 variables
$ cow : num 0 0.25 0.375 0.875 0.5 0.75 0.5 0.125 0 0.125 ...
$ spectra: num [1:45, 1:21451] 1029 371 606 368 554 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "4999.94078628963" "5001.55954267662" "5003.17856106153" "5004.79784144435" ...
- attr(*, "terms")=Classes 'terms', 'formula' length 3 cow ~ spectra
.. ..- attr(*, "variables")= language list(cow, spectra)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "cow" "spectra"
.. .. .. ..$ : chr "spectra"
.. ..- attr(*, "term.labels")= chr "spectra"
.. ..- attr(*, "order")= int 1
.. ..- attr(*, "intercept")= int 1
.. ..- attr(*, "response")= int 1
.. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
.. ..- attr(*, "predvars")= language list(cow, spectra)
.. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "nmatrix.21451"
.. .. ..- attr(*, "names")= chr [1:2] "cow" "spectra"
My question is therefore if any of you have experience with the baseline-package and the dataset (milk) used and ideas to how I can convert my data set which is structed: Date, Visits, Old_baseline_visits
To fit and test the baseline algorithm from the R-package
I have used baseline, and found it slightly confusing at first, particularly the example data. As it says in the help file, baseline expects a matrix with the spectra in rows. Even if you only have one "spectrum", it needs to be in the form of a single row matrix. Try this:
foo <- data.frame(Date=seq.Date(as.Date("1957-01-01"), by = "day",
length.out = ncol(milk$spectra)),
Visits=milk$spectra[1,],
Old_baseline_visits=milk$spectra[1,], row.names = NULL)
foo.t <- t(foo$Visits) # Visits in a single row matrix
bc.fillPeaks <- baseline(foo.t, lambda=6,
hwi=50, it=10, int=2000, method='fillPeaks')
plot(bc.fillPeaks)
If you want the baseline and corrected spectra back in your original data frame, try this:
foo$New_baseline <- c(getBaseline(bc.fillPeaks))
foo$New_corrected <- c(getCorrected(bc.fillPeaks))
plot(foo$Date, foo$New_corrected, "l")
Alternatively, if you don't need the baseline object, you can use baseline.fillPeaks(), which returns a list.
I get the following error when trying to use stargazer::stargazer with the coxph (Survival):
> summary(firm.survcox)
> stargazer(firm.survcox)
> firm.survcox #only partial results are included for brevity
Call: coxph(formula = Surv(Duration, Event, type = "right") ~ Innovation + Avg_Wage)
coef exp(coef) se(coef) z p
Innovation -0.87680 0.4161 0.008040 -109.051 0.0e+00
Prior_Experience:Age -0.01297 0.9871 0.004174 -3.107 1.9e-03
Likelihood ratio test=131885 on 23 df, p=0 n= 535416,
number of events= 203037 (1060020 observations deleted due to missingness)
Error in .get.standard.errors.1(object.name, user.given) :
subscript out of bounds
Variables in the model are firm-, industry-, and region-level, and there is one interaction term. I try running the model on only one variable (E.g. Innovation), and I get the same error message. Rownames are NULL.
Updated Question and Response:
Sorry about the confusion. I did not realize I could edit the original question. Below is the model run on one variable - innovation, with the same error produced).
Call: coxph(formula = Surv(Duration, Event, type = "right") ~ Innovation)
coef exp(coef) se(coef) z p
Innovation -0.87680 0.4161 0.008040 -109.051 0.0e+00
Likelihood ratio test=131885 on 23 df, p=0 n= 535416,
number of events= 203037 (1060020 observations deleted due to missingness)
stargazer(firm.survcox)
Error in .get.standard.errors.1(object.name, user.given) :
subscript out of bounds
Structure of the Cox Model
str(firm.survcox)
List of 18
$ coefficients : Named num -0.772
..- attr(*, "names")= chr "Innovation"
$ var : num [1, 1] 3.91e-05
$ loglik : num [1:2] -5583174 -5573640
$ score : num 16015
$ iter : int 4
$ linear.predictors: num [1:1595436] 0.0807 0.0807 0.0807 -0.6915 0.0807 ...
$ residuals : Named num [1:1595436] 0.925 0.976 -0.516 0.888 0.976 ...
..- attr(*, "names")= chr [1:1595436] "1" "2" "3" "4" ...
$ means : Named num 0.104
..- attr(*, "names")= chr "Innovation"
$ concordance : Named num [1:5] 5.20e+10 1.97e+10 3.67e+11 1.14e+10 2.49e+08
..- attr(*, "names")= chr [1:5] "concordant" "discordant" "tied.risk" "tied.time"..
$ method : chr "efron"
$ n : int 1595436
$ nevent : num 404033
$ terms :Classes 'terms', 'formula' length 3 Surv(Duration, Event, type =
"right") ~ Innovation
.. ..- attr(*, "variables")= language list(Surv(Duration, Event, type = "right"),
Innovation)
.. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:2] "Surv(Duration, Event, type = \"right\")" "Innovation"
.. .. .. ..$ : chr "Innovation"
.. ..- attr(*, "term.labels")= chr "Innovation"
.. ..- attr(*, "specials")=Dotted pair list of 3
.. .. ..$ strata : NULL .. .. ..$ cluster: NULL .. .. ..$ tt : NULL ..-
attr(*, "order")= int 1 .. ..- attr(*, "intercept")= int 1 .. ..- attr(*,
"response")= int 1 .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> .. ..-
attr(*, "predvars")= language list(Surv(Duration, Event, type = "right"),
Innovation) .. ..- attr(*, "dataClasses")= Named chr [1:2] "nmatrix.2" "numeric"
.. .. ..- attr(*, "names")= chr [1:2] "Surv(Duration, Event, type = \"right\")"
"Innovation"
$ assign :List of 1 ..$ Innovation: num 1
$ wald.test : Named num 15249 ..- attr(*, "names")= chr "Innovation"
$ y : Surv [1:1595436, 1:2] 2 1 10+ 5 1 10+ 10+ 6 8+ 8+
... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:1595436] "1" "2" "3" "4"
$ : chr [1:2] "time" "status" ..- attr(*, "type")= chr "right" $ formula
:Class 'formula' length 3 Surv(Duration, Event, type = "right") ~ Innovation ..
attr(*, ".Environment")=<environment: R_GlobalEnv> $ call : language
coxph(formula = Surv(Duration, Event, type = "right") ~ Innovation)
- attr(*, "class")= chr "coxph"
The material posted in your comment (which I added to the question and applied the code block formatting function as you should have done) doesn't really make sense. The coefficients listed "Innovation" and "Prior_Experience:Age" do not match the formula in the call: "formula = Surv(Duration, Event, type = "right") ~ Innovation + Avg_Wage)". There seems to have been some mangling of the object along the way.