I am using the eRm package to estimate a Rasch model. The RM() function returns a Rasch model that I can summarize using the summary() function. However, when I try to store the results, R creates an empty object.
library(eRm)
my_data <- matrix(sample(0:1, 100, replace = TRUE), nrow = 10)
my_model <- RM(X = my_data)
summary(my_model)
my_summary <- summary(my_model)
Why does this operation not work in this case but does work when storing the summary of a linear model? Is there another way to store the summary of the eRm model?
As #Imo surmised, it looks like summary.eRm just prints to the console, rather than returning an object. You can inspect the code for summary.eRm by running getAnywhere(summary.eRm). summary is a "generic" function, meaning that what it does depends on what "method" is called when the function is invoked.
For an lm model object, when you type summary(my_model), the summary.lm function is dispatched. But when you type summary(my_model) and my_model is an eRm object, the summary.eRm method is dispatched. summary.lm returns an object, but summary.eRm just prints to the console. Run methods(summary) to see the various summary functions that get dispatched for different types of objects.
A workaround would be to create your own summary object (or a function to create such an object), using the model object itself. You can inspect the components of the model object with str(my_model). You can also look at the code for summary.eRm to see where it is getting each of the components that it prints to the console.
Here's a simple example, lifting code from summary.eRm to create a summary function:
RMsmry = function(obj) {
cols = c("Estimate", "Std. Error", "lower CI", "upper CI")
# Create difficulty summary
ci = confint(obj, "eta")
tbl1 = as.data.frame(cbind(round(obj$etapar, 3),
round(obj$se.eta, 3), round(ci, 3)))
names(tbl1) = cols
# Create easiness summary
ci <- confint(obj, "beta")
tbl2 = as.data.frame(cbind(round(obj$betapar, 3),
round(obj$se.beta, 3), round(ci, 3)))
names(tbl2) = cols
return(list(Difficulty=tbl1, Easiness=tbl2))
}
my_summary = RMsmry(my_model)
my_summary
$Difficulty
Estimate Std. Error lower CI upper CI
I2 -1.191 0.658 -2.480 0.098
I3 -1.191 0.658 -2.480 0.098
I4 0.078 0.627 -1.150 1.306
I5 -0.750 0.623 -1.971 0.471
I6 0.078 0.627 -1.150 1.306
I7 1.079 0.748 -0.386 2.544
I8 -0.339 0.614 -1.543 0.865
I9 0.078 0.627 -1.150 1.306
I10 1.079 0.748 -0.386 2.544
$Easiness
Estimate Std. Error lower CI upper CI
beta I1 -1.079 0.748 -2.544 0.386
beta I2 1.191 0.658 -0.098 2.480
beta I3 1.191 0.658 -0.098 2.480
beta I4 -0.078 0.627 -1.306 1.150
beta I5 0.750 0.623 -0.471 1.971
beta I6 -0.078 0.627 -1.306 1.150
beta I7 -1.079 0.748 -2.544 0.386
beta I8 0.339 0.614 -0.865 1.543
beta I9 -0.078 0.627 -1.306 1.150
beta I10 -1.079 0.748 -2.544 0.386
Related
Say I have a series of GAMs that I would like to average together using MuMIn. How do I go about interpreting the results of the averaged smoothers? Why are there numbers after each smoother term?
library(glmmTMB)
library(mgcv)
library(MuMIn)
data("Salamanders") # glmmTMB data
# mgcv gams
gam1 <- gam(count ~ spp + s(cover) + s(DOP), data = Salamanders, family = tw, method = "ML")
gam2 <- gam(count ~ mined + s(cover) + s(DOP), data = Salamanders, family = tw, method = "ML")
gam3 <- gam(count ~ s(Wtemp), data = Salamanders, family = tw, method = "ML")
gam4 <- gam(count ~ mined + s(DOY), data = Salamanders, family = tw, method = "ML")
# MuMIn model average
summary(model.avg(gam1, gam2, gam3, gam4))
And an excerpt from the results...
Model-averaged coefficients:
(full average)
Estimate Std. Error
(Intercept) -1.32278368618846586812765053764451295137405 0.16027398202204409805027296442858641967177
minedno 2.22006553885311141982583649223670363426208 0.19680444996609294805445244946895400062203
s(cover).1 0.00096638939252485735100645092288118576107 0.05129736767981037115493592182247084565461
s(cover).2 0.00360413985630353601863351542533564497717 0.18864911049300209233692271482141222804785
s(cover).3 0.00034381902619062468381624930735540601745 0.01890820689958183642431777116144075989723
s(cover).4 -0.00248365164684107844403349041328965540743 0.12950622739175629560826052966149291023612
s(cover).5 -0.00089826079366626997504963192398008686723 0.04660149540411069601919535898559843190014
s(cover).6 0.00242197856572917875894734862640689243563 0.12855093144749979439112053114513400942087
s(cover).7 -0.00032596616013735266745646179664674946252 0.02076865732570042782922925539423886220902
s(cover).8 0.00700001172809289889942263584998727310449 0.36609857217759655956257347497739829123020
s(cover).9 -0.17150069832114492318630993850092636421323 0.17672571419517621449379873865836998447776
s(DOP).1 0.00018839994220792031023870016781529557193 0.01119134546418791391342306695833030971698
s(DOP).2 -0.00081869157242861999301819508900734945200 0.04333670935815417402103832955617690458894
s(DOP).3 -0.00021538789478326670289408395486674407948 0.01164171952980479901595955993798270355910
s(DOP).4 0.00043433676942596419591827161532648915454 0.02463278659589070856972270462392771150917
This is a little easier to read if you don't print so many digits (see below):
Each smooth term is parameterized using multiple coefficients (9 by default), which is why we have multiple s.(whatever).xxx coefficients.
It's not clear to me what you want to do with the model-averaged results. It's usually best to make model-averaged predictions rather than trying to interpret model-averaged coefficients, which has some pitfalls ... There is a predict() method for objects of class "averaging" (which is what model.average() returns).
For further questions about interpretation you might want to ask on CrossValidated ...
Model-averaged coefficients:
(full average)
Estimate Std. Error Adjusted SE z value Pr(>|z|)
(Intercept) -1.323e+00 1.603e-01 1.606e-01 8.239 <2e-16 ***
minedno 2.220e+00 1.968e-01 1.971e-01 11.263 <2e-16 ***
s(cover).1 9.664e-04 5.130e-02 5.130e-02 0.019 0.985
s(cover).2 3.604e-03 1.886e-01 1.887e-01 0.019 0.985
s(cover).3 3.438e-04 1.891e-02 1.891e-02 0.018 0.985
s(cover).4 -2.484e-03 1.295e-01 1.295e-01 0.019 0.985
s(cover).5 -8.983e-04 4.660e-02 4.660e-02 0.019 0.985
s(cover).6 2.422e-03 1.286e-01 1.286e-01 0.019 0.985
s(cover).7 -3.260e-04 2.077e-02 2.078e-02 0.016 0.987
s(cover).8 7.000e-03 3.661e-01 3.661e-01 0.019 0.985
s(cover).9 -1.715e-01 1.767e-01 1.768e-01 0.970 0.332
s(DOP).1 1.884e-04 1.119e-02 1.120e-02 0.017 0.987
s(DOP).2 -8.187e-04 4.334e-02 4.334e-02 0.019 0.985
s(DOP).3 -2.154e-04 1.164e-02 1.164e-02 0.018 0.985
s(DOP).4 4.343e-04 2.463e-02 2.464e-02 0.018 0.986
s(DOP).5 -1.737e-04 1.019e-02 1.020e-02 0.017 0.986
s(DOP).6 -3.224e-04 1.790e-02 1.790e-02 0.018 0.986
s(DOP).7 2.991e-07 5.739e-04 5.750e-04 0.001 1.000
s(DOP).8 -1.756e-03 9.557e-02 9.559e-02 0.018 0.985
s(DOP).9 1.930e-02 5.630e-02 5.639e-02 0.342 0.732
s(DOY).1 5.189e-08 3.378e-04 3.384e-04 0.000 1.000
I'm calculating Wald tests (with the R package eRm) and tried without success to get more than 3 p-value decimals (I do need them because of alpha-correction)
Does someone have an idea, how in this specific output I can get more decimals?
Changing digits = .., in print() didn't work.
library("eRm")
res <- RM(ds_matrix)
wald <- Waldtest(res, splitcr = splitage)
print(wald)
## Wald test on item level (z-values):
## z-statistic p-value
## beta I01 1.489 0.136
## beta I02 0.908 0.364
## beta I03 0.402 0.688
w <- Waldtest(res, splitcr = splitage)
pvals <- w$coef.table[,"p-value"]
print(pvals,digits=22)
Results:
beta I1 beta I2 beta I5
0.5019397827755713858977 0.6252106345771608619799 0.6384882841798422692392
beta I6
0.7424853136244984330716
I'm wondering if it's possible to access (in some form) the information that is presented in the -forest- command in the -metafor- package.
I am checking / verifying results, and I'd like to have the output of values produced. Thus far, the calculations all check, but I'd like to have them available for printing, saving, etc. instead of having to type them out by hand.
Sample code is below :
es <- read.table(header=TRUE, text = "
b se_b
0.083 0.011
0.114 0.011
0.081 0.013
0.527 0.017
" )
library(metafor)
es.est <- rma(yi=b, sei=se_b, dat=es, method="DL")
studies <- as.vector( c("Larry (2011)" , "Curly (2011)", "Moe (2015)" , "Shemp (2010)" ) )
forest(es.est , transf=exp , slab = studies , refline = 1 , xlim=c(0,3), at = c(1, 1.5, 2, 2.5, 3, 3.5, 4) , showweights=TRUE)
I'd like to access the values (effect size and c.i. for each study, as well as the overall estimate, and c.i.) that are presented on the right of the graphic.
Thanks so much,
-Jon
How about:
> summary(escalc(measure="GEN", yi=b, sei=se_b, data=es), transf=exp)
b se_b yi vi sei zi ci.lb ci.ub
1 0.083 0.011 1.0865 0.0001 0.0110 7.5455 1.0634 1.1102
2 0.114 0.011 1.1208 0.0001 0.0110 10.3636 1.0968 1.1452
3 0.081 0.013 1.0844 0.0002 0.0130 6.2308 1.0571 1.1124
4 0.527 0.017 1.6938 0.0003 0.0170 31.0000 1.6383 1.7512
Then yi, ci.lb, and ci.ub provides the same info.
I´m trying to replicate in R a cox proportional hazard model estimation from Stata using the following data http://iojournal.org/wp-content/uploads/2015/05/FortnaReplicationData.dta
The command in stata is the following:
stset enddate2009, id(VPFid) fail(warends) origin(time startdate)
stcox HCTrebels o_rebstrength demdum independenceC transformC lnpop lngdppc africa diffreligion warage if keepobs==1, cluster(js_country)
Cox regression -- Breslow method for ties
No. of subjects = 104 Number of obs = 566
No. of failures = 86
Time at risk = 194190
Wald chi2(10) = 56.29
Log pseudolikelihood = -261.94776 Prob > chi2 = 0.0000
(Std. Err. adjusted for 49 clusters in js_countryid)
-------------------------------------------------------------------------------
| Robust
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
HCTrebels | .4089758 .1299916 -2.81 0.005 .2193542 .7625165
o_rebstrength | 1.157554 .2267867 0.75 0.455 .7884508 1.699447
demdum | .5893352 .2353317 -1.32 0.185 .2694405 1.289027
independenceC | .5348951 .1882826 -1.78 0.075 .268316 1.066328
transformC | .5277051 .1509665 -2.23 0.025 .3012164 .9244938
lnpop | .9374204 .0902072 -0.67 0.502 .7762899 1.131996
lngdppc | .9158258 .1727694 -0.47 0.641 .6327538 1.325534
africa | .5707749 .1671118 -1.92 0.055 .3215508 1.013165
diffreligion | 1.537959 .4472004 1.48 0.139 .869834 2.719275
warage | .9632408 .0290124 -1.24 0.214 .9080233 1.021816
-------------------------------------------------------------------------------
With R, I´m using the following:
data <- read.dta("FortnaReplicationData.dta")
data4 <- subset(data, keepobs==1)
data4$end_date <- data4$`_t`
data4$start_date <- data4$`_t0`
levels(data4$o_rebstrength) <- c(0:4)
data4$o_rebstrength <- as.numeric(levels(data4$o_rebstrength[data4$o_rebstrength])
data4 <- data4[,c("start_date", "end_date","HCTrebels", "o_rebstrength", "demdum", "independenceC", "transformC", "lnpop", "lngdppc", "africa", "diffreligion", "warage", "js_countryid", "warends")]
data4 <- na.omit(data4)
surv <- coxph(Surv(start_date, end_date, warends) ~ HCTrebels+ o_rebstrength +demdum + independenceC+ transformC+ lnpop+ lngdppc+ africa +diffreligion+ warage+cluster(js_countryid), data = data4, robust = TRUE, method="breslow")
coef exp(coef) se(coef) robust se z p
HCTrebels -0.8941 0.4090 0.3694 0.3146 -2.84 0.0045
o_rebstrength 0.1463 1.1576 0.2214 0.1939 0.75 0.4505
demdum -0.5288 0.5893 0.4123 0.3952 -1.34 0.1809
independenceC -0.6257 0.5349 0.3328 0.3484 -1.80 0.0725
transformC -0.6392 0.5277 0.3384 0.2831 -2.26 0.0240
lnpop -0.0646 0.9374 0.1185 0.0952 -0.68 0.4974
lngdppc -0.0879 0.9158 0.2060 0.1867 -0.47 0.6377
africa -0.5608 0.5708 0.3024 0.2898 -1.94 0.0530
diffreligion 0.4305 1.5380 0.3345 0.2878 1.50 0.1347
warage -0.0375 0.9632 0.0405 0.0298 -1.26 0.2090
Likelihood ratio test=30.1 on 10 df, p=0.000827
n= 566, number of events= 86
I get the same hazard ratio coefficients but the standard errors does not look the same. The Z and p values are close but not exactly the same. Why might be the difference between the results in R and Stata?
As user20650 noticed, when including "nohr" in the Stata options you get exactly the same standard errors as in R. Still there was a small difference in the standard errors when using clusters. user20650 again noticed that the difference was given because Stata default standard errors are multiplied g/(g − 1), where g is the number of cluster while R does not adjust these standard errors. So a solution is just to include noadjust in Stata or have the standard errors adjusted in R by doing:
sqrt(diag(vcov(surv))* (49/48))
If still we want in R to have the same standard errors from Stata, as when not specifying nohr, we need to know that when nhr is left off we obtain $exp(\beta)$ with the standard errors resulting from fitting the model in those scale. In particular obtained by applying the delta method to the original standard-error estimate. "The delta method obtains the standard error of a transformed variable by calculating the variance of the corresponding first-order Taylor expansion, which for the transform $exp(\beta)$ amounts to mutiplying the oringal standard error by $exp(\hat{\beta})$. This trick of calculation yields identical rsults as does transforming the parameters prior to estimation and then reestimating" (Cleves et al 2010). In R we can do it by using:
library(msm)
se <-diag(vcov(surv)* (49/48))
sapply(se, function(x) deltamethod(~ exp(x1), coef(surv)[which(se==x)], x))
HCTrebels o_rebstrength demdum independenceC transformC lnpop lngdppc africa diffreligion warage
0.1299916 0.2267867 0.2353317 0.1882826 0.1509665 0.0902072 0.1727694 0.1671118 0.4472004 0.02901243
I did coxph for my data and get result like this:
> z
Call:
coxph(formula = Surv(Years, Event) ~ y, data = x)
coef exp(coef) se(coef) z p
y 0.0714 1.07 0.288 0.248 0.8
Likelihood ratio test=0.06 on 1 df, p=0.804 n= 65, number of events= 49
I just want to save
y 0.0714 1.07 0.288 0.248 0.8
into a file. Because I do permutation and generate 1000 z.
I want to save them into a text file like this:
fin -0.3794 0.684 0.1914 -1.983 0.0470
age -0.0574 0.944 0.0220 -2.611 0.0090
race 0.3139 1.369 0.3080 1.019 0.3100
wexp -0.1498 0.861 0.2122 -0.706 0.4800
mar -0.4337 0.648 0.3819 -1.136 0.2600
paro -0.0849 0.919 0.1958 -0.434 0.6600
Anyone can help?
Thanks!
The coefficients are easily accessed by
summary(z)[['coefficients']]
and the confidence interval information by
summary(z)[['conf.int']]
To find out what the components of a summary.coxph object
str(summary(z))
My advice would be to create a list of your permutations
data_list <- list(data_1, ...., data_1000)
Then call
lots_models <- lapply(data_list, coxph, formula = Surv(Years, Event) ~ y)
Which creates a list of models
You can create the summaries by
lots_summaries <- lapply(lots_models, summary)
Extract the coefficients
all_coefficients <- lapply(lots_summaries, '[[', 'coefficients')
all_conf.int <- lapply(lots_summaries, '[[', 'conf.int')
Add a permutation id column (if you want)
all_coefs_id <- lapply(seq_along(data_list),
function(i) cbind(all_coefficients[[i]],i))
all_ci_id <- lapply(seq_along(data_list),
function(i) cbind(all_conf.int[[i]],i))
Then combine into a data.frame
all_coefs_df <- do.call(rbind, all_coefs_id)
all_ci_df <- do.call(rbind, all_ci_id)
Which you than then save as a text file