Using {gtsummary} to display confidence intervals for survey.design object? - r

{gtsummary} has the tbl_svysummary() function for producing summary statistics tables from survey.design objects created by the {survey} package. The {gtsummary} website provides an example of how to add confidence intervals for tbl_summary(), by defining custom functions for calculating CIs which are then passed to the statistics = argument in tbl_summary().
However, the documentation for tbl_svysummary() noted that "Unlike tbl_summary(), it is not possible to pass a custom function." I'm using a survey.design object since I'm applying weighting to my data, but I really like the output of {gtsummary}, so it would be great if I could find a way to add confidence intervals as I need to show these for reporting.
Any suggestions on how to achieve this, or is it not possible?

I am sorry to report that it is currently not possible. The way one would go about it using the add_stat() function (example here How to generate effect size [90%CI] in the summary table using R package “gtsummary”?). But that function has not yet been generalized to work with tbl_svysummary() objects.
I had never considered generalizing it until now, so thank you very much for your question. I opened a GitHub Issue to track implementation progress. You can subscribe to the issue to be notified when it is complete.
https://github.com/ddsjoberg/gtsummary/issues/688
Happy Programming!

Related

what tests to use to validate a dynamic factor model obtained using nowcasting package in r

I used the nowcast function from R package to use dynamic factor model to nowcast GDP using the extracted factors. I Have tried multiple combination of the initial variables and finally obtained this model which all variables in it seems significant and teh ales obtained for my variable of interest is acceptable.
enter image description here
But I can't find any reference about what tests on residuals that I need to do in order to validate this model.
I am really struggling and have been stuck in this for a month, I need to submit my graduation project this weekend and I really need this model to work. so any help will be very much appreciated. Thank you.
Update 1:
This is teh acf plot n residuals suggested by the same package nowcasting, I think my model passes that test and therefore I can use it. right?
enter image description here

Holt-Winters in r with function hw. Question on value of beta parameter and forecasting phase

I have used the function hw to analyze a time series
fcast<-hw(my_series,h=12,level=95,seasonal="multiplicative",damped=TRUE,lambda=NULL)
By looking at the fcast$model$par I observe the values of alpha, beta, gamma, phi, and the initial states.
I've also looked at the contents of fcast$model$states to see the evolution of all the values. I've tried to reproduce the results in Excel in order to understand the whole procedure.
To achieve the same values of b (trend) as in fcast$model$states I observe that I have to use a formula like the one in the bibliography about the Holt-Winters method:
b(t)=beta2*(l(t)-l(t-1)+(1-beta2)*phi*b(t-1)
But, if in fcast$model$par beta=0.08128968, I find that in order to achieve the same results I have to use beta2=0.50593541.
What's the reason for that? I don't see any relationship between beta and beta2.
I have also found that in order to get the same forecast as the one obtained with the hw function I have to use the following formulas once the data are finished:
l(t)=l(t-1)+b(t-1)
b(t)=phi*b(t-1)
^y(t)=(l(t-1)+b(t-1))*s(t-m)
I haven't found any bibliography on this forecasting phase, explaining that some parameters are no longer used. For instance, in this case phi is still used for b(t), but not used anymore for l(t).
Can anyone refer to any bibliography where I can find this part explained?
So in the end I've been able to reproduce the whole set of data in Excel, but there's a couple of steps I would like to understand better.
Thank you!!

Equivalent to fitcdiscr in R (regarding Coeffs.linear and Coeffs.Const)

I am currently translating some MATLAB scripts to R for Multivariate Data Analysis. Currently I am trying to generate the same data as the Coeffs.Linear and Coeffs.Const part of the fitdiscr function in MATLAB.
The code being used is:
fitcdiscr(data, groups, 'DiscrimType', 'linear');
The data consists of 3 groups.
Unfortunately the R function seems to do the LDA only for two LDs and MATLAB seems to always compare all groups in all constellations. Does anybody have an idea how I could obtain that data?
I suspect you mean information on the implementation of various MATLAB function, which would be doc <functionname> (doc fitcdiscr would yield this documentation page on fitcdscr) to get the documentation, and edit <functionname> to get the implementation, if it is not obscured by The MathWorks. If those two do not give you enough information, I'm afraid you're out of luck, since not all TMW codes are available non-obscured.
fitcdiscr is non-obscured, although very brief; it's just a wrapper for some other functions. Keep doing edit <functionname> and doc <functionname> and see how deep the rabbit hole takes you.
NB: there's no built-in function called fitdiscr, but the syntax you describe is that of fitcdiscr (note the c), so I used that as examples. If the actual function being called is named fitdiscr, it's custom-made and you'll have to spit through its file by edit fitdiscr and hope for the best.

R: [Indicspecies package] multipatt function: extract values from summary.multipatt

I am working with the 'indicspecies' package - multipatt function and am unable to extract summary values of the package. Unfortunately I can't print all the summary and am left with impartial information for my model. The reason is the huge amount of data that needs to be printed from the summary (300.000 different species, 3 groups, 6 comparable combinations).
This is what happens with summary being saved (pre-code incl.):
x <- multipatt(data, ...)
sumx <-summary(x)
sumx
NULL
str(sumx)
NULL
So, the summary does not work exactly like a generic summary. It seems that the function is based around the older indval function from the 'labdsv' package (which is mentioned in the documentation). I found an archived thread where a similar problem is discussed: http://r.789695.n4.nabble.com/extract-values-from-summary-of-function-indval-of-the-package-labdsv-td4637466.html
but it seems not resolved (and is not exactly about the same function, rather the base function indval).
I was wondering if anyone has experience with the indicspecies package and knows a way to either extract the info from the summary.
It is possible to extract significance and other information from the other saved data from the model, but it might be nice to just get a quick complete overview from the data.
ps. I tried
options(max.print=1000000)
but this didn't solve it for me.
I use to capture the summary output for a multipatt object, but don't any more because the p-values reported are not corrected for multiple testing. To answer the OP's question you can capture the summary output using capture.output
ex.
dat.multipatt.summary<-capture.output(summary(dat.multipatt, indvalcomp=TRUE))
Again, I do not recommend this. It is very important to correct the p-values for multiple testing, so the summary output actually isn't helpful. To be clear ?multipatt states:
"sign Data table with results of the best matching pattern, the association value and the degree of statistical significance of the association (i.e. p-values from permutation test). Note that p-values are not corrected for multiple testing."
I just posted an answer for how to correct the p-values here https://stats.stackexchange.com/questions/370724/indiscpecies-multipatt-and-overcoming-multi-comparrisons/401277#401277
I don't have any experience with this package and since you haven't provided the data, it's difficult to reproduce. But since summary is returning NULL, are you sure your x is computed properly? Check the object.size or class or something else of x to see if it indeed has any content.
Also instead of accessing all the contents of summary(x) together, you can use # to access slots of it (similar to $ in dataframe).
If you need further assistance, it'd be better t provide atleast a small subset or some other sample data so that the community can work with it.

Frailty estimates in coxph object

If one uses obj=coxph(... + frailty(id) ), then the object also returns (log)frailty estimates for each individual, which can be extracted with obj$frail.
Does anybody knows how these estimates are being obtained? Are they Empirical Bayes estimates?
Thanks!
Theodor
The default distribution for frailty can be seen in the ?frailty page to be "gamma". If you look at the frailty function (which is not hidden) you see that it simply pastes the name of the distribution onto "frailty." and uses get() to retrieve the proper function. So look at frailty.gamma (also not hidden) to find the answers to your question. Looking back at the help page again, you can see that I should have been able to figure all that out without looking at the code, since it's right up at the top of the page. But there are many routes to knowledge with R. (They are ML, not "empirical Bayes", estimates.)
The help page suggests to me that the author (Therneau) expects you to consult Therneau and Grambsch for further details not obvious from reading the code. If you are doing serious work with survival models in R that is a very useful book to have. It's very clear and helpful in understanding the underpinnings of the 'survival'-package.

Resources