Problem using 'anova_fun' argument in 'add_global_p()' - r

I'm tying to display the results of analysis-of-variance applied to univariate regressions into a table created with gtsummary::tbl_uvregression() using another function than car::Anova(), that cannot handle every type of models. In my case I would like to use Anova.clm() from RVAideMemoire package, which is directly based on car::Anova() function and specifically built for clm(m) objects.
Unfortunately, any time I try to use anova_fun argument of add_global_p(), I get an error telling me that mod (the model) should be specified into the function with no default value. I read the documentation of add_global_p(), and in fact it is specified that the "anova_fun" function is used "in place of car::Anova()" and that it must accept a model as one of its arguments. But as far as I know, Anova.clm() uses it.
Moreover, the thing is even when I try to use car::Anova() itself through the anova_fun argument, I still get the same error. I also tried to go through the source codes of the different functions I need here, but couldn't find the solution and begin to get lost...
Here a reprex :
## Load packages
library(survival)
library(gtsummary)
## Create the table using tbl_uvregression
tab <- trial %>%
select(response,age,trt) %>%
tbl_uvregression(method = glm,
method.args = list(family = binomial),
y = response,
exponentiate = T)
## Add global p-values from analysis-of-variance through 'anova_fun' argument
tab %>%
add_global_p(anova_fun = car::Anova)
## And the same for tbl_regression
trial %>%
glm(response ~ age + trt, data = ., family = binomial("logit")) %>%
tbl_regression(exponentiate = T) %>%
add_global_p(anova_fun = car::Anova)
Maybe I just misunderstand the use of anova_fun argument into add_global_p()... In that case could somebody help me to deal with it ? Or at least is there an other (simple) way to add aov global p-values to a tbl_uvregression object applied to clmm models ?

Related

Is it possible to use lqmm with a mira object?

I am using the package lqmm, to run a linear quantile mixed model on an imputed object of class mira from the package mice. I tried to make a reproducible example:
library(lqmm)
library(mice)
summary(airquality)
imputed<-mice(airquality,m=5)
summary(imputed)
fit1<-lqmm(Ozone~Solar.R+Wind+Temp+Day,random=~1,
tau=0.5, group= Month, data=airquality,na.action=na.omit)
fit1
summary(fit1)
fit2<-with(imputed, lqmm(Ozone~Solar.R+Wind+Temp+Day,random=~1,
tau=0.5, group= Month, na.action=na.omit))
"Error in lqmm(Ozone ~ Solar.R + Wind + Temp + Day, random = ~1, tau = 0.5, :
`data' must be a data frame"
Yes, it is possible to get lqmm() to work in mice. Viewing the code for lqmm(), it turns out that it's a picky function. It requires that the data argument is supplied, and although it appears to check if the data exists in another environment, it doesn't seem to work in this context. Fortunately, all we have to do to get this to work is capture the data supplied from mice and give it to lqmm().
fit2 <- with(imputed,
lqmm(Ozone ~ Solar.R + Wind + Temp + Day,
data = data.frame(mget(ls())),
random = ~1, tau = 0.5, group = Month, na.action = na.omit))
The explanation is that ls() gets the names of the variables available, mget() gets those variables as a list, and data.frame() converts them into a data frame.
The next problem you're going to find is that mice::pool() requires there to be tidy() and glance() methods to properly pool the multiple imputations. It looks like neither broom nor broom.mixed have those defined for lqmm. I threw together a very quick and dirty implementation, which you could use if you can't find anything else.
To get pool(fit2) to run you'll need to create the function tidy.lqmm() as below. Then pool() will assume the sample size is infinite and perform the calculations accordingly. You can also create the glance.lqmm() function before running pool(fit2), which will tell pool() the residual degrees of freedom. Afterwards you can use summary(pooled) to find the p-values.
tidy.lqmm <- function(x, conf.int = FALSE, conf.level = 0.95, ...) {
broom:::as_tidy_tibble(data.frame(
estimate = coef(x),
std.error = sqrt(
diag(summary(x, covariance = TRUE,
R = 50)$Cov[names(coef(x)),
names(coef(x))]))))
}
glance.lqmm <- function(x, ...) {
broom:::as_glance_tibble(
logLik = as.numeric(stats::logLik(x)),
df.residual = summary(x, R = 2)$rdf,
nobs = stats::nobs(x),
na_types = "rii")
}
Note: lqmm uses bootstrapping to estimate the standard error. By default it uses R = 50 bootstrapping replicates, which I've copied in the tidy.lqmm() function. You can change that line to increase the number of replicates if you like.
WARNING: Use these functions and the results with caution. I know just enough to be dangerous. To me it looks like these functions work to give sensible results, but there are probably intricacies that I'm not aware of. If you can find a more authoritative source for similar functions that work, or someone who is familiar with lqmm or pooling mixed models, I'd trust them more than me.

Behaviour of scatterplot() function in car package in R with regards to argument names

I'm learning how to use the car package in R and I've run into a minor format/syntactical issue with regards to using the scatterplot() function. My personal preference when using functions in R is to use the name of the arguments when applying the function. So in this case I would write:
scatterplot(formula = prestige ~ income, data = Prestige, id = list(n = 4))
When I do this I am shot the error:
> Error in scatterplot(formula = prestige ~ income, data = Prestige, id = list(n = 4)):
argument "x" is missing, with no default
But when writing the function without the formula argument in particular:
scatterplot(prestige ~ income, data = Prestige, id = list(n = 4))
Everything goes through just fine.
I'm trying to understand why this is the case? I have read up from a previous post on the difference between using = and <- as assignment operators in What are the differences between "=" and "<-" assignment operators in R?, but I don't think that is what is happening here. Is there a way to reconcile this? I do like to keep things along the lines of best practices so I don't fall into lazy habits.

Plotting logistic model with package visreg doesn´t work Data not found

My second question of the day: I want to use the visreg package to plot my logistic regression models. As long as I don't use the attribute "by" it works like a charm, but when I want to use it I get an error. The code I used to create my model is the following:
m3<- glm(alive ~ seatbelt*dvcat + sex + ageOFocc + airbag, family = binomial, data = nassCDS, start=)
summary(m3)
If I then use:
visreg(m3, "seatbelt", scale = "response")
I get the following result
which is just fine. But if I now add the "by" attribute I get an error:
visreg(m3, "seatbelt", by="dvcat", scale ="response")
I googled and as far as I understood it the function can't find the data to plot the model. But where can I supply the data? I already tried the "data=" attribute, but it wasn´t working for me (or I did it wrong). There is no console output that I can provide only the message on the graph itself. Can somebody help me? Kind regards, Jan :)
EDIT: I used the "nassCDS" datasat from vincent arel-bundocks github which you can find here: https://vincentarelbundock.github.io/Rdatasets/datasets.html I just inserted the column alive via the column dead so that i am able to use the logistic regression. Therefore i used the dplyr package with the following code:
nassCDS <- nassCDS %>%
mutate(dead1 = as.integer(dead)) %>%
mutate(alive = sjmisc::rec(dead1, rec = "2=0; 1=1")) %>%
select(seatbelt, dead, alive, dvcat, sex, ageOFocc, everything()) %>%
select(-dead1)
Furthermore i changed the columns airbag and seatbelt to numeric as it was suggested by one other stackoverflow user.

Error in eval(parse()) - r unable to find argument input

I am very new to R, and this is my first time of encountering the eval() function. So I am trying to use the med and boot.med function from the following package: mma. I am using it to conduct mediation analysis. med and boot.med take in models such as linear models, and dataframes that specify mediators and predictors and then estimate the mediation effect of each mediator.
The author of the package gives the flexible option of specifying one's own custom.function. From the source code of med, it can be seen that the custom.function is passed to the eval(). So I tried insert the gbmt function as the custom function. However, R kept giving me error message: Error during wrapup: Number of trees to be used in prediction must be provided. I have been searching online for days and tried many ways of specifying the number of trees parameter n.trees, but nothing works (I believe others have raised similar issues: post 1, post 2).
The following codes are part of the source code of the med function:
cf1 = gsub("responseY", "y[,j]", custom.function[j])
cf1 = gsub("dataset123", "x2", cf1)
cf1 = gsub("weights123", "w", cf1)
full.model[[j]] <- eval(parse(text = cf1))
One custom function example the author gives in the package documentation is as follows:
temp1<-med(data=data.bin,n=2,custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
Here the glm is the custom function. This example code works and you can replicate it easily (if you have mma installed and loaded). However when I am trying to use the gbmt function on a survival object, I got errors and here is what my code looks like:
temp1 <- med(data = data.surv,n=2,type = "link",
custom.function = 'gbmt(responseY ~.,
data = dataset123,
distribution = dist,
train_params = start_stop,
cv_folds=10,
keep_gbm_data = TRUE,
)')
Anyone has any idea how the argument about number of trees n.trees can be added somewhere in the above code?
Many thanks in advance!
Update: in order to replicate the example code, please install mma and try the following:
library("mma")
data("weight_behavior") ##binary x #binary y
x=weight_behavior[,c(2,4:14)]
pred=weight_behavior[,3]
y=weight_behavior[,15]
data.bin<-data.org(x,y,pred=pred,contmed=c(7:9,11:12),binmed=c(6,10), binref=c(1,1),catmed=5,catref=1,predref="M",alpha=0.4,alpha2=0.4)
temp1<-med(data=data.bin,n=2) #or use self-defined final function
temp1<-med(data=data.bin,n=2, custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
I changed the custom.function to gbmt and used a survival object as responseY and the error occurs. When I use the gbmt function on my data outside the med function, there is no error.

Can map() take functions with multiple inputs?

I want to loop glm/lm over multiple outcomes and predictors while stratified by groups. nest() and map() functions from purrr package seems to provide an elegant solution to stratification analysis. However, when I use a customized function which takes multiple input, map() doesn't seem to work.
In almost all the tutorials on map() from purrr I have seen,regression model examples are static -- the dependent and independent variables are explicitly defined in the function. Because I want to loop over dozens of outcomes and predictors, I am trying to write a lm() function that can iterate over different combinations.
library(dplyr)
library(broom)
library(tidyr)
library(purrr)
# example data set
set.seed(20)
df <- data.frame(
out = rep(c(0,1),5,replace=TRUE),
pre = sample(c(1:4),10,replace = TRUE),
var1 = sample(c(1:2),10,replace = TRUE),
var2 = sample(c(1:50),10,replace = TRUE),
group = sample(c(1:2),10,replace = TRUE)
)
explicit_fun<-function(data){
glm(out ~ pre + var1 + var2, data=data, family = binomial())
}
input_fun<-function(data, outcome, predictor, covariate){
glm(as.formula(paste(outcome,"~",predictor,"+",paste(covariate,collapse = "+"))),data=data,family = binomial())
}
# nesting the data set
df_by_group<-df%>%
group_by(group)%>%
nest()
it works fine with the explicit function
models <- df_by_group%>%
mutate(mod=purrr::map(data,explicit_fun))
models <- models%>%
mutate(
glance_glm=purrr::map(mod,broom::glance),
tidy_glm=purrr::map(mod,broom::tidy),
augment_glm=purrr::map(mod,broom::augment)
)
unnest(models,data)
unnest(models,glance_glm,.drop = TRUE)%>% View()
unnest(models,tidy_glm) %>% View()
it stops working when using the function has multiple inputs
models<-df_by_group%>%
mutate(mod=purrr::map(data,input_fun(data=.,outcome="out",predictor="pre",covariate=c("var1","var2"))))
I expect the input_fun would work the same as the explicit_fun, but I received the following error message:
Error in mutate_impl(.data, dots) :
Evaluation error: Can't convert a `glm/lm` object to function
Call `rlang::last_error()` to see a backtrace.
You need to pass a function to map(). Right now, you are calling a function in the second parameter, not passing a function. The quickest way to fix this is to use the formula syntax to create a function. Try
models <- df_by_group%>%
mutate(mod=purrr::map(data, ~input_fun(data=.,outcome="out",predictor="pre",covariate=c("var1","var2"))))
This delays the evaluation of input_fun till the map actually happens and properly fills in the . value.

Resources