I have a multiply imputed dataset of class mids. I use the with() function to estimate the m different datasets with the coxph() function.
However, I'm having trouble using the with() function within my own function.
The code below is a simplified example that reproduces the error: Error in Surv(enter,exit,event) object 'enter' not found
list<-"X1+X2"
var.used<-formula(paste("Surv(enter,exit,event)~",list,sep=""))
with.coxph<-function(form,dataset){
with(dataset,coxph(form))
}
with.coxph(var.used,data)
When I simply run the function on its own:
with(dataset, coxph(Surv(enter,exit,event)~X1+X2))
It works fine.
I think the problem is related to the environment where with() is called. I found different posts in here, but I can't seem to make it work. I tried assigning the dataset and the formula to the global environment:
with.coxph2<-function(form,dataset){
assign(".dataset",dataset,envir=.GlobalEnv)
assign(".form",dataset,envir=.GlobalEnv)
with(dataset,coxph(form))
remove(".dataset",dataset,envir=.GlobalEnv)
remove(".form",dataset,envir=.GlobalEnv)
}
with.coxph2(var.used,data)
but this produced the same error.
EDIT
I have attempted to fix the problem as described below. When i simply run the function with out the with() statement it works perfectly.
makeModel<-function(resp, explan, mData) {
mF <- formula(paste(resp, paste(explan, collapse = "+"), sep = "~"))
mod <- coxph(mF, data = mData)
mod$call$formula <- mF
mod$call$data <- as.symbol(deparse(substitute(mData)))
mod
}
cp <- makeModel("Surv(start, stop, event)", "X1", complete(data))
# This works fine
However, I still get the same error when I include the with() statement in the equation:
with.coxph<-function(resp, explan, mData) {
mF <- formula(paste(resp, paste(explan, collapse = "+"), sep = "~"))
mod <- with(mData,coxph(mF))
mod$call$formula <- mF
mod$call$data <- as.symbol(deparse(substitute(mData)))
mod
}
cp <- with.coxph("Surv(start, stop, event)", "X1", data)
# Error in Surv(enter,exit,event): object 'enter' not found
I had similiar issues when using the lm function and I wanted to pass the formula and/or data argument to it. Tha's what I am doing now to circumvent that:
makeModel <- function(resp, explan, mData) {
mF <- formula(paste(resp, paste(explan, collapse = "+"), sep = "~"))
mod <- coxph(mF, data = mData)
mod$call$formula <- mF
mod$call$data <- as.symbol(deparse(substitute(mData)))
mod
}
makeModelBad <- function(resp, explan, mData) {
mF <- formula(paste(resp, paste(explan, collapse = "+"), sep = "~"))
coxph(mF, data = mData)
}
library(survival)
example(coxph) # to load the data
(cp <- makeModel("Surv(start, stop, event)", "x", test2))
# Call:
# coxph(formula = Surv(start, stop, event) ~ x, data = test2)
#
#
# coef exp(coef) se(coef) z p
# x -0.0211 0.979 0.795 -0.0265 0.98
#
# Likelihood ratio test=0 on 1 df, p=0.979 n= 10, number of events= 7
cp.bad <- makeModelBad("Surv(start, stop, event)", "x", test2)
Explanation
In order to use the models created inside a function, I had to explicitely change the respective slots, because mData is not known outside the function and a call to update for example will fail:
update(cp, . ~ 1) # works
update(cp.bad, . ~ 1) # does not work
# Error in terms.formula(formula, special, data = data) :
# object 'mData' not found
The change to the formula slot, is more eye candy to show the formula in the print of the object.
Related
I have been stymied by an error that traces back to predict.lme, running inside a function, failing to interpret a formula based on a variable that has been passed from outside the function. I know the issue has to do with variable scope and different environments, but I've been unable to fully understand it or find a workaround. Your help would be much appreciated.
Here's a reproducible example:
# This will be the nested function.
train_test_perf <- function(train_data, test_data, model, termLabels) {
fixForm <- reformulate(termlabels=termLabels, response="Y")
fit <- nlme::lme(fixForm, data=train_data, random=~ 1|ID)
train_pred <- predict(fit, newdata=train_data, level=0, na.action=na.exclude)
rtrain <- cor.test(train_data$Y, train_pred)
test_pred <- predict(fit, newdata=test_data, level=0, na.action=na.exclude)
rtest <- cor.test(test_data$Y, test_pred)
tmp <- data.frame(Model=model,
R_train=rtrain$estimate,
R_test=rtest$estimate)
return(tmp)
}
# And here is the function that calls it.
myfunc <- function(df, newdf, varList) {
for (v in varList) {
perf <- train_test_perf(train_data=df, test_data=newdf, model=v, termLabels=v)
print(perf)
}
}
# The outer function call.
myfunc(df=dat, newdf=newdat, varList=list("W", "X"))
Running this gives the following error and traceback:
Error in eval(mCall$fixed) : object 'fixForm' not found
7.
eval(mCall$fixed)
6.
eval(mCall$fixed)
5.
eval(eval(mCall$fixed)[-2])
4.
predict.lme(fit, newdata = train_data, level = 0, na.action = na.exclude)
3.
predict(fit, newdata = train_data, level = 0, na.action = na.exclude)
2.
train_test_perf(train_data = df, test_data = newdf, model = v,
termLabels = v)
1.
myfunc(df = dat, newdf = newdat, varList = list("W", "X"))
It seems clear that predict.lme does not have access to the fixForm variable, but I haven't been able to work out a way to both define a formula based on a variable and have the value accessible to predict.lme. I'm not sure whether the nested function structure is part of the problem here--if it is, I would prefer to find a workaround that would maintain this structure, as my real-life code includes some other things inside myfunc that occur before and after the call to train_test_perf.
Thanks,
Jeff Phillips
Using a variable as formula doesn't stores the variable not the formula which might be the issue. We can use a do.call.
train_test_perf <- function(train_data, test_data, model, termLabels) {
fixForm <- reformulate(termlabels=termLabels, response="Y")
fit <- do.call(nlme::lme, list(fixForm, data=quote(train_data), random=~ 1|ID))
train_pred <- predict(fit, newdata=train_data, level=0, na.action=na.exclude)
rtrain <- cor.test(train_data$Y, train_pred)
test_pred <- predict(fit, newdata=test_data, level=0, na.action=na.exclude)
rtest <- cor.test(test_data$Y, test_pred)
tmp <- data.frame(Model=model, R_train=rtrain$estimate,
R_test=rtest$estimate)
return(tmp)
}
Finally put it in an sapply to avoid tedious for loops.
t(sapply(c("W", "X"), \(x) train_test_perf(train_data=dat, test_data=newdat, model=x, termLabels=x)))
# Model R_train R_test
# [1,] "W" 0.1686495 -0.001738604
# [2,] "X" 0.4138526 0.2992374
I want to run logistic regression for multiple parameters and store the different metrics i.e AUC.
I wrote the function below but I get an error when I call it: Error in eval(predvars, data, env) : object 'X0' not found even if the variable exists in both my training and testing dataset. Any idea?
new.function <- function(a) {
model = glm(extry~a,family=binomial("logit"),data = train_df)
pred.prob <- predict(model,test_df, type='response')
predictFull <- prediction(pred.prob, test_df$extry)
auc_ROCR <- performance(predictFull, measure = "auc")
my_list <- list("AUC" = auc_ROCR)
return(my_list)
}
# Call the function new.function supplying 6 as an argument.
les <- new.function(X0)
The main reason why your function didn't work is that you are trying to call an object into a formula. You can fix it with paste formula function, but that is ultimately quite limiting.
I suggest instead that you consider using update. This allow you more flexibility to change with multiple variable combination, or change a training dataset, without breaking the function.
model = glm(extry~a,family=binomial("logit"),data = train_df)
new.model = update(model, .~X0)
new.function <- function(model){
pred.prob <- predict(model, test_df, type='response')
predictFull <- prediction(pred.prob, test_df$extry)
auc_ROCR <- performance(predictFull, measure = "auc")
my_list <- list("AUC" = auc_ROCR)
return(my_list)
}
les <- new.function(new.model)
The function can be further improved by calling the test_df as a separate argument, so that you can fit it with an alternative testing data.
To run the function in the way you intended, you would need to use non-standard evaluation to capture the symbol and insert it in a formula. This can be done using match.call and as.formula. Here's a fully reproducible example using dummy data:
new.function <- function(a) {
# Convert symbol to character
a <- as.character(match.call()$a)
# Build formula from character strings
form <- as.formula(paste("extry", a, sep = "~"))
model <- glm(form, family = binomial("logit"), data = train_df)
pred.prob <- predict(model, test_df, type = 'response')
predictFull <- ROCR::prediction(pred.prob, test_df$extry)
auc_ROCR <- ROCR::performance(predictFull, "auc")
list("AUC" = auc_ROCR)
}
Now we can call the function in the way you intended:
new.function(X0)
#> $AUC
#> A performance instance
#> 'Area under the ROC curve'
new.function(X1)
#> $AUC
#> A performance instance
#> 'Area under the ROC curve'
If you want to see the actual area under the curve you would need to do:
new.function(X0)$AUC#y.values[[1]]
#> [1] 0.6599759
So you may wish to modify your function so that the list contains auc_ROCR#y.values[[1]] rather than auc_ROCR
Data used
set.seed(1)
train_df <- data.frame(X0 = sample(100), X1 = sample(100))
train_df$extry <- rbinom(100, 1, (train_df$X0 + train_df$X1)/200)
test_df <- data.frame(X0 = sample(100), X1 = sample(100))
test_df$extry <- rbinom(100, 1, (test_df$X0 + test_df$X1)/200)
Created on 2022-06-29 by the reprex package (v2.0.1)
I am trying to write my own modeling function in R, one which takes a formula, some data, and maybe some extra context, like weights; after calling model.frame to extract the necessary numeric data, it will perform a fit. My first pass looked like:
my_modfunc <- function(formula,data,weights=NULL) {
mf <- model.frame(formula,data=data,weights=weights)
wt <- model.weights(mf)
# do some fitting here...
}
# make fake data to test it
set.seed(1234)
data <- data.frame(x1=rnorm(50),x2=rnorm(50),y=rnorm(50),w=runif(50))
# call it:
my_modfunc(y ~ x1 + x2,data=data,weights=w)
This fails, I get the error:
Error in model.frame.default(formula, data = data, weights = weights) :
invalid type (closure) for variable '(weights)'
Similarly, if I call
my_modfunc(y ~ x1 + x2,data=data,weights='w')
I get the same error. I suspect there is some problem with environment, quoting, and so on.
Cutting and pasting the source for lm, I could rewrite my function as
# based on lm
weird_modfunc <- function(formula,data,weights=NULL ) {
cl <- match.call() # what?
mf <- match.call(expand.dots = FALSE) # what??
m <- match(c("formula", "data", "weights"), names(mf), 0L)
mf <- mf[c(1L, m)] # ??
mf$drop.unused.levels <- TRUE # ??
mf[[1L]] <- quote(stats::model.frame) ## ???
mf <- eval(mf, parent.frame())
wt <- as.vector(model.weights(mf))
# do some fitting here...
}
# this runs without error:
weird_modfunc(y ~ x1 + x2,data=data,weights=w)
# this fails with the same error as above about variable lengths.
weird_modfunc(y ~ x1 + x2,data=data,weights='w')
The problem is that this contains multiple somewhat mystical incantations that I do not know how to interpret, modify or maintain.
What is the right way to call model.frame? Bonus points for making my function accept both weights=w and weights='w'
Welcome to the joys of non-standard evaluation. I suggest you base your function on the lm approach. It constructs a call to model.frame and evaluates it. That's necessary, because model.frame does non-standard evaluation, i.e., it accepts/expects a symbol for the weights parameter. Furthermore, it also ensures correct scoping regarding the formula's environment.
weird_modfunc <- function(formula,data,weights=NULL ) {
#cl not needed, lm only adds this call to the return object
mf <- match.call(expand.dots = FALSE)
message("Call with ellipses not expanded: ")
#note that there are no ellipses in the function arguments for now,
#but you might want to change that later
print(mf)
#turn weights into symbol if character is passed
if (is.character(mf$weights)) mf$weights <- as.symbol(mf$weights)
m <- match(c("formula", "data", "weights"), names(mf), 0L)
message("Position of formula, data and weights in the call:")
print(m)
mf <- mf[c(1L, m)]
message("New call that only contains what is needed:")
print(mf)
mf$drop.unused.levels <- TRUE
message("Call with argument added:")
print(mf)
mf[[1L]] <- quote(stats::model.frame)
message("Change call to a call to model.frame:")
print(mf)
mf <- eval(mf, parent.frame()) #evaluate call
wt <- as.vector(model.weights(mf))
# do some fitting here...
message("Return value:")
wt
}
# this runs without error:
weird_modfunc(y ~ x1 + x2,data=data,weights=w)
#Call with ellipses not expanded:
#weird_modfunc(formula = y ~ x1 + x2, data = data, weights = w)
#Position of formula, data and weights in the call
#[1] 2 3 4
#New call that only contains what is needed:
#weird_modfunc(formula = y ~ x1 + x2, data = data, weights = w)
#Call with argument added:
#weird_modfunc(formula = y ~ x1 + x2, data = data, weights = w,
# drop.unused.levels = TRUE)
#Change call to a call to model.frame:
#stats::model.frame(formula = y ~ x1 + x2, data = data, weights = w,
# drop.unused.levels = TRUE)
#Return value:
# [1] 0.35299850 0.98095832 0.53888276 0.44403386 0.94936678 0.45248337 0.19062580 0.99160915 0.54845545 0.76881577 0.91342167 0.68211200 0.40725142
#[14] 0.40759230 0.14608279 0.19666771 0.19220934 0.40841440 0.34822131 0.83454285 0.19840001 0.86180531 0.39718531 0.15325377 0.33928338 0.36718044
#[27] 0.42737908 0.18633690 0.65801660 0.92041138 0.73389406 0.88231927 0.95334653 0.19490154 0.47261674 0.38605066 0.37416586 0.02785566 0.92935521
#[40] 0.41052928 0.95584022 0.27215284 0.51724649 0.97830984 0.36969649 0.31043044 0.03420963 0.66756585 0.92091638 0.04498960
#this runs without error too:
weird_modfunc(y ~ x1 + x2,data=data,weights='w')
Here is a simpler version but there might be problems (well, more than usual with non-standard evaluation):
my_modfunc <- function(formula,data,weights=NULL) {
weights <- substitute(weights)
if (!is.symbol(weights)) weights <- as.symbol(weights)
#substitute the symbol into the call:
mf <- eval(substitute(model.frame(formula,data=data,weights=weights)))
wt <- model.weights(mf)
# do some fitting here...
wt
}
my_modfunc(y ~ x1 + x2,data=data,weights=w)
#works
my_modfunc(y ~ x1 + x2,data=data,weights="w")
#works
I built a model using ucm function. However when i try to forecast for future, it's not letting me pass the independent variables.
library("rucm")
library("lubridate")
#Create Simulated Data
Date<- as.Date(seq(from=as.Date('2012-01-01'),to=as.Date('2014-03-31'),by=1),"%Y-%m-%d")
Actual <- sample(27:65,length(Date),replace = TRUE)
DOW <- wday(Date)
Month <- month(Date)
DOM <- mday(Date)
Week <- week(Date)
Ya <- year(Date)
Ya <- ifelse(Ya=="2014",2,1)
a <- data.frame(Date,Actual,DOW,Month,DOM,Week,Ya)
a$Date<-as.Date(a$Date,"%Y-%m-%d")
abc <- cbind(Weekday=model.matrix(~as.factor(a$DOW)),
Mont=model.matrix(~as.factor(a$Month)),
Day=model.matrix(~as.factor(a$DOM)),a[,7,drop=FALSE],
Weekofyear=model.matrix(~as.factor(a$Week)))
abc<-data.frame(abc)
abc<-data.frame(abc[,c(-1,-8,-20,-52)])
abc2 <- subset(abc,abc$Ya==1)
abc2 <- abc2[,-48]
abc3 <- subset(abc,abc$Ya==2)
abc3 <- abc3[,-48]
#train and insample MAPE
a1<-subset(a,a$Ya==1)
a2<-subset(a,a$Ya==2)
#build model
dat <- as.data.frame(cbind(a1[,2,drop=FALSE], abc2))
fo <- as.formula(paste("Actual ~ ", paste(names(dat)[2:42], collapse= "+")))
fit_train_ucm <- ucm(fo, data = dat, cycle = TRUE, cycle.period = 365)
#predict for future
predict(fit_train_ucm,n.ahead = 90,newdata = abc3)
i am getting this error
Error in is.SSModel(newdata, na.check = TRUE, return.logical = FALSE) :
Object is not of class 'SSModel'
Update:
Based on suggestion by package author from GitHub, i used below code(sorry to say this is not very self explanatory, i did however try to edit code)
SSModel(rep(NA,nrow(abc3)) ~ x + SSMtrend(2, Q = list(fit_train_ucm$est.var.level, fit_train_ucm$est.var.slope)) + SSMseasonal(12, Q = fit_train_ucm$est.var.season), H = fit_train_ucm$irr.var, data=abc3)
Now the error message is
Error in eval(expr, envir, enclos) : object 'x' not found
Any help on this?
This bug of prediction with newdata in ucm is yet to corrected.
This is how you can get predictions for an out-of-sample period.
indep <- paste(names(dat)[2:42], collapse= "+")
newdata1 <- SSModel(as.formula(paste0("rep(NA,nrow(abc3)) ~ ", indep, "+ SSMtrend(1, Q = list(fit_train_ucm$est.var.level))",
"+ SSMcycle(365, Q = fit_train_ucm$est.var.cycle)")), H = fit_train_ucm$irr.var, data=abc3)
pred<-predict(fit_train_ucm$model, newdata=newdata1)
So basically here I am using the predict function from KFAS package. To use the predict function, I have to define the data to be an object of class SSModel.
You will take all the parameters that you used in the model as independent variables and run a State space model with NA's as independent variable.
P.S.: I will keep the comments in mind to write a better answer next time.
Just so someone lands at this page again in future - there has been some update on the git repo maintained by the author to approach this problem in simplistic way. Please check this commit
Alternatively, pasting the function which one may use from that commit:
predict.ucm <- function(object, n.ahead, newdata,...){
#### Predict in sample ####
if (missing(newdata)) {
return(predict(object = object$model, n.ahead = n.ahead))
}
#### Predict out of sample ####
# Regression variables.
model_variables <- paste0(names(object$est), collapse = " + ")
# Trend
# Case 1 no trend
if (is.null(object$est.var.level) & is.null(object$est.var.slope)) {
model_trend <- ""
}
# Case 2 level and trend
if (!is.null(object$est.var.level) & !is.null(object$est.var.slope)) {
model_trend <- "+ SSMtrend(degree = 2, Q = list(object$est.var.level, object$est.var.slope))"
}
# Case 3 level only / trend only is not allowed in R
if (!is.null(object$est.var.level) & is.null(object$est.var.slope)) {
model_trend <- "+ SSMtrend(degree = 1, Q = list(object$est.var.level))"
}
# Seasonality
if (!is.null(object$est.var.season)) {
model_season <- sprintf("+ SSMseasonal(period = %s, Q = object$est.var.season)",
object$call['season.length'] %>% as.character())
} else {
model_season <- ""
}
# Cycle
if (!is.null(object$est.var.cycle)) {
model_cycle <- sprintf("+ SSMcycle(period = %s, Q = object$est.var.cycle)",
object$call['cycle.period'] %>% as.character())
} else {
model_cycle <- ""
}
# Combine all components into a formula
model_formula <- as.formula(sprintf("rep(NA,nrow(newdata)) ~ %s %s %s %s",
model_variables,
model_trend,
model_season,
model_cycle
))
# Build a SSM object for the prediction
oos_data <- KFAS::SSModel(formula = model_formula, H = object$irr.var, data = newdata)
# Return the predictions
predict(object$model, newdata = oos_data)
}
EDIT
In case you are interested in predicting on a new data which is considered as for the same time period as the original model was built using, then following edit can be made in the above function. That new data should ideally include the same dependent variable values which was used to build the first model.
# Using same dependent which was used initially to fit the estimates, instead of NA
model_formula <- as.formula(paste0(
object$model$terms[[2]],
sprintf(
" ~ %s %s %s %s",
model_variables,
model_trend,
model_season,
model_cycle
)
))
# Build a SSM object for the prediction
oos_data <- KFAS::SSModel(formula = model_formula, H = object$irr.var, data = newdata)
# Return the predictions for in sample only - hence removing newdata arg
# predict(object$model, newdata = oos_data)
predict(oos_data)
I want use survfit() and basehaz() inside a function, but they do not work. Could you take a look at this problem. Thanks for your help. The following code leads to the error:
library(survival)
n <- 50 # total sample size
nclust <- 5 # number of clusters
clusters <- rep(1:nclust,each=n/nclust)
beta0 <- c(1,2)
set.seed(13)
#generate phmm data set
Z <- cbind(Z1=sample(0:1,n,replace=TRUE),
Z2=sample(0:1,n,replace=TRUE),
Z3=sample(0:1,n,replace=TRUE))
b <- cbind(rep(rnorm(nclust),each=n/nclust),rep(rnorm(nclust),each=n/nclust))
Wb <- matrix(0,n,2)
for( j in 1:2) Wb[,j] <- Z[,j]*b[,j]
Wb <- apply(Wb,1,sum)
T <- -log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb)
C <- runif(n,0,1)
time <- ifelse(T<C,T,C)
event <- ifelse(T<=C,1,0)
mean(event)
phmmd <- data.frame(Z)
phmmd$cluster <- clusters
phmmd$time <- time
phmmd$event <- event
fmla <- as.formula("Surv(time, event) ~ Z1 + Z2")
BaseFun <- function(x){
start.coxph <- coxph(x, phmmd)
print(start.coxph)
betahat <- start.coxph$coefficient
print(betahat)
print(333)
print(survfit(start.coxph))
m <- basehaz(start.coxph)
print(m)
}
BaseFun(fmla)
Error in formula.default(object, env = baseenv()) : invalid formula
But the following function works:
fit <- coxph(fmla, phmmd)
basehaz(fit)
It is a problem of scoping.
Notice that the environment of basehaz is:
environment(basehaz)
<environment: namespace:survival>
meanwhile:
environment(BaseFun)
<environment: R_GlobalEnv>
Therefore that is why the function basehaz cannot find the local variable inside the function.
A possible solution is to send x to the top using assign:
BaseFun <- function(x){
assign('x',x,pos=.GlobalEnv)
start.coxph <- coxph(x, phmmd)
print(start.coxph)
betahat <- start.coxph$coefficient
print(betahat)
print(333)
print(survfit(start.coxph))
m <- basehaz(start.coxph)
print(m)
rm(x)
}
BaseFun(fmla)
Other solutions may involved dealing with the environments more directly.
I'm following up on #moli's comment to #aatrujillob's answer. They were helpful so I thought I would explain how it solved things for me and a similar problem with the rpart and partykit packages.
Some toy data:
N <- 200
data <- data.frame(X = rnorm(N),W = rbinom(N,1,0.5))
data <- within( data, expr = {
trtprob <- 0.4 + 0.08*X + 0.2*W -0.05*X*W
Trt <- rbinom(N, 1, trtprob)
outprob <- 0.55 + 0.03*X -0.1*W - 0.3*Trt
Outcome <- rbinom(N,1,outprob)
rm(outprob, trtprob)
})
I want to split the data to training (train_data) and testing sets, and train the classification tree on train_data.
Here's the formula I want to use, and the issue with the following example. When I define this formula, the train_data object does not yet exist.
my_formula <- Trt~W+X
exists("train_data")
# [1] FALSE
exists("train_data", envir = environment(my_formula))
# [1] FALSE
Here's my function, which is similar to the original function. Again,
badFunc <- function(data, my_formula){
train_data <- data[1:100,]
ct_train <- rpart::rpart(
data= train_data,
formula = my_formula,
method = "class")
ct_party <- partykit::as.party(ct_train)
}
Trying to run this function throws an error similar to OP's.
library(rpart)
library(partykit)
bad_out <- badFunc(data=data, my_formula = my_formula)
# Error in is.data.frame(data) : object 'train_data' not found
# 10. is.data.frame(data)
# 9. model.frame.default(formula = Trt ~ W + X, data = train_data,
# na.action = function (x) {Terms <- attr(x, "terms") ...
# 8. stats::model.frame(formula = Trt ~ W + X, data = train_data,
# na.action = function (x) {Terms <- attr(x, "terms") ...
# 7. eval(expr, envir, enclos)
# 6. eval(mf, env)
# 5. model.frame.rpart(obj)
# 4. model.frame(obj)
# 3. as.party.rpart(ct_train)
# 2. partykit::as.party(ct_train)
# 1. badFunc(data = data, my_formula = my_formula)
print(bad_out)
# Error in print(bad_out) : object 'bad_out' not found
Luckily, rpart() is like coxph() in that you can specify the argument model=TRUE to solve these issues. Here it is again, with that extra argument.
goodFunc <- function(data, my_formula){
train_data <- data[1:100,]
ct_train <- rpart::rpart(
data= train_data,
## This solved it for me
model=TRUE,
##
formula = my_formula,
method = "class")
ct_party <- partykit::as.party(ct_train)
}
good_out <- goodFunc(data=data, my_formula = my_formula)
print(good_out)
# Model formula:
# Trt ~ W + X
#
# Fitted party:
# [1] root
# | [2] X >= 1.59791: 0.143 (n = 7, err = 0.9)
##### etc
documentation for model argument in rpart():
model:
if logical: keep a copy of the model frame in the result? If
the input value for model is a model frame (likely from an earlier
call to the rpart function), then this frame is used rather than
constructing new data.
Formulas can be tricky as they use lexical scoping and environments in a way that is not always natural (to me). Thank goodness Terry Therneau has made our lives easier with model=TRUE in these two packages!