How can I account for "failed" learners in benchmark() - r

I have a large list of task/learner/resampling combinations. I execute the resampling via
design = data.table(
task = list_of_tasks,
learner = list_of_learners,
resampling = list_of_resamplings
)
bmr = benchmark(design)
tab = bmr$aggregate(c(msr("classif.acc")))
The last command fails and I get the following error message:
Error in assert_classif(truth, response = response) : Assertion on 'response' failed: Contains missing values (element 1).
How can I check what went wrong? The learners have worked for slightly different tasks and they are all combination of "standard" learners (svm, naive bayes) with a preceding po("scale"). There are not missing data in the predictors of the targets of the tasks.

At least one learner predicted NAs. Search for NAs in the predictions to identify the failing learner.
library(mlr3)
library(mlr3misc)
# experiment
learner_rpart = lrn("classif.rpart")
learner_debug = lrn("classif.debug", predict_missing = 0.5)
task = tsk("pima")
resampling = rsmp("cv", folds = 3)
design = benchmark_grid(task, list(learner_rpart, learner_debug), resampling)
bmr = benchmark(design)
# search for predictions with NAs
tab = as.data.table(bmr)
tab[map_lgl(tab$prediction, function(pred) any(is.na(pred$response)))]
You should post a new question with a reprex including the failing learner, task and resampling.

Related

extract_inner_fselect_results is NULL with mlr3 Nested Resampling

This question is an extension of the following question: No Model Stored with Mlr3.
I have been performing nested resampling to get an unbiased metric of model performance. If I don't specify store_models=TRUE then I get Error: No model stored at the end of the run. However, if I specify store_models=TRUE in both the at and resample calls then RStudio crashes due to RAM consumption.
I have now tried the following code in which I specified store_models=TRUE for just the at call:
MSvCon<-read.csv("MS v Control Proteomics Final.csv", row.names=1)
MSvCon$Status<-as.factor(MSvCon$Status)
MSvCon[,2:4399]<-scale(MSvCon[,2:4399], center=TRUE, scale=TRUE)
set.seed(123, "L'Ecuyer")
task = as_task_classif(MSvCon, target = "Status")
learner = lrn("classif.ranger", importance = "impurity", num.trees=10000)
set_threads(learner, n = 8)
measure = msr("classif.fbeta", beta=1, average="micro")
terminator = trm("none")
resampling_inner = rsmp("repeated_cv", folds = 10, repeats = 10)
at = AutoFSelector$new(
learner = learner,
resampling = resampling_inner,
measure = measure,
terminator = terminator,
fselect = fs("rfe", n_features = 1, feature_fraction = 0.5, recursive = FALSE),
store_models=TRUE)
resampling_outer = rsmp("repeated_cv", folds = 10, repeats = 10)
rr = resample(task, at, resampling_outer)
After finishing, I am able to extract performance measures successfully. However, I tried to use extract_inner_fselect_results and extract_inner_fselect_archives to check what features were selected and importance measures but received a NULL result.
Do you have any suggestions on what I would need to adjust in my code to see this information? I anticipate that adding store_models=TRUE to the resample call would but the RAM consumption issue (even using 128GB on Rstudio Workbench) prevents that. Is there a way around this?
The archives of the inner resampling are stored in the model slot of the AutoFSelectors i.e. without store_models = TRUE in resample() you cannot access the inner results and archives. I will write a workaround for you and answer in the other question.

How can I use custom resampling respecting temporal order for non identical tasks with different sizes?

I have tasks where the rows have temporal order (e.g. monthly data).
I want to perform a "loo" type resampling, but the training data must always be earlier than the test data. So what I do is to generate a custom resampling in the following manner:
# Instantiate Resampling
resampling_backtest = rsmp("custom")
train_sets = list(1:30) # n.b. we just deliberately call the list of splits "train_sets" and "test_sets"
test_sets = list(31) # for later use in the instantiated resampling class, they will automatically be named "train_set" and "test_set" and be lists
for (testmonth in (32:task$nrow)) {
train_sets <- append(train_sets, list(c(1:(testmonth-1))))
test_sets <- append(test_sets, list(c(testmonth)))
}
resampling_backtest$instantiate(task, train_sets, test_sets)
My tasks are different subsets of a large sample that has one "Date" column. All of the subsamples are "ordered", as I first use task_n <- TaskClassif$new(...) and then task_n$set_col_roles("Date", roles = "order") for each of my n tasks.
Now, I have 2 problems:
I have defined the resampling schemes, but a row id value of e.g. "2" will refer to different months. This may be not a real problem, if it were not for the point below
When I make a list of the n tasks (list_of_tasks=list(task_1,...task_n)) and define a benchmark as below, I will get an error message
design = benchmark_grid(
tasks = list_of_tasks,
learners = list_of_learners,
resamplings = resampling_backtest
)
The error message is Error: All tasks must be uninstantiated, or must have the same number of rows.
So, what can I do here? Is there a way to hand over the resampling "uninstantiated"? Or do I need to manually define a resampling scheme for each of the n tasks separately? If yes, how can I hand that over to benchmark_grid()?
Or do I need to manually define a resampling scheme for each of the n tasks separately?
Yes. Just create the benchmark design manually with data.table(). An example with instantiated resamplings:
library(mlr3)
library(data.table)
task_pima = tsk("pima")
task_spam = tsk("spam")
resampling_pima = rsmp("cv", folds = 3)
resampling_pima$instantiate(task_pima)
resampling_spam = rsmp("cv", folds = 3)
resampling_spam$instantiate(task_spam)
design = data.table(
task = list(task_pima, task_spam),
learner = list(lrn("classif.rpart"), lrn("classif.rpart")),
resampling = list(resampling_pima, resampling_spam)
)
bmr = benchmark(design)
``

MLR3 using data transforms in bootstrapping hit an error

I'm trying to use bootstrapping resampling as my cross-validation in mlr3, and have been tracking down the cause of an error:
Error in as_data_backend.data.frame(backend, primary_key = row_ids) :
Assertion on 'primary_key' failed: Contains duplicated values, position 2.
The position changes (likely the first repeated row). Based on the error message I first thought it was an issue having rownames included, so I set those as the col_type$name, and also tried removing rownames from the data before creating the task (no luck!).
In trying to create a reprex, I narrowed it down to transform pipe operators like 'scale' and 'pca' as the cause:
library("mlr3verse")
task <- tsk('sonar')
pipe = po('scale') %>>%
po(lrn('classif.rpart'))
ps <- ParamSet$new(list(
ParamDbl$new("classif.rpart.cp", lower = 0, upper = 0.05)
))
glrn <- GraphLearner$new(pipe)
glrn$predict_type <- "prob"
bootstrap <- rsmp("bootstrap", ratio = 1, repeats = 5)
instance <- TuningInstanceSingleCrit$new(
task = task,
learner = glrn,
resampling = bootstrap,
measure = msr("classif.auc"),
search_space = ps,
terminator = trm("evals", n_evals = 100)
)
tuner <- tnr("random_search")
tuner$optimize(instance)
I've also tried grid search instead of random, different learners, including the flag "duplicated_ids = TRUE" in rsmp, with no luck. Changing to CV cross validation, however, does fix the problem.
For reference, in the full pipe/graph I am trying different feature filters and learners to identify candidate pipelines.

mlr: creating plotBMRBoxplots for only one of the learner

does anyone know whether it is possible to create the plots integrated in the mlr package for only one of the learners?
For example:
BMR_Boxplot <- plotBMRBoxplots(bmr, measure = mse)
BMR_Boxplot
Looking at the arguments, I don't see the possibility to choose one specific learner - is there any known workaround?
Many thanks!
If you subset your bmr object to the results of only one learner, it is easily possible.
Maybe would be nice to have this as feature.
Example code for subsetting to the first learner:
lrns = list(makeLearner("classif.lda"), makeLearner("classif.rpart"))
tasks = list(iris.task, sonar.task)
rdesc = makeResampleDesc("CV", iters = 5L)
meas = list(acc, ber)
bmr = benchmark(lrns, tasks, rdesc, measures = meas)
bmr$results[[2]] = NULL
bmr$learners[[2]] = NULL
plotBMRBoxplots(bmr, ber, style = "violin")

mlr: Tune model parameters with validation set

Just switched to mlr for my machine learning workflow. I am wondering if it is possible to tune hyperparameters using a separate validation set. From my minimum understanding, makeResampleDesc and makeResampleInstance accepts only resampling from training data.
My goal is to tune parameters with a validation set and test the final model with the test set. This is to prevent overfitting and knowledge leak.
Here is what I did code-wise:
## Create training, validation and test tasks
train_task <- makeClassifTask(data = train_data, target = "y", positive = 1)
validation_task <- makeClassifTask(data = validation_data, target = "y")
test_task <- makeClassifTask(data = test_data, target = "y")
## Attempt to tune parameters with separate validation data
tuned_params <- tuneParams(
task = train_task,
resampling = makeResampleInstance("Holdout", task = validation_task),
...
)
From the error message, it looks like evaluation is still trying to resample from the training set:
00001: Error in resample.fun(learner2, task, resampling, measures =
measures, : Size of data set: 19454 and resampling instance:
1666333 differ!
Does anyone know what I should do? Am I setting up everything the right way?
[Update as of 2019/03/27]
Following #jakob-r's comment, and finally understanding #LarsKotthoff's suggestion, here is what I did:
## Create combined training data
train_task_data <- rbind(train_data, validation_data)
## Create learner, training task, etc.
xgb_learner <- makeLearner("classif.xgboost", predict.type = "prob")
train_task <- makeClassifTask(data = train_task_data, target = "y", positive = 1)
## Tune hyperparameters
tune_wrapper <- makeTuneWrapper(
learner = xgb_learner,
resampling = makeResampleDesc("Holdout"),
measures = ...,
par.set = ...,
control = ...
)
model_xgb <- train(tune_wrapper, train_task)
Here is what I did following #LarsKotthoff 's comment. Assume you have two separate datasets for training (train_data) and validation (validation_data):
## Create combined training data
train_task_data <- rbind(train_data, validation_data)
size <- nrow(train_task_data)
train_ind <- seq_len(nrow(train_data))
validation_ind <- seq.int(max(train_ind) + 1, size)
## Create training task
train_task <- makeClassifTask(data = train_task_data, target = "y", positive = 1)
## Tune hyperparameters
tuned_params <- tuneParams(
task = train_task,
resampling = makeFixedHoldoutInstance(train_ind, validation_ind, size),
...
)
After optimizing the hyperparameter set, you can build a final model and test against your test dataset.
Note: I have to install the latest development version (as of 2018/08/06) from GitHub. Current CRAN version (2.12.1) throws an error when I call makeFixedHoldoutInstance(), i.e.,
Assertion on 'discrete.names' failed: Must be of type 'logical flag',
not 'NULL'.

Resources