mlr: creating plotBMRBoxplots for only one of the learner - r

does anyone know whether it is possible to create the plots integrated in the mlr package for only one of the learners?
For example:
BMR_Boxplot <- plotBMRBoxplots(bmr, measure = mse)
BMR_Boxplot
Looking at the arguments, I don't see the possibility to choose one specific learner - is there any known workaround?
Many thanks!

If you subset your bmr object to the results of only one learner, it is easily possible.
Maybe would be nice to have this as feature.
Example code for subsetting to the first learner:
lrns = list(makeLearner("classif.lda"), makeLearner("classif.rpart"))
tasks = list(iris.task, sonar.task)
rdesc = makeResampleDesc("CV", iters = 5L)
meas = list(acc, ber)
bmr = benchmark(lrns, tasks, rdesc, measures = meas)
bmr$results[[2]] = NULL
bmr$learners[[2]] = NULL
plotBMRBoxplots(bmr, ber, style = "violin")

Related

extract_inner_fselect_results is NULL with mlr3 Nested Resampling

This question is an extension of the following question: No Model Stored with Mlr3.
I have been performing nested resampling to get an unbiased metric of model performance. If I don't specify store_models=TRUE then I get Error: No model stored at the end of the run. However, if I specify store_models=TRUE in both the at and resample calls then RStudio crashes due to RAM consumption.
I have now tried the following code in which I specified store_models=TRUE for just the at call:
MSvCon<-read.csv("MS v Control Proteomics Final.csv", row.names=1)
MSvCon$Status<-as.factor(MSvCon$Status)
MSvCon[,2:4399]<-scale(MSvCon[,2:4399], center=TRUE, scale=TRUE)
set.seed(123, "L'Ecuyer")
task = as_task_classif(MSvCon, target = "Status")
learner = lrn("classif.ranger", importance = "impurity", num.trees=10000)
set_threads(learner, n = 8)
measure = msr("classif.fbeta", beta=1, average="micro")
terminator = trm("none")
resampling_inner = rsmp("repeated_cv", folds = 10, repeats = 10)
at = AutoFSelector$new(
learner = learner,
resampling = resampling_inner,
measure = measure,
terminator = terminator,
fselect = fs("rfe", n_features = 1, feature_fraction = 0.5, recursive = FALSE),
store_models=TRUE)
resampling_outer = rsmp("repeated_cv", folds = 10, repeats = 10)
rr = resample(task, at, resampling_outer)
After finishing, I am able to extract performance measures successfully. However, I tried to use extract_inner_fselect_results and extract_inner_fselect_archives to check what features were selected and importance measures but received a NULL result.
Do you have any suggestions on what I would need to adjust in my code to see this information? I anticipate that adding store_models=TRUE to the resample call would but the RAM consumption issue (even using 128GB on Rstudio Workbench) prevents that. Is there a way around this?
The archives of the inner resampling are stored in the model slot of the AutoFSelectors i.e. without store_models = TRUE in resample() you cannot access the inner results and archives. I will write a workaround for you and answer in the other question.

How can I account for "failed" learners in benchmark()

I have a large list of task/learner/resampling combinations. I execute the resampling via
design = data.table(
task = list_of_tasks,
learner = list_of_learners,
resampling = list_of_resamplings
)
bmr = benchmark(design)
tab = bmr$aggregate(c(msr("classif.acc")))
The last command fails and I get the following error message:
Error in assert_classif(truth, response = response) : Assertion on 'response' failed: Contains missing values (element 1).
How can I check what went wrong? The learners have worked for slightly different tasks and they are all combination of "standard" learners (svm, naive bayes) with a preceding po("scale"). There are not missing data in the predictors of the targets of the tasks.
At least one learner predicted NAs. Search for NAs in the predictions to identify the failing learner.
library(mlr3)
library(mlr3misc)
# experiment
learner_rpart = lrn("classif.rpart")
learner_debug = lrn("classif.debug", predict_missing = 0.5)
task = tsk("pima")
resampling = rsmp("cv", folds = 3)
design = benchmark_grid(task, list(learner_rpart, learner_debug), resampling)
bmr = benchmark(design)
# search for predictions with NAs
tab = as.data.table(bmr)
tab[map_lgl(tab$prediction, function(pred) any(is.na(pred$response)))]
You should post a new question with a reprex including the failing learner, task and resampling.

How does the mlrMBO package optimize hyperparameters when no objective function is specified?

I am still very new to the mlrMBO package and hyperparameter tuning in general, so I apologize for the ignorance here. Previously I was using the makeTuneControlGrid() function for grid search hyperparameter tuning for random forest and decision tree classification models, and then I was introduced to the mlrMBO package, and have used nearly the same code that I used for grid search, only used the makeTuneControlMBO() function instead. The performance metrics greatly improved from the grid search, but I am not understanding how the functions in this package are searching differently from grid search for optimal hyperparameter combinations, when I did not yet specify an objective function. My understanding from what I have read is that I need to make an objective function to optimize, so if this objective function is never created, how is the package searching hyperparameters if it is not using grid search? Is it optimizing a default objective function that is built into the package?
Here is my code:
task <- makeClassifTask(data = training_data, target = 'DEATH_EVENT', id = 'Death', positive = 1)
View(task)
# Configure learners with probability type
learner2 <- makeLearner('classif.randomForest', predict.type = 'prob') # Random Forest learner
View(learner2)
learner3 <- makeLearner('classif.kknn', predict.type = 'prob') # kNN learner
library(mlrMBO)
getParamSet("classif.randomForest")
ps2 <- makeParamSet(
makeDiscreteParam('mtry', values = seq(1,5, by = 1))
,
makeDiscreteParam('ntree', values = seq(450,600, by = 50)),
makeDiscreteParam('nodesize', values = seq(7,14, by = 1))
)
#creates a control object for MBO optimization
ctrl = makeTuneControlMBO(mbo.control=mlrMBO::makeMBOControl())
rdesc = makeResampleDesc("CV", iters = 5L)
tuned_params <- tuneParams(learner = learner2,
task = task,
control = ctrl,
par.set = ps2,
resampling = rdesc,
measures=list(tpr,auc, fnr, mmce, tnr, setAggregation(tpr, test.sd)), show.info=T)
tuned_params$x
tuned_params$y
tuned_params$mbo.result

R - mlr - What is the difference between Benchmark and Resample when searching for hyperparameters

I'm searching for the optimum hyper parameters settings and i realise i can do that in both ways in MLR. benchmark function, and resample function. What is the difference between the two?
If i were to do it via benchmark, i can compare multiple models, and extract the tuned parameters which is an advantage over resample. Instead, if i were use the resample, i can only tune one model at a time, and I also notice my CPU skyrockets.
how and when should i use one over the other?
data(BostonHousing, package = "mlbench")
BostonHousing$chas <- as.integer(levels(BostonHousing$chas))[BostonHousing$chas]
library('mlr')
library('parallel')
library("parallelMap")
# ---- define learning tasks -------
regr.task = makeRegrTask(id = "bh", data = BostonHousing, target = "medv")
# ---- tune Hyperparameters --------
set.seed(1234)
# Define a search space for each learner'S parameter
ps_xgb = makeParamSet(
makeIntegerParam("nrounds",lower=5,upper=50),
makeIntegerParam("max_depth",lower=3,upper=15),
# makeNumericParam("lambda",lower=0.55,upper=0.60),
# makeNumericParam("gamma",lower=0,upper=5),
makeNumericParam("eta", lower = 0.01, upper = 1),
makeNumericParam("subsample", lower = 0, upper = 1),
makeNumericParam("min_child_weight",lower=1,upper=10),
makeNumericParam("colsample_bytree",lower = 0.1,upper = 1)
)
# Choose a resampling strategy
rdesc = makeResampleDesc("CV", iters = 5L)
# Choose a performance measure
meas = rmse
# Choose a tuning method
ctrl = makeTuneControlRandom(maxit = 30L)
# Make tuning wrappers
tuned.lm = makeLearner("regr.lm")
tuned.xgb = makeTuneWrapper(learner = "regr.xgboost", resampling = rdesc, measures = meas,
par.set = ps_xgb, control = ctrl, show.info = FALSE)
# -------- Benchmark experiements -----------
# Four learners to be compared
lrns = list(tuned.lm, tuned.xgb)
#setup Parallelization
parallelStart(mode = "socket", #multicore #socket
cpus = detectCores(),
# level = "mlr.tuneParams",
mc.set.seed = TRUE)
# Conduct the benchmark experiment
bmr = benchmark(learners = lrns,
tasks = regr.task,
resamplings = rdesc,
measures = rmse,
keep.extract = T,
models = F,
show.info = F)
parallelStop()
# ------ Extract HyperParameters -----
bmr_hp <- getBMRTuneResults(bmr)
bmr_hp$bh$regr.xgboost.tuned[[1]]
res <-
resample(
tuned.xgb,
regr.task,
resampling = rdesc,
extract = getTuneResult, #getFeatSelResult, getTuneResult
show.info = TRUE,
measures = meas
)
res$extract
Benchmarking and resampling are orthogonal concepts -- you can use both independently or together with each other.
Resampling makes sure that learned models are evaluated appropriately. In particular, we don't want to evaluate a learned model by giving it the same data we used for training it, because then the model could just memorize the data and seem like the perfect model. Instead, we evaluate it on different, held-out data to see whether it has learned the general concept and is able to make good predictions on unseen data as well. The resampling determines how this split into train and test data happens, how many iterations of different train and test splits are used, etc.
Benchmarking allows you to compare different learners on different tasks. It's a convenient way to run large-scale comparison experiments that you would otherwise have to perform manually by combining all learners and all tasks, training and evaluating models, and making sure that everything happens in exactly the same way. To determine the performance of a learner and the models it induces on a given task, a resampling strategy is used, as outlined above.
So in short, the answer to your question is to use resampling when you want to evaluate the performance of learned model, and benchmarking with resampling when you want to compare the performance of different learners on different tasks.
If i were to do it via benchmark, i can compare multiple models, and
extract the tuned parameters which is an advantage over resample.
You can also do this using resample().
benchmark() is just a wrapper around resample() making it easier to run experiments on multiple tasks/learners/resamplings.

Can we use a pre-defined column for CV (resampling) in mlr?

To conduct a cross-validation (resampling) in mlr R package, normally we need to call makeResampleDesc function to specify the methods and folds.
My questions are:
Would it be possible to use a pre-defined column as a fold column? Or,
The makeResampleDesc in mlr makes sure that the folds created are consistent (between different learners under the same seed of cause), and can be exported for further manipulation?
The resample description is independent of any learner; you can use one with several learners and get the same folds. You can also extract the fold number from the resample result if you want to link them back to the original data.
You can use a column in the data as the fold column using the blocking argument to makeClassifTask. From the help:
blocking: [‘factor’]
An optional factor of the same length as the number of
observations. Observations with the same blocking level
“belong together”. Specifically, they are either put all in
the training or the test set during a resampling iteration.
Default is ‘NULL’ which means no blocking.
I faced a similar problem.
Trying the following code I was not able to get the same learner:
library(mlr)
set.seed(123)
K_Fold = 3
rdesc <- makeResampleDesc("CV", iters = K_Fold)
r <- resample("regr.rpart", bh.task, rdesc, show.info = FALSE, extract = getFeatureImportance, measures = medae)
KFoldIndex <- getResamplingIndices(r)
r2 <- resample("regr.glm", bh.task, rdesc, show.info = FALSE, measures = medae)
KFoldIndex2 <- getResamplingIndices(r2)
On the other hand, if you use makeResampleInstance you can apply the same indices to different independent learners. It can be found here: https://mlr.mlr-org.com/articles/tutorial/resample.html:
rdesc = makeResampleDesc("CV", iters = K_Fold)
rin = makeResampleInstance(rdesc, size = nrow(iris))
r.lda = resample("classif.lda", iris.task, rin, show.info = FALSE)
r.rpart = resample("classif.rpart", iris.task, rin, show.info = FALSE)
getResamplingIndices(r.lda)
getResamplingIndices(r.rpart)

Resources