Lime package not able to get predictions for CaretStack - r

I built a Caret ensemble model by stacking models together.
The model ran successfully and I got encouraging results.
The challenge came when I tried to use Lime to interpret the black box predictions. I got an error saying "The class of model must have a model_type method"
The only time I encountered such error was when using Lime in H20. Subsequently, the guys behind Lime have released an update that supports H20 in Lime.
Does anyone know if any work has been done to include CaretStack for use with Lime? Or know of a workaround to solve this issue.

According to the Lime documentation, these are the supported models
Out of the box, lime supports the following model objects:
train from caret
WrappedModel from mlr
xgb.Booster from xgboost
H2OModel from h2o
keras.engine.training.Model from keras
lda from MASS (used for low-dependency examples)
If your model is not one of the above you'll need to implement support yourself. If the model has a predict interface mimicking that of predict.train() from caret, it will be enough to wrap your model in as_classifier()/as_regressor() to gain support.
Otherwise you'll need need to implement a predict_model() method and potentially a model_type() method (if the latter is omitted the model should be wrapped in as_classifier()/as_regressor(), everytime it is used in lime()).
Solution to your question:
For your case, CaretStack has a predict interface mimicking that of predict.train(), so wrapping your model in as_classifier() or as_regressor() should suffice

Related

Can no longer make predictions with older Rborist model

I have an older random forest model I built with the Rborist package, and I recently stopped being able to make predictions with it. I think older versions of the package produced models of class Rborist and starting with version 0.3-1 it makes models of class rfArb. So the predict method also changed from predict.Rborist to predict.rfArb.
I was originally getting a message about how there was "no applicable method for 'predict' applied to an object of class 'Rborist'" (since it had been depreciated, I think). Then I manually changed the class of my old model to rfArb and started getting a new error message:
Error in predict,rfArb(surv_mod_rf, newdata = trees) :
Sampler state needed for prediction
It looks to me like this has to do with the way rfArb objects are constructed (they have a sampler vector that shows how many times each observation in the sample was sampled, or some such), which is different than the way Rborist objects are constructed. But please let me know if I'm misunderstanding this.
Is there any way to update my old model so I can use the recent version of Rborist to make predictions, or are the only options using an older version of the package or rebuilding the model?

Difference between "mlp" and "mlpML"

I'm using the Caret package from R to create prediction models for maximum energy demand. What i need to use is neural network multilayer perceptron, but in the Caret package i found out there's 2 of the mlp method, which is "mlp" and "mlpML". what is the difference between the two?
I have read description from a book (Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization) but it still doesnt answer my question.
Caret has 238 different models available! However many of them are just different methods to call the same basic algorithm.
Besides mlp there are 9 other methods of calling a multi-layer-perceptron one of which is mlpML. The real difference is only in the parameters of the function call and which model you need depends on your use case and what you want to adapt about the basic model.
Chances are, if you don't know what mlpML or mlpWeightDecay,etc. does you are fine to just use the basic mlp.
Looking at the official documentation we can see that:
mlp(size) while mlpML(layer1,layer2,layer3) so in the first method you can only tune the size of the multi-layer-perceptron while in the second call you can tune each layer individually.
Looking at the source code here:
https://github.com/topepo/caret/blob/master/models/files/mlp.R
and here:
https://github.com/topepo/caret/blob/master/models/files/mlpML.R
It seems that the difference is that mlpML allows several hidden layers:
modelInfo <- list(label = "Multi-Layer Perceptron, with multiple layers",
while mlp has one single layer with hidden units.
The official documentation also hints at this difference. In my opinion, it is not particularly useful to have many different models that differ only very slightly, and the documentation does not explain those slight differences well.

MXNet-Caret Training and Optimization

I am using MXNet library in RStudio to train a neural network model.
When training the model using caret, I can tune (among others) the "momentum" parameter. Is this related with the Stochastic Gradient Descent optimizer?
I know that this is the default optimizer when training using "mx.model.FeedForward.create", but what happens when I am using caret:::train??
Momentum is related to SGD and controls how prone your algorithm to change direction of descend. There are several formulas to do that, read more about it here: https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d
Caret package suppose to be general purpose, so it works with MXNet. When you call cret::train it can accept method parameter. It should be taken from the repository of caret package, which at the moment supports MXNet. See this for an example: https://github.com/topepo/caret/issues/887 from Adam or https://github.com/topepo/caret/blob/master/RegressionTests/Code/mxnet.R for regular SGD.

Using a 'gbm' model created in R package 'dismo' with functions in R package 'gbm'

This is a follow-up to a previous question I asked a while back that was recently answered.
I have built several gbm models with dismo::gbm.step, which relies on the gbm fitting functions found in R package gbm, as well as cross validation tools from R package splines.
As part of my analysis, I would like to use some of the graphical tools available in R (e. g. perspective plots) to visualize pairwise interactions in the data. Both the gbm and the dismo packages have functions for detecting and modelling interactions in the data.
The implementation in dismo is explained in Elith et. al (2008) and returns a statistic which indicates departures of the model predictions from a linear combination of the predictors, while holding all other predictors at their means.
The implementation in gbm uses Friedman`s H statistic (Friedman & Popescue, 2005), and returns a different metric, and also does NOT set the other variables at their means.
The interactions modelled and plotted with dismo::gbm.interactions are great and have been very informative. However, I would also like to use gbm::interact.gbm, partly for publication strength and also to compare the results from the two methods.
If I try to run gbm::interact.gbm in a gbm.object created with dismo, an error is returned…
"Error in is.factor(data[, x$var.names[j]]) :
argument "data" is missing, with no default"
I understand dismo::gmb.step adds extra data the authors thought would be useful to the gbm model.
I also understand that the answer to my question lies somewherein the source code.
My questions is...
Is it possible to modify a gbm object created in dismo to be used in gbm::gbm.interact? If so, would this be accomplished by...
a. Modifying the gbm object created in dismo::gbm.step?
b. Modifying the source code for gbm::interact.gbm?
c. Doing something else?
I will be going through the source code trying to solve this myself, if I come up with a solution before anyone answers I will answer my own question.
The gbm::interact.gbm function requires data as an argument interact.gbm <- function(x, data, i.var = 1, n.trees = x$n.trees).
The dismo gbm.object is essentially the same as the gbm gbm.object, but with extra information attached so I don't imagine changing the gbm.object would help.

How are the predictions obtained?

I have been unable to find information on how exactly predict.cv.glmnet works.
Specifically, when a prediction is being made are the predictions based on a fit that uses all the available data? Or are predictions based on a fit where some data has been discarded as part of the cross validation procedure when running cv.glmnet?
I would strongly assume the former but was unable to find a sentence in the documentation that clearly states that after a cross validation is finished, the model is fitted with all available data for a new prediction.
If I have overlooked a statement along those lines, I would also appreciate a hint on where to find this.
Thanks!
In the documentation for predict.cv.glmnet :
"This function makes predictions from a cross-validated glmnet model, using the stored "glmnet.fit" object ... "
In the documentation for cv.glmnet (under value):
"glmnet.fit a fitted glmnet object for the full data."

Resources