MXNet-Caret Training and Optimization - r

I am using MXNet library in RStudio to train a neural network model.
When training the model using caret, I can tune (among others) the "momentum" parameter. Is this related with the Stochastic Gradient Descent optimizer?
I know that this is the default optimizer when training using "mx.model.FeedForward.create", but what happens when I am using caret:::train??

Momentum is related to SGD and controls how prone your algorithm to change direction of descend. There are several formulas to do that, read more about it here: https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d
Caret package suppose to be general purpose, so it works with MXNet. When you call cret::train it can accept method parameter. It should be taken from the repository of caret package, which at the moment supports MXNet. See this for an example: https://github.com/topepo/caret/issues/887 from Adam or https://github.com/topepo/caret/blob/master/RegressionTests/Code/mxnet.R for regular SGD.

Related

Machine learning in R: Using MLR package survival filters in MLR3

I want to run a number of machine learning algorithms with different feature selection methods on survival data using the MLR3 package. For that, I am using the Benchmark() function of MLR3.
Unfortunately, filter feature selection methods of MLR3 do not support survival, yet. However, MLR package supports survival filters.
I can fuse MLR learners with an MLR filter method. After that, I need to convert them to a learner in MLR3 in order to be able to use banchmark_grid() function of MLR3.
Is there any way to use MLR survival filters in MLR3? Or is there any way that I can convert MLR filters to MLR3 filters?
Unfortunately not -- the basic design of mlr and mlr3 is fundamentally different.

Is it possible to build a random forest with model based trees i.e., `mob()` in partykit package

I'm trying to build a random forest using model based regression trees in partykit package. I have built a model based tree using mob() function with a user defined fit() function which returns an object at the terminal node.
In partykit there is cforest() which uses only ctree() type trees. I want to know if it is possible to modify cforest() or write a new function which builds random forests from model based trees which returns objects at the terminal node. I want to use the objects in the terminal node for predictions. Any help is much appreciated. Thank you in advance.
Edit: The tree I have built is similar to the one here -> https://stackoverflow.com/a/37059827/14168775
How do I build a random forest using a tree similar to the one in above answer?
At the moment, there is no canned solution for general model-based forests using mob() although most of the building blocks are available. However, we are currently reimplementing the backend of mob() so that we can leverage the infrastructure underlying cforest() more easily. Also, mob() is quite a bit slower than ctree() which is somewhat inconvenient in learning forests.
The best alternative, currently, is to use cforest() with a custom ytrafo. These can also accomodate model-based transformations, very much like the scores in mob(). In fact, in many situations ctree() and mob() yield very similar results when provided with the same score function as the transformation.
A worked example is available in this conference presentation:
Heidi Seibold, Achim Zeileis, Torsten Hothorn (2017).
"Individual Treatment Effect Prediction Using Model-Based Random Forests."
Presented at Workshop "Psychoco 2017 - International Workshop on Psychometric Computing",
WU Wirtschaftsuniversität Wien, Austria.
URL https://eeecon.uibk.ac.at/~zeileis/papers/Psychoco-2017.pdf
The special case of model-based random forests for individual treatment effect prediction was also implemented in a dedicated package model4you that uses the approach from the presentation above and is available from CRAN. See also:
Heidi Seibold, Achim Zeileis, Torsten Hothorn (2019).
"model4you: An R Package for Personalised Treatment Effect Estimation."
Journal of Open Research Software, 7(17), 1-6.
doi:10.5334/jors.219

Difference between "mlp" and "mlpML"

I'm using the Caret package from R to create prediction models for maximum energy demand. What i need to use is neural network multilayer perceptron, but in the Caret package i found out there's 2 of the mlp method, which is "mlp" and "mlpML". what is the difference between the two?
I have read description from a book (Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization) but it still doesnt answer my question.
Caret has 238 different models available! However many of them are just different methods to call the same basic algorithm.
Besides mlp there are 9 other methods of calling a multi-layer-perceptron one of which is mlpML. The real difference is only in the parameters of the function call and which model you need depends on your use case and what you want to adapt about the basic model.
Chances are, if you don't know what mlpML or mlpWeightDecay,etc. does you are fine to just use the basic mlp.
Looking at the official documentation we can see that:
mlp(size) while mlpML(layer1,layer2,layer3) so in the first method you can only tune the size of the multi-layer-perceptron while in the second call you can tune each layer individually.
Looking at the source code here:
https://github.com/topepo/caret/blob/master/models/files/mlp.R
and here:
https://github.com/topepo/caret/blob/master/models/files/mlpML.R
It seems that the difference is that mlpML allows several hidden layers:
modelInfo <- list(label = "Multi-Layer Perceptron, with multiple layers",
while mlp has one single layer with hidden units.
The official documentation also hints at this difference. In my opinion, it is not particularly useful to have many different models that differ only very slightly, and the documentation does not explain those slight differences well.

Set up different actiavtion functions for different layers using "neuralnet" package

Ciao,
I am working to neuralnet in R.
I used to program this kind of stuff using Keras in python so I would expect to be able to set up different activation functions for different layers.
Let me explain. Suppose I want to build a neural net with 2 hidden layers (say with 5 and 4 neurons) and an output between -1 and 1.
I would like to set up RELU or softplus in the hidden layers and tanh in the output layer.
The issue here is that neuralnet package lets me choose only one activation function via the argument act.fun:
> nn <- neuralnet(data = data, hidden = c(5, 4), act.fun =tanh)
I tried by setting the act.fun argument as c(softplus, softplus, tanh) but of course I get an error because the neuralnet function expects only one function for that argument.
Do you know how I can set up the neuralnet in this way? On the internet I can only find very basic linear neural net built with this package. If it would be not possible this mean that this package is almost useless because it would be able to build only "linear models" (??!)
Thanks a lot,
ciao
ReLu was added in neuralnet 1.44.4 (not on CRAN yet, could use devtools::install_github("bips-hb/neuralnet")). In this version it's also possible to change the output activation function separately (output.act.fct). However, different activations for the hidden layers is not yet possible.
See also here: https://github.com/bips-hb/neuralnet/issues/18.
On the internet I can only find very basic linear neural net built with this package. If it would be not possible this mean that this package is almost useless because it would be able to build only "linear models" (??!)
No, not only linear models. But note that the package is from the pre-deep learning era (2008) and not made for deep networks. I would also recommend keras (the R package is great) here.

Lime package not able to get predictions for CaretStack

I built a Caret ensemble model by stacking models together.
The model ran successfully and I got encouraging results.
The challenge came when I tried to use Lime to interpret the black box predictions. I got an error saying "The class of model must have a model_type method"
The only time I encountered such error was when using Lime in H20. Subsequently, the guys behind Lime have released an update that supports H20 in Lime.
Does anyone know if any work has been done to include CaretStack for use with Lime? Or know of a workaround to solve this issue.
According to the Lime documentation, these are the supported models
Out of the box, lime supports the following model objects:
train from caret
WrappedModel from mlr
xgb.Booster from xgboost
H2OModel from h2o
keras.engine.training.Model from keras
lda from MASS (used for low-dependency examples)
If your model is not one of the above you'll need to implement support yourself. If the model has a predict interface mimicking that of predict.train() from caret, it will be enough to wrap your model in as_classifier()/as_regressor() to gain support.
Otherwise you'll need need to implement a predict_model() method and potentially a model_type() method (if the latter is omitted the model should be wrapped in as_classifier()/as_regressor(), everytime it is used in lime()).
Solution to your question:
For your case, CaretStack has a predict interface mimicking that of predict.train(), so wrapping your model in as_classifier() or as_regressor() should suffice

Resources