how to create a updatable coreml model? - coreml

I tried to build a pre-trained core-ml model with the help of create ML framework, but the model created is not updatable, Is there a way to create a pre-trained core-ml model which can be updated on the device itself (newly introduced feature in Core-ML 3) ?

Not directly with Create ML, you'll have to use coremltools to make the model updatable. See here for examples: https://github.com/apple/coremltools/tree/main/examples
However... this will only work for neural networks and k-nearest neighbors models. Create ML does not actually produce these kinds of models (at the moment).
For example, an image classifier trained with Create ML is a GLM on top of a fixed neural network. You cannot make GLM models updatable at this point.
So in short, no, you can't make models trained with Create ML updatable.

Related

How to use VGG19 in Flux.jl?

I have a specific computer vision problem that I want to try solving using some pre-trained models. The Flux.jl docs don't actually have any pre-trained models in them like some of the other ML frameworks (PyTorch as an example). How would I access those sort of pertained models in Flux?
In the Flux ecosystem, the functionality for something like pre-trained computer vision models has been extrapolated out into a separate package called MetalHead.jl: https://github.com/FluxML/Metalhead.jl
Per the docs there, you can create a VGG19 model by doing:
julia> vgg19 = VGG19()
VGG19()
and then you can pass the model to something like the classify function along with an input image for a validation test.

How to do cross-validation in R using neuralnet?

I'm trying to build a predictive model, using the neuralnet package. First I'm spliting my dataset in training (80%) and test (20%). But ANN is such a powerful technique that my model easily overfits the training set and performs poorly on the external test set.
Predicted vs True Value - Training is the right one and test set is the left one
Is there a way to do a cross-validation on the training set so that my model doesn't overfit the set? How may I do this with my own built in function?
Plus, are there any other approaches when dealing with deep learning? I've heard you can tweak the weights of the model in order to improve its quality on external data.
Thanks in advance!

Is it possible to build a random forest with model based trees i.e., `mob()` in partykit package

I'm trying to build a random forest using model based regression trees in partykit package. I have built a model based tree using mob() function with a user defined fit() function which returns an object at the terminal node.
In partykit there is cforest() which uses only ctree() type trees. I want to know if it is possible to modify cforest() or write a new function which builds random forests from model based trees which returns objects at the terminal node. I want to use the objects in the terminal node for predictions. Any help is much appreciated. Thank you in advance.
Edit: The tree I have built is similar to the one here -> https://stackoverflow.com/a/37059827/14168775
How do I build a random forest using a tree similar to the one in above answer?
At the moment, there is no canned solution for general model-based forests using mob() although most of the building blocks are available. However, we are currently reimplementing the backend of mob() so that we can leverage the infrastructure underlying cforest() more easily. Also, mob() is quite a bit slower than ctree() which is somewhat inconvenient in learning forests.
The best alternative, currently, is to use cforest() with a custom ytrafo. These can also accomodate model-based transformations, very much like the scores in mob(). In fact, in many situations ctree() and mob() yield very similar results when provided with the same score function as the transformation.
A worked example is available in this conference presentation:
Heidi Seibold, Achim Zeileis, Torsten Hothorn (2017).
"Individual Treatment Effect Prediction Using Model-Based Random Forests."
Presented at Workshop "Psychoco 2017 - International Workshop on Psychometric Computing",
WU Wirtschaftsuniversität Wien, Austria.
URL https://eeecon.uibk.ac.at/~zeileis/papers/Psychoco-2017.pdf
The special case of model-based random forests for individual treatment effect prediction was also implemented in a dedicated package model4you that uses the approach from the presentation above and is available from CRAN. See also:
Heidi Seibold, Achim Zeileis, Torsten Hothorn (2019).
"model4you: An R Package for Personalised Treatment Effect Estimation."
Journal of Open Research Software, 7(17), 1-6.
doi:10.5334/jors.219

identifying key columns/features used by decision tree regression

In Azure ML, I have a predictive regression model using boosted decision tree regression and it is reasonably accurate.
The input dataset has over 450 columns and the model has done a good job of predicting against test data sets, without over-fitting.
To report on the result i need to know what features/columns the model mainly used to make predictions but i cant find this information easily when looking at the trained model data.
How do i identify this information? Im happy to import the result dataset into R to help find this but I just need pointers on what direction to start working in.
Mostly, in using Microsoft Azure Machine Learning, when looking at the features that is mainly used to make predictions, it is found on the output of the Train Model module.
But on using Decision Trees as your algorithm, the output of your Train Model module would be the constructed 'trees' of the algorithm, and it looks like this:
To know the features that made impact on predictions while using Decision Trees algorithms, you can use the Permutation Feature Importance module. Look at the sample experiment below:
The parameters of Permutation Feature Importance are Random Seed and Metric for Measuring Performance (in this case, Regression - Coefficient of Determination)
The left input of Permutation Feature Importance is your trained model, and the right input is your test data.
The output of Permutation Feature Importance looks like this:
You can add Execute R Script to extract the Features and Scores from Permutation Feature Importance module.

How to retrain model using old model + new data chunk in R?

I'm currently working on trust prediction in social networks - from obvious reasons I model this problem as data stream. What I want to do is to "update" my trained model using old model + new chunk of data stream. Classifiers that I am using are SVM, NB (e1071 implementation), neural network (nnet) and C5.0 decision tree.
Sidenote: I know that this solution is possible using RMOA package by defining "model" argument in trainMOA function, but I don't think I can use it with those classifiers implementations (if I am wrong please correct me).
According to strange SO rules, I can't post it as comment, so be it.
Classifiers that you've listed need full data set at the time you train a model, so whenever new data comes in, you should combine it with previous data and retrain the model. What you are probably looking for is online machine learning. One of the very popular implementations is Vowpal Wabbit, it also has bindings to R.

Resources