I created an xml file using pmml function from pmml library in R.
adamodel_iOS=ada(label~.,data=train_iOS, iter=ntrees, verbose=TRUE, loss="ada", bag.frac=0.7, nu=0.1, control=defctrl, type="real")
Ptrain_iOS = predict(adamodel_iOS,newdata=train_iOS, type="prob")
library(pmml)
adapmml_iOS=pmml(adamodel_iOS)
saveXML(adapmml_iOS,"model_iOS.xml")
save.image()
After, training model in the first line, I found the corresponding probabilities for the training data.
Now I want to use this xml file to generate predictions on a set of data(basically the training set again). How do I do that in R? I see that in java and spark, we can load xml file generated by pmml function and then there are functions which can make predictions.
Basically, I am looking for a function in R that can take this xml file as an input and then return an object which in turn takes some datapoints as input and return their probabilities of having label 0 and 1.
I found a link:
Can PMML models be read in R?
but it does not help
Check this link for the list of PMML producers and consumers. As you can see R is listed as producer not consumer. Also, algorithms for which R can produce the corresponding PMML files are listed.
The most comprehensive tool for PMML validator, convertor, and also for scoring data using PMML models is ADAPA, which is not free.
KNIME is an open source drag & drop analytics tool which supports both import and export of PMML files (not for all models and the features are limited.) It supports R, Python, and Java too.
Although it's a long time ago, I still want to share that you can use the "reticulate" to call the python pypmml package to implement your ideas in R, and in order to be more friendly and make the prediction look more like the predict function in R, I will It is encapsulated, the address of the package is here enter link description here
Related
I have been working on a ML project for which that work (done inside an R-project) resulted in some ML models (built with caret) ALONG WITH code that uses those models for additional analysis.
As the next phase, I am "deploying" these models by creating an R-package that my collaborators can use for analysis of new data, where that analysis includes USING the trained ML models. This package includes functions that generate reports, where, embedded in that report is the application of the trained ML models against the new data sets.
I am trying to identify the "right" way to include those trained models in the package. (Note, currently each model is saved in its own .rds file).
I want to be able to use those models inside of package functions.
I also want to consider the possibility of "updating" the models to a new version at a later date.
So ... should I:
Include the .rda files in inst/exdata
Include as part of sysdata.rda
Put them in an external data package (which seems reasonable, except almost all examples in tutorials expect a data package to
include data.frame-ish objects.)
With regard to that third option ... I note that these models likely imply that there are some additional "NAMESPACE" issues at play, as the models will require a whole bunch of caret related stuff to be useable. Is that NAMESPACE modification required to be in the "data" package or the package that I am building that will "use" the models?
My first intention is to go for 1. There is no need to go for other formats as PMML as you only want to run it within R. So I consider Rda as natively best. As long, as your models are not huge, it should be fine to share with collaborators (but maybe not for a CRAN package). I see, that 3. sounds convenient but why seperate models and functions? Freshly trained models then would come with a new package version, as you anayway would need to go with a data package. I dont see gaining much this way, but I have not much experiance with data packages.
So I have trained a random forest model in R which I have converted into a pmml file. But how do you import the pmml file into Android Studio and use it for predictions? I have a hard time finding examples of this online. The model is very simple and I only have 4 features which are used to classify the input data as "A","B" or "C". So I want to, given these 4 features, be able to input that into my trained model and get an output in the app. Is there some interface that I can import and use?
Since you already have a pmml you may better checkout this library. It's a PMML evaluator for Android. You could be able to import your pmml for inferences with it. Hope this would be useful.
Note: I haven't tried this particular library myself.
I would like to save the results of a Bayesian estimation function (e.g. brm) to file (e.g. in Latex format), for sharing results and preparing publications. It appears that existing packages (e.g. xtable, stargazer) are designed with non-Bayesian statistics in mind and cannot handle these model objects. Are there any existing packages or available code to handle this task (before I begin to recreate the wheel)? I have found tools for making tables from models estimated using JAGS/BUGS here but brm uses stan to estimate models.
If you call launch_shinystan on the object and go to the Estimate tab, there is a link to Generate LaTeX table that gives you a bunch of options to check on the left and it outputs the syntax on the right.
For posterity's sake, for basic tables one may also access specific parts of the model by for example:
summary(model)$fixed
where model is a stanfit object, and pass that to xtable or another function to output latex tables.
I am working on the "randomForest" R package to change sampling method for feature subset selection at the nodes of trees in the forest. Currently random forest uses simple random sampling to do that. I tried to look at the R code by using the commands
library(randomForest)
getAnywhere(randomForest.default)
but could not find the relevant code chunk where "mtry" features are selected. How can I do this change in the source code?
I also tried using the S3 and S4 methods described in this SO question, but did not see all the functions in the randomForest package, and more important, did not see the randomForest() method listed.
However, if you navigate to the CRAN page for randomForest, you will see a link to the source code for the package:
https://cran.r-project.org/web/packages/randomForest/index.html
You can download a TAR file which has all the source code for the package from the above link. The actual source code seems to be in the code folder, e.g. rf.c which looks like it might be the file you want to refactor.
Let's say I run random forest or kmeans. I get an R object. Now I want to save that model for future use. I thought PMML was a good format but then realized that R can't read PMML and turn it back into an object that can be used for scoring. It can only write it. Is there any alternative to saving it as an R object with the save command? That seems like a bloated solution since the data that was used for training is attached to it.