I have app that could calculate some defined models (let's say lm model). The app is hosted on the server. Now I would like to add functionality, that user could define "on the side" any model (arima or any defined by user), add it to app and then calculate estimates by using that model.
The best solution would be if user could define model on own R instance, export it to a file and import via front-end on the server. The best solution for me, because user don't have to have any permissions to the server.
I was thinking about saving model definition as a RDS file and then importing it to app. However if model is saved via:
modelTest <- glm(y ~ x, data = df)
saveRDS(modelTest, file = "modelTest.rds")
And then after import:
modelTest2 <- readRDS("modelTest.rds")
df2$prediction <- predict(modelTest2, newdata=df2)
In the above example, the whole glm object is saved. It means that predicted values are also saved, so the file could be large if many predicted values are saved. Is it possible to use another method and save model with only model definition without data?
Related
I am working with secure datasets and a safe data environment, and producing models from them, e.g. Cox proportional hazard models which are safe to export
i.e. model <- coxph(survobject, y ~ x + a+ b + c).
I can use grid.expand on all possible values of x, a, b, c and predict(model, grid) to visualise what the model is saying about the risk distribution.
However, I'd like to export the models and carry out all the model visualisation outside the safe space.
What is the best way to export the model, save the model and import the model so I can use predict at a later time on new data without access to the original data. Bearing in mind this file needs to be readable by a human to assess that it is a datasafe export and does not include any of the original (sensitive) data.
I note that you can export the model as an RDS file.
https://stackoverflow.com/questions/54744797/is-there-a-way-to-export-and-import-models-rather-than-re-running-them-every-ses]
However, when opening this with notepad it is unreadable.
I have already exported all the coefficients, i.e. the outputs from glance and tidy (both exponentiated and not). Is there any way to reconstruct a model that can carry out predict from this data?
How can I export the model in a human readable file, perhaps via unlist, saving in excel and then reconstructing the model?
I know how to calculate predictions from linear models but not from CoxPH models.
I am developing an image classification workflow which uses keras through R. The workflow will likely be run multiple times, potentially by multiple users. I save a custom trained version of keras' Iv3 model, as a .h5 file.
Once the file is saved and loaded back in with load_model_hdf5(), is there a way to see the class labels with which the model has been trained?
I understand that the classes are the alphabetized names of the folders in the training directory but there will be cases where the model is loaded on a new machine without access to the training directory.
Right now I am manually loading in the list of class labels (as strings) which is not a good solution.
Ideally, I would load in my trained model and then access a list of class labels...
pseudocode might look like this
model_fn <- # some example model file (.h5)
model <- load_model_hdf5(model_fn)
classes <- model$classes
I have 7 training models in total. All of them are created using caret::train function.
5 of them are generated with R in one laptop(Mac), while the other 2 are generated with R in another laptop(PC). I need to output a chart. Before that I need to use caret::resamples function to generate the result. That means I will need all 7 models in the same environment. Is there any code I can write to do that?
Actually, as of Axeman's comment, you can use saveRDS(model, "model.rds") and model <- readRDS("model.rds"), but you don't need to use .RData in order to use the .rds files. You can just add the lines of code in your scripts!
You can do so for each model in your environment:
saveRDS(model1, "model1.rds")
saveRDS(model2, "model2.rds")
and so on, then transport to another computer however you'd like, and read:
model1 <- readRDS("model1.rds")
model2 <- readRDS("model2.rds")
I have trained a model with keras/tensorflow in R Studio and stored it.
I now reload the model with
my_model <- load_model_hdf5("my_model.h5")
with
summary(my_model)
I get a summary of the model, including number and size of layers.
Yet, I don't see what the activation functions are.
Is there a way to the access the activation functions?
Secondly, is there also a way to access the hyperparameters as epoch number, batch size,.. that were used in training this model?
You can check Netron with loading your .h5 file. Netron is highly useful.
Problem:
I have a Random Forest model trained in R. I need to deploy this model in a standalone Excel tool that will be used by 350 people across a sales network to perform real-time predictions based on data entered into the spreadsheet by users.
How can I do this?
Constraints:
It is not an option to require users to install R on their local machines.
It is not an option to have a server (physical or cloud) providing a scoring API.
What have I done so far?
1. PMML
I can export the model in PMML (XML structure). From research I can see there are libraries for loading and executing PMML inputs in Python and Java. However I haven't found anything implemented in VBA / VB.
2. Zementis
I looked into a solution called Zementis which offers an Excel add-in to deploy PMML models. However from my understanding this requires web-service calls to a cloud server (e.g. AWS) where the actual model execution happens. My IT security department will not allow this.
3. Others
The most common recommendation seems to be to call R to load the model and run the predict function. As noted above, this is not a viable option.
Detailed Context:
The Random Forest model is trained in R, with c. 30 variables. The model is used to recommend "personalised" prices for products as part of a sales process.
The model needs to be distributed to the sales network, with about 350 users. The business's preference is to integrate the model into an existing spreadsheet tool that sales teams currently use to calculate deal profitability.
This means that I need to be able to export the model in a way that it can be implemented in Excel VBA.
Given timescales, the implementation needs to be self-contained with no IT infrastructure or additional application installs. We are working with the organisation's IT team on a server based solution, however their deployment timescales are 12 months+ which means we need a tactical solution in the short-term.
Here's one approach to get the "rules" for the trees (example using the mtcars dataset)
install.packages("randomForest")
library(randomForest)
head(mtcars)
set.seed(1)
fit <- randomForest(mpg ~ ., data=mtcars, importance=TRUE, proximity=TRUE)
print(fit)
## Look at variable importance:
importance(fit)
# Print the rules for each tree in the forest
install.packages("rattle")
library(rattle)
printRandomForests(fit)
It is probably unrealistic to use the rules for 500 trees, but maybe you could implement 100 trees in your vba and then take an average of the results (for a continuous response) or predict the class with the most votes across the trees (for a categorical response).
Maybe you could recreate the model on a Worksheet.
As far as I know, Excel can import XML structures (on the Development Tools ribbon).
Edit: 1) save pmml structure in plaintext editor as .xml file.
2) Open the file in Excel 2013 (maybe other versions also do it)
3) Click through the error message and open the file anyway. Trees open as a table, a bit funny, but recognizable.
4) Create prediction calculation (generic fn in VBA) to operate on the tree.