How to save Machine Learning models in R - r

I am using R to create some basic machine learning models. I use the klar, caret and e1071 packages. Here is the code that generates my model
library(e1071)
library(klaR)
library(caret)
x = iris[,-5]
y = iris$Species
model = train(x,y,'nb',trControl = trainControl(method='cv',number=10))
i was wondering,is it possible to save this model somewhere and reference it later ? For example, in python we can use the pickle package to do
nbClassifier = nltk.NaiveBayesClassifier.train(featureSets)
saveNBClassifier = open("abtNBClassifier.pickle","wb")
pickle.dump(nbClassifier, saveNBClassifier)
saveNBClassifier.close()
and later
open_file = open("abtNBClassifier.pickle", "rb")
classifier = pickle.load(open_file)
open_file.close()
is something similar possible in R ?

If you only want to save a single object, you can also use:
saveRDS(model, file = "model.rds")
Afterwards you can use
loadedModel <- readRDS(model.rds)
ReadRDS() does not load the object as it was named when you saved it, but can be loaded in a new name.
For more information on the difference between save() and saveRDS() see this link

Yes you can just use:
save(model, file="model.Rda")
and later:
load("model.Rda")

Related

How to save a ranger model in mlr3 without data?

I have created a ranger model using mlr3 library. I saved this model to my machine using following command. The created file is huge in size. The saved file also has the data along with the model. Is there a way to only save the model without the data?
learner_ranger = lrn("classif.ranger", predict_type = "prob", predict_sets = c("train", "test"), importance = "impurity", num.threads = 8)
learner_ranger$train(train_task])
save(learner_ranger, file = "model.rda")
When I try to load this saved model it does not load the model correctly.
learner_ranger = load("model.rda")
str(learner_ranger)
Error in str(learner_ranger) : object 'learner_ranger' not found
To reduce the file size and save only the model, I have tried following but I am getting an error
save(learner_ranger$model, file = "model.rda")
The error I am getting is -
Error in save(learner_ranger$model, file = "model.rda") :
object ‘learner_randomF$model’ not found
After some research found that there are two ways to save and load the model in R:
using save(), load(): When we use save(), we will have to load it using the same name.
using saveRDS(), loadRDS(): saveRDS() does not save the model name and we have the flexibility to load the model in any other name. But saveRDS() can only save one object at a time as it is a lower-level function.
Most people prefer saveRDS() over save() as it serializes the object.
I am still looking for ways to save the model without data.

Save the Deep learning model by R language

I have established a deep learning model with the h2o package of the R software. I gained a model with good presence and I wanna to save it. However, I tried all kinds of methods but failed. The code "save()" and "save.image()" are provided in the base package of R software. I used the "save()" function to conserve my model. But when I want to use the built model to run new data, it is said that the "model" object is not found in the function. I am really confused about this problem for a few days. If you have any good ideas, just tell me. Thanks for your reading~
load("F:/R/Rstudy/myfile") ##download the saved file
library(h2o)
h2o.init()
Te <- read.csv("F:/Rdata/Test.csv") ## import testing data
Te <- as.h2o(Te)
Te[,2] <- as.factor(Te[,2])
perf <- h2o.performance(model, Te) ## test model
ERROR: Unexpected HTTP Status code: 404 Not Found (url = http://localhost:54321/3/ModelMetrics/models/DeepLearning_model_R_1533035975237_1/frames/RTMP_sid_8185_2)
ERROR MESSAGE:
Object 'DeepLearning_model_R_1533035975237_1' not found in function: predict for argument: model
You can use the below to save and retrieve the model.
build the model
model <- h2o.deeplearning(params)
save the model
model_path <- h2o.saveModel(object=model, path=getwd(), force=TRUE)
print(model_path)
/tmp/mymodel/DeepLearning_model_R_1441838096933
load the model
saved_model <- h2o.loadModel(model_path)
Reference - http://docs.h2o.ai/h2o/latest-stable/h2o-docs/save-and-load-model.html
Hope this helps,
ND

Load classification tree model on new raster data set

I am working with a classification tree model that allows me to classify raster images (to create maps). I would like to reapply this classification tree to a new raster dataset, in another script.
I tried to save the model as .rda or .RData file but I have the same problem it's impossible to predict the model on the new dataset.
no applicable method for 'predict' applied to an object of class "character"
I'm pretty sure the problem is due to the save function of the model since it does not appear entirely:
summary(model)
# Length Class Mode
# 1 character character
I'm saving my model with this code :
save(cuttreedown.training, file =("Artmac.rda")
and load it with :
model <- load("Artmac.rda")
And this is my predict function where satImage is my raster file (the function is working well on another script) :
predict(satImage, model, filename="test.tif", progress='text', datatype='INT1U', type='prob', overwrite=TRUE)
I just want to apply my model on a new raster dataset as it is done in the training script.
You can use saveRDS/loadRDS for such a task. Here is a full example:
library(rpart)
data(iris)
fit <- rpart(Species~., data = iris)
saveRDS(fit, "fit.rds")
load_fit <- readRDS("fit.rds")
all.equal(predict(load_fit, iris),
predict(fit, iris))
#output
TRUE
simply save and load the model without an extension. Also, don't create a new object when loading, it will automatically populate the model object into the environment. Try:
save(cuttreedown.training, file="Artmac")
load("Artmac")
summary(Artmac)

How to save a model when using MXnet

I am using MXnet for training a CNN (in R) and I can train the model without any error with the following code:
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
But as this process is time-consuming, I run it on a server during the night and I want to save the model for the purpose of using it after finishing the training.
I used:
save(list = ls(), file="mymodel.RData")
and
mx.model.save("mymodel", 10)
But none of them can save the model! for example when I load the "mymodel.RData", I can not predict the labels for the test set!
Another example is when I load the "mymodel.RData" and try to plot it with the following code:
graph.viz(model$symbol$as.json())
I get the following error:
Error in model$symbol$as.json() : external pointer is not valid
Can anybody give me a solution for saving and then loading this model for future use?
Thanks
You can save the model by
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
epoch.end.callback=mx.callback.save.checkpoint("model_prefix")
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
A mxnet model is an R list, but its first component is not an R object but a C++ pointer and can't be saved and reloaded as an R object. Therefore, the model needs to be serialized to behave as an actual R object. The serialized object is also a list, but its first object is a text string containing model information.
To save a model:
modelR <- mx.serialize(model)
save(modelR, file="~/model1.RData")
To retrieve it and use it again:
load("~/model1.RData", verbose=TRUE)
model <- mx.unserialize(modelR)
The best practice for saving a snapshot of your training progress is to use save_snapshot (http://mxnet.io/api/python/module.html#mxnet.module.Module.save_checkpoint) as part of the callback after every epoch training. In R the equivalent command is probably mx.callback.save.checkpoint, but I'm not using R and not sure about it usage.
Using these snapshots can also allow you to take advantage of the low cost option of using AWS Spot market (https://aws.amazon.com/ec2/spot/pricing/ ), which for example now offers and instance with 16 K80 GPUs for $3.8/hour compare to the on-demand price of $14.4. Such 80%-90% discount is common in the spot market and can optimize the speed and cost of your training, as long as you use these snapshots correctly.

Prediction using saved model object

I am trying to use predict function in R using a model saved earlier. The model was created and saved using the following code:
lrModel1 <- glm(response ~ .,data = modelData,family = binomial,model = TRUE)
save(lrModel1,file = "lrModel100.rda")
When I load the model for later use as follows and try to use the predict function on it as follows:
bar <- load("lrModel100.rda")
predicted <- predict(bar,validationData,type = "response")
I get the following error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "character"
Is there a way to get the model object name from the saved RDA file and use it for prediction?
Thank you.
Ravi
As #droopy told you the model's name doesn't change if you save and load. You can use get to use the model:
predicted <- predict(get(bar),validationData,type = "response")
If you have saved the model earlier it may throw this error.
Reload the library(glmnet) and make sure that number of variables in X & Y are same.
I have same problem before.
I used caret to build model, and save the model as a rds file.(saveRDS)
When I readRDS my file, and use this model to predict, I encountered this problem.
After I use "library(caret)", my problem is solved.
So I think if you save your model, and re-open your model to predict, you have to reload the package you used for building model.

Resources