I have created a ranger model using mlr3 library. I saved this model to my machine using following command. The created file is huge in size. The saved file also has the data along with the model. Is there a way to only save the model without the data?
learner_ranger = lrn("classif.ranger", predict_type = "prob", predict_sets = c("train", "test"), importance = "impurity", num.threads = 8)
learner_ranger$train(train_task])
save(learner_ranger, file = "model.rda")
When I try to load this saved model it does not load the model correctly.
learner_ranger = load("model.rda")
str(learner_ranger)
Error in str(learner_ranger) : object 'learner_ranger' not found
To reduce the file size and save only the model, I have tried following but I am getting an error
save(learner_ranger$model, file = "model.rda")
The error I am getting is -
Error in save(learner_ranger$model, file = "model.rda") :
object ‘learner_randomF$model’ not found
After some research found that there are two ways to save and load the model in R:
using save(), load(): When we use save(), we will have to load it using the same name.
using saveRDS(), loadRDS(): saveRDS() does not save the model name and we have the flexibility to load the model in any other name. But saveRDS() can only save one object at a time as it is a lower-level function.
Most people prefer saveRDS() over save() as it serializes the object.
I am still looking for ways to save the model without data.
Related
I've been able to successfully generate a model to assist in my multilabel assignment using the mlr library using the following setup,
scene.task = makeMultilabelTask(data = training.data, target = labels)
lrn.br = makeLearner("classif.rpart", predict.type = "prob")
lrn.br = makeMultilabelNestedStackingWrapper(lrn.br)
mdl.lrn <- train(lrn.br, scene.task)
I went about saving my model using the following line,
save(mdl.lrn, file = "mdl.rda")
The model that is saved is just over 1GB in size.
I have attempted to load that saved model using the following line,
load("mdl.rda")
However, I continually run out of memory space as I receive messaging that reads, Error: cannot allocate vector of size ### Kb. I believe that my current amount of memory would allow for the loading of the model as it was enough to generate the model.
Has anyone been able to save and load an mlr generated model for reuse? If so, would you be able to recommend or confirm the approach of saving and loading a model?
I am working with a classification tree model that allows me to classify raster images (to create maps). I would like to reapply this classification tree to a new raster dataset, in another script.
I tried to save the model as .rda or .RData file but I have the same problem it's impossible to predict the model on the new dataset.
no applicable method for 'predict' applied to an object of class "character"
I'm pretty sure the problem is due to the save function of the model since it does not appear entirely:
summary(model)
# Length Class Mode
# 1 character character
I'm saving my model with this code :
save(cuttreedown.training, file =("Artmac.rda")
and load it with :
model <- load("Artmac.rda")
And this is my predict function where satImage is my raster file (the function is working well on another script) :
predict(satImage, model, filename="test.tif", progress='text', datatype='INT1U', type='prob', overwrite=TRUE)
I just want to apply my model on a new raster dataset as it is done in the training script.
You can use saveRDS/loadRDS for such a task. Here is a full example:
library(rpart)
data(iris)
fit <- rpart(Species~., data = iris)
saveRDS(fit, "fit.rds")
load_fit <- readRDS("fit.rds")
all.equal(predict(load_fit, iris),
predict(fit, iris))
#output
TRUE
simply save and load the model without an extension. Also, don't create a new object when loading, it will automatically populate the model object into the environment. Try:
save(cuttreedown.training, file="Artmac")
load("Artmac")
summary(Artmac)
I am using MXnet for training a CNN (in R) and I can train the model without any error with the following code:
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
But as this process is time-consuming, I run it on a server during the night and I want to save the model for the purpose of using it after finishing the training.
I used:
save(list = ls(), file="mymodel.RData")
and
mx.model.save("mymodel", 10)
But none of them can save the model! for example when I load the "mymodel.RData", I can not predict the labels for the test set!
Another example is when I load the "mymodel.RData" and try to plot it with the following code:
graph.viz(model$symbol$as.json())
I get the following error:
Error in model$symbol$as.json() : external pointer is not valid
Can anybody give me a solution for saving and then loading this model for future use?
Thanks
You can save the model by
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
epoch.end.callback=mx.callback.save.checkpoint("model_prefix")
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
A mxnet model is an R list, but its first component is not an R object but a C++ pointer and can't be saved and reloaded as an R object. Therefore, the model needs to be serialized to behave as an actual R object. The serialized object is also a list, but its first object is a text string containing model information.
To save a model:
modelR <- mx.serialize(model)
save(modelR, file="~/model1.RData")
To retrieve it and use it again:
load("~/model1.RData", verbose=TRUE)
model <- mx.unserialize(modelR)
The best practice for saving a snapshot of your training progress is to use save_snapshot (http://mxnet.io/api/python/module.html#mxnet.module.Module.save_checkpoint) as part of the callback after every epoch training. In R the equivalent command is probably mx.callback.save.checkpoint, but I'm not using R and not sure about it usage.
Using these snapshots can also allow you to take advantage of the low cost option of using AWS Spot market (https://aws.amazon.com/ec2/spot/pricing/ ), which for example now offers and instance with 16 K80 GPUs for $3.8/hour compare to the on-demand price of $14.4. Such 80%-90% discount is common in the spot market and can optimize the speed and cost of your training, as long as you use these snapshots correctly.
I am using R to create some basic machine learning models. I use the klar, caret and e1071 packages. Here is the code that generates my model
library(e1071)
library(klaR)
library(caret)
x = iris[,-5]
y = iris$Species
model = train(x,y,'nb',trControl = trainControl(method='cv',number=10))
i was wondering,is it possible to save this model somewhere and reference it later ? For example, in python we can use the pickle package to do
nbClassifier = nltk.NaiveBayesClassifier.train(featureSets)
saveNBClassifier = open("abtNBClassifier.pickle","wb")
pickle.dump(nbClassifier, saveNBClassifier)
saveNBClassifier.close()
and later
open_file = open("abtNBClassifier.pickle", "rb")
classifier = pickle.load(open_file)
open_file.close()
is something similar possible in R ?
If you only want to save a single object, you can also use:
saveRDS(model, file = "model.rds")
Afterwards you can use
loadedModel <- readRDS(model.rds)
ReadRDS() does not load the object as it was named when you saved it, but can be loaded in a new name.
For more information on the difference between save() and saveRDS() see this link
Yes you can just use:
save(model, file="model.Rda")
and later:
load("model.Rda")
I am trying to use predict function in R using a model saved earlier. The model was created and saved using the following code:
lrModel1 <- glm(response ~ .,data = modelData,family = binomial,model = TRUE)
save(lrModel1,file = "lrModel100.rda")
When I load the model for later use as follows and try to use the predict function on it as follows:
bar <- load("lrModel100.rda")
predicted <- predict(bar,validationData,type = "response")
I get the following error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "character"
Is there a way to get the model object name from the saved RDA file and use it for prediction?
Thank you.
Ravi
As #droopy told you the model's name doesn't change if you save and load. You can use get to use the model:
predicted <- predict(get(bar),validationData,type = "response")
If you have saved the model earlier it may throw this error.
Reload the library(glmnet) and make sure that number of variables in X & Y are same.
I have same problem before.
I used caret to build model, and save the model as a rds file.(saveRDS)
When I readRDS my file, and use this model to predict, I encountered this problem.
After I use "library(caret)", my problem is solved.
So I think if you save your model, and re-open your model to predict, you have to reload the package you used for building model.