Load classification tree model on new raster data set - r

I am working with a classification tree model that allows me to classify raster images (to create maps). I would like to reapply this classification tree to a new raster dataset, in another script.
I tried to save the model as .rda or .RData file but I have the same problem it's impossible to predict the model on the new dataset.
no applicable method for 'predict' applied to an object of class "character"
I'm pretty sure the problem is due to the save function of the model since it does not appear entirely:
summary(model)
# Length Class Mode
# 1 character character
I'm saving my model with this code :
save(cuttreedown.training, file =("Artmac.rda")
and load it with :
model <- load("Artmac.rda")
And this is my predict function where satImage is my raster file (the function is working well on another script) :
predict(satImage, model, filename="test.tif", progress='text', datatype='INT1U', type='prob', overwrite=TRUE)
I just want to apply my model on a new raster dataset as it is done in the training script.

You can use saveRDS/loadRDS for such a task. Here is a full example:
library(rpart)
data(iris)
fit <- rpart(Species~., data = iris)
saveRDS(fit, "fit.rds")
load_fit <- readRDS("fit.rds")
all.equal(predict(load_fit, iris),
predict(fit, iris))
#output
TRUE

simply save and load the model without an extension. Also, don't create a new object when loading, it will automatically populate the model object into the environment. Try:
save(cuttreedown.training, file="Artmac")
load("Artmac")
summary(Artmac)

Related

How to save a ranger model in mlr3 without data?

I have created a ranger model using mlr3 library. I saved this model to my machine using following command. The created file is huge in size. The saved file also has the data along with the model. Is there a way to only save the model without the data?
learner_ranger = lrn("classif.ranger", predict_type = "prob", predict_sets = c("train", "test"), importance = "impurity", num.threads = 8)
learner_ranger$train(train_task])
save(learner_ranger, file = "model.rda")
When I try to load this saved model it does not load the model correctly.
learner_ranger = load("model.rda")
str(learner_ranger)
Error in str(learner_ranger) : object 'learner_ranger' not found
To reduce the file size and save only the model, I have tried following but I am getting an error
save(learner_ranger$model, file = "model.rda")
The error I am getting is -
Error in save(learner_ranger$model, file = "model.rda") :
object ‘learner_randomF$model’ not found
After some research found that there are two ways to save and load the model in R:
using save(), load(): When we use save(), we will have to load it using the same name.
using saveRDS(), loadRDS(): saveRDS() does not save the model name and we have the flexibility to load the model in any other name. But saveRDS() can only save one object at a time as it is a lower-level function.
Most people prefer saveRDS() over save() as it serializes the object.
I am still looking for ways to save the model without data.

How to save a model when using MXnet

I am using MXnet for training a CNN (in R) and I can train the model without any error with the following code:
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
But as this process is time-consuming, I run it on a server during the night and I want to save the model for the purpose of using it after finishing the training.
I used:
save(list = ls(), file="mymodel.RData")
and
mx.model.save("mymodel", 10)
But none of them can save the model! for example when I load the "mymodel.RData", I can not predict the labels for the test set!
Another example is when I load the "mymodel.RData" and try to plot it with the following code:
graph.viz(model$symbol$as.json())
I get the following error:
Error in model$symbol$as.json() : external pointer is not valid
Can anybody give me a solution for saving and then loading this model for future use?
Thanks
You can save the model by
model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=mx.gpu(0),
num.round=20,
array.batch.size=batch.size,
learning.rate=0.1,
momentum=0.1,
eval.metric=mx.metric.accuracy,
wd=0.001,
epoch.end.callback=mx.callback.save.checkpoint("model_prefix")
batch.end.callback=mx.callback.log.speedometer(batch.size, frequency = 100)
)
A mxnet model is an R list, but its first component is not an R object but a C++ pointer and can't be saved and reloaded as an R object. Therefore, the model needs to be serialized to behave as an actual R object. The serialized object is also a list, but its first object is a text string containing model information.
To save a model:
modelR <- mx.serialize(model)
save(modelR, file="~/model1.RData")
To retrieve it and use it again:
load("~/model1.RData", verbose=TRUE)
model <- mx.unserialize(modelR)
The best practice for saving a snapshot of your training progress is to use save_snapshot (http://mxnet.io/api/python/module.html#mxnet.module.Module.save_checkpoint) as part of the callback after every epoch training. In R the equivalent command is probably mx.callback.save.checkpoint, but I'm not using R and not sure about it usage.
Using these snapshots can also allow you to take advantage of the low cost option of using AWS Spot market (https://aws.amazon.com/ec2/spot/pricing/ ), which for example now offers and instance with 16 K80 GPUs for $3.8/hour compare to the on-demand price of $14.4. Such 80%-90% discount is common in the spot market and can optimize the speed and cost of your training, as long as you use these snapshots correctly.

How to save Machine Learning models in R

I am using R to create some basic machine learning models. I use the klar, caret and e1071 packages. Here is the code that generates my model
library(e1071)
library(klaR)
library(caret)
x = iris[,-5]
y = iris$Species
model = train(x,y,'nb',trControl = trainControl(method='cv',number=10))
i was wondering,is it possible to save this model somewhere and reference it later ? For example, in python we can use the pickle package to do
nbClassifier = nltk.NaiveBayesClassifier.train(featureSets)
saveNBClassifier = open("abtNBClassifier.pickle","wb")
pickle.dump(nbClassifier, saveNBClassifier)
saveNBClassifier.close()
and later
open_file = open("abtNBClassifier.pickle", "rb")
classifier = pickle.load(open_file)
open_file.close()
is something similar possible in R ?
If you only want to save a single object, you can also use:
saveRDS(model, file = "model.rds")
Afterwards you can use
loadedModel <- readRDS(model.rds)
ReadRDS() does not load the object as it was named when you saved it, but can be loaded in a new name.
For more information on the difference between save() and saveRDS() see this link
Yes you can just use:
save(model, file="model.Rda")
and later:
load("model.Rda")

Why does load(...) return character name of object instead of the object itself?

The svm model is created with the package e1071 in R. To use the model, I need to save it and read as needed. The package has write.svm, but does not have read.svm. If I use
model <- svm(x, y)
save(model, 'modelfile.rdata')
M <- load('modelfile.rdata')
object M contains just the word 'model'.
How to save the svm model and read back later, to apply to some new data?
Look at the return value for the function load in the help file:
Value:
A character vector of the names of objects created, invisibly.
So "model" is indeed the expected value of M. Your svm has been restored under its original name, which is model.
If you find it a bit confusing that load does not return the object loaded but instead restores it under the name used in saving it, consider using saveRDS and readRDS.
saveRDS(model, 'modelfile.rds')
M <- readRDS('modelfile.rds')
and M should contain your svm model.
I prefer saveRDS and readRDS because with them I know what objects I'm creating in my workspace - see the blog post of Gavin Simpson (linked in his answer) for a detailed discussion.
You misunderstand what load does. It restores an object to the same name it had when you save()d it. What you are seeing in M is the return value of the load() function. Calling load() has the additional side effect of loading the object back under the same name that it was saved with.
Consider:
require("e1071")
data(iris)
## classification mode
# default with factor response:
model <- svm (Species~., data=iris)
## Save it
save(model, file = "my-svm.RData")
## delete model
rm(model)
## load the model
M <- load("my-svm.RData")
Now look at the workspace
> ls()
[1] "iris" "M" "model"
Hence model was restored as a side effect of load().
From ?load we see the reason M contains the name of the objects created (and hence saved originally)
Value:
A character vector of the names of objects created, invisibly.
If you want to restore an object to a new name, use saveRDS() and readRDS():
saveRDS(model, "svm-model.rds")
newModel <- readRDS( "svm-model.rds")
ls()
> ls()
[1] "iris" "M" "model" "newModel"
If you want to know more about saveRDS() and readRDS() see the relevant help ?saveRDS() and you might be interested in a blog post I wrote on this topic.

Prediction using saved model object

I am trying to use predict function in R using a model saved earlier. The model was created and saved using the following code:
lrModel1 <- glm(response ~ .,data = modelData,family = binomial,model = TRUE)
save(lrModel1,file = "lrModel100.rda")
When I load the model for later use as follows and try to use the predict function on it as follows:
bar <- load("lrModel100.rda")
predicted <- predict(bar,validationData,type = "response")
I get the following error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "character"
Is there a way to get the model object name from the saved RDA file and use it for prediction?
Thank you.
Ravi
As #droopy told you the model's name doesn't change if you save and load. You can use get to use the model:
predicted <- predict(get(bar),validationData,type = "response")
If you have saved the model earlier it may throw this error.
Reload the library(glmnet) and make sure that number of variables in X & Y are same.
I have same problem before.
I used caret to build model, and save the model as a rds file.(saveRDS)
When I readRDS my file, and use this model to predict, I encountered this problem.
After I use "library(caret)", my problem is solved.
So I think if you save your model, and re-open your model to predict, you have to reload the package you used for building model.

Resources