TensorFlow: Do preprocessing operations get frozen in a graph as well? - graph

I believe after training, the model saved to the checkpoint does not contain any of the preprocessing operation, as upon examination of the checkpoint model, the operations available start from the input of a model (and not the preprocessing operations that precede the model input).
However, when freezing a graph restored from a point file, where the graph has additional preprocessing operations, does the preprocessing operation gets frozen as well? I have included a preprocessing operation for test time in the graph, and intend to freeze the graph together with the checkpoint model, but the result seem to vary a lot for these 2 scenarios:
Put raw image through frozen graph with preprocessing operations included in the frozen graph --> very, very poor accuracy as if no preprocessing was done.
Preprocess the image first, before putting the preprocessed image through a frozen graph that does not include any preprocessing operation --> result works as expected with very high accuracy.
So my question is does the preprocessing operation gets effectively frozen, or is it advisable to only preprocess images at test time so that we can leave the frozen graph for performing inference only (and not any preprocessing op)? My intention was to include the preprocessing ops within the graph to make it more convenient, but it seems that this approach does not work.
What is the TensorFlow's take on such a workflow? Should preprocessing be done within the graph and frozen, or should it be a separate task outside of the frozen graph?
Here is how I intended to put the preprocessing ops within a graph and freeze them all:
with tf.Graph().as_default() as graph:
# image = tf.placeholder(shape=[None, None, 3], dtype=tf.float32, name = 'Placeholder_only')
# preprocessed_image = inception_preprocessing.preprocess_for_eval(image, 299, 299)
# preprocessed_image = tf.expand_dims(preprocessed_image, 0)
img_array = tf.placeholder(dtype=tf.float32, shape=[None,None,3], name='Placeholder_only')
preprocessed_image = inception_preprocessing.preprocess_for_eval(img_array, 299, 299)
preprocessed_image = tf.expand_dims(preprocessed_image, 0, name='expand_preprocessed_img')
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(preprocessed_image, num_classes = 5, is_training = False)
variables_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
#Setup graph def
input_graph_def = graph.as_graph_def()
output_node_names = "InceptionResnetV2/Logits/Predictions"
output_graph_name = "./frozen_flowers_model_IR2_with_preprocesssing.pb"
with tf.Session() as sess:
saver.restore(sess, checkpoint_file)
# count=0
# for op in graph.get_operations():
# print (op.name)
# count+=1
# if count==50:
# assert False
#Exporting the graph
print ("Exporting graph...")
output_graph_def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
output_node_names.split(","))
with tf.gfile.GFile(output_graph_name, "wb") as f:
f.write(output_graph_def.SerializeToString())

Related

How do I save my model to use in another project in mlr3?

I would like to divide my working pipeline in 2:
One place (internal) where to benchmark and auto-tune the alrithms to select the final model.
Apply the selected models to new datasets (external).
For the second part, I will need to somehow save the resulting model object to later use
model$predict_newdata() and transporting it without needing to re-train the algorithm and taking with it the original training data.
The idea is synthesized with the following error:
library("mlr3")
task = tsk("iris")
learner = lrn("classif.rpart")
learner$train(task, row_ids = 1:120)
predictions = learner$predict(task, row_ids = 121:150)
predictions
So far so good, but now I have to save this model into an object outside the R Session, but of course, this won't work:
store_model = learner$model
save(store_model, 'model_rpart.RData')
The solution is to save the whole object as an .rds object as Brian suggested.
saveRDS(learner, 'learner_rpart.rds')
model <- readRDS('learner_rpart.rds')
predictions = model$predict(task, row_ids = 121:150)
predictions$confusion

Converting a R2jags object into a Stanreg (rstanarm) object

I made a model using R2jags. I like the jags syntax but I find the output produced by R2jags not easy to use. I recently read about the rstanarm package. It has many useful functions and is well supported by the tidybayes and bayesplot packages for easy model diagnostics and visualisation. However, I'm not a fan of the syntax used to write a model in rstanarm. Ideally, I would like to get the best of the two worlds, that is writing the model in R2jags and convert the output into a Stanreg object to use rstanarm functions.
Is that possible? If so, how?
I think then question isn't necessarily whether or not it's possible - I suspect it probably is. The question really is how much time you're prepared to spend doing it. All you'd have to do is try to replicate in structure the object that gets created by rstanarm, to the extent that it's possible with the R2jags output. That would make it so that some post-processing tasks would probably work.
If I might be so bold, I suspect a better use of your time would be to turn the R2jags object into something that could be used with the post-processing functions you want to use. For example, it only takes a small modification to the JAGS output to make all of the mcmc_*() plotting functions from bayesplot work. Here's an example. Below is the example model from the jags() function help.
# An example model file is given in:
model.file <- system.file(package="R2jags", "model", "schools.txt")
# data
J <- 8.0
y <- c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2)
sd <- c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6)
jags.data <- list("y","sd","J")
jags.params <- c("mu","sigma","theta")
jags.inits <- function(){
list("mu"=rnorm(1),"sigma"=runif(1),"theta"=rnorm(J))
}
jagsfit <- jags(data=jags.data, inits=jags.inits, jags.params,
n.iter=5000, model.file=model.file, n.chains = 2)
Now, what the mcmc_*() plotting functions from bayesplot expect is a list of matrices of MCMC draws where the column names give the name of the parameter. By default, jags() puts all of them into a single matrix. In the above case, there are 5000 iterations in total, with 2500 as burnin (leaving 2500 sampled) and the n.thin is set to 2 in this case (jags() has an algorithm for identifying the thinning parameter), but in any case, the jagsfit$BUGSoutput$n.keep element identifies how many iterations are kept. In this case, it's 1250. So you could use that to make a list of two matrices from the output.
jflist <- list(jagsfit$BUGSoutput$sims.matrix[1:jagsfit$BUGSoutput$n.keep, ],
jagsfit$BUGSoutput$sims.matrix[(jagsfit$BUGSoutput$n.keep+1):(2*jagsfit$BUGSoutput$n.keep), ])
Now, you'd just have to call some of the plotting functions:
mcmc_trace(jflist, regex_pars="theta")
or
mcmc_areas(jflist, regex_pars="theta")
So, instead of trying to replicate all of the output that rstanarm produces, it might be a better use of your time to try to bend the jags output into a format that would be amenable to the post-processing functions you want to use.
EDIT - added possibility for pp_check() from bayesplot.
The posterior draws of y in this case are in the theta parameters. So, we make an object that has elements y and yrep and make it of class foo
x <- list(y = y, yrep = jagsfit$BUGSoutput$sims.list$theta)
class(x) <- "foo"
We can then write a pp_check method for objects of class foo. This come straight out of the help file for bayesplot::pp_check().
pp_check.foo <- function(object, ..., type = c("multiple", "overlaid")) {
y <- object[["y"]]
yrep <- object[["yrep"]]
switch(match.arg(type),
multiple = ppc_hist(y, yrep[1:min(8, nrow(yrep)),, drop = FALSE]),
overlaid = ppc_dens_overlay(y, yrep[1:min(8, nrow(yrep)),, drop = FALSE]))
}
Then, just call the function:
pp_check(x, type="overlaid")

predict_generator in Keras forever run time

Solved: per this github issue (https://github.com/keras-team/keras/issues/3946), when using flow_images_from_directory, images must be in a folder within the specified directory (folder within a folder). Discovered this based on (“Found 0 images belonging to 0 classes”) message when trying to run this code in Python. In R, no error message results, and predict_generator runs forever. After putting images into a folder (titled “folder”) within test_dir, predict_generator worked quickly (20 ms/step) and gave sensical results on images of cats and dogs.
Original post:
We are trying to use a fine-tuned model to make predictions on unlabeled image. For this example, we are using a model taken from Allaire and Chollet's Deep Learning With R (and available from their github site, link below in code). The problem we are encountering is that, even with making predictions on only one image, and working on a computer with GPU, the predict_generator part of this code runs for 18 hours without completion. We intend to make predictions on ~200K images, so we need run time to be short per image.
(Note: We think our GPU is engaged, based on it taking a few hours to fine-tune VGG on a binary classification task with several hundred training images.)
Our code is adapted from this post:
How to evaluate() and predict() from generator like data in R keras
We have tried, with similar indefinite run-time results, to follow a similar example:
https://www.kaggle.com/dkoops/keras-r-vgg16-base
What changes do we need to make to our code to generate predictions? What is a reasonable expected run time per image (seconds?)?
Here is our code:
library(keras)
####load cats and dogs model (taken from: https://github.com/jjallaire/deep-learning-with-r-notebooks)
model <- load_model_hdf5("cats_and_dogs_small_2.h5")
train_datagen = image_data_generator(
rescale = 1/255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = TRUE,
fill_mode = "nearest"
)
#get data from https://www.kaggle.com/c/dogs-vs-cats/data. Put one image into folder test1image
test_dir<-"test1image"
test_generator <- flow_images_from_directory(
test_dir, # Target directory
train_datagen, # Data generator
target_size = c(150, 150), # Resizes all images to 150 × 150
batch_size = 1,
class_mode = "binary",
shuffle = FALSE# binary_crossentropy loss for binary labels
)
num_test_images = 1
y <- predict_generator(model, test_generator, steps=num_test_images,
verbose =1)

How to predict the labels for the test set when using a custom Iterator in MXnet?

I have a big dataset (around 20GB for training and 2GB for testing) and I want to use MXnet and R. Due to lack of memory, I search for an iterator to load the training and test set by a custom iterator and I found this solution.
Now, I can train the model using the code on this page, but the problem is that if I read the test set with the save iterator as follow:
test.iter <- CustomCSVIter$new(iter = NULL, data.csv = "test.csv", data.shape = 480, batch.size = batch.size)
Then, the prediction command does not work and there is no prediction template in the page;
preds <- predict(model, test.iter)
So, my specific problem is, if I build my model using the code on the page, how can I read my test set and predict its labels for the evaluation process? My test set and train set is in this format.
Thank you for your help
It actually works exactly as you explained. You just call predict with model and iterator:
preds = predict(model, test.iter)
The only trick here is that the predictions are displayed column-wise. By that I mean, if you take the whole sample you are referring to, execute it and add the following lines:
test.iter <- CustomCSVIter$new(iter = NULL, data.csv = "mnist_train.csv", data.shape = 28, batch.size = batch.size)
preds = predict(model, test.iter)
preds[,1] # index of the sample to see in the column position
You receive:
[1] 5.882561e-11 2.826923e-11 7.873914e-11 2.760162e-04 1.221306e-12 9.997239e-01 4.567645e-11 3.177564e-08 1.763889e-07 3.578671e-09
This show the softmax output for the 1st element of the training set. If you try to print everything by just writing preds, then you will see only empty values because of the RStudio print limit of 1000 - real data will have no chance to appear.
Notice that I reuse the training data for prediction. I do so, since I don't want to adjust iterator's code, which needs to be able to consume the data with and without a label in front (training and test sets). In real-world scenario you would need to adjust iterator so it would work with and without a label.

R problem with randomForest classification with raster package

I am having an issue with randomForest and the raster package. First, I create the classifier:
library(raster)
library(randomForest)
# Set some user variables
fn = "image.pix"
outraster = "classified.pix"
training_band = 2
validation_band = 1
original_classes = c(125,126,136,137,151,152,159,170)
reclassd_classes = c(122,122,136,137,150,150,150,170)
# Get the training data
myraster = stack(fn)
training_class = subset(myraster, training_band)
# Reclass the training data classes as required
training_class = subs(training_class, data.frame(original_classes,reclassd_classes))
# Find pixels that have training data and prepare the data used to create the classifier
is_training = Which(training_class != 0, cells=TRUE)
training_predictors = extract(myraster, is_training)[,3:nlayers(myraster)]
training_response = as.factor(extract(training_class, is_training))
remove(is_training)
# Create and save the forest, use odd number of trees to avoid breaking ties at random
r_tree = randomForest(training_predictors, y=training_response, ntree = 201, keep.forest=TRUE) # Runs out of memory, does not allow more trees than this...
remove(training_predictors, training_response)
Up to this point, all is good. I can see that the forest was created correctly by looking at the error rates, confusion matrix, etc. When I try to classify some data, however, I run into trouble with the following, which returns all NA's in predictions:
# Classify the whole image
predictor_data = subset(myraster, 3:nlayers(myraster))
layerNames(predictor_data) = layerNames(myraster)[3:nlayers(myraster)]
predictions = predict(predictor_data, r_tree, type='response', progress='text')
And gives this warning:
Warning messages:
1: In `[<-.factor`(`*tmp*`, , value = c(1, 1, 1, 1, 1, 1, ... :
invalid factor level, NAs generated
(keeps going like this)...
However, calling predict.randomForest directly works fine and returns the expected predictions (this is not a good option for me because the image is large, and I cannot store the whole matrix in memory):
# Classify the whole image and write it to file
predictor_data = subset(myraster, 3:nlayers(myraster))
layerNames(predictor_data) = layerNames(myraster)[3:nlayers(myraster)]
predictor_data = extract(predictor_data, extent(predictor_data))
predictions = predict(r_tree, newdata=predictor_data)
How can I get it to work directly with the "raster" version? I know that this is possible, as shown in the examples of predict{raster}.
You could try nesting predict.randomForest within the writeRaster function and write the matrix as a raster in chunks as per the pdf included in the raster package. Before that, try the argument 'na.rm=TRUE' when calling predict in the raster function. You might also assign dummy values to the NAs in the predict rasters, then later rewriting them as NAs using functions in the raster package.
As for memory problems when calling RFs, I've had a plethora of memory issues dealing with BRTs. They're immense on disk and in memory! (Should a model be more complex than the data?) I've not had them run reliably on 32-bit machines (WinXp or Linux). Sometimes tweaking Windows memory allotment to applications has helped, and moving to Linux has helped more, but I get the most from 64-bit Windows or Linux machines, since they impose a higher (or no) limit on the amount of memory applications can take. You may be able to increase the number of trees you can use by doing this.

Resources