Solved: per this github issue (https://github.com/keras-team/keras/issues/3946), when using flow_images_from_directory, images must be in a folder within the specified directory (folder within a folder). Discovered this based on (“Found 0 images belonging to 0 classes”) message when trying to run this code in Python. In R, no error message results, and predict_generator runs forever. After putting images into a folder (titled “folder”) within test_dir, predict_generator worked quickly (20 ms/step) and gave sensical results on images of cats and dogs.
Original post:
We are trying to use a fine-tuned model to make predictions on unlabeled image. For this example, we are using a model taken from Allaire and Chollet's Deep Learning With R (and available from their github site, link below in code). The problem we are encountering is that, even with making predictions on only one image, and working on a computer with GPU, the predict_generator part of this code runs for 18 hours without completion. We intend to make predictions on ~200K images, so we need run time to be short per image.
(Note: We think our GPU is engaged, based on it taking a few hours to fine-tune VGG on a binary classification task with several hundred training images.)
Our code is adapted from this post:
How to evaluate() and predict() from generator like data in R keras
We have tried, with similar indefinite run-time results, to follow a similar example:
https://www.kaggle.com/dkoops/keras-r-vgg16-base
What changes do we need to make to our code to generate predictions? What is a reasonable expected run time per image (seconds?)?
Here is our code:
library(keras)
####load cats and dogs model (taken from: https://github.com/jjallaire/deep-learning-with-r-notebooks)
model <- load_model_hdf5("cats_and_dogs_small_2.h5")
train_datagen = image_data_generator(
rescale = 1/255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = TRUE,
fill_mode = "nearest"
)
#get data from https://www.kaggle.com/c/dogs-vs-cats/data. Put one image into folder test1image
test_dir<-"test1image"
test_generator <- flow_images_from_directory(
test_dir, # Target directory
train_datagen, # Data generator
target_size = c(150, 150), # Resizes all images to 150 × 150
batch_size = 1,
class_mode = "binary",
shuffle = FALSE# binary_crossentropy loss for binary labels
)
num_test_images = 1
y <- predict_generator(model, test_generator, steps=num_test_images,
verbose =1)
Related
I have a series of ARGOS data from various individual animals. I am using the R package aniMotum (previously foiegras) to create move persistence maps like this example output from the documentation. My goal is to be able to create one of these maps for each animal in my dataframe.
When I adapt the example code to my tracking data, it fails on some tracks and not others. When it fails, I receive a series of errors like these
Error in newton(par = c(X = 4205.28883097774, X = -5904.86998464547, X = 4204.95187507608, :
Newton failed to find minimum.
And a final error
Warning message:
The optimiser failed. Try simplifying the model with the following argument:
map = list(rho_o = factor(NA))
Here is a sample of the code I used
library(aniMotum)
library(tidyverse)
movePersistence <- readRDS("newton_error.rds")
x <-fit_ssm(movePersistence,
vmax = 3,
model = "mp",
time.step = 2,
control = ssm_control(verbose = 2)
)
aniMotum::map(x, what = "p", normalise = TRUE, silent = TRUE)
From what I can tell, the issue is occurring from the Newton function in the TMB package (a dependency of Animotum). I have noticed that if I change the time.step value, it will change the number of errors I get. For example, on track 14, I get 13 Newton errors with a step of 2 but none with a step of 12. I am using 2 as it gives a better approximation than 12. I did try my whole dataset at various steps and each time, different tracks fail. I also tried changing the tolerance and number of iterations in the ssm_control but that was more of a hail-Mary approach that was not working.
Here is a dput() of track 14 to reproduce the errors on a smaller scale using the above code - https://pastebin.com/xMXzdPDh - if anyone has some recommendations I would greatly appreciate it as I do not understand the inconsistent results.
I have a big dataset (around 20GB for training and 2GB for testing) and I want to use MXnet and R. Due to lack of memory, I search for an iterator to load the training and test set by a custom iterator and I found this solution.
Now, I can train the model using the code on this page, but the problem is that if I read the test set with the save iterator as follow:
test.iter <- CustomCSVIter$new(iter = NULL, data.csv = "test.csv", data.shape = 480, batch.size = batch.size)
Then, the prediction command does not work and there is no prediction template in the page;
preds <- predict(model, test.iter)
So, my specific problem is, if I build my model using the code on the page, how can I read my test set and predict its labels for the evaluation process? My test set and train set is in this format.
Thank you for your help
It actually works exactly as you explained. You just call predict with model and iterator:
preds = predict(model, test.iter)
The only trick here is that the predictions are displayed column-wise. By that I mean, if you take the whole sample you are referring to, execute it and add the following lines:
test.iter <- CustomCSVIter$new(iter = NULL, data.csv = "mnist_train.csv", data.shape = 28, batch.size = batch.size)
preds = predict(model, test.iter)
preds[,1] # index of the sample to see in the column position
You receive:
[1] 5.882561e-11 2.826923e-11 7.873914e-11 2.760162e-04 1.221306e-12 9.997239e-01 4.567645e-11 3.177564e-08 1.763889e-07 3.578671e-09
This show the softmax output for the 1st element of the training set. If you try to print everything by just writing preds, then you will see only empty values because of the RStudio print limit of 1000 - real data will have no chance to appear.
Notice that I reuse the training data for prediction. I do so, since I don't want to adjust iterator's code, which needs to be able to consume the data with and without a label in front (training and test sets). In real-world scenario you would need to adjust iterator so it would work with and without a label.
I believe after training, the model saved to the checkpoint does not contain any of the preprocessing operation, as upon examination of the checkpoint model, the operations available start from the input of a model (and not the preprocessing operations that precede the model input).
However, when freezing a graph restored from a point file, where the graph has additional preprocessing operations, does the preprocessing operation gets frozen as well? I have included a preprocessing operation for test time in the graph, and intend to freeze the graph together with the checkpoint model, but the result seem to vary a lot for these 2 scenarios:
Put raw image through frozen graph with preprocessing operations included in the frozen graph --> very, very poor accuracy as if no preprocessing was done.
Preprocess the image first, before putting the preprocessed image through a frozen graph that does not include any preprocessing operation --> result works as expected with very high accuracy.
So my question is does the preprocessing operation gets effectively frozen, or is it advisable to only preprocess images at test time so that we can leave the frozen graph for performing inference only (and not any preprocessing op)? My intention was to include the preprocessing ops within the graph to make it more convenient, but it seems that this approach does not work.
What is the TensorFlow's take on such a workflow? Should preprocessing be done within the graph and frozen, or should it be a separate task outside of the frozen graph?
Here is how I intended to put the preprocessing ops within a graph and freeze them all:
with tf.Graph().as_default() as graph:
# image = tf.placeholder(shape=[None, None, 3], dtype=tf.float32, name = 'Placeholder_only')
# preprocessed_image = inception_preprocessing.preprocess_for_eval(image, 299, 299)
# preprocessed_image = tf.expand_dims(preprocessed_image, 0)
img_array = tf.placeholder(dtype=tf.float32, shape=[None,None,3], name='Placeholder_only')
preprocessed_image = inception_preprocessing.preprocess_for_eval(img_array, 299, 299)
preprocessed_image = tf.expand_dims(preprocessed_image, 0, name='expand_preprocessed_img')
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(preprocessed_image, num_classes = 5, is_training = False)
variables_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
#Setup graph def
input_graph_def = graph.as_graph_def()
output_node_names = "InceptionResnetV2/Logits/Predictions"
output_graph_name = "./frozen_flowers_model_IR2_with_preprocesssing.pb"
with tf.Session() as sess:
saver.restore(sess, checkpoint_file)
# count=0
# for op in graph.get_operations():
# print (op.name)
# count+=1
# if count==50:
# assert False
#Exporting the graph
print ("Exporting graph...")
output_graph_def = graph_util.convert_variables_to_constants(
sess,
input_graph_def,
output_node_names.split(","))
with tf.gfile.GFile(output_graph_name, "wb") as f:
f.write(output_graph_def.SerializeToString())
For a certain combination of parameters in the deeplearning function of h2o, I get different results each time I run it.
args <- list(list(hidden = c(200,200,200),
loss = "CrossEntropy",
hidden_dropout_ratio = c(0.1, 0.1,0.1),
activation = "RectifierWithDropout",
epochs = EPOCHS))
run <- function(extra_params) {
model <- do.call(h2o.deeplearning,
modifyList(list(x = columns, y = c("Response"),
validation_frame = validation, distribution = "multinomial",
l1 = 1e-5,balance_classes = TRUE,
training_frame = training), extra_params))
}
model <- lapply(args, run)
What would I need to do in order to get consistent results for the model each time I run this?
Deeplearning with H2O will not be reproducible if it is run on more than a single core. The results and performance metrics may vary slightly from what you see each time you train the deep learning model. The implementation in H2O uses a technique called "Hogwild!" which increases the speed of training at the cost of reproducibility on multiple cores.
So if you want reproducible results you will need to restrict H2O to run on a single core and make sure to use a seed in the h2o.deeplearning call.
Edit based on comment by Darren Cook:
I forgot to include the reproducible = TRUE parameter that needs to be set in combination with the seed to make it truly reproducible. Note that this will make it a lot slower to run. And is is not advisable to do this with a large dataset.
More information on "Hogwild!"
I am having an issue with randomForest and the raster package. First, I create the classifier:
library(raster)
library(randomForest)
# Set some user variables
fn = "image.pix"
outraster = "classified.pix"
training_band = 2
validation_band = 1
original_classes = c(125,126,136,137,151,152,159,170)
reclassd_classes = c(122,122,136,137,150,150,150,170)
# Get the training data
myraster = stack(fn)
training_class = subset(myraster, training_band)
# Reclass the training data classes as required
training_class = subs(training_class, data.frame(original_classes,reclassd_classes))
# Find pixels that have training data and prepare the data used to create the classifier
is_training = Which(training_class != 0, cells=TRUE)
training_predictors = extract(myraster, is_training)[,3:nlayers(myraster)]
training_response = as.factor(extract(training_class, is_training))
remove(is_training)
# Create and save the forest, use odd number of trees to avoid breaking ties at random
r_tree = randomForest(training_predictors, y=training_response, ntree = 201, keep.forest=TRUE) # Runs out of memory, does not allow more trees than this...
remove(training_predictors, training_response)
Up to this point, all is good. I can see that the forest was created correctly by looking at the error rates, confusion matrix, etc. When I try to classify some data, however, I run into trouble with the following, which returns all NA's in predictions:
# Classify the whole image
predictor_data = subset(myraster, 3:nlayers(myraster))
layerNames(predictor_data) = layerNames(myraster)[3:nlayers(myraster)]
predictions = predict(predictor_data, r_tree, type='response', progress='text')
And gives this warning:
Warning messages:
1: In `[<-.factor`(`*tmp*`, , value = c(1, 1, 1, 1, 1, 1, ... :
invalid factor level, NAs generated
(keeps going like this)...
However, calling predict.randomForest directly works fine and returns the expected predictions (this is not a good option for me because the image is large, and I cannot store the whole matrix in memory):
# Classify the whole image and write it to file
predictor_data = subset(myraster, 3:nlayers(myraster))
layerNames(predictor_data) = layerNames(myraster)[3:nlayers(myraster)]
predictor_data = extract(predictor_data, extent(predictor_data))
predictions = predict(r_tree, newdata=predictor_data)
How can I get it to work directly with the "raster" version? I know that this is possible, as shown in the examples of predict{raster}.
You could try nesting predict.randomForest within the writeRaster function and write the matrix as a raster in chunks as per the pdf included in the raster package. Before that, try the argument 'na.rm=TRUE' when calling predict in the raster function. You might also assign dummy values to the NAs in the predict rasters, then later rewriting them as NAs using functions in the raster package.
As for memory problems when calling RFs, I've had a plethora of memory issues dealing with BRTs. They're immense on disk and in memory! (Should a model be more complex than the data?) I've not had them run reliably on 32-bit machines (WinXp or Linux). Sometimes tweaking Windows memory allotment to applications has helped, and moving to Linux has helped more, but I get the most from 64-bit Windows or Linux machines, since they impose a higher (or no) limit on the amount of memory applications can take. You may be able to increase the number of trees you can use by doing this.