How to define a custom training loop in Flux.jl - julia

I am trying to setup my training loop for a ML workflow using Flux.jl. I know I can use the built in Flux.train!() function to do the training but I need a little bit more customization than the API gives me out of the box. How can I define my own custom training loop in Flux?

Per the Flux.jl docs on Training Loops, you can do something like:
function my_custom_train!(loss, ps, data, opt)
# training_loss is declared local so it will be available for logging outside the gradient calculation.
local training_loss
ps = Params(ps)
for d in data
gs = gradient(ps) do
training_loss = loss(d...)
# Code inserted here will be differentiated, unless you need that gradient information
# it is better to do the work outside this block.
return training_loss
end
# Insert whatever code you want here that needs training_loss, e.g. logging.
# logging_callback(training_loss)
# Insert what ever code you want here that needs gradient.
# E.g. logging with TensorBoardLogger.jl as histogram so you can see if it is becoming huge.
update!(opt, ps, gs)
# Here you might like to check validation set accuracy, and break out to do early stopping.
end
end
It is also possible to simplify the above example with a hardcoded loss function.

Related

Modelica/Dymola Run Linearized Model with Initial Values

I am new to Dymola and I want to run a linearized model with initial conditions.
I know how to Linearize it. I can get the StateSpace object in Command window or get the dslin.mat.
Now I want to run it with initial conditions. I found them in the dsin.txt file, but cant bring them together.
Is there an implemented way or do I need to write it on my own?
Best regards,
Axel
You can use the block Modelica.Blocks.Continuous.StateSpace to build a model containing a state-space description, as shown below:
The respective code is:
model StateSpaceModel
Modelica.Blocks.Continuous.StateSpace sys annotation (Placement(transformation(extent={{-10,-10},{10,10}})));
Modelica.Blocks.Sources.Step step(startTime=0.5) annotation (Placement(transformation(extent={{-60,-10},{-40,10}})));
equation
connect(step.y, sys.u[1]) annotation (Line(points={{-39,0},{-12,0}}, color={0,0,127}));
annotation (uses(Modelica(version="4.0.0")));
end StateSpaceModel;
Additionally you can use a script (or a Modelica function) that does some work for you. More precisely, it
linearizes any suitable model. I've used the state-space model from the MSL itself, so you can be sure the result is correct.
translates the above model to be able to set the parameters from the command line
sets the parameters of the state-space block called sys. This includes the ones for the initial conditions in x_start
simulates the model with the new parameters
// Get state-space description of a model
ss = Modelica_LinearSystems2.ModelAnalysis.Linearize("Modelica.Blocks.Continuous.StateSpace");
// Translate custom example, set parameters to result of the above linearization, add initial conditions for states and simulate
translateModel("StateSpaceModel")
sys.A = ss.A;
sys.B = ss.B;
sys.C = ss.C; // in case of an error here, check if 'OutputCPUtime == false;'
sys.D = ss.D;
sys.x_start = ones(size(sys.A,1));
simulateModel("StateSpaceModel", resultFile="StateSpaceModel");

How to load an image for inference in Flux.jl?

I have a model which I trained using a specific dataset. I did not originally break the set up into a train and test set (which I should have). With that said, I want to do some adhoc testing to see how the model performs when I give it specific images. I tried doing something like Images.load("/Users/logankilpatrick/Desktop/train/dog.10697.jpg") to load the image and then pass it directly to the model but I get inout size mismatch errors. What is the correct way to load the image?
To use an image for inference, you need to do a few steps as shown below:
x = Images.load("/Users/logankilpatrick/Desktop/train/dog.10697.jpg")
x = Images.imresize(x, (224,224)...) # 224x224 depends on the model size
x = permutedims(channelview(x), (3,2,1))
# Channelview returns a view of A, splitting out (if necessary) the color channels of A into a new first dimension.
x = reshape(x, size(x)..., 1) # Add an extra dim to show we only have 1 image
float32.(x) # Convert to float32 instead of float64
model(x)
Note that a few of these may change depending on the model you are using and other factors but this is the general idea of what you need to do. It is likely work it to write up a simple function that does this for you.

How to use RWeka classifiers function attribute "options"?

In RWeka classifiers, there is an attribute "options" in the classifier's function call, e.g. Bagging(formula, data, subset, na.action, control = Weka_control(), options = NULL). Could some one please give an example (a sample R code) on how to define these options?
I would be interested in passing on some options (such as the number of iterations and size of each bag) to Bagging meta learner of RWeka. Thanks in advance!
You can get at the features that you mentioned, but not through options.
First, what does options do? According to the help page ?Bagging
Argument options allows further customization. Currently, options model and instances (or partial matches for these) are used: if set to TRUE, the model frame or the corresponding Weka instances, respectively, are included in the fitted model object, possibly speeding up subsequent computations on the object. By default, neither is included.
So options simply stores more information in the returned result. To get at the features that you want, you need to use control. You will need to construct the value for control using the function Weka_control. Without some help, it is hard to know how to use that, but luckily, help is available through WOW the Weka Option Wizard. Because there are many options, the output is long. I am going to truncate it to just the part about the features that you mentioned - the number of iterations and size of each bag. But do look at what else is available.
WOW(Bagging)
-P Size of each bag, as a percentage of the training set size. (default 100)
-I <num>
Number of iterations. (current value 10)
Number of arguments: 1.
Repeating: I have truncated the output to show just these two options.
Example: Iris data
Suppose that I wanted to use bagging with the iris data with the bag size being 90% of the data (instead of the default 100%) and with 20 iterations (instead of the default 10). First, I would build the Weka_control, then include that in my call to Bagging.
WC = Weka_control(P=90, I=20)
BagOfIrises = Bagging(Species ~ ., data=iris, control=WC)
I hope that this helps.

Exclude specific tensors being updated by optimizer in TensorFlow

I have two graphs, which I suppose to train them independently, which means I have two different optimizers, but at the same time one of them is using the tensor values of the other graph. As a result, I need to be able to stop specific tensors being updated while training one of the graphs. I have assigned two different namescopes two my tensors and using this code to control updates over tensors for different optimizers:
mentor_training_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "mentor")
train_op_mentor = mnist.training(loss_mentor, FLAGS.learning_rate, mentor_training_vars)
mentee_training_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "mentee")
train_op_mentee = mnist.training(loss_mentee, FLAGS.learning_rate, mentee_training_vars)
the vars variable is being used like below, in the training method of mnist object:
def training(loss, learning_rate, var_list):
# Add a scalar summary for the snapshot loss.
tf.summary.scalar('loss', loss)
# Create the gradient descent optimizer with the given learning rate.
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
# Create a variable to track the global step.
global_step = tf.Variable(0, name='global_step', trainable=False)
# Use the optimizer to apply the gradients that minimize the loss
# (and also increment the global step counter) as a single training step.
train_op = optimizer.minimize(loss, global_step=global_step, var_list=var_list)
return train_op
I'm using the var_list attribute of the optimizer class in order to control vars being updated by the optimizer.
Right now I'm confused whether I have done what I supposed to do appropriately, and even if there is anyway to check if any optimizer would only update partial of a graph?
I would appreciate if anyone can help me with this issue.
Thanks!
I have had a similar problem and used the same approach as you, i.e. via the var_list argument of the optimizer. I then checked whether the variables not intended for training stayed the same using:
the_var_np = sess.run(tf.get_default_graph().get_tensor_by_name('the_var:0'))
assert np.equal(the_var_np, pretrained_weights['the_var']).all()
pretrained_weights is a dictionary returned by np.load('some_file.npz') which I used to store the pre-trained weights to disk.
Just in case you need that as well, here is how you can override a tensor with a given value:
value = pretrained_weights['the_var']
variable = tf.get_default_graph().get_tensor_by_name('the_var:0')
sess.run(tf.assign(variable, value))

Avoiding using global objects when building an R package with multiple separate functions

I have built an R package that runs a complex Bayesian model (Dirichlet Process Mixture model on spatial data) including an MCMC, thinning and validation and interface with Googlemaps. I'm very happy with performance and it runs without problems. The only issue is I would like to get it up on CRAN and it will be rejected because I extensively use global variables.
The package is built around the use of 8 core functions (which the user interacts with):
1) LoadData: Loads in data, extracts key information and sets up a series of global matrices as well as other small list objects.
2) ModelParameters: Sets model parameters, option to plot prior on parameter sigma on Googlemap. Calculates a hyper-prior at this point and saves a large matrix to the global environment
3) GraphicParameters: Sets graphic parameters of maps and plots (see code below)
4) CreateMaps: Creates the prior surface on source location tau and plots the data on a Google map. Keeps a number of global objects saved for repeated plotting of this map.
5) RunMCMC: Runs the bulk of the analysis using MCMC (a time intensive step), creates many global objects.
6) ThinandAnalsye: Thins the posterior samples and constructs the geoprofile (a time intensive step)
7) PlotGP: Plots the data and overlays the geoprofile onto a Google map
8) reporthitscores: OPTIONAL if source data is imported, calculates the hit scores of potential sources
Each one is run in turn before the next, and I pass global variables out which are used by one or more of the other functions.
I built it this way for a reason, as the user must stop and evaluate the results of these functions before rushing ahead to the future ones.
Each of these functions passes not just fixed parameters, but also large map objects, lists and matrices as global objects. I thought it was a nice simple solution with a smooth workflow (you can check the results in your main working environment before moving on, possibly applying transformations etc) and I have given all the objects unique and informative names.
How do I get around this, and pass the checks of CRAN whilst keeping my user friendly workflow of a series of interacting functions?
I dont want to post up a lot of code (as just the MCMC part is several hundred lines long)
But I will include one of the simple examples. GraphicParameters is one of my simple parameter setting functions, that comes with the default values set. This is a simple example, there are much more complex ones in the package. There is a model parameters function that pulls many of the variables from an existing data loading function for example.
GraphicParameters <-
function(Guardrail=0.05, nring=20,transp=0.4,gridsize=640,gridsize2=300,MapType= "roadmap",Location=getwd(),pointcol="black") {
Guardrail<<-Guardrail
nring<<-nring
transp<<-transp
gridsize<<-gridsize
gridsize2<<-gridsize2
MapType<<-MapType
Location<<-Location
pointcol<<-pointcol
}
Most of the material I have seen concerning avoiding global objects resolves around a single function that will do all the work. I want to keep my step by step multi-function approach, but loose the global objects.
Any help would be greatly appreciated.
I understand this may be a major reworking of the code (which is several 1000 lines currently), so I would also love solutions that minimally affect the overall structure of the package.
P.S. I wish I had known about CRANs displeasure with global objects before I started!!!
Your problem is very amenable to OOP-style design. You can use reference classes or S4 to export a single global, e.g., a MapAnalysis class generator. The idea is then that someone creates this using
ma <- new('MapAnalysis', option1 = ..., option2 = ..., ...) # S4
# or
ma <- MapAnalysis$new(option1 = ..., ...) # refClass
and can then call your methods with
ma$loadData(...)
ma$setParameters(...)
with the object doing any bookkeeping of options and auxiliary objects internally. It should not be that much work to refactor. If you read the page I linked to at the top of this post, you should see it's probably possible to just wrap all your functions with a refClass('MapAnalysis', fields = (...), methods = (...)) with few further modifications. (Although it would do you a lot of good down the road to re-think the architecture in OOP terms.)

Resources