Why does lsoda (in R) fail to complete running duration, with warning messages? - r

I am writing a numerical model in R, for an ecological system, and solving it using "lsoda" from package deSolve.
My model has 14 state variables.
I define the model, set it up fine, and give time duration according to this:
nyears<-60
ndays<-nyears*365+1
times<-seq(0,nyears*365,by=1)
Rates of change of state variables (e.g. the rate of change of variable "A1" is "dA1")are calculated according to existing values for state variables (at time=t) and a set of parameters.
Simplified example:
dA1<-Tf*A1*(ImaxA*p_sub)
Where Tf, ImaxA and p_sub are parameters, and A1 is my state variable at time=t.
When I solve the model, I use the lsoda solver like this:
out<-as.data.frame(lsoda(start,times,model,parms))
Sometimes (depending on my parameter combinations), the model run completes over the entire duration I have specified, however sometimes it stops short of the mark (still giving me output up until the solver "crashes"). When it "crashes", this message is displayed:
DLSODA- At current T (=R1), MXSTEP (=I1) steps
taken on this call before reaching TOUT
In above message, I1 = 5000
In above message, R1 = 11535.5
Warning messages:
1: In lsoda(start, times, model, parms) :
an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
2: In lsoda(start, times, model, parms) :
Returning early. Results are accurate, as far as they go
It commonly appears when one of the state variables is getting exponentially bigger, or is tending very near to zero, however sometimes it crashes when seemingly not much change is happening. I may be wrong, but is it due to the rate of change of state-variables becoming too large? If so, why might it also "crash" when there is not a fast rate of change?
Is there a way that I can make the solver complete its task with the specified parameter values, maybe with a more relaxed tolerance for error?

Thank you all for your contributions. I looked at some of the rates, and at the point of crashing, the model was switching between two metabolic states - and the fast rate of this binary switch caused the solver to stop - rejecting the solution because the rate of change was too large. I have fixed my model by introducing a gradual switch between states (with a logistic curve) instead of this binary switch. I aknowledge that I didn;t give enough info in the original question, so thanks for the help you offered!

Related

Error: Required number of iterations = 1087633109 exceeds iterMax = 1e+06 ; either increase iterMax, dx, dt or reduce sigma

I am getting this error and this post telling me that I should decrease the sigma but here is the thing this code was working fine a couple of months ago. Nothing change based on the data and the code. I was wondering why this error out of blue.
And the second point, when I lower the sigma such as 13.1, it looks running (but I have been waiting for an hour).
sigma=203.9057
dimyx1=1024
A22den=density(Lnetwork,sigma,distance="path",continuous=TRUE,dimyx=dimyx1) #
About Lnetwork
Point pattern on linear network
69436 points
Linear network with 8417 vertices and 8563 lines
Enclosing window: rectangle = [143516.42, 213981.05] x [3353367, 3399153] units
Error: Required number of iterations = 1087633109 exceeds iterMax = 1e+06 ; either increase iterMax, dx, dt or reduce sigma
This is a question about the spatstat package.
The code for handling data on a linear network is still under active development. It has changed in recent public releases of spatstat, and has changed again in the development version. You need to specify exactly which version you are using.
The error report says that the required number of iterations of the algorithm is too large. This occurs because either the smoothing bandwidth sigma is too large, or the spacing dx between sample points along the network is too small. The number of iterations is proportional to (sigma/dx)^2 in most cases.
First, check that the value of sigma is physically reasonable.
Normally you shouldn't have to worry about the algorithm parameter dx because it is determined automatically by default. However, it's possible that your data are causing the code to choose a very small value of dx.
The internal code which automatically determines the spacing dx of sample points along the network has been changed recently, in order to fix several bugs.
I suggest that you specify the algorithm parameters manually. See the help file for densityHeat for information on how to control the spacings. Setting the parameters manually will also ensure greater consistency of the results between different versions of the software.
The quickest solution is to set finespacing=FALSE. This is not the best solution because it still uses some of the automatic rules which may be giving problems. Please read the help file to understand what that does.
Did you update spatstat since this last worked? Probably the internal code for determining spacing on the network etc. changed a bit. The actual computations are done by the function densityHeat(), and you can see how to manually set spacing etc. in its help file.

doubts regarding batch size and time steps in RNN

In Tensorflow's tutorial of RNN: https://www.tensorflow.org/tutorials/recurrent
. It mentions two parameters: batch size and time steps. I am confused by the concepts. In my opinion, RNN introduces batch is because the fact that the to-train sequence can be very long such that backpropagation cannot compute that long(exploding/vanishing gradients). So we divide the long to-train sequence into shorter sequences, each of which is a mini-batch and whose size is called "batch size". Am I right here?
Regarding time steps, RNN consists of only a cell (LSTM or GRU cell, or other cell) and this cell is sequential. We can understand the sequential concept by unrolling it. But unrolling a sequential cell is a concept, not real which means we do not implement it in unroll way. Suppose the to-train sequence is a text corpus. Then we feed one word each time to the RNN cell and then update the weights. So why do we have time steps here? Combining my understanding of the above "batch size", I am even more confused. Do we feed the cell one word or multiple words (batch size)?
Batch size pertains to the amount of training samples to consider at a time for updating your network weights. So, in a feedforward network, let's say you want to update your network weights based on computing your gradients from one word at a time, your batch_size = 1.
As the gradients are computed from a single sample, this is computationally very cheap. On the other hand, it is also very erratic training.
To understand what happen during the training of such a feedforward network,
I'll refer you to this very nice visual example of single_batch versus mini_batch to single_sample training.
However, you want to understand what happens with your num_steps variable. This is not the same as your batch_size. As you might have noticed, so far I have referred to feedforward networks. In a feedforward network, the output is determined from the network inputs and the input-output relation is mapped by the learned network relations:
hidden_activations(t) = f(input(t))
output(t) = g(hidden_activations(t)) = g(f(input(t)))
After a training pass of size batch_size, the gradient of your loss function with respect to each of the network parameters is computed and your weights updated.
In a recurrent neural network (RNN), however, your network functions a tad differently:
hidden_activations(t) = f(input(t), hidden_activations(t-1))
output(t) = g(hidden_activations(t)) = g(f(input(t), hidden_activations(t-1)))
=g(f(input(t), f(input(t-1), hidden_activations(t-2)))) = g(f(inp(t), f(inp(t-1), ... , f(inp(t=0), hidden_initial_state))))
As you might have surmised from the naming sense, the network retains a memory of its previous state, and the neuron activations are now also dependent on the previous network state and by extension on all states the network ever found itself to be in. Most RNNs employ a forgetfulness factor in order to attach more importance to more recent network states, but that is besides the point of your question.
Then, as you might surmise that it is computationally very, very expensive to calculate the gradients of the loss function with respect to network parameters if you have to consider backpropagation through all states since the creation of your network, there is a neat little trick to speed up your computation: approximate your gradients with a subset of historical network states num_steps.
If this conceptual discussion was not clear enough, you can also take a look at a more mathematical description of the above.
I found this diagram which helped me visualize the data structure.
From the image, 'batch size' is the number of examples of a sequence you want to train your RNN with for that batch. 'Values per timestep' are your inputs.' (in my case, my RNN takes 6 inputs) and finally, your time steps are the 'length', so to speak, of the sequence you're training
I'm also learning about recurrent neural nets and how to prepare batches for one of my projects (and stumbled upon this thread trying to figure it out).
Batching for feedforward and recurrent nets are slightly different and when looking at different forums, terminology for both gets thrown around and it gets really confusing, so visualizing it is extremely helpful.
Hope this helps.
RNN's "batch size" is to speed up computation (as there're multiple lanes in parallel computation units); it's not mini-batch for backpropagation. An easy way to prove this is to play with different batch size values, an RNN cell with batch size=4 might be roughly 4 times faster than that of batch size=1 and their loss are usually very close.
As to RNN's "time steps", let's look into the following code snippets from rnn.py. static_rnn() calls the cell for each input_ at a time and BasicRNNCell::call() implements its forward part logic. In a text prediction case, say batch size=8, we can think input_ here is 8 words from different sentences of in a big text corpus, not 8 consecutive words in a sentence.
In my experience, we decide the value of time steps based on how deep we would like to model in "time" or "sequential dependency". Again, to predict next word in a text corpus with BasicRNNCell, small time steps might work. A large time step size, on the other hand, might suffer gradient exploding problem.
def static_rnn(cell,
inputs,
initial_state=None,
dtype=None,
sequence_length=None,
scope=None):
"""Creates a recurrent neural network specified by RNNCell `cell`.
The simplest form of RNN network generated is:
state = cell.zero_state(...)
outputs = []
for input_ in inputs:
output, state = cell(input_, state)
outputs.append(output)
return (outputs, state)
"""
class BasicRNNCell(_LayerRNNCell):
def call(self, inputs, state):
"""Most basic RNN: output = new_state =
act(W * input + U * state + B).
"""
gate_inputs = math_ops.matmul(
array_ops.concat([inputs, state], 1), self._kernel)
gate_inputs = nn_ops.bias_add(gate_inputs, self._bias)
output = self._activation(gate_inputs)
return output, output
To visualize how these two parameters are related to the data set and weights, Erik Hallström's post is worth reading. From this diagram and above code snippets, it's obviously that RNN's "batch size" will no affect weights (wa, wb, and b) but "time steps" does. So, one could decide RNN's "time steps" based on their problem and network model and RNN's "batch size" based on computation platform and data set.

Dymola: Initialization of District Heating Network Simluation model using Non-linear solver

I have been modelling and simulating a number of simple district heating networks in Dymola and am quite often faced with an error during initialisation.
The system we are simulating consists of
A producer: two pressure boundaries - source and sink. Pressure at source is controlled via PI which ensures that the source pressure is such that the minimum differential pressure accross a consumer is greater than or equal to some set value.
Consumers: They have a PI controlled valve and a heat exchanger with a fixed return temp. Valve controls mass flow and subsequently the heat flow to match the consumer load at any given time.
Pipes: dynamic thermo-hydraulic model with heat loss and and Spatialdistribution to account for delay. Vectorised ports and mixing volumes used to reduce non linear equations.
The figure below is a heavily simplified version of the network (same error occuring here)
The consumer model looks as follows:
And the producer:
During initialisation, the following error occurs:
ERROR: Failed to solve non-linear system using Newton solver.
To get more information: Turn on Simulation/Setup/Debug/Nonlinear solver diagnostics/Details
Solution to systems of equations not found at time = 0
Nonlinear system of equations number = 3
Infinity-norm of residue = 118280
Iteration is not making good progress.
Accumulated number of residue calc.: 389
Accumulated number of symbolic Jacobian calc.: 5
Last values of solution vector:
L.PI.gainPID.y = 0
Last values of residual vector:
{ -118280 }
Trying to solve non-linear system using global homotopy-method.
... loading "data" from "C:/Users/Sim1/Desktop/Keith Dymola Files/GrazReininghaus_UseCase/PythonScriptsforTranslation/Reininghaus.txt"
ERROR: Failed to solve non-linear system using Newton solver.
To get more information: Turn on Simulation/Setup/Debug/Nonlinear solver diagnostics/Details
Solution to systems of equations not found at time = 0
Nonlinear system of equations number = 1
Infinity-norm of residue = 2.22814E+018
Iteration is not making good progress.
Accumulated number of residue calc.: 101
Accumulated number of symbolic Jacobian calc.: 9
Last values of solution vector:
M.port_a.m_flow = 0.000485868
N.valveLinear.dp = -55.8243
O.valveLinear.dp = -135.618
P.valveLinear.dp = 550.474
I.port_a.m_flow = 3.20321E-010
C.port_a.m_flow = 2.19343E-011
D.port_a.m_flow = 0.00208272
E.valveLinear.dp = 371.552
L.port_a.m_flow = -7.10982E-012
J.valveLinear.dp = 243.085
K.port_a.m_flow = 1.924E-005
Last values of residual vector:
{ 6.60393E+013, -1.14781E+018, -1.05438E+018, -2.58754E+016, -111988,
-1.56817E+010, 16024.9, 3.14411E+007, 3.99781E+008, 3.14412E+007,
-15012.9 }
Error: could not solve simplified initialization for homotopy method.
Error: could not solve simplified initialization for homotopy method.
FixInitials:Init
The components A,B,C e.t.c are the consumers in the network. I am using the Radau IIa 5th order solver with tol=1e-06. The PI controller in the consumer valves inittype is to integrate only with integrtor state and the PI in the producer is initialised with an output value.
I have tried playing around with all sorts of nominal values for mass flows and pressure drops in the network, a well as initial values in the PI controllerrs but the same form ERROR is always returned. The model passes the error check but always fails at initialisation.
I would like to know if anybody has had experience in debugging such nonlinear systems, and if so, a few tips in how to initialise these models would be a great help with the debugging process.
Ok so for anybody who is interested, I managed to simulate my network in the end. It turns out the initialisation problem was arising in the "first order" block within the consumer which takes the measured heat flow signal in the heat exchanger and passes it to the PI. The default init type for this component was "noinit", however by changing it to take in an initial guess value (nominal consumer load worked in this case), the initialisation sections passed. I guess this problem occured in this example as my consumer nominal loads were quite a bit higher than in previous examples and therefore the initial value was outside of the suitable range without specifying it manually.

Strange behavior when implementing Back propagation in DBN

Currently I'm trying to implement the Deep Belief Network. But I've met a very strange problem. My source code can be found here: https://github.com/mistree/GoDeep/blob/master/GoDeep/
I first implemented the RBM using CD and it works perfectly (by using the concurrency feature of Golang it's quite fast). Then I start to implement a normal feed forward network with back propagation and then the strange thing happens. It seems very unstable. When I run it with xor gate test it sometimes fails, only when I set the hidden layer nodes to 10 or more then it never fails. Below is how I calculate it
Step 1 : calculate all the activation with bias
Step 2 : calculate the output error
Step 3 : back propagate the error to each node
Step 4 : calculate the delta weight and bias for each node with momentum
Step 1 to Step 4 I do a full batch and sum up these delta weight and bias
Step 5 : apply the averaged delta weight and bias
I followed the tutorial here http://ufldl.stanford.edu/wiki/index.php/Backpropagation_Algorithm
And normally it works if I give it more hidden layer nodes. My test code is here https://github.com/mistree/GoDeep/blob/master/Test.go
So I think it should work and start to implement the DBN by combining the RBM and normal NN. However then the result becomes really bad. It even can't learn a xor gate in 1000 iteration. And sometimes goes totally wrong. I tried to debug with that so after the PreTrain of DBN I do a reconstruction. Most times the reconstruction looks good but the back propagation even fails when the preTrain result is perfect.
I really don't know what's wrong with the back propagation. I must misunderstood the algorithm or made some big mistakes in the implementation.
If possible please run the test code and you'll see how weird it is. The code it self is quite readable. Any hint will be great help.Thanks in advance
I remember Hinton saying you cant train RBM's on an XOR, something about the vector space that doesnt allow a two layer network to work. Deeper networks have less linear properties that allow it to work.

arithmetic library for tracking worst case error

(edited)
Is there any library or tool that allows for knowing the maximum accumulated error in arithmetic operations?
For example if I make some iterative calculation ...
myVars = initialValues;
while (notEnded) {
myVars = updateMyVars(myVars)
}
... I want to know at the end not only the calculated values, but also the potential error (the range of posible values if results in each individual operations took the range limits for each operand).
I have already written a Java class called EADouble.java (EA for Error Accounting) which holds and updates the maximum positive and negative errors along with the calculated value, for some basic operations, but I'm afraid I might be reinventing an square wheel.
Any libraries/whatever in Java/whatever? Any suggestions?
Updated on July 11th: Examined existing libraries and added link to sample code.
As commented by fellows, there is the concept of Interval Arithmetic, and there was a previous question ( A good uncertainty (interval) arithmetic library? ) on the topic. There just a couple of small issues about my intent:
I care more about the "main" value than about the upper and lower bounds. However, to add that extra value to an open library should be straight-forward.
Accounting the error as an independent floating point might allow for a finer accuracy (e.g. for addition the upper bound would be incremented just half ULP instead of a whole ULP).
Libraries I had a look at:
ia_math (Java. Just would have to add the main value. My favourite so far)
Boost/numeric/Interval (C++, Very complex/complete)
ErrorProp (Java, accounts value, and error as standard deviation)
The sample code (TestEADouble.java) runs ok a ballistic simulation and a calculation of number e. However those are not very demanding scenarios.
probably way too late, but look at BIAS/Profil: http://www.ti3.tuhh.de/keil/profil/index_e.html
Pretty complete, simple, account for computer error, and if your errors are centered easy access to your nominal (through Mid(...)).

Resources