I’m currently studying the documentation of DifferentialEquations.jl and trying to port my older computational neuroscience codes for using it instead of my own, less elegant and performant, ODE solvers. While doing this, I stumbled upon the following question: is it possible to access and use the results returned from the solver as soon as the current step is returned (instead of waiting for the problem to finish)?
I’m looking for a way to e.g. plot in real-time the voltage levels of a simulated neuron, which seems like a simple enough task and one that’s probably trivial to do using already existing Julia packages but I can’t figure out how. Does it have to do anything with callbacks? Thanks in advance.
Plots.jl doesn't seem to be animating for me right now, but I'll show you the steps anyways. Yes, you can use a DiscreteCallback for this. If you make condition(u,t,integrator)=true then the affect! is called every step, and you could do that.
But, I think using the integrator interface is perfect for this case. Let me show you an example of this. Take the 2D problem from the tutorial:
using DifferentialEquations
using Plots
A = [1. 0 0 -5
4 -2 4 -3
-4 0 0 1
5 -2 2 3]
u0 = rand(4,2)
tspan = (0.0,1.0)
f(u,p,t) = A*u
prob = ODEProblem(f,u0,tspan)
Now instead of using solve, use init to get an integrator out.
integrator = init(prob,Tsit5())
The integrator interface is defined in full at its documentation page, but the basic usage is that you can step using step!. If you put that in a loop and keep stepping then that's essentially what solve does. But it also has the iterator interface, so if you do something like for integ in integrator then inside of the for loop integ will be the current state of the integrator, with values integ.u at time point integ.t. It also has all sorts of things like a plot recipe for intermediate interpolation integ(t) (this is true even when dense=false because it's free and doesn't require extra saving allocations, so feel free to use it).
So, you can do
p = plot(integrator,markersize=0,legend=false,xlims=tspan)
anim = #animate for integ in integrator
plot!(p,integrator,lw=3)
end
plot(p)
gif(anim, "test.gif", fps = 2)
and Plots.jl will give you the animated gif that adds the current interval at each step. Here's what the end plot looks like:
It colored differently in each step because it was a different plot, so you can see how it continued. Of course, you can do anything inside of that loop, or if you want more control you can manually step!(integrator) as necessary.
Related
I have been working for a couple of months with OpenMDAO and I find myself struggling with my code when I want to impose conditions for trying to replicate a physical/engineering behaviour.
I have tried using sigmoid functions, but I am still not convinced with that, due to the difficulty about trading off sensibility and numerical stabilization. Most of times I found overflows in exp so I end up including other conditionals (like np.where) so loosing linearity.
outputs['sigmoid'] = 1 / (1 + np.exp(-x))
I was looking for another kind of step function or something like that, able to keep linearity and derivability to the ease of the optimization. I don't know if something like that exists or if there is any strategy that can help me. If it helps, I am working with an OpenConcept benchmark, which uses vectorized computations ans Simpson's rule numerical integration.
Thank you very much.
PD: This is my first ever question in stackoverflow, so I would like to apologyze in advance for any error or bad practice commited. Hope to eventually collaborate and become active in the community.
Update after Justin answer:
I will take the opportunity to define a little bit more my problem and the strategy I tried. I am trying to monitorize and control thermodynamics conditions inside a tank. One of the things is to take actions when pressure P1 reaches certein threshold P2, for defining this:
eval= (inputs['P1'] - inputs['P2']) / (inputs['P1'] + inputs['P2'])
# P2 = threshold [Pa]
# P1 = calculated pressure [Pa]
k=100 #steepness control
outputs['sigmoid'] = (1 / (1 + np.exp(-eval * k)))
eval was defined in order avoid overflows normalizing the values, so when the threshold is recahed, corrections are taken. In a very similar way, I defined a function to check if there is still mass (so flowing can continue between systems):
eval= inputs['mass']/inputs['max']
k=50
outputs['sigmoid'] = (1 / (1 + np.exp(-eval*k)))**3
maxis also used for normalizing the value and the exponent is added for reaching zero before entering in the negative domain.
PLot (sorry it seems I cannot post images yet for my reputation)
It may be important to highlight that both mass and pressure are calculated from coupled ODE integration, in which this activation functions take part. I guess OpenConcept nature 'explore' a lot of possible values before arriving the solution, so most of the times giving negative infeasible values for massand pressure and creating overflows. For that sometimes I try to include:
eval[np.where(eval > 1.5)] = 1.5
eval[np.where(eval < -1.5)] = -1.5
That is not a beautiful but sometimes effective solution. I try to avoid using it since I taste that this bounds difficult solver and optimizer work.
I could give you a more complete answer if you distilled your question down to a specific code example of the function you're wrestling with and its expected input range. If you provide that code-sample, I'll update my answer.
Broadly, this is a common challenge when using gradient based optimization. You want some kind of behavior like an if-condition to turn something on/off and in many cases thats a fundamentally discontinuous function.
To work around that we often use sigmoid functions, but these do have some of the numerical challenges you pointed out. You could try a hyberbolic tangent as an alternative, though it may suffer the same kinds of problems.
I will give you two broad options:
Option 1
sometimes its ok (even if not ideal) to leave the purely discrete conditional in the code. Lets say you wanted to represent a kind of simple piecewise function:
y = 2x; x>=0
y = 0; x < 0
There is a sharp corner in that function right at 0. That corner is not differentiable, but the function is fine everywhere else. This is very much like the absolute value function in practice, though you might not draw the analogy looking at the piecewise definition of the function because the piecewise nature of abs is often hidden from you.
If you know (or at least can check after the fact) that your final answer will no lie right on or very near to that C1 discontinuity, then its probably fine to leave the code the way is is. Your derivatives will be well defined everywhere but right at 0 and you can simply pick the left or the right answer for 0.
Its not strictly mathematically correct, but it works fine as long as you're not ending up stuck right there.
Option 2
Apply a smoothing function. This can be a sigmoid, or a simple polynomial. The exact nature of the smoothing function is highly specific to the kind of discontinuity you are trying to approximate.
In the case of the piecewise function above, you might be tempted to define that function as:
2x*sig(x)
That would give you roughly the correct behavior, and would be differentiable everywhere. But wolfram alpha shows that it actually undershoots a little. Thats probably undesirable, so you can increase the exponent to mitigate that. This however, is where you start to get underflow and overflow problems.
So to work around that, and make a better behaved function all around, you could instead defined a three part piecewise polynomial:
y = 2x; x>=a
y = c0 + c1*x + c2*x**2; -a <= x < a
y = 0 x < -a
you can solve for the coefficients as a function of a (please double check my algebra before using this!):
c0 = 1.5a
c1 = 2
c2 = 1/(2a)
The nice thing about this approach is that it will never overshoot and go negative. You can also make a reasonably small and still get decent numerics. But if you try to make it too small, c2 will obviously blow up.
In general, I consider the sigmoid function to be a bit of a blunt instrument. It works fine in many cases, but if you try to make it approximate a step function too closely, its a nightmare. If you want to represent physical processes, I find polynomial fillet functions work more nicely.
It takes a little effort to derive that polynomial, because you want it to be c1 continuous on both sides of the curve. So you have to construct the system of equations to solve for it as a function of the polynomial order and the specific relaxation you want (0.1 here).
My goto has generally been to consult the table of activation functions on wikipedia: https://en.wikipedia.org/wiki/Activation_function
I've had good luck with sigmoid and the hyperbolic tangent, scaling them such that we can choose the lower and upper values as well as choosing the location of the activation on the x-axis and the steepness.
Dymos uses a vectorization that I think is similar to OpenConcept and I've had success with numpy.where there as well, providing derivatives for each possible "branch" taken. It is true that you may have issues with derivative mismatches if you have an analysis point right on the transition, but often I've had success despite that. If the derivative at the transition becomes a hinderance then implementing a sigmoid or relu are more appropriate.
If x is of a magnitude such that it can cause overflows, consider applying units or using scaling to put it within reasonable limits if you cannot bound it directly.
Can anyone help me with taking Wreath Products of Groups in Sagemath?
I haven't been able to find a online reference and it doesn't appear to be built in as far as I can tell.
As far as I know, you would have to use GAP to compute them within Sage (and then can manipulate them from within Sage as well). See e.g. this discussion from 2012. This question has information about it, here is the documentation, and here it is within Sage:
F = AbelianGroup(3,[2]*3)
G = PermutationGroup([[(1,2,3),(4,5)],[(3,4)]])
Gp = gap.StandardWreathProduct(gap(F),gap(G))
print Gp
However, if you try to get this back into Sage, you will get a NotImplementedError because Sage doesn't understand what GAP returns in this wacky case (which I hope even is legitimate). Presumably if a recognized group is returned then one could eventually get it back to Sage for further processing. In this case, you might be better off doing some GAP computations and then putting them back in Sage after doing all your group stuff (which isn't always the case).
I have loaded a model in Torch and I would like to fine-tune it. For now I'd like to retrain the last 2 layers of the network (though in the future I may want to add layers). How can I do this? I have been looking for tutorials, but I haven't found what I am looking for. Any tips?
I don't know if I understood what you are asking for. If you want to leave the net as it was except for the 2 layers you want to train (or fine-tune) you have to stop the backpropagation on the ones you don't want to train, like this:
for i=1, x do
c = model:get(i)
c.updateGradInput = function(self, inp, out) end
c.accGradParameters = function(self,inp, out) end
end
Now only the layers outside of this loop will upgrade their parameters. If you want to add new layers just call model:insert(module, position), you can have a look here Torch containers
If that was not what you were looking for, please elaborate more on the question.
Currently I'm trying to implement the Deep Belief Network. But I've met a very strange problem. My source code can be found here: https://github.com/mistree/GoDeep/blob/master/GoDeep/
I first implemented the RBM using CD and it works perfectly (by using the concurrency feature of Golang it's quite fast). Then I start to implement a normal feed forward network with back propagation and then the strange thing happens. It seems very unstable. When I run it with xor gate test it sometimes fails, only when I set the hidden layer nodes to 10 or more then it never fails. Below is how I calculate it
Step 1 : calculate all the activation with bias
Step 2 : calculate the output error
Step 3 : back propagate the error to each node
Step 4 : calculate the delta weight and bias for each node with momentum
Step 1 to Step 4 I do a full batch and sum up these delta weight and bias
Step 5 : apply the averaged delta weight and bias
I followed the tutorial here http://ufldl.stanford.edu/wiki/index.php/Backpropagation_Algorithm
And normally it works if I give it more hidden layer nodes. My test code is here https://github.com/mistree/GoDeep/blob/master/Test.go
So I think it should work and start to implement the DBN by combining the RBM and normal NN. However then the result becomes really bad. It even can't learn a xor gate in 1000 iteration. And sometimes goes totally wrong. I tried to debug with that so after the PreTrain of DBN I do a reconstruction. Most times the reconstruction looks good but the back propagation even fails when the preTrain result is perfect.
I really don't know what's wrong with the back propagation. I must misunderstood the algorithm or made some big mistakes in the implementation.
If possible please run the test code and you'll see how weird it is. The code it self is quite readable. Any hint will be great help.Thanks in advance
I remember Hinton saying you cant train RBM's on an XOR, something about the vector space that doesnt allow a two layer network to work. Deeper networks have less linear properties that allow it to work.
first off I'm going to say I don't know a whole lot about theory and such. But I was wondering if this was an NP or NP-complete problem. It specifically sounds like a special case of the subset sum problem.
Anyway, there's this game I've been playing recently called Alchemy which prompted this thought. Basically you start off with 4 basic elements and combine them to make other elements.
So, for instance, this is a short "recipe" if you will for making elements
fire=basic element
water=basic element
air=basic element
earth=basic element
sand=earth+earth
glass=sand+fire
energy=fire+air
lightbulb=energy+glass
So let's say a computer could create only the 4 basic elements, but it could create multiple sets of the elements. So you write a program to make any element by combining other elements. How would this program process the list the create a lightbulb?
It's clearly fire+air=energy, earth+earth=sand, sand+fire=glass, energy+glass=lightbulb.
But I can't think of any way to write a program to process a list and figure that out without doing a brute force type method and going over every element and checking its recipe.
Is this an NP problem? Or am I just not able to figure this out?
How would this program process the list the create a lightbulb?
Surely you just run the definitions backwards; e.g.
Creating a lightbulb requires 1 energy + 1 glass
Creating an energy requires 1 fire + 1 air
and so on. This is effectively a simple tree walk.
OTOH, if you want the computer to figure out that energy + glass means lightbulb (rather than "blob of molten glass"), you've got no chance of solving the problem. You probably couldn't get 2 gamers to agree that energy + glass = lightbulb!
You can easily model your problem as a graph and look for a solution with any complete search algorithm. If you don't have any experience, it might also help to look into automated planning. I'm linking to that text because it also features an introduction on complexity and search algorithms.