Heaviside theta integration with Mathematica - math

I am trying to integrate a Heaviside theta function with two signs inside and Mathematica won't give me a solution. Is there any way of improving the approach before just acknowledging that Mathematica cannot integrate it?

Some things you do really worry me.
You use X as a variable and as the name of a function. I've changed those to X and Xf.
You use ω as a variable and as the name of a function. I've changed those to ω and ωf.
You use = and not := in your function definitions. I've changed that.
With those changes I have
Xf[s1_,s2_,α_]:=(1+s2(-2+α)-α+s1(-1+2s2+α))/((-1+2 s1)(-1+2s2));
ωf[s1_,s2_,α_]:=((1-s2+s1(-3+4s2))(1+s2(-2+α)-α+s1(-1+2s2+α)))/((-1+2s1)(-1+s1+s2)(-1+2s2));
λ1[s1_,s2_,α_,X_,ω_]:=1/2(s1-2s2-3s1 X+3s2 X-α+s1 α+s2 α+ω-s1 ω-s2 ω-
1/2Sqrt[(4(α-ω+s2(2-3X-α-ω)+s1(-1+3X-α+ω))^2+8(2-5X+4X^2-2α+2X α+
s1(-2+4s2(1-2X)^2+7X-8X^2+2α-2X α-ω)+ω-s2(4-11X+8X^2-2α+2X α+ω)))]);
λ2[s1_,s2_,α_,X_,ω_]:=1/2(s1-2s2-3s1 X+3s2 X-α+s1 α+s2 α+ω-s1 ω-s2 ω+
1/2Sqrt[(4(α-ω+s2(2-3X-α-ω)+s1(-1+3X-α+ω))^2+8(2-5X+4X^2-2α+2X α+
s1(-2+4s2(1-2X)^2+7X-8X^2+2α-2X α-ω)+ω-s2(4-11X+8X^2-2α+2X α+ω)))]);
λ1simp[s1_,s2_,α_]:=λ1[s1,s2,α,Xf[s1,s2,α],ωf[s1,s2,α]];
λ2simp[s1_,s2_,α_]:=λ2[s1,s2,α,Xf[s1,s2,α],ωf[s1,s2,α]];
fint[s1_,s2_]:=HeavisideTheta[Sign[-λ1simp[s1,s2,α]]*Sign[-λ2simp[s1,s2,α]]];
Please check that very carefully to see if I've made any mistakes.
Now I want to look at your integrand and see what Mathematica sees.
Simplify[fint[s1,s2],1/2<=s1<=1&&1/2<=s2<=s1]
And it responds with
HeavisideTheta[CompexInfinity] 2s1==1||2s2==1||s1+s2==1
HeavisideTheta[True Sign[...]*Sign[...]]
so it looks like your integrand is blowing up at the boundary.
I check that with
Simplify[fint[1/2,s2]]
or
Simplify[fint[s1,1/2]]
and it responds with 1/0 and Indeterminate and HeavisideTheta[Indeterminate]
When it isn't at the boundary, for example
Simplify[fint[3/4,3/4]]
it returns
HeavisideTheta[Sign[5-4*Sqrt[7-28*α+25*α^2]]*Sign[5+4*Sqrt[7-28*α+25*α^2]]]
and that probably says that α is a free variable and we aren't able to determine the value of the Sign without more information.
So I think this is a strong hint where I would begin looking for why your integration is not simply completing.
If you are curious what that integrand looks like then try
α=1/4;
Plot3D[fint[s1,s2],{s1,1/2,1},{s2,1/2,s1}]

Related

Recomendations (functions/solution) to apply in OpenMDAO instead of boolean conditions (if/else)

I have been working for a couple of months with OpenMDAO and I find myself struggling with my code when I want to impose conditions for trying to replicate a physical/engineering behaviour.
I have tried using sigmoid functions, but I am still not convinced with that, due to the difficulty about trading off sensibility and numerical stabilization. Most of times I found overflows in exp so I end up including other conditionals (like np.where) so loosing linearity.
outputs['sigmoid'] = 1 / (1 + np.exp(-x))
I was looking for another kind of step function or something like that, able to keep linearity and derivability to the ease of the optimization. I don't know if something like that exists or if there is any strategy that can help me. If it helps, I am working with an OpenConcept benchmark, which uses vectorized computations ans Simpson's rule numerical integration.
Thank you very much.
PD: This is my first ever question in stackoverflow, so I would like to apologyze in advance for any error or bad practice commited. Hope to eventually collaborate and become active in the community.
Update after Justin answer:
I will take the opportunity to define a little bit more my problem and the strategy I tried. I am trying to monitorize and control thermodynamics conditions inside a tank. One of the things is to take actions when pressure P1 reaches certein threshold P2, for defining this:
eval= (inputs['P1'] - inputs['P2']) / (inputs['P1'] + inputs['P2'])
# P2 = threshold [Pa]
# P1 = calculated pressure [Pa]
k=100 #steepness control
outputs['sigmoid'] = (1 / (1 + np.exp(-eval * k)))
eval was defined in order avoid overflows normalizing the values, so when the threshold is recahed, corrections are taken. In a very similar way, I defined a function to check if there is still mass (so flowing can continue between systems):
eval= inputs['mass']/inputs['max']
k=50
outputs['sigmoid'] = (1 / (1 + np.exp(-eval*k)))**3
maxis also used for normalizing the value and the exponent is added for reaching zero before entering in the negative domain.
PLot (sorry it seems I cannot post images yet for my reputation)
It may be important to highlight that both mass and pressure are calculated from coupled ODE integration, in which this activation functions take part. I guess OpenConcept nature 'explore' a lot of possible values before arriving the solution, so most of the times giving negative infeasible values for massand pressure and creating overflows. For that sometimes I try to include:
eval[np.where(eval > 1.5)] = 1.5
eval[np.where(eval < -1.5)] = -1.5
That is not a beautiful but sometimes effective solution. I try to avoid using it since I taste that this bounds difficult solver and optimizer work.
I could give you a more complete answer if you distilled your question down to a specific code example of the function you're wrestling with and its expected input range. If you provide that code-sample, I'll update my answer.
Broadly, this is a common challenge when using gradient based optimization. You want some kind of behavior like an if-condition to turn something on/off and in many cases thats a fundamentally discontinuous function.
To work around that we often use sigmoid functions, but these do have some of the numerical challenges you pointed out. You could try a hyberbolic tangent as an alternative, though it may suffer the same kinds of problems.
I will give you two broad options:
Option 1
sometimes its ok (even if not ideal) to leave the purely discrete conditional in the code. Lets say you wanted to represent a kind of simple piecewise function:
y = 2x; x>=0
y = 0; x < 0
There is a sharp corner in that function right at 0. That corner is not differentiable, but the function is fine everywhere else. This is very much like the absolute value function in practice, though you might not draw the analogy looking at the piecewise definition of the function because the piecewise nature of abs is often hidden from you.
If you know (or at least can check after the fact) that your final answer will no lie right on or very near to that C1 discontinuity, then its probably fine to leave the code the way is is. Your derivatives will be well defined everywhere but right at 0 and you can simply pick the left or the right answer for 0.
Its not strictly mathematically correct, but it works fine as long as you're not ending up stuck right there.
Option 2
Apply a smoothing function. This can be a sigmoid, or a simple polynomial. The exact nature of the smoothing function is highly specific to the kind of discontinuity you are trying to approximate.
In the case of the piecewise function above, you might be tempted to define that function as:
2x*sig(x)
That would give you roughly the correct behavior, and would be differentiable everywhere. But wolfram alpha shows that it actually undershoots a little. Thats probably undesirable, so you can increase the exponent to mitigate that. This however, is where you start to get underflow and overflow problems.
So to work around that, and make a better behaved function all around, you could instead defined a three part piecewise polynomial:
y = 2x; x>=a
y = c0 + c1*x + c2*x**2; -a <= x < a
y = 0 x < -a
you can solve for the coefficients as a function of a (please double check my algebra before using this!):
c0 = 1.5a
c1 = 2
c2 = 1/(2a)
The nice thing about this approach is that it will never overshoot and go negative. You can also make a reasonably small and still get decent numerics. But if you try to make it too small, c2 will obviously blow up.
In general, I consider the sigmoid function to be a bit of a blunt instrument. It works fine in many cases, but if you try to make it approximate a step function too closely, its a nightmare. If you want to represent physical processes, I find polynomial fillet functions work more nicely.
It takes a little effort to derive that polynomial, because you want it to be c1 continuous on both sides of the curve. So you have to construct the system of equations to solve for it as a function of the polynomial order and the specific relaxation you want (0.1 here).
My goto has generally been to consult the table of activation functions on wikipedia: https://en.wikipedia.org/wiki/Activation_function
I've had good luck with sigmoid and the hyperbolic tangent, scaling them such that we can choose the lower and upper values as well as choosing the location of the activation on the x-axis and the steepness.
Dymos uses a vectorization that I think is similar to OpenConcept and I've had success with numpy.where there as well, providing derivatives for each possible "branch" taken. It is true that you may have issues with derivative mismatches if you have an analysis point right on the transition, but often I've had success despite that. If the derivative at the transition becomes a hinderance then implementing a sigmoid or relu are more appropriate.
If x is of a magnitude such that it can cause overflows, consider applying units or using scaling to put it within reasonable limits if you cannot bound it directly.

Taking advantage of Julia's integration abilities

One of the main reasons I wanted to use Julia for my project is because of its speed, especially for calculating integrals.
I would like to integrate a 1-d function f(x) over some interval [a,b]. In general Julia's quadgk function would be a fast and accurate solution. However, I do not have the function f(x), but only its values f(xi) for a discrete set of points xi in [a,b], stored in an array. The xi's are regularly spaced, and I can get the spacing to be however small I like.
Naively, I could simply define a function f which interpolates using the values f(xi) and feed this to quadgk, (and make the spacing as small as possible), however then I won't know what my error is, which is a shame because QuadGK tells you the error in its estimation.
Another solution is to write a function myself to integrate the array (with trapezoid rule for example), but that would defeat the purpose of using Julia...
What is the easiest way to accurately integrate a function only given discrete values using Julia?
Since you only have values, not the function itself, trapezoid will be your best bet probably. The package Trapz provides this (https://github.com/francescoalemanno/Trapz.jl). However, I think it is worth seeing how easy writing a pretty good implementation yourself would be.
function trap(A)
return sum(A) - (A[begin] + A[end])/2
end
This takes 2.9ms for an array of 10 million floats. If they're Int, then 2.9ms. If they were complex numbers, it would still work (and take 8.9 ms)
A method like this is a good example to show how simple it can be to write pretty fast code in Julia that is still fully generic

Integer ceil(sqrt(x))

The answer gives the following code for computing floor(sqrt(x)) using just integers. Is it possible to use/modify it to return ceil(sqrt(x)) instead? Alternatively, what is the preferred way to calculate such value?
Edit: Thank you all so far and I apologise, I should have make it more explicit: I was hoping there is more "natural" way of doing this that using floor(sqrt(x)), possibly plus one. The floor version uses Newton's method to approach the root from above, I thought that maybe approaching it from below or similar would do the trick.
For example the answer even provides how to round to nearest integer: just input 4*x to the algorithm.
If x is an exact square, the ceiling and the floor of the square root are equal; otherwise, the ceiling is one more than the square root. So you could use (in Python),
result = floorsqrt(x)
if result * result != x:
result += 1
Modifying the code you linked to is not a good idea, since that code uses some properties of the Newton-Raphson method of calculating the square root. Much theory has been developed about that method, and the code uses that theory. The code I show is not as neat as modifying your linked code but it is safer and probably quicker than making a change in the code.
You can use this fact that:
floor(x) = (ceil(x) - 1) if x \not \in Z else ceil(x)
Hence, check if N is in the form 2^k, the code is the same, and if it is not, you can -1 the result of the current code.

Mathematica not evaluating any trigonometric integrals

When I input any integral, say sin[x] or ln[x], all I get back is the input. And from what I understand that usually means that mathematica can't evaluate the integral...What am i missing here?
I'm new to stackoverflow so I can't post screenshots.
Read the documentation:
Sin[x] manual
Log[x] manual
Note also that by default Log function is of base e, therefore it is your Ln.
In addition, if you have further questions mathematica/wolfram has it's own subsite where you should ask related questions https://mathematica.stackexchange.com/
As of version 9 you can query wolfram alpha directly and it can even try to interpret human language (and fix syntax mistakes like this). To query wolfram alpha start line with =.

How to make nonsymbolic plot_vector_field in sage?

I have a function f(x,y) whose outcome is random (I take mean from 20 random numbers depending on x and y). I see no way to modify this function to make it symbolic.
And when I run
x,y = var('x,y')
d = plot_vector_field((f(x),x), (x,0,1), (y,0,1))
it says it can't cast symbolic expression to real or rationa number. In fact it stops when I write:
a=matrix(RR,1,N)
a[0]=x
What is the way to change this variable to real numbers in the beginning, compute f(x) and draw a vector field? Or just draw a lot of arrows with slope (f(x),x)?
I can create something sort of like yours, though with no errors. At least it doesn't do what you want.
def f(m,n):
return m*randint(100,200)-n*randint(100,200)
var('x,y')
plot_vector_field((f(x,y),f(y,x)),(x,0,1),(y,0,1))
The reason is because Python functions immediately evaluate - in this case, f(x,y) was 161*x - 114*y, though that will change with each invocation.
My suspicion is that your problem is similar, the immediate evaluation of the Python function once and for all. Instead, try lambda functions. They are annoying but very useful in this case.
var('x,y')
plot_vector_field((lambda x,y: f(x,y), lambda x,y: f(y,x)),(x,0,1),(y,0,1))
Wow, I now I have to find an excuse to show off this picture, cool stuff. I hope your error ends up being very similar.

Resources