Understanding Forward.Diff issues - julia

I got an apparently quite common Julia error when trying to use AD with forward.diff. The error messages vary a bit (sometimes matching function name sometimes Float64)
MethodError: no method matching logL_multinom(::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(logL_multinom), Real}, Real, 7}})
My goal: Transform a probability vector to be unbounded (θ -> y), do some stuff (namely HMC sampling) and transform back to the simplex space whenever the unnormalized posterior (logL_multinom()) is evaluated. DA should be used to overome problems for later, more complex, models than this.
Unfortunately, neither the Julia documentation, not the solutions from other questions helped me figure the particular problem out. Especially, it seems to work when I do the first transformation (y -> z) outside of the function, but the first transformation is a 1-to-1 mapping via logistic and should not cause any harm to differentiation.
Here is an MWE:
using LinearAlgebra
using ForwardDiff
using Base
function logL_multinom(y)
# transform to constrained
K = length(y)+1
k = collect(1:(K-1))
# inverse logit:
z = 1 ./ (1 .+ exp.(-y .- log.(K .- k))) # if this is outside, it works
θ = zeros(eltype(y),K) ; x_cumsum = zeros(eltype(y),K-1)
typeof(θ)
for i in k
x_cumsum[i] = 1-sum(θ)
θ[i] = (x_cumsum[i]) * z[i]
end
θ[K] = x_cumsum[K-1] - θ[K-1]
#log_dens_correction = sum( log(z*(1-z)*x_cumsum) )
dot(colSums, log.(θ))
end
colSums = [835, 52, 1634, 3469, 3053, 2507, 2279, 1115]
y0 = [-0.8904013824298864, -0.8196709647741431, -0.2676845405543302, 0.31688184351556026, -0.870860684394019,0.15187821053559714,0.39888119498547964]
logL_multinom(y0)
∇L = y -> ForwardDiff.gradient(logL_multinom,y)
∇L(y0)
Thanks a lot and especially some further readings/ explanations for the problem are appreciated since I'll be working with it moreoften :D
Edit: I tried to convert the input and any intermediate variable into Real / arrays of these, but nothing helped so far.

Related

How to calculate complex functions in Fortran which are not easy to separate into real and imaginary components?

I need to calculate a complex function for different x values, something like
(i+1) exp(i * x) / x
I see that my option is to expand it in terms of sin and cos, and then separate out real and imaginary parts manually by hand, to be able to calculate them individually in code, and then define my complex function. This is a relatively simple function, but I have some bigger ones, not so easy to segregate manually into two components.
Am I missing something or this is the only way?
EDIT : I am pasting the sample code which works after helpful comments from everyone below, hope it's useful :
program complex_func
IMPLICIT NONE
Real(8) x
complex CF
x = 0.7
call complex_example(x, CF)
write(*,*) CF
end program complex_func
Subroutine complex_example(y, my_CF)
Implicit None
Real(8) y
Complex my_CF
complex, parameter :: i = (0, 1) ! sqrt(-1)
my_CF = (i+1) * exp(i*y) / y
!my_CF = cmplx(1, 1) * exp(cmplx(0.0, y)) / y !!THIS WORKS, TOO
write(*,*) my_CF
return
end Subroutine complex_example

Plot of function, DomainError. Exponentiation yielding a complex result requires a complex argument

Background
I read here that newton method fails on function x^(1/3) when it's inital step is 1. I am tring to test it in julia jupyter notebook.
I want to print a plot of function x^(1/3)
then I want to run code
f = x->x^(1/3)
D(f) = x->ForwardDiff.derivative(f, float(x))
x = find_zero((f, D(f)),1, Roots.Newton(),verbose=true)
Problem:
How to print chart of function x^(1/3) in range eg.(-1,1)
I tried
f = x->x^(1/3)
plot(f,-1,1)
I got
I changed code to
f = x->(x+0im)^(1/3)
plot(f,-1,1)
I got
I want my plot to look like a plot of x^(1/3) in google
However I can not print more than a half of it
That's because x^(1/3) does not always return a real (as in numbers) result or the real cube root of x. For negative numbers, the exponentiation function with some powers like (1/3 or 1.254 and I suppose all non-integers) will return a Complex. For type-stability requirements in Julia, this operation applied to a negative Real gives a DomainError. This behavior is also noted in Frequently Asked Questions section of Julia manual.
julia> (-1)^(1/3)
ERROR: DomainError with -1.0:
Exponentiation yielding a complex result requires a complex argument.
Replace x^y with (x+0im)^y, Complex(x)^y, or similar.
julia> Complex(-1)^(1/3)
0.5 + 0.8660254037844386im
Note that The behavior of returning a complex number for exponentiation of negative values is not really different than, say, MATLAB's behavior
>>> (-1)^(1/3)
ans =
0.5000 + 0.8660i
What you want, however, is to plot the real cube root.
You can go with
plot(x -> x < 0 ? -(-x)^(1//3) : x^(1//3), -1, 1)
to enforce real cube root or use the built-in cbrt function for that instead.
plot(cbrt, -1, 1)
It also has an alias ∛.
plot(∛, -1, 1)
F(x) is an odd function, you just use [0 1] as input variable.
The plot on [-1 0] is deducted as follow
The code is below
import numpy as np
import matplotlib.pyplot as plt
# Function f
f = lambda x: x**(1/3)
fig, ax = plt.subplots()
x1 = np.linspace(0, 1, num = 100)
x2 = np.linspace(-1, 0, num = 100)
ax.plot(x1, f(x1))
ax.plot(x2, -f(x1[::-1]))
ax.axhline(y=0, color='k')
ax.axvline(x=0, color='k')
plt.show()
Plot
That Google plot makes no sense to me. For x > 0 it's ok, but for negative values of x the correct result is complex, and the Google plot appears to be showing the negative of the absolute value, which is strange.
Below you can see the output from Matlab, which is less fussy about types than Julia. As you can see it does not agree with your plot.
From the plot you can see that positive x values give a real-valued answer, while negative x give a complex-valued answer. The reason Julia errors for negative inputs, is that they are very concerned with type stability. Having the output type of a function depend on the input value would cause a type instability, which harms performance. This is less of a concern for Matlab or Python, etc.
If you want a plot similar the above in Julia, you can define your function like this:
f = x -> sign(x) * abs(complex(x)^(1/3))
Edit: Actually, a better and faster version is
f = x -> sign(x) * abs(x)^(1/3)
Yeah, it looks awkward, but that's because you want a really strange plot, which imho makes no sense for the function x^(1/3).

How do I write a piecewise Differential Equation in Julia?

I am new to Julia, I would like to solve this system:
where k1 and k2 are constant parameters. However, I=0 when y,0 or Ky otherwise, where k is a constant value.
I followed the tutorial about ODE. The question is, how to solve this piecewise differential equation in DifferentialEquations.jl?
Answered on the OP's cross post on Julia Discourse; copied here for completeness.
Here is a (mildly) interesting example $x''+x'+x=\pm p_1$ where the sign of $p_1$ changes when a switching manifold is encountered at $x=p_2$. To make things more interesting, consider hysteresis in the switching manifold such that $p_2\mapsto -p_2$ whenever the switching manifold is crossed.
The code is relatively straightforward; the StaticArrays/SVector/MVector can be ignored, they are only for speed.
using OrdinaryDiffEq
using StaticArrays
f(x, p, t) = SVector(x[2], -x[2]-x[1]+p[1]) # x'' + x' + x = ±p₁
h(u, t, integrator) = u[1]-integrator.p[2] # switching surface x = ±p₂;
g(integrator) = (integrator.p .= -integrator.p) # impact map (p₁, p₂) = -(p₁, p₂)
prob = ODEProblem(f, # RHS
SVector(0.0, 1.0), # initial value
(0.0, 100.0), # time interval
MVector(1.0, 1.0)) # parameters
cb = ContinuousCallback(h, g)
sol = solve(prob, Vern6(), callback=cb, dtmax=0.1)
Then plot sol[2,:] against sol[1,:] to see the phase plane - a nice non-smooth limit cycle in this case.
Note that if you try to use interpolation of the resulting solution (i.e., sol(t)) you need to be very careful around the points that have a discontinuous derivative as the interpolant goes a little awry. That's why I've used dtmax=0.1 to get a smoother solution output in this case. (I'm probably not using the most appropriate integrator either but it's the one that I was using in a previous piece of code that I copied-and-pasted.)

This is more a matlab/math brain teaser than a question

Here is the setup. No assumptions for the values I am using.
n=2; % dimension of vectors x and (square) matrix P
r=2; % number of x vectors and P matrices
x1 = [3;5]
x2 = [9;6]
x = cat(2,x1,x2)
P1 = [6,11;15,-1]
P2 = [2,21;-2,3]
P(:,1)=P1(:)
P(:,2)=P2(:)
modePr = [-.4;16]
TransPr=[5.9,0.1;20.2,-4.8]
pred_modePr = TransPr'*modePr
MixPr = TransPr.*(modePr*(pred_modePr.^(-1))')
x0 = x*MixPr
Then it was time to apply the following formula to get myP
, where μij is MixPr. I used this code to get it:
myP=zeros(n*n,r);
Ptables(:,:,1)=P1;
Ptables(:,:,2)=P2;
for j=1:r
for i = 1:r;
temp = MixPr(i,j)*(Ptables(:,:,i) + ...
(x(:,i)-x0(:,j))*(x(:,i)-x0(:,j))');
myP(:,j)= myP(:,j) + temp(:);
end
end
Some brilliant guy proposed this formula as another way to produce myP
for j=1:r
xk1=x(:,j); PP=xk1*xk1'; PP0(:,j)=PP(:);
xk1=x0(:,j); PP=xk1*xk1'; PP1(:,j)=PP(:);
end
myP = (P+PP0)*MixPr-PP1
I tried to formulate the equality between the two methods and seems to be this one. To make things easier, I skipped the summation of matrix P in both methods .
where the first part denotes the formula that I used, and the second comes from his code snippet. Do you think this is an obvious equality? If yes, ignore all the above and just try to explain why. I could only start from the LHS, and after some algebra I think I proved it equals to the RHS. However I can't see how did he (or she) think of it in the first place.
Using E for expectation, the one dimensional version of your formula is the familiar:
Variance(X) = E((X-E(X))^2) = E(X^2) - E(X)^2
While the second form might be easier programming, I'd worry about ending up with a negative (or, in the multidimensional case, non positive definite) answer by using it, due to rounding error.

Minimizing a function containing an integral

Does anyone know how to minimize a function containing an integral in MATLAB? The function looks like this:
L = Int(t=0,t=T)[(AR-x)dt], A is a system parameter and R and x are related through:
dR/dt = axRY - bR, where a and b are constants.
dY/dt = -xRY
I read somewhere that I can use fminbnd and quad in combination but I am not able to make it work. Any suggestions?
Perhaps you could give more details of your integral, e.g. where is the missing bracket in [AR-x)dt]? Is there any dependence of x on t, or can we integrate dR/dt = axR - bR to give R=C*exp((a*x-b)*t)? In any case, to answer your question on fminbnd and quad, you could set A,C,T,a,b,xmin and xmax (the last two are the range you want to look for the min over) and use:
[x fval] = fminbnd(#(x) quad(#(t)A*C*exp((a*x-b)*t)-x,0,T),xmin,xmax)
This finds x that minimizes the integral.
If i didn't get it wrong you are trying to minimize respect to t:
\int_0^t{(AR-x) dt}
well then you just need to find the zeros of:
AR-x
This is just math, not matlab ;)
Here's some manipulation of your equations that might help.
Combining the second and third equations you gave gives
dR/dt = -a*(dY/dt)-bR
Now if we solve for R on the righthand side and plug it into the first equation you gave we get
L = Int(t=0,t=T)[(-A/b*(dR/dt + a*dY/dt) - x)dt]
Now we can integrate the first term to get:
L = -A/b*[R(T) - R(0) + Y(T) - Y(0)] - Int(t=0,t=T)[(x)dt]
So now all that matters with regards to R and Y are the endpoints. In fact, you may as well define a new function, Z which equals Y + R. Then you get
L = -A/b*[Z(T) - Z(0)] - Int(t=0,t=T)[(x)dt]
This next part I'm not as confident in. The integral of x with respect to t will give some function which is evaluated at t = 0 and t = T. This function we will call X to give:
L = -A/b*[Z(T) - Z(0)] - X(T) + X(0)
This equation holds true for all T, so we can set T to t if we want to.
L = -A/b*[Z(t) - Z(0)] - X(t) + X(0)
Also, we can group a lot of the constants together and call them C to give
X(t) = -A/b*Z(t) + C
where
C = A/b*Z(0) + X(0) - L
So I'm not sure what else to do with this, but I've shown that the integral of x(t) is linearly related to Z(t) = R(t) + Y(t). It seems to me that there are many equations that solve this. Anyone else see where to go from here? Any problems with my math?

Resources