Automatic Differentiation in Julia: Hessian from ReverseDiffSparse - julia

How can I evaluate the Hessian of a function in Julia using automatic differentiation (preferably using ReverseDiffSparse)? In the following example, I can compute and evaluate the gradient at a point values through JuMP:
m = Model()
#variable(m, x)
#variable(m, y)
#NLobjective(m, Min, sin(x) + sin(y))
values = zeros(2)
values[linearindex(x)] = 2.0
values[linearindex(y)] = 3.0
d = JuMP.NLPEvaluator(m)
MathProgBase.initialize(d, [:Grad])
objval = MathProgBase.eval_f(d, values) # == sin(2.0) + sin(3.0)
∇f = zeros(2)
MathProgBase.eval_grad_f(d, ∇f, values)
# ∇f[linearindex(x)] == cos(2.0)
# ∇f[linearindex(y)] == cos(3.0)
Now I want the Hessian at values. I've tried changing the below:
MathProgBase.initialize(d, [:Grad,:Hess])
H = zeros(2); # based on MathProgBase.hesslag_structure(d) = ([1,2],[1,2])
MathProgBase.eval_hesslag(d, H, values, 0, [0])
but am getting an error.
For reference, cross-post is on the helpful Julia Discourse forum here.

Related

Is there an R spline that matches the Schoenberg algorithm?

Is there a R algorithm that fit smoothing splines while minimizing L,
L = ρ ∑ (i from 0 to n-1) wi(yi-Si(xi))² + (1 - ρ) ∫ (x from 0 to x_(n-1)) (S''(x))² dx
Maybe it's possible with smooth.spline but I didn't succeed to find the good parameters.
(The equation can be seen more clearly here : https://www.iro.umontreal.ca/~simardr/ssj/doc/html/umontreal/iro/lecuyer/functionfit/SmoothingCubicSpline.html)
pspline::smooth.Pspline( x = input_x,
y = input_y,
norder = 2,
method = 1,
spar = rho)
will do the job.

Julia ERROR: 'Can't Promote to Common Type' for multivariate polynomials in Nemo Library

I am new to Julia and have been trying to compute some polynomials using the Nemo library's multivariate polynomials. I have the following code:
R = GF(2); # create finite field
S, (z, x) = PolynomialRing(R, ["z", "x"]); # Multivariate polynomial
# polynomial initialisations
L = x^0;
E_L = x^0;
E = x*0;
a = z;
w = [1 1 a^3 a^2 a^1 0 0];
n = length(w); #input vector, in terms of a
j = 1;
for i = 1:n
if(w[i] != 0)
L = L*(1-(a^i)*x); # equation for the locator polynomial
if(i != j)
E_L = E_L*(1-(a^j)*x);
end
j = j + 1;
E = E + w[i]*(a^i)*E_L; # LINE WITH ERROR
end
end
I am not entirely sure why the line with the expression for E throws an error.
I have tried a similar thing with the declaration for L: L = L + L*(1-(a^i)*x) to see whether it was something to do with the addition of two polynomials, however, this works fine. Therefore, I am confused as to why E's expression throws an error.
Any help would greatly appreciated! Thanks in advance!

Double integration with a differentiation inside in R

I need to integrate the following function where there is a differentiation term inside. Unfortunately, that term is not easily differentiable.
Is this possible to do something like numerical integration to evaluate this in R?
You can assume 30,50,0.5,1,50,30 for l, tau, a, b, F and P respectively.
UPDATE: What I tried
InnerFunc4 <- function(t,x){digamma(gamma(a*t*(LF-LP)*b)/gamma(a*t))*(x-t)}
InnerIntegral4 <- Vectorize(function(x) { integrate(InnerFunc4, 1, x, x = x)$value})
integrate(InnerIntegral4, 30, 80)$value
It shows the following error:
Error in integrate(InnerFunc4, 1, x, x = x) : non-finite function value
UPDATE2:
InnerFunc4 <- function(t,L){digamma(gamma(a*t*(LF-LP)*b)/gamma(a*t))*(L-t)}
t_lower_bound = 0
t_upper_bound = 30
L_lower_bound = 30
L_upper_bound = 80
step_size = 0.5
integral = 0
t <- t_lower_bound + 0.5*step_size
while (t < t_upper_bound){
L = L_lower_bound + 0.5*step_size
while (L < L_upper_bound){
volume = InnerFunc4(t,L)*step_size**2
integral = integral + volume
L = L + step_size
}
t = t + step_size
}
Since It seems that your problem is only the derivative, you can get rid of it by means of partial integration:
Edit
Not applicable solution for lower integration bound 0.

Julia - MethodError: no method matching current_axis(::Nothing)

I am trying to test a linear approximation function and I am getting the error "no method matching current_axis(::Nothing)".
Here is my linear approximation function:
function linear_approx(A,b,c,p0)
p0 = [i for i in p0]
y(p) = p'*A*p .+ b'*p .+ c .-1
e = y(p0)
d = 2*A*p0 + b
(; d, e)
end
Here is the function that attempts to plot and throws an exception. I also included that value of the parameter when I tried to call it:
pts = [(1,1), (3,2), (4,4)]
function visualize_approx(pts)
# Use this function to inspect your solution, and
# ensure that the three points lie on one of
# the level-sets of your quadratic approximation.
(; A, b, c) = constant_curvature_approx(pts)
min_val = Inf
max_val = -Inf
for pt in pts
(; d, e) = linear_approx(A,b,c,pt)
P = LinRange(pt[1] - 0.2, pt[1]+0.2, 100)
Q = linear_segment(pt, d, e, P)
# the error arises here
plot!(P, Q)
plot!([pt[1]], [pt[2]])
end
delta = max_val - min_val
min_val -= 0.25*delta
max_val += 0.25*delta
X = Y = LinRange(min_val,max_val, 100)
Z = zeros(100,100)
for i = 1:100
for j = 1:100
pt = [X[i]; Y[j]]
Z[i,j] = pt'*A*pt + pt'*b + c
end
end
contour(X,Y,Z,levels=[-1,0,1,2,3])
for pt in pts
plot!([pt[1]], [pt[2]])
end
current_figure()
end
Does anyone know why this error arises?
plot! modifies a previously created plot object. It seems like you did not create a plot before calling it. This is why you get the error. Use plot when creating the plot and plot! when modifying it.

2D curve fitting in Julia

I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.

Resources