I am trying to minimize a nonlinear function with nonlinear inequality constraints with NLopt and JuMP.
In my test code below, I am minimizing a function with a known global minima.
Local optimizers such as LD_MMA fails to find this global minima, so I am trying to use global optimizers of NLopt that allow nonlinear inequality constraintes.
However, when I check my termination status, it says “termination_status(model) = MathOptInterface.OTHER_ERROR”. I am not sure which part of my code to check for this error.
What could be the cause?
I am using JuMP since in the future I plan to use other solvers such as KNITRO as well, but should I rather use the NLopt syntax?
Below is my code:
# THIS IS A CODE TO SOLVE FOR THE TOYMODEL
# THE EQUILIBRIUM IS CHARACTERIZED BY A NONLINEAR SYSTEM OF ODEs OF INCREASING FUCTIONS B(x) and S(y)
# THE GOAL IS TO APPROXIMATE B(x) and S(y) WITH POLYNOMIALS
# FIND THE POLYNOMIAL COEFFICIENTS THAT MINIMIZE THE LEAST SQUARES OF THE EQUILIBRIUM EQUATIONS
# load packages
using Roots, NLopt, JuMP
# model primitives and other parameters
k = .5 # equal split
d = 1 # degree of polynomial
nparam = 2*d+2 # number of parameters to estimate
m = 10 # number of grids
m -= 1
vGrid = range(0,1,m) # discretize values
c1 = 0 # lower bound for B'() and S'()
c2 = 2 # lower and upper bounds for offers
c3 = 1 # lower and upper bounds for the parameters to be estimated
# objective function to be minimized
function obj(α::T...) where {T<:Real}
# split parameters
αb = α[1:d+1] # coefficients for B(x)
αs = α[d+2:end] # coefficients for S(y)
# define B(x), B'(x), S(y), and S'(y)
B(v) = sum([αb[i] * v .^ (i-1) for i in 1:d+1])
B1(v) = sum([αb[i] * (i-1) * v ^ (i-2) for i in 2:d+1])
S(v) = sum([αs[i] * v .^ (i-1) for i in 1:d+1])
S1(v) = sum([αs[i] * (i-1) * v ^ (i-2) for i in 2:d+1])
# the equilibrium is characterized by the following first order conditions
#FOCb(y) = B(k * y * S1(y) + S(y)) - S(y)
#FOCs(x) = S(- (1-k) * (1-x) * B1(x) + B(x)) - B(x)
function FOCb(y)
sy = S(y)
binv = find_zero(q -> B(q) - sy, (-c2, c2))
return k * y * S1(y) + sy - binv
end
function FOCs(x)
bx = B(x)
sinv = find_zero(q -> S(q) - bx, (-c2, c2))
return (1-k) * (1-x) * B1(x) - B(x) + sinv
end
# evaluate the FOCs at each grid point and return the sum of squares
Eb = [FOCb(y) for y in vGrid]
Es = [FOCs(x) for x in vGrid]
E = [Eb; Es]
return E' * E
end
# this is the actual global minimum
αa = [1/12, 2/3, 1/4, 2/3]
obj(αa...)
# do optimization
model = Model(NLopt.Optimizer)
set_optimizer_attribute(model, "algorithm", :GN_ISRES)
#variable(model, -c3 <= α[1:nparam] <= c3)
#NLconstraint(model, [j = 1:m], sum(α[i] * (i-1) * vGrid[j] ^ (i-2) for i in 2:d+1) >= c1) # B should be increasing
#NLconstraint(model, [j = 1:m], sum(α[d+1+i] * (i-1) * vGrid[j] ^ (i-2) for i in 2:d+1) >= c1) # S should be increasing
register(model, :obj, nparam, obj, autodiff=true)
#NLobjective(model, Min, obj(α...))
println("")
println("Initial values:")
for i in 1:nparam
set_start_value(α[i], αa[i]+rand()*.1)
println(start_value(α[i]))
end
JuMP.optimize!(model)
println("")
#show termination_status(model)
#show objective_value(model)
println("")
println("Solution:")
sol = [value(α[i]) for i in 1:nparam]
My output:
Initial values:
0.11233072522513032
0.7631843020124309
0.3331559403539963
0.7161240026812674
termination_status(model) = MathOptInterface.OTHER_ERROR
objective_value(model) = 0.19116585196576466
Solution:
4-element Vector{Float64}:
0.11233072522513032
0.7631843020124309
0.3331559403539963
0.7161240026812674
I answered on the Julia forum: https://discourse.julialang.org/t/mathoptinterface-other-error-when-trying-to-use-isres-of-nlopt-through-jump/87420/2.
Posting my answer for posterity:
You have multiple issues:
range(0,1,m) should be range(0,1; length = m) (how did this work otherwise?) This is true for Julia 1.6. The range(start, stop, length) method was added for Julia v1.8
Sometimes your objective function errors because the root doesn't exist. If I run with Ipopt, I get
ERROR: ArgumentError: The interval [a,b] is not a bracketing interval.
You need f(a) and f(b) to have different signs (f(a) * f(b) < 0).
Consider a different bracket or try fzero(f, c) with an initial guess c.
Here's what I would do:
using JuMP
import Ipopt
import Roots
function main()
k, d, c1, c2, c3, m = 0.5, 1, 0, 2, 1, 10
nparam = 2 * d + 2
m -= 1
vGrid = range(0, 1; length = m)
function obj(α::T...) where {T<:Real}
αb, αs = α[1:d+1], α[d+2:end]
B(v) = sum(αb[i] * v^(i-1) for i in 1:d+1)
B1(v) = sum(αb[i] * (i-1) * v^(i-2) for i in 2:d+1)
S(v) = sum(αs[i] * v^(i-1) for i in 1:d+1)
S1(v) = sum(αs[i] * (i-1) * v^(i-2) for i in 2:d+1)
function FOCb(y)
sy = S(y)
binv = Roots.fzero(q -> B(q) - sy, zero(T))
return k * y * S1(y) + sy - binv
end
function FOCs(x)
bx = B(x)
sinv = Roots.fzero(q -> S(q) - bx, zero(T))
return (1-k) * (1-x) * B1(x) - B(x) + sinv
end
return sum(FOCb(x)^2 + FOCs(x)^2 for x in vGrid)
end
αa = [1/12, 2/3, 1/4, 2/3]
model = Model(Ipopt.Optimizer)
#variable(model, -c3 <= α[i=1:nparam] <= c3, start = αa[i]+ 0.1 * rand())
#constraints(model, begin
[j = 1:m], sum(α[i] * (i-1) * vGrid[j]^(i-2) for i in 2:d+1) >= c1
[j = 1:m], sum(α[d+1+i] * (i-1) * vGrid[j]^(i-2) for i in 2:d+1) >= c1
end)
register(model, :obj, nparam, obj; autodiff = true)
#NLobjective(model, Min, obj(α...))
optimize!(model)
print(solution_summary(model))
return value.(α)
end
main()
Related
I am solving a differential equation in Julia the corresponding equations are given in the code below. Now by solving the differential equation what I am getting is the two variables only. But for my work, I also want the derivatives of the variables in each time step too after the integration is done, in which I am unable to proceed further.
using DelimitedFiles
using LightGraphs
using LinearAlgebra
using Random
using PyPlot
using BenchmarkTools
using SparseArrays
const N= 100;
Adj=readdlm("sf_simplicial_100.txt")
G=Graph(Adj)
global A = adjacency_matrix(G)
global deg=degree(G)
global omega=deg
mean(deg)
A2=zeros(N,N,N);
for i in 1:N
for j in 1:N
for k in 1:N
if (A[i,j]==1 && A[j,k]==1 && A[k,i]==1)
A2[i,j,k]=1
A2[i,k,j]=1
A2[j,k,i]=1
A2[j,i,k]=1
A2[k,i,j]=1
A2[k,j,i]=1
end;
end;
end;
end;
K= sum(p -> A2[:,:,p], 1:N)
deg_sim= sum(j -> K[:,j], 1:N)/2;
deg_sim2=2*deg_sim;
function kuramoto(du,u, pp, t)
u1 = #view u[1:N] #### θ
u2 = #view u[N+1:2*N] ####### λ
du1 = #view du[1:N] #### dθ
du2 = #view du[N+1:2*N] ####### dλ
α1=0.08
β1=0.04
σ1=1.0
σ2=1.0
λ0=pp
####### local_order
z1 = Array{Complex{Float64},1}(undef, N)
mul!(z1, A, exp.((u1)im))
z1 = z1 ./ deg
####### generalized_local_order
z2 = Array{Complex{Float64},1}(undef, N)
z2= (diag(A*Diagonal(exp.((u1)im))*A*Diagonal(exp.((u1)im))*A))
z2 = z2 ./ deg_sim2
####### equ of motion
#. du1 = omega + u2 *( σ1 * deg * imag(z1 * exp((-1im) * u1)) + σ2 * deg_sim * imag(z2 * exp((-1im) * 2*u1)))
#. du2 = α1 *(λ0-u2)- β1 * (abs(z1)+ abs(z2))/2.0
return nothing
end;
using DifferentialEquations
# setting up time steps and integration intervals
dt = 0.01 # time step
dts = 0.1 # save time
ti = 0.0
tt = 1000.0
tf = 5000.0
nt = Int(div(tt,dts))
nf = Int(div(tf,dts))
tspan = (ti, tf); # time interval
pp=0.75
ini=readdlm("N=100/initial_condition.txt")
u0=[ini;pp*ones(N)];
du = similar(u0);
prob = ODEProblem(kuramoto,u0, tspan, pp)
sol = solve(prob, RK4(), reltol=1e-4, saveat=dts,maxiters=1e10,progress=true)
Use the derivative of the interpolation sol(t,Val{1}). To do this you'll want to not use saveat. Otherwise you can use the SavingCallback.
I have multiple objective functions for the same model in Julia JuMP created using an #optimize in a for loop. What does it mean to have multiple objective functions in Julia? What objective is minimized, or is it that all the objectives are minimized jointly? How are the objectives minimized jointly?
using JuMP
using MosekTools
K = 3
N = 2
penalties = [1.0, 3.9, 8.7]
function fac1(r::Number, i::Number, l::Number)
fac1 = 1.0
for m in 0:r-1
fac1 *= (i-m)*(l-m)
end
return fac1
end
function fac2(r::Number, i::Number, l::Number, tau::Float64)
return tau ^ (i + l - 2r + 1)/(i + l - 2r + 1)
end
function Q_r(i::Number, l::Number, r::Number, tau::Float64)
if i >= r && l >= r
return 2 * fac1(r, i, l) * fac2(r, i, l, tau)
else
return 0.0
end
end
function Q(i::Number, l::Number, tau::Number)
elem = 0
for r in 0:N
elem += penalties[r + 1] * Q_r(i, l, r, tau)
end
return elem
end
# discrete segment starting times
mat = Array{Float64, 3}(undef, K, N+1, N+1)
function Q_mat()
for k in 0:K-1
for i in 1:N+1
for j in 1:N+1
mat[k+1, i, j] = Q(i, j, convert(Float64, k))
end
end
return mat
end
end
function A_tau(r::Number, n::Number, tau::Float64)
fac = 1
for m in 1:r
fac *= (n - (m - 1))
end
if n >= r
return fac * tau ^ (n - r)
else
return 0.0
end
end
function A_tau_mat(tau::Float64)
mat = Array{Float64, 2}(undef, N+1, N+1)
for i in 1:N+1
for j in 1:N+1
mat[i, j] = A_tau(i, j, tau)
end
end
return mat
end
function A_0(r::Number, n::Number)
if r == n
fac = 1
for m in 1:r
fac *= r - (m - 1)
end
return fac
else
return 0.0
end
end
m = Model(optimizer_with_attributes(Mosek.Optimizer, "QUIET" => false, "INTPNT_CO_TOL_DFEAS" => 1e-7))
#variable(m, A[i=1:K+1,j=1:K,k=1:N+1,l=1:N+1])
#variable(m, p[i=1:K+1,j=1:N+1])
# constraint difference might be a small fractional difference.
# assuming that time difference is 1 second starting from 0.
for i in 1:K
#constraint(m, -A_tau_mat(convert(Float64, i-1)) * p[i] .+ A_tau_mat(convert(Float64, i-1)) * p[i+1] .== [0.0, 0.0, 0.0])
end
for i in 1:K+1
#constraint(m, A_tau_mat(convert(Float64, i-1)) * p[i] .== [1.0 12.0 13.0])
end
#constraint(m, A_tau_mat(convert(Float64, K+1)) * p[K+1] .== [0.0 0.0 0.0])
for i in 1:K+1
#objective(m, Min, p[i]' * Q_mat()[i] * p[i])
end
optimize!(m)
println("p value is ", value.(p))
println(A_tau_mat(0.0), A_tau_mat(1.0), A_tau_mat(2.0))
With the standard JuMP you can have only one goal function at a time. Running another #objective macro just overwrites the previous goal function.
Consider the following code:
julia> m = Model(GLPK.Optimizer);
julia> #variable(m,x >= 0)
x
julia> #objective(m, Max, 2x)
2 x
julia> #objective(m, Min, 2x)
2 x
julia> println(m)
Min 2 x
Subject to
x >= 0.0
It can be obviously seen that there is only one goal function left.
However, indeed there is an area in optimization called multi-criteria optimization. The goal here is to find a Pareto-barrier.
There is a Julia package for handling MC and it is named MultiJuMP. Here is a sample code:
using MultiJuMP, JuMP
using Clp
const mmodel = multi_model(Clp.Optimizer, linear = true)
const y = #variable(mmodel, 0 <= y <= 10.0)
const z = #variable(mmodel, 0 <= z <= 10.0)
#constraint(mmodel, y + z <= 15.0)
const exp_obj1 = #expression(mmodel, -y +0.05 * z)
const exp_obj2 = #expression(mmodel, 0.05 * y - z)
const obj1 = SingleObjective(exp_obj1)
const obj2 = SingleObjective(exp_obj2)
const multim = get_multidata(mmodel)
multim.objectives = [obj1, obj2]
optimize!(mmodel, method = WeightedSum())
This library also supports plotting of the Pareto frontier.
The disadvantage is that as of today it does not seem to be actively maintained (however it works with the current Julia and JuMP versions).
I would like to minimize a distance function ||dz - z|| under the constraint that g(z) = 0.
I wanted to use Lagrange Multipliers to solve this problem. Then I used NLsolve.jl to solve the non-linear equation that I end up with.
using NLsolve
using ForwardDiff
function ProjLagrange(dz, g::Function)
λ_init = ones(size(g(dz...),1))
initial_x = vcat(dz, λ_init)
function gradL!(F, x)
len_dz = length(dz)
z = x[1:len_dz]
λ = x[len_dz+1:end]
F = Array{Float64}(undef, length(x))
my_distance(z) = norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
F[1:len_dz] = ∇f(z) .- dot(λ, g(z...))
if length(λ) == 1
F[end] = g(z...)
else
F[len_dz+1:end] = g(z)
end
end
nlsolve(gradL!, initial_x)
end
g_test(x1, x2, x3) = x1^2 + x2 - x2 + 5
z = [1000,1,1]
ProjLagrange(z, g_test)
But I always end up with Zero: [NaN, NaN, NaN, NaN] and Convergence: false.
Just so you know I have already solved the equation by using Optim.jl and minimizing the following function: Proj(z) = b * sum(abs.(g(z))) + a * norm(dz - z).
But I would really like to know if this is possible with NLsolve. Any help is greatly appreciated!
Starting almost from scratch and wikipedia's Lagrange multiplier page because it was good for me, the code below seemed to work. I added an λ₀s argument to the ProjLagrange function so that it can accept a vector of initial multiplier λ values (I saw you initialized them at 1.0 but I thought this was more generic). (Note this has not been optimized for performance!)
using NLsolve, ForwardDiff, LinearAlgebra
function ProjLagrange(x₀, λ₀s, gs, n_it)
# distance function from x₀ and its gradients
f(x) = norm(x - x₀)
∇f(x) = ForwardDiff.gradient(f, x)
# gradients of the constraints
∇gs = [x -> ForwardDiff.gradient(g, x) for g in gs]
# Form the auxiliary function and its gradients
ℒ(x,λs) = f(x) - sum(λ * g(x) for (λ,g) in zip(λs,gs))
∂ℒ∂x(x,λs) = ∇f(x) - sum(λ * ∇g(x) for (λ,∇g) in zip(λs,∇gs))
∂ℒ∂λ(x,λs) = [g(x) for g in gs]
# as a function of a single argument
nx = length(x₀)
ℒ(v) = ℒ(v[1:nx], v[nx+1:end])
∇ℒ(v) = vcat(∂ℒ∂x(v[1:nx], v[nx+1:end]), ∂ℒ∂λ(v[1:nx], v[nx+1:end]))
# and solve
v₀ = vcat(x₀, λ₀s)
nlsolve(∇ℒ, v₀, iterations=n_it)
end
# test
gs_test = [x -> x[1]^2 + x[2] - x[3] + 5]
λ₀s_test = [1.0]
x₀_test = [1000.0, 1.0, 1.0]
n_it = 100
res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
gives me
julia> res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.800027199717013, -49.52026655749088, 51.520266557490885, -0.050887973682118504]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
I altered your code as below (see my comments in there) and got the following output. It doesn't throw NaNs anymore, reduces the objective and converges. Does this differ from your Optim.jl results?
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.80003, -49.5203, 51.5203, -0.050888]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
using NLsolve
using ForwardDiff
using LinearAlgebra: norm, dot
using Plots
function ProjLagrange(dz, g::Function, n_it)
λ_init = ones(size(g(dz),1))
initial_x = vcat(dz, λ_init)
# These definitions can go outside as well
len_dz = length(dz)
my_distance = z -> norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
# In fact, this is probably the most vital difference w.r.t. your proposal.
# We need the gradient of the constraints.
∇g = z -> ForwardDiff.gradient(g, z)
function gradL!(F, x)
z = x[1:len_dz]
λ = x[len_dz+1:end]
# `F` is memory allocated by NLsolve to store the residual of the
# respective call of `gradL!` and hence doesn't need to be allocated
# anew every time (or at all).
F[1:len_dz] = ∇f(z) .- λ .* ∇g(z)
F[len_dz+1:end] .= g(z)
end
return nlsolve(gradL!, initial_x, iterations=n_it, store_trace=true)
end
# Presumable here is something wrong: x2 - x2 is not very likely, also made it
# callable directly with an array argument
g_test = x -> x[1]^2 + x[2] - x[3] + 5
z = [1000,1,1]
n_it = 10000
res = ProjLagrange(z, g_test, n_it)
# Ugly reformatting here
trace = hcat([[state.iteration; state.fnorm; state.stepnorm] for state in res.trace.states]...)
plot(trace[1,:], trace[2,:], label="f(x) inf-norm", xlabel="steps")
Evolution of inf-norm of f(x) over iteration steps
[Edit: Adapted solution to incorporate correct gradient computation for g()]
I am learning how to solve non-linear equations and Julia (1.3.1) in general and want to ask how we should use NLsolve.
As a first step, I try the following:
using NLsolve
uni = 1
z = 3
x = [3,2,4,5,6]
y = z .+ x
print(y)
function g!(F, x)
F[1:uni+4] = y .- z - x
end
nlsolve(g!, [0.5,0.9,1,2,3])
And, I confirm it works as below:
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.9, 1.0, 2.0, 3.0]
* Zero: [2.999999999996542, 2.000000000003876, 4.000000000008193, 4.999999999990685, 5.999999999990221]
* Inf-norm of residuals: 0.000000
* Iterations: 2
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 3
Then, I try a more complicated model as following
using SpecialFunctions, NLsolve, Random
Random.seed!(1234)
S = 2
# Setting parameters
ε = 1.3
β = 0.4
γ = gamma((ε-1)/ε)
T = rand(S)
E = rand(S)
B = rand(S)
w = rand(S)
Q = rand(S)
d = [1 2 ;2 1 ]
# Construct a model
rvector = T.*Q.^(1-β).*B.^ε
svector = E.* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
# These two are outcome the model
πR = (sum(π,dims=1))'
πM = sum(π,dims=2)
# E is now set as unknown and we want to estimate it given the outcome of the model
function f!(Res, Unknown)
rvector = T.*Q.^(1-β).*B.^ε
svector = Unknown[1:S].* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
Res = ones(S+S)
Res[1:S] = πR - (sum(π,dims=1))'
Res[S+1:S+S] = πM - sum(π,dims=2)
end
nlsolve(f!, [0.5,0.6])
and this code produces a weird result as follows.
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.6]
* Zero: [0.5, 0.6]
* Inf-norm of residuals: 0.000000
* Iterations: 0
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 1
* Jacobian Calls (df/dx): 1
So, essentially, the function always yields 0 as the return, and thus the initial input always becomes the solution. I cannot understand why this does not work and also why this behaves differently from the first example. Could I have your suggestion to fix it?
You need to update Res in place. Now you do
Res = ones(S+S)
which shadows the input Res. Just update Res directly.
I've tried to reproduce the model from a PYMC3 and Stan comparison. But it seems to run slowly and when I look at #code_warntype there are some things -- K and N I think -- which the compiler seemingly calls Any.
I've tried adding types -- though I can't add types to turing_model's arguments and things are complicated within turing_model because it's using autodiff variables and not the usuals. I put all the code into the function do_it to avoid globals, because they say that globals can slow things down. (It actually seems slower, though.)
Any suggestions as to what's causing the problem? The turing_model code is what's iterating, so that should make the most difference.
using Turing, StatsPlots, Random
sigmoid(x) = 1.0 / (1.0 + exp(-x))
function scale(w0::Float64, w1::Array{Float64,1})
scale = √(w0^2 + sum(w1 .^ 2))
return w0 / scale, w1 ./ scale
end
function do_it(iterations::Int64)::Chains
K = 10 # predictor dimension
N = 1000 # number of data samples
X = rand(N, K) # predictors (1000, 10)
w1 = rand(K) # weights (10,)
w0 = -median(X * w1) # 50% of elements for each class (number)
w0, w1 = scale(w0, w1) # unit length (euclidean)
w_true = [w0, w1...]
y = (w0 .+ (X * w1)) .> 0.0 # labels
y = [Float64(x) for x in y]
σ = 5.0
σm = [x == y ? σ : 0.0 for x in 1:K, y in 1:K]
#model turing_model(X, y, σ, σm) = begin
w0_pred ~ Normal(0.0, σ)
w1_pred ~ MvNormal(σm)
p = sigmoid.(w0_pred .+ (X * w1_pred))
#inbounds for n in 1:length(y)
y[n] ~ Bernoulli(p[n])
end
end
#time chain = sample(turing_model(X, y, σ, σm), NUTS(iterations, 200, 0.65));
# ϵ = 0.5
# τ = 10
# #time chain = sample(turing_model(X, y, σ), HMC(iterations, ϵ, τ));
return (w_true=w_true, chains=chain::Chains)
end
chain = do_it(1000)