Julia minimize simple scalar function - julia

How do I minimize a simple scalar function in Julia using Newton's method? (or any other suitable optimization scheme)
using Optim
# Function to optimize
function g(x)
return x^2
end
x0 = 2.0 # Initial value
optimize(g, x0, Newton())
The above doesn't seem to work and returns
ERROR: MethodError: no method matching optimize(::typeof(g), ::Float64, ::Newton{LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}}})

Optim is designed for vector problems and not scalar ones like in your example. You can adjust the example to be a vector-problem with one variable though:
julia> using Optim
julia> function g(x) # <- g accepts x as a vector
return x[1]^2
end
julia> x0 = [2.0] # <- Make this a vector
1-element Vector{Float64}:
2.0
julia> optimize(g, x0, Newton())
* Status: success
* Candidate solution
Final objective value: 0.000000e+00

The optimize function expects an interval, not a starting point:
optimize(g, -10, 10)
returns
Results of Optimization Algorithm
* Algorithm: Brent's Method
* Search Interval: [-10.000000, 10.000000]
* Minimizer: 0.000000e+00
* Minimum: 0.000000e+00
* Iterations: 5
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 6
Concerning available methods I have not read the doc, but you can directly have a look at the source code using the #edit macro:
#edit optimize(g, -10, 10)
Browsing the source you will see:
function optimize(f,
lower::Union{Integer, Real},
upper::Union{Integer, Real},
method::Union{Brent, GoldenSection};
kwargs...)
T = promote_type(typeof(lower/1), typeof(upper/1))
optimize(f,
T(lower),
T(upper),
method;
kwargs...)
end
Hence I think that you have only two methods for unidimensional minimization: Brent and GoldenSection.
By example you can try:
julia> optimize(g, -10, 10, GoldenSection())
Results of Optimization Algorithm
* Algorithm: Golden Section Search
* Search Interval: [-10.000000, 10.000000]
* Minimizer: 1.110871e-16
* Minimum: 1.234035e-32
* Iterations: 79
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 80

Related

Lagrange Multiplier Method using NLsolve.jl

I would like to minimize a distance function ||dz - z|| under the constraint that g(z) = 0.
I wanted to use Lagrange Multipliers to solve this problem. Then I used NLsolve.jl to solve the non-linear equation that I end up with.
using NLsolve
using ForwardDiff
function ProjLagrange(dz, g::Function)
λ_init = ones(size(g(dz...),1))
initial_x = vcat(dz, λ_init)
function gradL!(F, x)
len_dz = length(dz)
z = x[1:len_dz]
λ = x[len_dz+1:end]
F = Array{Float64}(undef, length(x))
my_distance(z) = norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
F[1:len_dz] = ∇f(z) .- dot(λ, g(z...))
if length(λ) == 1
F[end] = g(z...)
else
F[len_dz+1:end] = g(z)
end
end
nlsolve(gradL!, initial_x)
end
g_test(x1, x2, x3) = x1^2 + x2 - x2 + 5
z = [1000,1,1]
ProjLagrange(z, g_test)
But I always end up with Zero: [NaN, NaN, NaN, NaN] and Convergence: false.
Just so you know I have already solved the equation by using Optim.jl and minimizing the following function: Proj(z) = b * sum(abs.(g(z))) + a * norm(dz - z).
But I would really like to know if this is possible with NLsolve. Any help is greatly appreciated!
Starting almost from scratch and wikipedia's Lagrange multiplier page because it was good for me, the code below seemed to work. I added an λ₀s argument to the ProjLagrange function so that it can accept a vector of initial multiplier λ values (I saw you initialized them at 1.0 but I thought this was more generic). (Note this has not been optimized for performance!)
using NLsolve, ForwardDiff, LinearAlgebra
function ProjLagrange(x₀, λ₀s, gs, n_it)
# distance function from x₀ and its gradients
f(x) = norm(x - x₀)
∇f(x) = ForwardDiff.gradient(f, x)
# gradients of the constraints
∇gs = [x -> ForwardDiff.gradient(g, x) for g in gs]
# Form the auxiliary function and its gradients
ℒ(x,λs) = f(x) - sum(λ * g(x) for (λ,g) in zip(λs,gs))
∂ℒ∂x(x,λs) = ∇f(x) - sum(λ * ∇g(x) for (λ,∇g) in zip(λs,∇gs))
∂ℒ∂λ(x,λs) = [g(x) for g in gs]
# as a function of a single argument
nx = length(x₀)
ℒ(v) = ℒ(v[1:nx], v[nx+1:end])
∇ℒ(v) = vcat(∂ℒ∂x(v[1:nx], v[nx+1:end]), ∂ℒ∂λ(v[1:nx], v[nx+1:end]))
# and solve
v₀ = vcat(x₀, λ₀s)
nlsolve(∇ℒ, v₀, iterations=n_it)
end
# test
gs_test = [x -> x[1]^2 + x[2] - x[3] + 5]
λ₀s_test = [1.0]
x₀_test = [1000.0, 1.0, 1.0]
n_it = 100
res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
gives me
julia> res = ProjLagrange(x₀_test, λ₀s_test, gs_test, n_it)
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.800027199717013, -49.52026655749088, 51.520266557490885, -0.050887973682118504]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
I altered your code as below (see my comments in there) and got the following output. It doesn't throw NaNs anymore, reduces the objective and converges. Does this differ from your Optim.jl results?
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [1000.0, 1.0, 1.0, 1.0]
* Zero: [9.80003, -49.5203, 51.5203, -0.050888]
* Inf-norm of residuals: 0.000000
* Iterations: 10
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 11
* Jacobian Calls (df/dx): 11
using NLsolve
using ForwardDiff
using LinearAlgebra: norm, dot
using Plots
function ProjLagrange(dz, g::Function, n_it)
λ_init = ones(size(g(dz),1))
initial_x = vcat(dz, λ_init)
# These definitions can go outside as well
len_dz = length(dz)
my_distance = z -> norm(dz - z)
∇f = z -> ForwardDiff.gradient(my_distance, z)
# In fact, this is probably the most vital difference w.r.t. your proposal.
# We need the gradient of the constraints.
∇g = z -> ForwardDiff.gradient(g, z)
function gradL!(F, x)
z = x[1:len_dz]
λ = x[len_dz+1:end]
# `F` is memory allocated by NLsolve to store the residual of the
# respective call of `gradL!` and hence doesn't need to be allocated
# anew every time (or at all).
F[1:len_dz] = ∇f(z) .- λ .* ∇g(z)
F[len_dz+1:end] .= g(z)
end
return nlsolve(gradL!, initial_x, iterations=n_it, store_trace=true)
end
# Presumable here is something wrong: x2 - x2 is not very likely, also made it
# callable directly with an array argument
g_test = x -> x[1]^2 + x[2] - x[3] + 5
z = [1000,1,1]
n_it = 10000
res = ProjLagrange(z, g_test, n_it)
# Ugly reformatting here
trace = hcat([[state.iteration; state.fnorm; state.stepnorm] for state in res.trace.states]...)
plot(trace[1,:], trace[2,:], label="f(x) inf-norm", xlabel="steps")
Evolution of inf-norm of f(x) over iteration steps
[Edit: Adapted solution to incorporate correct gradient computation for g()]

Using Lsq-Fit in Julia

I am trying to practice fitting with the Lsq-Fit-function in Julia.
The derivative of a Cauchy-distribution with parameters \gamma and x_0.
Following this manual I tried
f(x, x_0, γ) = -2*(x - x_0)*(π * γ^3 * (1 + ((x - x_0)/γ)^2)^2)^(-1)
x_0 = 3350
γ = 50
xarr = range(3000, length = 5000, stop = 4000)
yarr = [f(x, x_0, γ) for x in xarr]
using LsqFit
# p ≡ [x_0, γ]
model(x, p) = -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1)
p0 = [3349, 49]
curve_fit(model, xarr, yarr, p0)
param = fit.param
... and it does not work, giving a MethodError: no method matching -(::StepRangeLen[...], leaving me confused.
Can please somebody tell me what I am doing wrong?
There are a few issues with what you've written:
the model function is meant to be called with its first argument (x) being the full vector of independent variables, not just one value. This is where the error you mention comes from:
julia> model(x, p) = -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1);
julia> p0 = [3349, 49];
julia> model(xarr, p0);
ERROR: MethodError: no method matching -(::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::Float64)
One way to fix this is to use the dot notation to broadcast all operators so that they work elementwise:
julia> model(x, p) = -2*(x .- p[1]) ./ (π * (p[2])^3 * (1 .+ ((x .- p[1])/p[2]).^2).^2);
julia> model(xarr, p0); # => No error
but if this is too tedious you can let the #. macro do the work for you:
# just put #. in front of the expression to transform every
# occurrence of a-b into a.-b (and likewise for all operators)
# which means to compute the operation elementwise
julia> model(x, p) = #. -2*(x - p[1])*(π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)^(-1);
julia> model(xarr, p0); # => No error
Another issue is that the parameters you're looking for are meant to be floating-point values. But your initial guess p0 is initialized with integers, which confuses curve_fit. There are two ways of fixing this. Either put floating-point values in p0:
julia> p0 = [3349.0, 49.0]
2-element Array{Float64,1}:
3349.0
49.0
or use a typed array initializer to specify explicitly the element type:
julia> p0 = Float64[3349, 49]
2-element Array{Float64,1}:
3349.0
49.0
This is not really an error, but I would find it more intuitive to compute a/b instead of a*b^(-1). Also, yarr can be computed with a simple broadcast using dot notation instead of a comprehension.
Wrapping all this together:
f(x, x_0, γ) = -2*(x - x_0)*(π * γ^3 * (1 + ((x - x_0)/γ)^2)^2)^(-1)
(x_0, γ) = (3350, 50)
xarr = range(3000, length = 5000, stop = 4000);
# use dot-notation to "broadcast" f and map it
# elementwise to elements of xarr
yarr = f.(xarr, x_0, γ);
using LsqFit
model(x, p) = #. -2*(x - p[1]) / (π * (p[2])^3 * (1 + ((x - p[1])/p[2])^2)^2)
p0 = Float64[3300, 10]
fit = curve_fit(model, xarr, yarr, p0)
yields:
julia> fit.param
2-element Array{Float64,1}:
3349.999986535933
49.99999203625603

non-linear solver always produces zero residual

I am learning how to solve non-linear equations and Julia (1.3.1) in general and want to ask how we should use NLsolve.
As a first step, I try the following:
using NLsolve
uni = 1
z = 3
x = [3,2,4,5,6]
y = z .+ x
print(y)
function g!(F, x)
F[1:uni+4] = y .- z - x
end
nlsolve(g!, [0.5,0.9,1,2,3])
And, I confirm it works as below:
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.9, 1.0, 2.0, 3.0]
* Zero: [2.999999999996542, 2.000000000003876, 4.000000000008193, 4.999999999990685, 5.999999999990221]
* Inf-norm of residuals: 0.000000
* Iterations: 2
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 3
Then, I try a more complicated model as following
using SpecialFunctions, NLsolve, Random
Random.seed!(1234)
S = 2
# Setting parameters
ε = 1.3
β = 0.4
γ = gamma((ε-1)/ε)
T = rand(S)
E = rand(S)
B = rand(S)
w = rand(S)
Q = rand(S)
d = [1 2 ;2 1 ]
# Construct a model
rvector = T.*Q.^(1-β).*B.^ε
svector = E.* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
# These two are outcome the model
πR = (sum(π,dims=1))'
πM = sum(π,dims=2)
# E is now set as unknown and we want to estimate it given the outcome of the model
function f!(Res, Unknown)
rvector = T.*Q.^(1-β).*B.^ε
svector = Unknown[1:S].* w.^ε
Φ_all = (sum(sum(rvector * svector' .* d )))
π = rvector * svector' .* d ./ Φ_all
Res = ones(S+S)
Res[1:S] = πR - (sum(π,dims=1))'
Res[S+1:S+S] = πM - sum(π,dims=2)
end
nlsolve(f!, [0.5,0.6])
and this code produces a weird result as follows.
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.5, 0.6]
* Zero: [0.5, 0.6]
* Inf-norm of residuals: 0.000000
* Iterations: 0
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 1
* Jacobian Calls (df/dx): 1
So, essentially, the function always yields 0 as the return, and thus the initial input always becomes the solution. I cannot understand why this does not work and also why this behaves differently from the first example. Could I have your suggestion to fix it?
You need to update Res in place. Now you do
Res = ones(S+S)
which shadows the input Res. Just update Res directly.

Find fixed point of multivariable function in Julia

I need to find the fixed point of a multivariable function in Julia.
Consider the following minimal example:
function example(p::Array{Float64,1})
q = -p
return q
end
Ideally I'd use a package like Roots.jl and call find_zeros(p -> p - example(p)), but I can't find the analogous package for multivariable functions. I found one called IntervalRootFinding, but it oddly requires unicode characters and is sparsely documented, so I can't figure out how to use it.
There are many options. The choice of the best one depends on the nature of example function (you have to understand the nature of your example function and check against a documentation of a specific package if it would support it).
Eg. you can use fixedpoint from NLsolve.jl:
julia> using NLsolve
julia> function example!(q, p::Array{Float64,1})
q .= -p
end
example! (generic function with 1 method)
julia> fixedpoint(example!, ones(1))
Results of Nonlinear Solver Algorithm
* Algorithm: Anderson m=1 beta=1 aa_start=1 droptol=0
* Starting Point: [1.0]
* Zero: [0.0]
* Inf-norm of residuals: 0.000000
* Iterations: 3
* Convergence: true
* |x - x'| < 0.0e+00: true
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 0
julia> fixedpoint(example!, ones(3))
Results of Nonlinear Solver Algorithm
* Algorithm: Anderson m=3 beta=1 aa_start=1 droptol=0
* Starting Point: [1.0, 1.0, 1.0]
* Zero: [-2.220446049250313e-16, -2.220446049250313e-16, -2.220446049250313e-16]
* Inf-norm of residuals: 0.000000
* Iterations: 3
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 0
julia> fixedpoint(example!, ones(5))
Results of Nonlinear Solver Algorithm
* Algorithm: Anderson m=5 beta=1 aa_start=1 droptol=0
* Starting Point: [1.0, 1.0, 1.0, 1.0, 1.0]
* Zero: [0.0, 0.0, 0.0, 0.0, 0.0]
* Inf-norm of residuals: 0.000000
* Iterations: 3
* Convergence: true
* |x - x'| < 0.0e+00: true
* |f(x)| < 1.0e-08: true
* Function Calls (f): 3
* Jacobian Calls (df/dx): 0
If your function would require a global optimization tools to find a fixed point then you can e.g. use BlackBoxOptim.jl with norm(f(x) .-x) as an objective:
julia> using LinearAlgebra
julia> using BlackBoxOptim
julia> function example(p::Array{Float64,1})
q = -p
return q
end
example (generic function with 1 method)
julia> f(x) = norm(example(x) .- x)
f (generic function with 1 method)
julia> bboptimize(f; SearchRange = (-5.0, 5.0), NumDimensions = 1)
Starting optimization with optimizer DiffEvoOpt{FitPopulation{Float64},RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
Optimization stopped after 10001 steps and 0.15 seconds
Termination reason: Max number of steps (10000) reached
Steps per second = 68972.31
Function evals per second = 69717.14
Improvements/step = 0.35090
Total function evaluations = 10109
Best candidate found: [-8.76093e-40]
Fitness: 0.000000000
julia> bboptimize(f; SearchRange = (-5.0, 5.0), NumDimensions = 3);
Starting optimization with optimizer DiffEvoOpt{FitPopulation{Float64},RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
Optimization stopped after 10001 steps and 0.02 seconds
Termination reason: Max number of steps (10000) reached
Steps per second = 625061.23
Function evals per second = 631498.72
Improvements/step = 0.32330
Total function evaluations = 10104
Best candidate found: [-3.00106e-12, -5.33545e-12, 5.39072e-13]
Fitness: 0.000000000
julia> bboptimize(f; SearchRange = (-5.0, 5.0), NumDimensions = 5);
Starting optimization with optimizer DiffEvoOpt{FitPopulation{Float64},RadiusLimitedSelector,BlackBoxOptim.AdaptiveDiffEvoRandBin{3},RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
Optimization stopped after 10001 steps and 0.02 seconds
Termination reason: Max number of steps (10000) reached
Steps per second = 526366.94
Function evals per second = 530945.88
Improvements/step = 0.29900
Total function evaluations = 10088
Best candidate found: [-9.23635e-8, -2.6889e-8, -2.93044e-8, -1.62639e-7, 3.99672e-8]
Fitness: 0.000000391
I'm an author of IntervalRootFinding.jl. I'm happy to say that the documentation has been improved a lot recently, and no unicode characters are necessary. I suggest using the master branch.
Here's how to solve your example with the package. Note that this package should be able to find all of the roots within a box, and guarantee that it has found all of them. Yours only has 1:
julia> using IntervalArithmetic, IntervalRootFinding
julia> function example(p)
q = -p
return q
end
example (generic function with 2 methods)
julia> X = IntervalBox(-2..2, 2)
[-2, 2] × [-2, 2]
julia> roots(x -> example(x) - x, X, Krawczyk)
1-element Array{Root{IntervalBox{2,Float64}},1}:
Root([0, 0] × [0, 0], :unique)
For more information, I suggest looking at https://discourse.julialang.org/t/ann-intervalrootfinding-jl-for-finding-all-roots-of-a-multivariate-function/9515.

Maximum Likelihood in Julia

I am trying to use maximum likelihood to estimate the normal linear model in Julia. I have used the following code to simulate the process with just an intercept and an anonymous function per the Optim documentation regarding values that do not change:
using Optim
nobs = 500
nvar = 1
β = ones(nvar)*3.0
x = [ones(nobs) randn(nobs,nvar-1)]
ε = randn(nobs)*0.5
y = x*β + ε
function LL_anon(X, Y, β, σ)
-(-length(X)*log(2π)/2 - length(X)*log(σ) - (sum((Y - X*β).^2) / (2σ^2)))
end
LL_anon(X,Y, pars) = LL_anon(X,Y, pars...)
res2 = optimize(vars -> LL_anon(x,y, vars...), [1.0,1.0]) # Start values: β=1.0, σ=1.0
This actually recovered the parameters and I received the following output:
* Algorithm: Nelder-Mead
* Starting Point: [1.0,1.0]
* Minimizer: [2.980587812647935,0.5108406803949835]
* Minimum: 3.736217e+02
* Iterations: 47
* Convergence: true
* √(Σ(yᵢ-ȳ)²)/n < 1.0e-08: true
* Reached Maximum Number of Iterations: false
* Objective Calls: 92
However, when I try and set nvar = 2, i.e. an intercept plus an additional covariate, I get the following error message:
MethodError: no method matching LL_anon(::Array{Float64,2}, ::Array{Float64,1}, ::Float64, ::Float64, ::Float64)
Closest candidates are:
LL_anon(::Any, ::Any, ::Any, ::Any) at In[297]:2
LL_anon(::Array{Float64,1}, ::Array{Float64,1}, ::Any, ::Any) at In[113]:2
LL_anon(::Any, ::Any, ::Any) at In[297]:4
...
Stacktrace:
[1] (::##245#246)(::Array{Float64,1}) at .\In[299]:1
[2] value!!(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\NLSolversBase\src\interface.jl:9
[3] initial_state(::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}, ::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/solvers/zeroth_order\nelder_mead.jl:136
[4] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:25
[5] #optimize#151(::Array{Any,1}, ::Function, ::Tuple{##245#246}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:62
[6] #optimize#148(::Array{Any,1}, ::Function, ::Function, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:52
[7] optimize(::Function, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:52
I'm not sure why adding an additional variable is causing this issue but it seems like a type instability problem.
The second issue is that when I use my original working example and set the starting values to [2.0,2.0], I get the following error message:
log will only return a complex result if called with a complex argument. Try log(complex(x)).
Stacktrace:
[1] nan_dom_err at .\math.jl:300 [inlined]
[2] log at .\math.jl:419 [inlined]
[3] LL_anon(::Array{Float64,2}, ::Array{Float64,1}, ::Float64, ::Float64) at .\In[302]:2
[4] (::##251#252)(::Array{Float64,1}) at .\In[304]:1
[5] value(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\NLSolversBase\src\interface.jl:19
[6] update_state!(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Optim.NelderMeadState{Array{Float64,1},Float64,Array{Float64,1}}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/solvers/zeroth_order\nelder_mead.jl:193
[7] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}, ::Optim.NelderMeadState{Array{Float64,1},Float64,Array{Float64,1}}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:51
[8] optimize(::NLSolversBase.NonDifferentiable{Float64,Array{Float64,1},Val{false}}, ::Array{Float64,1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Float64,Void}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\optimize.jl:25
[9] #optimize#151(::Array{Any,1}, ::Function, ::Tuple{##251#252}, ::Array{Float64,1}) at C:\Users\dolacomb\.julia\v0.6\Optim\src\multivariate/optimize\interface.jl:62
Again, I’m not sure why this is happening and since start values are very important I’d like to know how to overcome this issue and they are not too far off from the true values.
Any help would be greatly appreciated!
Splatting causes the problem. E.g. it transforms [1, 2, 3] into three parameters while your function accepts only two.
Use the following call:
res2 = optimize(vars -> LL_anon(x,y, vars[1:end-1], vars[end]), [1.0,1.0,1.0])
and you can remove the following line from your code
LL_anon(X,Y, pars) = LL_anon(X,Y, pars...)
or replace it with:
LL_anon(X,Y, pars) = LL_anon(X,Y, pars[1:end-1], pars[end])
but it would not be called by optimization routine unless you change a call to:
res2 = optimize(vars -> LL_anon(x,y, vars), [1.0,1.0,1.0])
Finally - to get good performance of this code I would recommend to wrap it all in a function.
EDIT: now I see a second question. The reason is that σ can become negative in the optimization process and then log(σ) fails. The simplest thing to do in this case is to take log(abs(σ))) like this:
function LL_anon(X, Y, β, σ)
-(-length(X)*log(2π)/2 - length(X)*log(abs(σ)) - (sum((Y - X*β).^2) / (2σ^2)))
end
Of course then you have to take absolute value of σ as a solution as you might get a negative value from optimization routine.
A cleaner way would be to optimize over e.g. log(σ) not σ like this:
function LL_anon(X, Y, β, logσ)
-(-length(X)*log(2π)/2 - length(X)*logσ - (sum((Y - X*β).^2) / (2(exp(logσ))^2)))
end
but then you have to use exp(logσ) to recover σ after optimization finishes.
I have asked around regarding this and have another option. The main reason for looking at this problem is twofold. One, to learn how to use the optimization routines in Julia in a canonical situation and two, to expand this to spatial econometric models. With that in mind, I'm posting the other suggested code from the Julia message board so that others may see another solution.
using Optim
nobs = 500
nvar = 2
β = ones(nvar) * 3.0
x = [ones(nobs) randn(nobs, nvar - 1)]
ε = randn(nobs) * 0.5
y = x * β + ε
function LL_anon(X, Y, β, log_σ)
σ = exp(log_σ)
-(-length(X) * log(2π)/2 - length(X) * log(σ) - (sum((Y - X * β).^2) / (2σ^2)))
end
opt = optimize(vars -> LL_anon(x,y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
# Use forward autodiff to get first derivative, then optimize
fun1 = OnceDifferentiable(vars -> LL_anon(x, y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
opt1 = optimize(fun1, ones(nvar+1))
Results of Optimization Algorithm
Algorithm: L-BFGS
Starting Point: [1.0,1.0,1.0]
Minimizer: [2.994204150985705,2.9900626550063305, …]
Minimum: 3.538340e+02
Iterations: 12
Convergence: true
|x - x’| ≤ 1.0e-32: false
|x - x’| = 8.92e-12
|f(x) - f(x’)| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x’)| = 9.64e-16 |f(x)|
|g(x)| ≤ 1.0e-08: true
|g(x)| = 6.27e-09
Stopped by an increasing objective: true
Reached Maximum Number of Iterations: false
Objective Calls: 50
Gradient Calls: 50
opt1.minimizer
3-element Array{Float64,1}:
2.9942
2.99006
-1.0651 #Note: needs to be exponentiated
# Get Hessian, use Newton!
fun2 = TwiceDifferentiable(vars -> LL_anon(x, y, vars[1:nvar], vars[nvar + 1]),
ones(nvar+1))
opt2 = optimize(fun2, ones(nvar+1))
Results of Optimization Algorithm
Algorithm: Newton’s Method
Starting Point: [1.0,1.0,1.0]
Minimizer: [2.99420415098702,2.9900626550079026, …]
Minimum: 3.538340e+02
Iterations: 9
Convergence: true
|x - x’| ≤ 1.0e-32: false
|x - x’| = 1.36e-11
|f(x) - f(x’)| ≤ 1.0e-32 |f(x)|: false
|f(x) - f(x’)| = 1.61e-16 |f(x)|
|g(x)| ≤ 1.0e-08: true
|g(x)| = 6.27e-09
Stopped by an increasing objective: true
Reached Maximum Number of Iterations: false
Objective Calls: 45
Gradient Calls: 45
Hessian Calls: 9
fieldnames(fun2)
13-element Array{Symbol,1}:
:f
:df
:fdf
:h
:F
:DF
:H
:x_f
:x_df
:x_h
:f_calls
:df_calls
:h_calls
opt2.minimizer
3-element Array{Float64,1}:
2.98627
3.00654
-1.11313
numerical_hessian = (fun2.H) #.H is the numerical Hessian
3×3 Array{Float64,2}:
64.8715 -9.45045 0.000121521
-0.14568 66.4507 0.0
1.87326e-6 4.10675e-9 44.7214
From here, one can use the numerical Hessian to obtain the standard errors for the estimates and form t-statistics, etc. for their own functions.
Again, thank you for providing an answer and I hope people find this information useful.

Resources