Optim Julia parameter meaning - julia

I'm trying to use Optim in Julia to solve a two variable minimization problem, similar to the following
x = [1.0, 2.0, 3.0]
y = 1.0 .+ 2.0 .* x .+ [-0.3, 0.3, -0.1]
function sqerror(betas, X, Y)
err = 0.0
for i in 1:length(X)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
res = optimize(b -> sqerror(b, x, y), [0.0,0.0])
res.minimizer
I do not quite understand what [0.0,0.0] means. By looking at the document http://julianlsolvers.github.io/Optim.jl/v0.9.3/user/minimization/. My understanding is that it is the initial condition. However, if I change that to [0.0,0., 0.0], the algorithm still work despite the fact that I only have two unknowns, and the algorithm gives me three instead of two minimizer. I was wondering if anyone knows what[0.0,0.0] really stands for.

It is initial value. optimize by itself cannot know how many values your sqerror function takes. You specify it by passing this initial value.
For example if you add dimensionality check to sqerror you will get a proper error:
julia> function sqerror(betas::AbstractVector, X::AbstractVector, Y::AbstractVector)
#assert length(betas) == 2
err = 0.0
for i in eachindex(X, Y)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
sqerror (generic function with 2 methods)
julia> optimize(b -> sqerror(b, x, y), [0.0,0.0,0.0])
ERROR: AssertionError: length(betas) == 2
Note that I also changed the loop condition to eachindex(X, Y) to ensure that your function checks if X and Y vectors have aligned indices.
Finally if you want performance and reduce compilation cost (so e.g. assuming you do this optimization many times) it would be better to define your optimized function like this:
objective_factory(x, y) = b -> sqerror(b, x, y)
optimize(objective_factory(x, y), [0.0,0.0])

Related

Type instability in Julia Generator

While trying to reduce the number of allocations generated by a function computing a likelihood by e.g. using Generator expressions, I came across the following behavior which I do not quite understand. Take the following two functions:
function testMax!(x,X,β)
xmax = 0.0
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
if x[i] > xmax
xmax = x[i]
end
end
y = 0.0
for i ∈ eachindex(x)
y += exp(x[i]-xmax)
end
return xmax, y
end
function testMaxWeird!(x,X,β)
xmax = 0.0
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
if x[i] > xmax
xmax = x[i]
end
end
y = sum(exp(x[j]-xmax) for j ∈ eachindex(x))
return xmax, y
end
Both generate the same output
using Random
Random.seed!(1234)
H = 10000;
X = rand(H,2);
β = rand(2);
x = zeros(H);
testMax!(x,X,β)
x = zeros(H);
testMaxWeird!(x,X,β)
returns (1.0772897308017204, 6101.682959406999). However, the first one is type stable, while the second one is not (and therefore much slower).
#code_warntype testMax!(x,X,β)
#code_warntype testMaxWeird!(x,X,β)
In particular, the problem lies with the type of y and xmax, the difference in the outputs being in the #code_warntype lines
y::Float64
xmax::Float64
versus
y::Any
xmax#_9::Core.Box
I am just confused as of why exactly this occurs, and whether it is due to bad practice on how I am defining xmax multiple times within the function, or to the way I am using the Generator expression?
Edit/Follow-up
The references and solutions provided are very helpful. I am still somewhat confused as to when exactly this can be expected to happen -- is it due to the way that xmax is updated within the for-loop, is its scope different from any other local variable defined within the function? Why does e.g. the (less efficient) way of computing the max not lead to the same closure issue?
function testMax2!(x,X,β)
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
end
xmax = maximum(x)
y = sum(exp(x[j]-xmax) for j ∈ eachindex(x))
return xmax, y
end
Edit 2: Nevermind, I think the Performance Tips explain this: "The parser, when translating it into lower-level instructions, substantially reorganizes the above code by extracting the inner function to a separate code block." I assume that this means that it comes from the variable being assigned multiple times such that the "reordering" of the code might lead to confusion.
This is a variation on a longstanding issue when creating some closures https://github.com/JuliaLang/julia/issues/15276.

How to halt a loop in Julia and printing the ErrorMsg at the same time without using any macros?

I am writing a simple newton method
x_(n+1) = x_n - f(x_n) / f_prime(x_n)
to find the roots (can be a real number or a complex number) of a quadratic function:
f(x) = a*x*x + b*x + c
(a, b, c are given constants and are all real numbers). I know Newton method will fail if the start point or some iteration point in the loop has a zero derivative. I want to use a if statement inside my for/while loop to avoid this situation. Does Julia have something like stop 0 syntax in Fortran ?
The generic Newton's Method root-finding code:
function newton_root_finding(f, f_diff, x0, rtol=1e-8, atol=1e-8)
f_x0 = f(x0)
f_diff_x0 = f_diff(x0)
x1 = x0 - f_x0 / f_diff_x0
f_diff_x1 = f_diff(x1)
#assert abs(f_diff_x0) > atol + rtol * abs(f_diff_x0) "Zero derivative. No solution found."
while abs(f_x0) > atol + rtol * (abs(f_x0))
x0 = x1
f_x0 = f(x0)
f_diff_x0 = f_diff(x0)
x1 = x0 - f_x0 / f_diff_x0
end
return x1
end
function quadratic_func(x)
a = 1.0
b = 0.0
c = 2.0
return a*x*x + b*x + c
end
function quadratic_func_diff(x)
a = 1.0
b = 0.0
c = 2.0
return 2.0*a*x + 1.0*b + 0.0*c
end
newton_root_finding(quadratic_func, quadratic_func_diff, 1.0 + 0.5im)
In the above code I used a #assert macro to make it happens, but I don't want to use any macro. I want to use a if statement inside my while loop to halt it. Another thing I've noticed is that if I change to #assert abs(f_diff_x0) != 0 this test will be ignored. Is that because of some round-off errors that "zero derivative" doesn't exactly equal to 0?
The way to exit from the inside of a loop in general is a break statement; a return fulfills the same purpose, because it just exits the whole function.
For the comparisons you can use Base.isapprox(x, y; atol=atol, rtol=rtol). It's documentation starts with:
Inexact equality comparison: true if norm(x-y) <= max(atol, rtol*max(norm(x), norm(y))).
norm falls back to abs for numbers. And I think you might have a bug in both comparisons, always comparing the value at x0 to itself.
As for the breaking on zero derivatives, an #assert is, I think, appropriate here: if you get zero derivative, you don't stop iteration and return the result, but you throw an error to signify an infeasible condition. I'd thus write your function as follows:
function newton_root_finding(f, ∂f, x0, rtol=1e-8, atol=1e-8)
x_old = x0
y_old = f(x0)
while true
df_old = ∂f(x_old)
#assert !isapprox(df_old, 0, rtol=rtol, atol=atol) "Zero derivative. No solution found."
x_new = x_old - y_old / df_old
y_new = f(x_new)
isapprox(y_old, y_new, rtol=rtol, atol=atol) && return x_new
x_old, y_old = x_new, y_new
end
end
This returns 3.357392012620626e-26 + 1.4142135623730951im on your test case, approximately sqrt(2)im.
To address your first question, you can use break to exit the while loop, like
function test()
i = 0
while true
i += 1
if i > 10
break
end
end
return i
end
As to your second question, when comparing floating point numbers it is often better to use isapprox (provide an atol if you compare against zero) instead of == or !=.

"Syntax error" while writing OCaml function?

I am writing a function in OCaml to raise x to the power of y.
My code is:
#let rec pow x y =
if y == 0 then 1 else
if (y mod 2 = 0) then pow x y/2 * pow x y/2 else
x * pow x y/2 * pow x y/2;;
When I try to execute it, I get an error for syntax in line one, but it doesn't tell me what it is.
When you wrote the code, did you type the #? The # is just a character that the OCaml REPL outputs to prompt for input; it is not part of the code. You should not type it.
Here are some other errors that you should fix:
== is physical equality in OCaml. = is structural equality. Although both work the same for unboxed types (such as int), it's better practice to do y = 0. Note that you use =, the recommended equality, in the expression y mod 2 = 0.
You need parentheses around y/2. pow x y/2 parses as (pow x y) / 2, but you want pow x (y / 2).

Load Error when trying to pass complicated function into Simpson's rule

I have written a method that approximates a definite integral by the composite Simpson's rule.
#=
f integrand
a lower integration bound
b upper integration bound
n number of iterations or panels
h step size
=#
function simpson(f::Function, a::Number, b::Number, n::Number)
n % 2 == 0 || error("`n` must be even")
h = (b - a) / n
s = f(a) + f(b)
s += 4*sum(f(a .+ collect(1:2:n) .* h))
s += 2*sum(f(a .+ collect(2:2:n-1) .* h))
return h/3 * s
end
For "simple" functions, like e^(-x^2), the simpson function works.
Input: simpson(x -> simpson(x -> exp.(-x.^2), 0, 5, 100)
Output: 0.8862269254513949
However, for the more complicated function f(x)
gArgs(x) = (30 .+ x, 0)
f(x) = exp.(-x.^2) .* maximum(generator.(gArgs.(x)...)[1])
where generator(θ, plotsol) is a function that takes in a defect θ in percent and a boolean value plotsol (either 0 or 1) that determines whether the generator should be plotted, and returns a vector with the magnetization in certain points in the generator.
When I try to compute the integral by running the below code
gArgs(x) = (30 .+ x, 0)
f(x) = exp.(-x.^2) .* maximum(generator.(gArgs.(x)...)[1])
println(simpson(x -> f(x), 0, 5, 10))
I encounter the error MethodError: no method matching generator(::Float64). With slight variants of the expression for f(x) I run into different errors like DimensionMismatch("array could not be broadcast to match destination") and InexactError: Bool(33.75). In the end, I think the cause of the error boils down to that I cannot figure out how to properly enter an expression for the integrand f(x). Could someone help me figure out how to enter f(x) correctly? Let me know if anything is unclear in my question.
Given an array x , gArgs.(x) returns an array of Tuples and you are trying to broadcast over an array of tuples. But the behavior of broadcasting with tuples is a bit different. Tuples are not treated as a single element and they themselves broadcast.
julia> println.(gArgs.([0.5, 1.5, 2.5, 3.5, 4.5])...)
30.531.532.533.534.5
00000
This is not what you expected, is it?
You can also see the problem with the following example;
julia> (2, 5) .!= [(2, 5)]
2-element BitArray{1}:
true
true
I believe f is a function that actually takes a scalar and returns a scalar. Instead of making f work on arrays, you should leave the broadcasting to the caller. You are very likely to be better of implementing f element-wise. This is the more Julia way of doing things and will make your job much easier.
That said, I believe your implementation should work with the following modifications, if you do not have an error in generator.
function simpson(f::Function, a::Number, b::Number, n::Number)
n % 2 == 0 || error("`n` must be even")
h = (b - a) / n
s = f(a) + f(b)
s += 4*sum(f.(a .+ collect(1:2:n) .* h)) # broadcast `f`
s += 2*sum(f.(a .+ collect(2:2:n-1) .* h)) # broadcast `f`
return h/3 * s
end
# define `gArg` and `f` element-wise and `generator`, too.
gArgs(x) = (30 + x, 0) # get rid of broadcasting dot. Shouldn't `0` be `false`?
f(x) = exp(-x^2) * maximum(generator(gArgs(x)...)[1]) # get rid of broadcasting dots
println(simpson(f, 0, 5, 10)) # you can just write `f`
You should also define the generator function element-wise.

For-loop with the dimension flexibility of broadcasting

With the aid of broadcasting, the following code will work whether x, y, and z are scalars, vectors of size n, or any combination thereof.
b = zeros(n)
b .= x.*y.*z .+ x
However, I'd like a for-loop. The following for-loop only works when x is a vector of size n, y is a scalar, and z is a scalar.
for i = 1:n
b[i] = x[i]*y*z + x[i]
end
To write the equivalent of b .= x.*y.*z .+ x as a for-loop for any case, I can only think of writing a for-loop for every combination of x, y, and z within if-statements. This can get messy with more variables in more complicated math expressions.
Is there a more elegant way to do what I'd like than using many if-statements?
You could define a wrapper type that indexing into it will give array indexing if wrapped variable is array and repeats the same value for all indices for scalars. I have an example below but it probably is not as efficient as using broadcast. And it is not checking if array lengths are consistent. However, a custom wrapper type would alleviate the situation.
julia> function f(x,y,z)
lx,ly,lz = length(x),length(y),length(z)
maxlen = max(lx,ly,lz)
cx = cycle(x)
cy = cycle(y)
cz = cycle(z)
b = zeros(maxlen)
#inbounds for (xi,yi,zi,i) in zip(cx,cy,cz,1:maxlen)
b[i] = xi*yi*zi+xi
end
return b
end
f (generic function with 1 method)
julia> f(1:3,21,2)
3-element Array{Float64,1}:
43.0
86.0
129.0

Resources