I'm trying to use JuMP to solve a non-linear problem, where the number of variables are decided by the user - that is, not known at compile time.
To accomplish this, the #NLobjective line looks like this:
#eval #JuMP.NLobjective(m, Min, $(Expr(:call, :myf, [Expr(:ref, :x, i) for i=1:n]...)))
Where, for instance, if n=3, the compiler interprets the line as identical to:
#JuMP.NLobjective(m, Min, myf(x[1], x[2], x[3]))
The issue is that #eval works only in the global scope, and when contained in a function, an error is thrown.
My question is: how can I accomplish this same functionality -- getting #NLobjective to call myf with a variable number of x[1],...,x[n] arguments -- within the local, not-known-at-compilation scope of a function?
def testme(n)
myf(a...) = sum(collect(a).^2)
m = JuMP.Model(solver=Ipopt.IpoptSolver())
JuMP.register(m, :myf, n, myf, autodiff=true)
#JuMP.variable(m, x[1:n] >= 0.5)
#eval #JuMP.NLobjective(m, Min, $(Expr(:call, :myf, [Expr(:ref, :x, i) for i=1:n]...)))
JuMP.solve(m)
end
testme(3)
Thanks!
As explained in http://jump.readthedocs.io/en/latest/nlp.html#raw-expression-input , objective functions can be given without the macro. The relevant expression:
JuMP.setNLobjective(m, :Min, Expr(:call, :myf, [x[i] for i=1:n]...))
is even simpler than the #eval based one and works in the function. The code is:
using JuMP, Ipopt
function testme(n)
myf(a...) = sum(collect(a).^2)
m = JuMP.Model(solver=Ipopt.IpoptSolver())
JuMP.register(m, :myf, n, myf, autodiff=true)
#JuMP.variable(m, x[1:n] >= 0.5)
JuMP.setNLobjective(m, :Min, Expr(:call, :myf, [x[i] for i=1:n]...))
JuMP.solve(m)
return [getvalue(x[i]) for i=1:n]
end
testme(3)
and it returns:
julia> testme(3)
:
EXIT: Optimal Solution Found.
3-element Array{Float64,1}:
0.5
0.5
0.5
Related
I am a newbie to Julia and Flux with some experience in Tensorflow Keras and python. I tried to use the Flux.withgradient command to write a user-defined training function with more flexibility. Here is the training part of my code:
loss, grad = Flux.withgradient(modelDQN.evalParameters) do
qEval = modelDQN.evalModel(evalInput)
Flux.mse(qEval, qTarget)
end
Flux.update!(modelDQN.optimizer, modelDQN.evalParameters, grad)
This code works just fine. But if I put the command qEval = modelDQN.evalModel(evalInput) outside the do end loop, as follows:
qEval = modelDQN.evalModel(evalInput)
loss, grad = Flux.withgradient(modelDQN.evalParameters) do
Flux.mse(qEval, qTarget)
end
Flux.update!(modelDQN.optimizer, modelDQN.evalParameters, grad)
The model parameters will not be updated. As far as I know, the do end loop works as an anonymous function that takes 0 arguments. Then why do we need the command qEval = modelDQN.evalModel(evalInput) inside the loop to get the model updated?
The short answer is that anything to be differentiated has to happen inside the (anonymous) function which you pass to gradient (or withgradient), because this is very much not a standard function call -- Zygote (Flux's auto-differentiation library) traces its execution to compute the derivative, and can't transform what it can't see.
Longer, this is Zygote's "implicit" mode, which relies on global references to arrays. The simplest use is something like this:
julia> using Zygote
julia> x = [2.0, 3.0];
julia> g = gradient(() -> sum(x .^ 2), Params([x]))
Grads(...)
julia> g[x] # lookup by objectid(x)
2-element Vector{Float64}:
4.0
6.0
If you move some of that calculation outside, then you make a new array y with a new objectid. Julia has no memory of where this came from, it is completely unrelated to x. They are ordinary arrays, not a special tracked type.
So if you refer to y in the gradient, Zygote cannot infer how this depends on x:
julia> y = x .^ 2 # calculate this outside of gradient
2-element Vector{Float64}:
4.0
9.0
julia> g2 = gradient(() -> sum(y), Params([x]))
Grads(...)
julia> g2[x] === nothing # represents zero
true
Zygote doesn't have to be used in this way. It also has an "explicit" mode which does not rely on global references. This is perhaps less confusing:
julia> gradient(x1 -> sum(x1 .^ 2), x) # x1 is a local variable
([4.0, 6.0],)
julia> gradient(x1 -> sum(y), x) # sum(y) is obviously indep. x1
(nothing,)
julia> gradient((x1, y1) -> sum(y1), x, y)
(nothing, Fill(1.0, 2))
Flux is in the process of changing to use this second form. On v0.13.9 or later, something like this ought to work:
opt_state = Flux.setup(modelDQN.optimizer, modelDQN) # do this once
loss, grads = Flux.withgradient(modelDQN.model) do m
qEval = m(evalInput) # local variable m
Flux.mse(qEval, qTarget)
end
Flux.update!(opt_state, modelDQN.model, grads[1])
so I wrote a minimum example to show what I'm trying to do. Basically I want to solve a optimization problem with multiple variables. When I try to do this in JuMP I was having issues with my function obj not being able to take a forwardDiff object.
I looked here: and it seemed to do with the function signature :Restricting function signatures while using ForwardDiff in Julia . I did this in my obj function, and for insurance did it in my sub-function as well, but I still get the error
LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#110#112"{typeof(my_fun)},Float64},Float64,2})
Closest candidates are:
Float64(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
Float64(::T) where T<:Number at boot.jl:715
Float64(::Int8) at float.jl:60
This still does not work. I feel like I have the bulk of the code correct, just some weird of type thing going on that I have to clear up so autodifferentiate works...
Any suggestions?
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::Array{<:Real,1})
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= Matrix{Float64}(I, 2, 2)
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))
Use instead
function obj(x::Vector{T}) where {T}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{T}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T}
eye= Matrix{T}(I, 2, 2)
eye[2,2]=var
return eye
end
Essentially, anywhere you see Float64, replace it by the type in the incoming argument.
I found the problem:
in my mat_fun the type of the return had to be "Real" in order for it to propgate through. Before it was Float64, which was not consistent with the fact I guess all types have to be Real with the autodifferentiate. Even though a Float64 is clearly Real, it looks like the inheritence isn't perserved i.e you have to make sure everything that is returned and inputed are type Real.
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::AbstractVector{T}) where {T<:Real}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
#println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= zeros(Real,(2,2))
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))
The github repo for ForwardDiff.jl has some examples. I am trying to extend the example to take in addition to a vector of variables, a parameter. I cannot get it to work.
This is the example (it is short so I will show it rather than linking)
using ForwardDiff
x = rand(5)
f(x::Vector) = sum(sin, x) .+ prod(tan, x) * sum(sqrt, x);
g = x -> ForwardDiff.gradient(f, x);
g(x) # this outputs the gradient.
I want to modify this since I use functions with multiple parameters as well as variables. As a simple modification I have tried adding a single parameter.
f(x::Vector, y) = (sum(sin, x) .+ prod(tan, x) * sum(sqrt, x)) * y;
I have tried the following to no avail:
fp = x -> ForwardDiff.gradient(f, x);
fp = x -> ForwardDiff.gradient(f, x, y);
y = 1
println("test grad: ", fp(x, y))
I get the following error message:
ERROR: LoadError: MethodError: no method matching (::var"#73#74")(::Array{Float64,1}, ::Int64)
A similar question was not answered in 2017. A comment led me to here and it seems the function can only accept one input?
The target function must be unary (i.e., only accept a single argument). ForwardDiff.jacobian is an exception to this rule.
Has this changed? It seems very limited to only be able to differentiate unary functions.
A possible workaround would be to concatenate the list of variables and parameters and then just slice the returned gradient to not include the gradients with respect to the parameters, but this seems silly.
I personally think it makes sense to have this unary-only syntax for ForwardDiff. In your case, you could just pack/unpack x and y into a single vector (nominally x2 below):
julia> using ForwardDiff
julia> x = rand(5)
5-element Array{Float64,1}:
0.4304735670747184
0.3939269364431113
0.7912705403776603
0.8942024934250143
0.5724373306715196
julia> f(x::Vector, y) = (sum(sin, x) .+ prod(tan, x) * sum(sqrt, x)) * y;
julia> y = 1
1
julia> f(x2::Vector) = f(x2[1:end-1], x2[end]) % unpacking in f call
f (generic function with 2 methods)
julia> fp = x -> ForwardDiff.gradient(f, x);
julia> println("test grad: ", fp([x; y])) % packing in fp call
test grad: [2.6105844240785796, 2.741442601659502, 1.9913192377198885, 1.9382805843854594, 2.26202717745402, 3.434350946190029]
But my preference would be to explicitly name the partial derivatives differently:
julia> ∂f∂x(x,y) = ForwardDiff.gradient(x -> f(x,y), x)
∂f∂x (generic function with 1 method)
julia> ∂f∂y(x,y) = ForwardDiff.derivative(y -> f(x,y), y)
∂f∂y (generic function with 1 method)
julia> ∂f∂x(x, y)
5-element Array{Float64,1}:
2.6105844240785796
2.741442601659502
1.9913192377198885
1.9382805843854594
2.26202717745402
julia> ∂f∂y(x, y)
3.434350946190029
Here's a quick attempt at a function which takes multiple arguments, the same signature as Zygote.gradient:
julia> using ForwardDiff, Zygote
julia> multigrad(f, xs...) = ntuple(length(xs)) do i
g(y) = f(ntuple(j -> j==i ? y : xs[j], length(xs))...)
xs[i] isa AbstractArray ? ForwardDiff.gradient(g, xs[i]) :
xs[i] isa Number ? ForwardDiff.derivative(g, xs[i]) : nothing
end;
julia> f1(x,y,z) = sum(x.^2)/y;
julia> multigrad(f1, [1,2,3], 4)
([0.5, 1.0, 1.5], -0.875)
julia> Zygote.gradient(f1, [1,2,3], 4)
([0.5, 1.0, 1.5], -0.875)
For a function with several scalar arguments, this evaluates each derivative separately, and perhaps it would be more efficient to use one evaluation with some Dual(x, (dx, dy, dz)). With large-enough array arguments, ForwardDiff.gradient will already perform multiple evaluations, each with some number of perturbations (the chunk size, which you can control).
I'm trying to build a function that will output an expression to be assigned to a new in-memory function. I might be misinterpreting the capability of metaprogramming but, I'm trying to build a function that generates a math series and assigns it to a function such as:
main.jl
function series(iter)
S = ""
for i in 1:iter
a = "x^$i + "
S = S*a
end
return chop(S, tail=3)
end
So, this will build the pattern and I'm temporarily working with it in the repl:
julia> a = Meta.parse(series(4))
:(x ^ 1 + x ^ 2 + x ^ 3 + x ^ 4)
julia> f =eval(Meta.parse(series(4)))
120
julia> f(x) =eval(Meta.parse(series(4)))
ERROR: cannot define function f; it already has a value
Obviously eval isn't what I'm looking for in this case but, is there another function I can use? Or, is this just not a viable way to accomplish the task in Julia?
The actual error you get has to do nothing with metaprogramming, but with the fact that you are reassigning f, which was assigned a value before:
julia> f = 10
10
julia> f(x) = x + 1
ERROR: cannot define function f; it already has a value
Stacktrace:
[1] top-level scope at none:0
[2] top-level scope at REPL[2]:1
It just doesn't like that. Call either of those variables differently.
Now to the conceptual problem. First, what you do here is not "proper" metaprogramming in Julia: why deal with strings and parsing at all? You can work directly on expressions:
julia> function series(N)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
series (generic function with 1 method)
julia> series(3)
:(x ^ 1 + x ^ 2 + x ^ 3)
This makes use of the fact that + belongs to the class of expressions that are automatically collected in repeated applications.
Second, you don't call eval at the appropriate place. I assume you meant to say "give me the function of x, with the body being what series(4) returns". Now, while the following works:
julia> f3(x) = eval(series(4))
f3 (generic function with 1 method)
julia> f3(2)
30
it is not ideal, as you newly compile the body every time the function is called. If you do something like that, it is preferred to expand the code once into the body at function definition:
julia> #eval f2(x) = $(series(4))
f2 (generic function with 1 method)
julia> f2(2)
30
You just need to be careful with hygiene here. All depends on the fact that you know that the generated body is formulated in terms of x, and the function argument matches that. In my opinion, the most Julian way of implementing your idea is through a macro:
julia> macro series(N::Int, x)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :($x ^ $i))
end
return S
end
#series (macro with 1 method)
julia> #macroexpand #series(4, 2)
:(2 ^ 1 + 2 ^ 2 + 2 ^ 3 + 2 ^ 4)
julia> #series(4, 2)
30
No free variables remaining in the output.
Finally, as has been noted in the comments, there's a function (and corresponding macro) evalpoly in Base which generalizes your use case. Note that this function does not use code generation -- it uses a well-designed generated function, which in combination with the optimizations results in code that is usually equal to the macro-generated code.
Another elegant option would be to use the multiple-dispatch mechanism of Julia and dispatch the generated code on type rather than value.
#generated function series2(p::Val{N}, x) where N
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
Usage
julia> series2(Val(20), 150.5)
3.5778761722367333e43
julia> series2(Val{20}(), 150.5)
3.5778761722367333e43
This task can be accomplished with comprehensions. I need to RTFM...
https://docs.julialang.org/en/v1/manual/arrays/#Generator-Expressions
I'm trying to call BLAS in Julia using ccall like this
ccall((BLAS.#blasfunc(:zgemm_), BLAS.libblas),...other arguments)
For as far as I can tell, this is the same way the LinearAlgebra package calls BLAS (link to source)
I get the following error however:
ccall: could not find function :zgemm_64_ in library libopenblas64_
Anyone have any idea what could be the problem?
EDIT: found out that using :zgemm_64_ directly instead of BLAS.#blasfunc(:zgemm_) solved the error, but I'd still like to know why.
In case it becomes necessary, here is the full function where I make the BLAS call.
import LinearAlgebra: norm, lmul!, rmul!, BlasInt, BLAS
# Preallocated version of A = A*B
function rmul!(
A::AbstractMatrix{T},
B::AbstractMatrix{T},
workspace::AbstractVector{T}
) where {T<:Number}
m,n,lw = size(A,1), size(B,2), length(workspace)
if(size(A,2) !== size(B,1))
throw(DimensionMismatch("dimensions of A and B don't match"))
end
if(size(B,1) !== n)
throw(DimensionMismatch("A must be square"))
end
if(lw < m*n)
throw(DimensionMismatch("provided workspace is too small"))
end
# Multiplication via direct blas call
ccall((BLAS.#blasfunc(:zgemm_), BLAS.libblas), Cvoid,
(Ref{UInt8}, Ref{UInt8}, Ref{BlasInt}, Ref{BlasInt},
Ref{BlasInt}, Ref{T}, Ptr{T}, Ref{BlasInt},
Ptr{T}, Ref{BlasInt}, Ref{T}, Ptr{T},
Ref{BlasInt}),
'N', 'N', m, n,n, 1.0, A, max(1,stride(A,2)),B, max(1,stride(B,2)), 0.0, workspace, n)
# Copy temp to A
for j=1:n
for i=1:m
A[i,j] = workspace[j+*(i-1)*n]
end
end
end
function test_rmul(m::Integer, n::Integer)
BLAS.set_num_threads(1)
A = rand(ComplexF64, m,n)
Q = rand(ComplexF64, n,n)
workspace = similar(A, m*n)
A_original = copy(A)
Q_original = copy(Q)
rmul!(A,Q,workspace)
#show norm(A_original*Q_original - A)
#show norm(Q_original - Q)
end
test_rmul(100,50)
BLAS.#blasfunc(:zgemm_) returns Symbol(":zgemm_64_"), and not :zgemm_64_, which looks rather strange in the first place... it's hygienic in the technical sense, but admittedly confusing. The reason it works in the original implementation is because there, the symbol with the name is always spliced into #eval; compare:
julia> #eval begin
BLAS.#blasfunc(:zgemm_)
end
Symbol(":zgemm_64_")
julia> #eval begin
BLAS.#blasfunc($(:zgemm_))
end
:zgemm_64_
So, #blasfunc expects its argument to be a name (i.e., a symbol in the AST), not a symbol literal (a quoted symbol in the AST). You could equivalently write it like a variable name:
julia> #eval begin
BLAS.#blasfunc zgemm_
end
:zgemm_64_
(without zgemm_ being actually defined in this scope!)