I am a newbie to Julia and Flux with some experience in Tensorflow Keras and python. I tried to use the Flux.withgradient command to write a user-defined training function with more flexibility. Here is the training part of my code:
loss, grad = Flux.withgradient(modelDQN.evalParameters) do
qEval = modelDQN.evalModel(evalInput)
Flux.mse(qEval, qTarget)
end
Flux.update!(modelDQN.optimizer, modelDQN.evalParameters, grad)
This code works just fine. But if I put the command qEval = modelDQN.evalModel(evalInput) outside the do end loop, as follows:
qEval = modelDQN.evalModel(evalInput)
loss, grad = Flux.withgradient(modelDQN.evalParameters) do
Flux.mse(qEval, qTarget)
end
Flux.update!(modelDQN.optimizer, modelDQN.evalParameters, grad)
The model parameters will not be updated. As far as I know, the do end loop works as an anonymous function that takes 0 arguments. Then why do we need the command qEval = modelDQN.evalModel(evalInput) inside the loop to get the model updated?
The short answer is that anything to be differentiated has to happen inside the (anonymous) function which you pass to gradient (or withgradient), because this is very much not a standard function call -- Zygote (Flux's auto-differentiation library) traces its execution to compute the derivative, and can't transform what it can't see.
Longer, this is Zygote's "implicit" mode, which relies on global references to arrays. The simplest use is something like this:
julia> using Zygote
julia> x = [2.0, 3.0];
julia> g = gradient(() -> sum(x .^ 2), Params([x]))
Grads(...)
julia> g[x] # lookup by objectid(x)
2-element Vector{Float64}:
4.0
6.0
If you move some of that calculation outside, then you make a new array y with a new objectid. Julia has no memory of where this came from, it is completely unrelated to x. They are ordinary arrays, not a special tracked type.
So if you refer to y in the gradient, Zygote cannot infer how this depends on x:
julia> y = x .^ 2 # calculate this outside of gradient
2-element Vector{Float64}:
4.0
9.0
julia> g2 = gradient(() -> sum(y), Params([x]))
Grads(...)
julia> g2[x] === nothing # represents zero
true
Zygote doesn't have to be used in this way. It also has an "explicit" mode which does not rely on global references. This is perhaps less confusing:
julia> gradient(x1 -> sum(x1 .^ 2), x) # x1 is a local variable
([4.0, 6.0],)
julia> gradient(x1 -> sum(y), x) # sum(y) is obviously indep. x1
(nothing,)
julia> gradient((x1, y1) -> sum(y1), x, y)
(nothing, Fill(1.0, 2))
Flux is in the process of changing to use this second form. On v0.13.9 or later, something like this ought to work:
opt_state = Flux.setup(modelDQN.optimizer, modelDQN) # do this once
loss, grads = Flux.withgradient(modelDQN.model) do m
qEval = m(evalInput) # local variable m
Flux.mse(qEval, qTarget)
end
Flux.update!(opt_state, modelDQN.model, grads[1])
Related
The github repo for ForwardDiff.jl has some examples. I am trying to extend the example to take in addition to a vector of variables, a parameter. I cannot get it to work.
This is the example (it is short so I will show it rather than linking)
using ForwardDiff
x = rand(5)
f(x::Vector) = sum(sin, x) .+ prod(tan, x) * sum(sqrt, x);
g = x -> ForwardDiff.gradient(f, x);
g(x) # this outputs the gradient.
I want to modify this since I use functions with multiple parameters as well as variables. As a simple modification I have tried adding a single parameter.
f(x::Vector, y) = (sum(sin, x) .+ prod(tan, x) * sum(sqrt, x)) * y;
I have tried the following to no avail:
fp = x -> ForwardDiff.gradient(f, x);
fp = x -> ForwardDiff.gradient(f, x, y);
y = 1
println("test grad: ", fp(x, y))
I get the following error message:
ERROR: LoadError: MethodError: no method matching (::var"#73#74")(::Array{Float64,1}, ::Int64)
A similar question was not answered in 2017. A comment led me to here and it seems the function can only accept one input?
The target function must be unary (i.e., only accept a single argument). ForwardDiff.jacobian is an exception to this rule.
Has this changed? It seems very limited to only be able to differentiate unary functions.
A possible workaround would be to concatenate the list of variables and parameters and then just slice the returned gradient to not include the gradients with respect to the parameters, but this seems silly.
I personally think it makes sense to have this unary-only syntax for ForwardDiff. In your case, you could just pack/unpack x and y into a single vector (nominally x2 below):
julia> using ForwardDiff
julia> x = rand(5)
5-element Array{Float64,1}:
0.4304735670747184
0.3939269364431113
0.7912705403776603
0.8942024934250143
0.5724373306715196
julia> f(x::Vector, y) = (sum(sin, x) .+ prod(tan, x) * sum(sqrt, x)) * y;
julia> y = 1
1
julia> f(x2::Vector) = f(x2[1:end-1], x2[end]) % unpacking in f call
f (generic function with 2 methods)
julia> fp = x -> ForwardDiff.gradient(f, x);
julia> println("test grad: ", fp([x; y])) % packing in fp call
test grad: [2.6105844240785796, 2.741442601659502, 1.9913192377198885, 1.9382805843854594, 2.26202717745402, 3.434350946190029]
But my preference would be to explicitly name the partial derivatives differently:
julia> ∂f∂x(x,y) = ForwardDiff.gradient(x -> f(x,y), x)
∂f∂x (generic function with 1 method)
julia> ∂f∂y(x,y) = ForwardDiff.derivative(y -> f(x,y), y)
∂f∂y (generic function with 1 method)
julia> ∂f∂x(x, y)
5-element Array{Float64,1}:
2.6105844240785796
2.741442601659502
1.9913192377198885
1.9382805843854594
2.26202717745402
julia> ∂f∂y(x, y)
3.434350946190029
Here's a quick attempt at a function which takes multiple arguments, the same signature as Zygote.gradient:
julia> using ForwardDiff, Zygote
julia> multigrad(f, xs...) = ntuple(length(xs)) do i
g(y) = f(ntuple(j -> j==i ? y : xs[j], length(xs))...)
xs[i] isa AbstractArray ? ForwardDiff.gradient(g, xs[i]) :
xs[i] isa Number ? ForwardDiff.derivative(g, xs[i]) : nothing
end;
julia> f1(x,y,z) = sum(x.^2)/y;
julia> multigrad(f1, [1,2,3], 4)
([0.5, 1.0, 1.5], -0.875)
julia> Zygote.gradient(f1, [1,2,3], 4)
([0.5, 1.0, 1.5], -0.875)
For a function with several scalar arguments, this evaluates each derivative separately, and perhaps it would be more efficient to use one evaluation with some Dual(x, (dx, dy, dz)). With large-enough array arguments, ForwardDiff.gradient will already perform multiple evaluations, each with some number of perturbations (the chunk size, which you can control).
I'm trying to build a function that will output an expression to be assigned to a new in-memory function. I might be misinterpreting the capability of metaprogramming but, I'm trying to build a function that generates a math series and assigns it to a function such as:
main.jl
function series(iter)
S = ""
for i in 1:iter
a = "x^$i + "
S = S*a
end
return chop(S, tail=3)
end
So, this will build the pattern and I'm temporarily working with it in the repl:
julia> a = Meta.parse(series(4))
:(x ^ 1 + x ^ 2 + x ^ 3 + x ^ 4)
julia> f =eval(Meta.parse(series(4)))
120
julia> f(x) =eval(Meta.parse(series(4)))
ERROR: cannot define function f; it already has a value
Obviously eval isn't what I'm looking for in this case but, is there another function I can use? Or, is this just not a viable way to accomplish the task in Julia?
The actual error you get has to do nothing with metaprogramming, but with the fact that you are reassigning f, which was assigned a value before:
julia> f = 10
10
julia> f(x) = x + 1
ERROR: cannot define function f; it already has a value
Stacktrace:
[1] top-level scope at none:0
[2] top-level scope at REPL[2]:1
It just doesn't like that. Call either of those variables differently.
Now to the conceptual problem. First, what you do here is not "proper" metaprogramming in Julia: why deal with strings and parsing at all? You can work directly on expressions:
julia> function series(N)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
series (generic function with 1 method)
julia> series(3)
:(x ^ 1 + x ^ 2 + x ^ 3)
This makes use of the fact that + belongs to the class of expressions that are automatically collected in repeated applications.
Second, you don't call eval at the appropriate place. I assume you meant to say "give me the function of x, with the body being what series(4) returns". Now, while the following works:
julia> f3(x) = eval(series(4))
f3 (generic function with 1 method)
julia> f3(2)
30
it is not ideal, as you newly compile the body every time the function is called. If you do something like that, it is preferred to expand the code once into the body at function definition:
julia> #eval f2(x) = $(series(4))
f2 (generic function with 1 method)
julia> f2(2)
30
You just need to be careful with hygiene here. All depends on the fact that you know that the generated body is formulated in terms of x, and the function argument matches that. In my opinion, the most Julian way of implementing your idea is through a macro:
julia> macro series(N::Int, x)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :($x ^ $i))
end
return S
end
#series (macro with 1 method)
julia> #macroexpand #series(4, 2)
:(2 ^ 1 + 2 ^ 2 + 2 ^ 3 + 2 ^ 4)
julia> #series(4, 2)
30
No free variables remaining in the output.
Finally, as has been noted in the comments, there's a function (and corresponding macro) evalpoly in Base which generalizes your use case. Note that this function does not use code generation -- it uses a well-designed generated function, which in combination with the optimizations results in code that is usually equal to the macro-generated code.
Another elegant option would be to use the multiple-dispatch mechanism of Julia and dispatch the generated code on type rather than value.
#generated function series2(p::Val{N}, x) where N
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
Usage
julia> series2(Val(20), 150.5)
3.5778761722367333e43
julia> series2(Val{20}(), 150.5)
3.5778761722367333e43
This task can be accomplished with comprehensions. I need to RTFM...
https://docs.julialang.org/en/v1/manual/arrays/#Generator-Expressions
I am trying to use ForwardDiff in a library where almost all functions are restricted to only take in Floats. I want to generalise these function signatures so that ForwardDiff can be used while still being restrictive enough so functions only take numeric values and not things like Dates. I have alot of functions with the same name but different types (ie functions that take in "time" as either a float or a Date with the same function name) and do not want to remove the type qualifiers throughout.
Minimal Working Example
using ForwardDiff
x = [1.0, 2.0, 3.0, 4.0 ,5.0]
typeof(x) # Array{Float64,1}
function G(x::Array{Real,1})
return sum(exp.(x))
end
function grad_F(x::Array)
return ForwardDiff.gradient(G, x)
end
G(x) # Method Error
grad_F(x) # Method error
function G(x::Array{Float64,1})
return sum(exp.(x))
end
G(x) # This works
grad_F(x) # This has a method error
function G(x)
return sum(exp.(x))
end
G(x) # This works
grad_F(x) # This works
# But now I cannot restrict the function G to only take numeric arrays and not for instance arrays of Dates.
Is there are a way to restict functions to only take numeric values (Ints and Floats) and whatever dual number structs that ForwardDiff uses but not allow Symbols, Dates, etc.
ForwardDiff.Dual is a subtype of the abstract type Real. The issue you have, however, is that Julia's type parameters are invariant, not covariant. The following, then, returns false.
# check if `Array{Float64, 1}` is a subtype of `Array{Real, 1}`
julia> Array{Float64, 1} <: Array{Real, 1}
false
That makes your function definition
function G(x::Array{Real,1})
return sum(exp.(x))
end
incorrect (not suitable for your use). That's why you get the following error.
julia> G(x)
ERROR: MethodError: no method matching G(::Array{Float64,1})
The correct definition should rather be
function G(x::Array{<:Real,1})
return sum(exp.(x))
end
or if you somehow need an easy access to the concrete element type of the array
function G(x::Array{T,1}) where {T<:Real}
return sum(exp.(x))
end
The same goes for your grad_F function.
You might find it useful to read the relevant section of the Julia documentation for types.
You might also want to type annotate your functions for AbstractArray{<:Real,1} type rather than Array{<:Real, 1} so that your functions can work other types of arrays, like StaticArrays, OffsetArrays etc., without a need for redefinitions.
This would accept any kind of array parameterized by any kind of number:
function foo(xs::AbstractArray{<:Number})
#show typeof(xs)
end
or:
function foo(xs::AbstractArray{T}) where T<:Number
#show typeof(xs)
end
In case you need to refer to the type parameter T inside the body function.
x1 = [1.0, 2.0, 3.0, 4.0 ,5.0]
x2 = [1, 2, 3,4, 5]
x3 = 1:5
x4 = 1.0:5.0
x5 = [1//2, 1//4, 1//8]
xss = [x1, x2, x3, x4, x5]
function foo(xs::AbstractArray{T}) where T<:Number
#show xs typeof(xs) T
println()
end
for xs in xss
foo(xs)
end
Outputs:
xs = [1.0, 2.0, 3.0, 4.0, 5.0]
typeof(xs) = Array{Float64,1}
T = Float64
xs = [1, 2, 3, 4, 5]
typeof(xs) = Array{Int64,1}
T = Int64
xs = 1:5
typeof(xs) = UnitRange{Int64}
T = Int64
xs = 1.0:1.0:5.0
typeof(xs) = StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
T = Float64
xs = Rational{Int64}[1//2, 1//4, 1//8]
typeof(xs) = Array{Rational{Int64},1}
T = Rational{Int64}
You can run the example code here: https://repl.it/#SalchiPapa/Restricting-function-signatures-in-Julia
I'm writing a genetic program in order to test the fitness of randomly generated expressions. Shown here is the function to generate the expression as well a the main function. DIV and GT are defined elsewhere in the code:
function create_single_full_tree(depth, fs, ts)
"""
Creates a single AST with full depth
Inputs
depth Current depth of tree. Initially called from main() with max depth
fs Function Set - Array of allowed functions
ts Terminal Set - Array of allowed terminal values
Output
Full AST of typeof()==Expr
"""
# If we are at the bottom
if depth == 1
# End of tree, return function with two terminal nodes
return Expr(:call, fs[rand(1:length(fs))], ts[rand(1:length(ts))], ts[rand(1:length(ts))])
else
# Not end of expression, recurively go back through and create functions for each new node
return Expr(:call, fs[rand(1:length(fs))], create_single_full_tree(depth-1, fs, ts), create_single_full_tree(depth-1, fs, ts))
end
end
function main()
"""
Main function
"""
# Define functional and terminal sets
fs = [:+, :-, :DIV, :GT]
ts = [:x, :v, -1]
# Create the tree
ast = create_single_full_tree(4, fs, ts)
#println(typeof(ast))
#println(ast)
#println(dump(ast))
x = 1
v = 1
eval(ast) # Error out unless x and v are globals
end
main()
I am generating a random expression based on certain allowed functions and variables. As seen in the code, the expression can only have symbols x and v, as well as the value -1. I will need to test the expression with a variety of x and v values; here I am just using x=1 and v=1 to test the code.
The expression is being returned correctly, however, eval() can only be used with global variables, so it will error out when run unless I declare x and v to be global (ERROR: LoadError: UndefVarError: x not defined). I would like to avoid globals if possible. Is there a better way to generate and evaluate these generated expressions with locally defined variables?
Here is an example for generating an (anonymous) function. The result of eval can be called as a function and your variable can be passed as parameters:
myfun = eval(Expr(:->,:x, Expr(:block, Expr(:call,:*,3,:x) )))
myfun(14)
# returns 42
The dump function is very useful to inspect the expression that the parsers has created. For two input arguments you would use a tuple for example as args[1]:
julia> dump(parse("(x,y) -> 3x + y"))
Expr
head: Symbol ->
args: Array{Any}((2,))
1: Expr
head: Symbol tuple
args: Array{Any}((2,))
1: Symbol x
2: Symbol y
typ: Any
2: Expr
[...]
Does this help?
In the Metaprogramming part of the Julia documentation, there is a sentence under the eval() and effects section which says
Every module has its own eval() function that evaluates expressions in its global scope.
Similarly, the REPL help ?eval will give you, on Julia 0.6.2, the following help:
Evaluate an expression in the given module and return the result. Every Module (except those defined with baremodule) has its own 1-argument definition of eval, which evaluates expressions in that module.
I assume, you are working in the Main module in your example. That's why you need to have the globals defined there. For your problem, you can use macros and interpolate the values of x and y directly inside the macro.
A minimal working example would be:
macro eval_line(a, b, x)
isa(a, Real) || (warn("$a is not a real number."); return :(throw(DomainError())))
isa(b, Real) || (warn("$b is not a real number."); return :(throw(DomainError())))
return :($a * $x + $b) # interpolate the variables
end
Here, #eval_line macro does the following:
Main> #macroexpand #eval_line(5, 6, 2)
:(5 * 2 + 6)
As you can see, the values of macro's arguments are interpolated inside the macro and the expression is given to the user accordingly. When the user does not behave,
Main> #macroexpand #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
:((Main.throw)((Main.DomainError)()))
a user-friendly warning message is provided to the user at parse-time, and a DomainError is thrown at run-time.
Of course, you can do these things within your functions, again by interpolating the variables --- you do not need to use macros. However, what you would like to achieve in the end is to combine eval with the output of a function that returns Expr. This is what the macro functionality is for. Finally, you would simply call your macros with an # sign preceding the macro name:
Main> #eval_line(5, 6, 2)
16
Main> #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
ERROR: DomainError:
Stacktrace:
[1] eval(::Module, ::Any) at ./boot.jl:235
EDIT 1. You can take this one step further, and create functions accordingly:
macro define_lines(linedefs)
for (name, a, b) in eval(linedefs)
ex = quote
function $(Symbol(name))(x) # interpolate name
return $a * x + $b # interpolate a and b here
end
end
eval(ex) # evaluate the function definition expression in the module
end
end
Then, you can call this macro to create different line definitions in the form of functions to be called later on:
#define_lines([
("identity_line", 1, 0);
("null_line", 0, 0);
("unit_shift", 0, 1)
])
identity_line(5) # returns 5
null_line(5) # returns 0
unit_shift(5) # returns 1
EDIT 2. You can, I guess, achieve what you would like to achieve by using a macro similar to that below:
macro random_oper(depth, fs, ts)
operations = eval(fs)
oper = operations[rand(1:length(operations))]
terminals = eval(ts)
ts = terminals[rand(1:length(terminals), 2)]
ex = :($oper($ts...))
for d in 2:depth
oper = operations[rand(1:length(operations))]
t = terminals[rand(1:length(terminals))]
ex = :($oper($ex, $t))
end
return ex
end
which will give the following, for instance:
Main> #macroexpand #random_oper(1, [+, -, /], [1,2,3])
:((-)([3, 3]...))
Main> #macroexpand #random_oper(2, [+, -, /], [1,2,3])
:((+)((-)([2, 3]...), 3))
Thanks Arda for the thorough response! This helped, but part of me thinks there may be a better way to do this as it seems too roundabout. Since I am writing a genetic program, I will need to create 500 of these ASTs, all with random functions and terminals from a set of allowed functions and terminals (fs and ts in the code). I will also need to test each function with 20 different values of x and v.
In order to accomplish this with the information you have given, I have come up with the following macro:
macro create_function(defs)
for name in eval(defs)
ex = quote
function $(Symbol(name))(x,v)
fs = [:+, :-, :DIV, :GT]
ts = [x,v,-1]
return create_single_full_tree(4, fs, ts)
end
end
eval(ex)
end
end
I can then supply a list of 500 random function names in my main() function, such as ["func1, func2, func3,.....". Which I can eval with any x and v values in my main function. This has solved my issue, however, this seems to be a very roundabout way of doing this, and may make it difficult to evolve each AST with each iteration.
I'm trying to use JuMP to solve a non-linear problem, where the number of variables are decided by the user - that is, not known at compile time.
To accomplish this, the #NLobjective line looks like this:
#eval #JuMP.NLobjective(m, Min, $(Expr(:call, :myf, [Expr(:ref, :x, i) for i=1:n]...)))
Where, for instance, if n=3, the compiler interprets the line as identical to:
#JuMP.NLobjective(m, Min, myf(x[1], x[2], x[3]))
The issue is that #eval works only in the global scope, and when contained in a function, an error is thrown.
My question is: how can I accomplish this same functionality -- getting #NLobjective to call myf with a variable number of x[1],...,x[n] arguments -- within the local, not-known-at-compilation scope of a function?
def testme(n)
myf(a...) = sum(collect(a).^2)
m = JuMP.Model(solver=Ipopt.IpoptSolver())
JuMP.register(m, :myf, n, myf, autodiff=true)
#JuMP.variable(m, x[1:n] >= 0.5)
#eval #JuMP.NLobjective(m, Min, $(Expr(:call, :myf, [Expr(:ref, :x, i) for i=1:n]...)))
JuMP.solve(m)
end
testme(3)
Thanks!
As explained in http://jump.readthedocs.io/en/latest/nlp.html#raw-expression-input , objective functions can be given without the macro. The relevant expression:
JuMP.setNLobjective(m, :Min, Expr(:call, :myf, [x[i] for i=1:n]...))
is even simpler than the #eval based one and works in the function. The code is:
using JuMP, Ipopt
function testme(n)
myf(a...) = sum(collect(a).^2)
m = JuMP.Model(solver=Ipopt.IpoptSolver())
JuMP.register(m, :myf, n, myf, autodiff=true)
#JuMP.variable(m, x[1:n] >= 0.5)
JuMP.setNLobjective(m, :Min, Expr(:call, :myf, [x[i] for i=1:n]...))
JuMP.solve(m)
return [getvalue(x[i]) for i=1:n]
end
testme(3)
and it returns:
julia> testme(3)
:
EXIT: Optimal Solution Found.
3-element Array{Float64,1}:
0.5
0.5
0.5