An unexpected variable reference changing during Julia Zygote gradient pullback

An unexpected variable reference changing during Julia Zygote gradient pullback - julia

I'm using Julia Flux.jl to train models. But when I customize my model, there is an issue the variable reference in gradient pullback function doesn't behave as expected. For simplicity, the issue code is simplified and shown as following
let
x=rand(Float32, (10,1))
v_layer=[
Dense(10, 11),
Dense(10, 12),
]
ps=Flux.Params([
vcat((collect.(params.(v_layer)))...)...,
])
function model(x)
out=x
for (layer_idx, layer)=enumerate(v_layer)
#show layer_idx, size(x)
out=layer(x)
end
return out
end
gs=gradient(ps) do
sum(model(x))
end
end
If we run the code, the size of x in for loop will change according to the #show macro output as following
(layer_idx, size(x)) = (1, (10, 1))
(layer_idx, size(x)) = (2, (11, 1))
And we will get the following error message
DimensionMismatch("A has dimensions (12,10) but B has dimensions (11,1)")
I think the reference change of out in for loop should not change the reference of x, but that happened. Besides if we put the function model whole body inside gradient do as following
gs=gradient(ps) do
out=x
for (layer_idx, layer)=enumerate(v_layer)
#show layer_idx, size(x)
out=layer(x)
end
sum(out)
end
worked just fine. Why is this?
PS. A slightly more complex but more practical version code is as follows and has the same error.
let
v_layer_A=[
Dense(10, 11),
Dense(11, 12),
]
v_layer_B=[
Dense(10, 11),
Dense(10, 12),
]
x=rand(Float32, (10,1))
ps=Flux.Params([
vcat((collect.(params.(v_layer_A)))...)...,
vcat((collect.(params.(v_layer_B)))...)...,
])
function model(x)
out=x
for (idx, (layer_A, layer_B))=enumerate(zip(v_layer_A, v_layer_B))
#show idx, size(x)
out_A=layer_A(out)
out_B=layer_B(x)
out=out_A.*out_B
end
return out
end
gs=gradient(ps) do
sum(model(x))
end
end

Related

Julia - last() does not work inside the function

I am writing the below code, and in the code the last() function does not work. I get the below error.
ERROR: UndefVarError: last not defined
But when I use last() outside of the function with same logic it works.
I am trying to write the below function -
function mergeOverlappingIntervals(intervals)
sort!(intervals, by = x -> intervals[1])
new_interval = intervals[1]
for i in range(2, length(intervals))
if last(new_intervals)[2] >= intervals[i][1]
last(new_intervals) = [minimum!(last(new_intervals)[1], intervals[i][1]), maximum!(last(new_intervals)[2], intervals[2])]
else
push!(new_interval, intervals[i])
end
end
end
Can you please help?

If you are trying to merge the intervals, note that your sort does not work since `by = x -> intervals[1] is a constant: you wanted to say by = x -> x[1]. and the default sort on vectors does that already.
You could instead do:
using Random
intervals = shuffle([[1, 2], [3, 5], [4, 6]])
function mergeintervals(intervals)
#assert length(intervals) > 1
sort!(intervals)
a = [copy(intervals[begin])]
for v in #view intervals[begin+1:end]
if a[end][end] >= v[begin]
if a[end][end] < v[end]
a[end][end] = v[end]
end
else
push!(a, copy(v))
end
end
return a
end
#show mergeintervals(intervals) # [[1, 2], [3, 6]]
The reason last(x) does not work is that (unlike x[end]) it does not return an lvalue, that is, it does not produce a value or syntactic expression that you can assign to. So Julia thinks you are trying to redefine the function last(x) when you attempt to assign to it (as DNF pointed out). (An lvalue is something that can be used as the left hand side of an assignment, which in Julia does not mean it is a direct memory address: see below).

A straightforward implementation, after fixing two errors. last() cannot be assigned to as others said, and minimum or maximum are used with arrays, you want max here to compare scalars. Finally you should return the new merged stack of intervals.
function mergeIntervals(intervals)
sort!(intervals)
stack = [intervals[1]]
for i in 2:length(intervals)
if stack[end][2] < intervals[i][1]
push!(stack, intervals[i])
else
stack[end][2] = max(stack[end][2], intervals[i][2])
end
end
return stack
end
Test an example:
intervals = [[6, 8], [1, 9], [2, 4], [4, 7]]
mergeIntervals(intervals)
[[1, 9]]

I was able to solve the problem by taking the suggestions from all the answers posted. Thank you.
Below is the code -
function mergeOverlappingIntervals(intervals)
sort!(intervals, by = x -> intervals[1])
new_interval = [intervals[1]]
for i in range(2, length(intervals))
if new_interval[end][2] >= intervals[i][1]
new_interval[end] = [min(new_interval[end][1], intervals[i][1]), max(new_interval[end][2], intervals[i][2])]
else
push!(new_interval, intervals[i])
end
end
return new_interval
end

Julia JuMP making sure nonlinear objective function has correct function signatures so that autodifferentiate works properly?

so I wrote a minimum example to show what I'm trying to do. Basically I want to solve a optimization problem with multiple variables. When I try to do this in JuMP I was having issues with my function obj not being able to take a forwardDiff object.
I looked here: and it seemed to do with the function signature :Restricting function signatures while using ForwardDiff in Julia . I did this in my obj function, and for insurance did it in my sub-function as well, but I still get the error
LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{JuMP.var"#110#112"{typeof(my_fun)},Float64},Float64,2})
Closest candidates are:
Float64(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
Float64(::T) where T<:Number at boot.jl:715
Float64(::Int8) at float.jl:60
This still does not work. I feel like I have the bulk of the code correct, just some weird of type thing going on that I have to clear up so autodifferentiate works...
Any suggestions?
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::Array{<:Real,1})
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= Matrix{Float64}(I, 2, 2)
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))

Use instead
function obj(x::Vector{T}) where {T}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{T}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T}
eye= Matrix{T}(I, 2, 2)
eye[2,2]=var
return eye
end
Essentially, anywhere you see Float64, replace it by the type in the incoming argument.

I found the problem:
in my mat_fun the type of the return had to be "Real" in order for it to propgate through. Before it was Float64, which was not consistent with the fact I guess all types have to be Real with the autodifferentiate. Even though a Float64 is clearly Real, it looks like the inheritence isn't perserved i.e you have to make sure everything that is returned and inputed are type Real.
using JuMP
using Ipopt
using LinearAlgebra
function obj(x::AbstractVector{T}) where {T<:Real}
println(x)
x1 = x[1]
x2 = x[2]
eye= Matrix{Float64}(I, 4, 4)
obj_val = tr(eye-kron(mat_fun(x1),mat_fun(x2)))
#println(obj_val)
return obj_val
end
function mat_fun(var::T) where {T<:Real}
eye= zeros(Real,(2,2))
eye[2,2]=var
return eye
end
m = Model(Ipopt.Optimizer)
my_fun(x...) = obj(collect(x))
#variable(m, 0<=x[1:2]<=2.0*pi)
register(m, :my_fun, 2, my_fun; autodiff = true)
#NLobjective(m, Min, my_fun(x...))
optimize!(m)
# retrieve the objective value, corresponding x values and the status
println(JuMP.value.(x))
println(JuMP.objective_value(m))
println(JuMP.termination_status(m))

Unable to plot solution of ODE in Maxima

Good time of the day!
Here is the code:
eq:'diff(x,t)=(exp(cos(t))-1)*x;
ode2(eq,x,t);
sol:ic1(%,t=1,x=-1);
/*---------------------*/
plot2d(
rhs(sol),
[t,-4*%pi, 4*%pi],
[y,-5,5],
[xtics,-4*%pi, 1*%pi, 4*%pi],
[ytics, false],
/*[yx_ratio , 0.6], */
[legend,"Solution."],
[xlabel, "t"], [ylabel, "x(t)"],
[style, [lines,1]],
[color, blue]
);
and here is the errors:
integrate: variable must not be a number; found: -12.56637061435917
What went wrong?
Thanks.

Here's a way to plot the solution sol which was found by ode2 and ic2 as you showed. First replace the integrate nouns with calls to quad_qags, a numerical quadrature function. I'll introduce a made-up variable name (a so-called gensym) to avoid confusion with the variable t.
(%i59) subst (nounify (integrate) =
lambda ([e, xx],
block ([u: gensym(string(xx))],
quad_qags (subst (xx = u, e), u, -4*%pi, xx)[1])),
rhs(sol));
(%o59) -%e^((-t)-quad_qags(%e^cos(t88373),t88373,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]
+quad_qags(%e^cos(t88336),t88336,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]+1)
Now I'll define a function foo1 with that result. I'll make a list of numerical values to see if it works right.
(%i60) foo1(t) := ''%;
(%o60) foo1(t):=-%e
^((-t)-quad_qags(%e^cos(t88373),t88373,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]
+quad_qags(%e^cos(t88336),t88336,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]+1)
(%i61) foo1(0.5);
(%o61) -1.648721270700128
(%i62) makelist (foo1(t), t, makelist (k, k, -10, 10));
(%o62) [-59874.14171519782,-22026.46579480672,
-8103.083927575384,-2980.957987041728,
-1096.633158428459,-403.4287934927351,
-148.4131591025766,-54.59815003314424,
-20.08553692318767,-7.38905609893065,-2.71828182845904,
-1.0,-0.3678794411714423,-0.1353352832366127,
-0.04978706836786394,-0.01831563888873418,
-0.006737946999085467,-0.002478752176666358,
-9.118819655545163E-4,-3.354626279025119E-4,
-1.234098040866796E-4]
Does %o62 look right to you? I'll assume it is okay. Next I'll define a function foo which calls foo1 defined before when the argument is a number, otherwise it just returns 0. This is a workaround for a bug in plot2d, which incorrectly determines that foo1 is not a function of t alone. Usually that workaround isn't needed, but it is needed in this case.
(%i63) foo(t) := if numberp(t) then foo1(t) else 0;
(%o63) foo(t):=if numberp(t) then foo1(t) else 0
Okay, now the function foo can be plotted!
(%i64) plot2d (foo, [t, -4*%pi, 4*%pi], [y, -5, 5]);
plot2d: some values were clipped.
(%o64) false
That takes about 30 seconds to plot -- calling quad_qags is relatively expensive.

it looks like ode2 does not know how to completely solve the problem, so the result contains an integral:
(%i6) display2d: false $
(%i7) eq:'diff(x,t)=(exp(cos(t))-1)*x;
(%o7) 'diff(x,t,1) = (%e^cos(t)-1)*x
(%i8) ode2(eq,x,t);
(%o8) x = %c*%e^('integrate(%e^cos(t),t)-t)
(%i9) sol:ic1(%,t=1,x=-1);
(%o9) x = -%e^((-%at('integrate(%e^cos(t),t),t = 1))
+'integrate(%e^cos(t),t)-t+1)
I tried it with contrib_ode also:
(%i12) load (contrib_ode);
(%o12) "/Users/dodier/tmp/maxima-code/share/contrib/diffequations/contrib_ode.mac"
(%i13) contrib_ode (eq, x, t);
(%o13) [x = %c*%e^('integrate(%e^cos(t),t)-t)]
So contrib_ode did not solve it completely either.
However the solution returned by ode2 (same for contrib_ode) appears to be a valid solution. I'll post a separate answer describing how to evaluate it numerically for plotting.

In Julia Flux, I keep getting the error LoadError: Mutating arrays is not supported, but I don't see where I am mutating an array

I am new to Julia and for some reason I can't get this very simple code to work. No matter what I try, I get the error LoadError: Mutating arrays is not supported. I understand that this error occurs when I mutate an array during the course of optimization so that the code is no longer differentiable. I clearly do not understand Julia enough to see where I am doing this.
If it helps the error seems to be occurring in the line for d in dataset.
using Statistics
using Flux: onehotbatch, onehot, onecold, crossentropy, throttle
using Flux
using Base.Iterators:repeated
using Plots:heatmap
using ImageView:imshow
images = Flux.Data.MNIST.images()[1:10]
labels = Flux.Data.MNIST.labels()[1:10]
heatmap(images[4], color=:grays, aspect_ratio=1)
X = float.(reshape.(images, :))
encode(x) = onehot(x, 0:9)
Y = encode.(labels)
m = Chain(Dense(28^2, 32, relu), Dense(32, 10), softmax)
loss(x, y) = crossentropy(m(x), y)
opt = ADAM()
accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))
dataset = zip(X, Y)
print(size(X))
evalcb = () -> #show(loss(X, Y))
print("Training...")
# Flux.train!(loss, params(m), dataset, opt, cb=throttle(evalcb, 5));
for d in dataset
print(d[2])
gs = gradient(params(m)) do
l = loss(d...)
end
update!(opt, params(m), gs)
end

It looks like I did have an old version of Flux (but not that old). I had to uninstall and reinstall Julia to install the new version of Flux.

Evaluate expression with local variables

I'm writing a genetic program in order to test the fitness of randomly generated expressions. Shown here is the function to generate the expression as well a the main function. DIV and GT are defined elsewhere in the code:
function create_single_full_tree(depth, fs, ts)
"""
Creates a single AST with full depth
Inputs
depth Current depth of tree. Initially called from main() with max depth
fs Function Set - Array of allowed functions
ts Terminal Set - Array of allowed terminal values
Output
Full AST of typeof()==Expr
"""
# If we are at the bottom
if depth == 1
# End of tree, return function with two terminal nodes
return Expr(:call, fs[rand(1:length(fs))], ts[rand(1:length(ts))], ts[rand(1:length(ts))])
else
# Not end of expression, recurively go back through and create functions for each new node
return Expr(:call, fs[rand(1:length(fs))], create_single_full_tree(depth-1, fs, ts), create_single_full_tree(depth-1, fs, ts))
end
end
function main()
"""
Main function
"""
# Define functional and terminal sets
fs = [:+, :-, :DIV, :GT]
ts = [:x, :v, -1]
# Create the tree
ast = create_single_full_tree(4, fs, ts)
#println(typeof(ast))
#println(ast)
#println(dump(ast))
x = 1
v = 1
eval(ast) # Error out unless x and v are globals
end
main()
I am generating a random expression based on certain allowed functions and variables. As seen in the code, the expression can only have symbols x and v, as well as the value -1. I will need to test the expression with a variety of x and v values; here I am just using x=1 and v=1 to test the code.
The expression is being returned correctly, however, eval() can only be used with global variables, so it will error out when run unless I declare x and v to be global (ERROR: LoadError: UndefVarError: x not defined). I would like to avoid globals if possible. Is there a better way to generate and evaluate these generated expressions with locally defined variables?

Here is an example for generating an (anonymous) function. The result of eval can be called as a function and your variable can be passed as parameters:
myfun = eval(Expr(:->,:x, Expr(:block, Expr(:call,:*,3,:x) )))
myfun(14)
# returns 42
The dump function is very useful to inspect the expression that the parsers has created. For two input arguments you would use a tuple for example as args[1]:
julia> dump(parse("(x,y) -> 3x + y"))
Expr
head: Symbol ->
args: Array{Any}((2,))
1: Expr
head: Symbol tuple
args: Array{Any}((2,))
1: Symbol x
2: Symbol y
typ: Any
2: Expr
[...]
Does this help?

In the Metaprogramming part of the Julia documentation, there is a sentence under the eval() and effects section which says
Every module has its own eval() function that evaluates expressions in its global scope.
Similarly, the REPL help ?eval will give you, on Julia 0.6.2, the following help:
Evaluate an expression in the given module and return the result. Every Module (except those defined with baremodule) has its own 1-argument definition of eval, which evaluates expressions in that module.
I assume, you are working in the Main module in your example. That's why you need to have the globals defined there. For your problem, you can use macros and interpolate the values of x and y directly inside the macro.
A minimal working example would be:
macro eval_line(a, b, x)
isa(a, Real) || (warn("$a is not a real number."); return :(throw(DomainError())))
isa(b, Real) || (warn("$b is not a real number."); return :(throw(DomainError())))
return :($a * $x + $b) # interpolate the variables
end
Here, #eval_line macro does the following:
Main> #macroexpand #eval_line(5, 6, 2)
:(5 * 2 + 6)
As you can see, the values of macro's arguments are interpolated inside the macro and the expression is given to the user accordingly. When the user does not behave,
Main> #macroexpand #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
:((Main.throw)((Main.DomainError)()))
a user-friendly warning message is provided to the user at parse-time, and a DomainError is thrown at run-time.
Of course, you can do these things within your functions, again by interpolating the variables --- you do not need to use macros. However, what you would like to achieve in the end is to combine eval with the output of a function that returns Expr. This is what the macro functionality is for. Finally, you would simply call your macros with an # sign preceding the macro name:
Main> #eval_line(5, 6, 2)
16
Main> #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
ERROR: DomainError:
Stacktrace:
[1] eval(::Module, ::Any) at ./boot.jl:235
EDIT 1. You can take this one step further, and create functions accordingly:
macro define_lines(linedefs)
for (name, a, b) in eval(linedefs)
ex = quote
function $(Symbol(name))(x) # interpolate name
return $a * x + $b # interpolate a and b here
end
end
eval(ex) # evaluate the function definition expression in the module
end
end
Then, you can call this macro to create different line definitions in the form of functions to be called later on:
#define_lines([
("identity_line", 1, 0);
("null_line", 0, 0);
("unit_shift", 0, 1)
])
identity_line(5) # returns 5
null_line(5) # returns 0
unit_shift(5) # returns 1
EDIT 2. You can, I guess, achieve what you would like to achieve by using a macro similar to that below:
macro random_oper(depth, fs, ts)
operations = eval(fs)
oper = operations[rand(1:length(operations))]
terminals = eval(ts)
ts = terminals[rand(1:length(terminals), 2)]
ex = :($oper($ts...))
for d in 2:depth
oper = operations[rand(1:length(operations))]
t = terminals[rand(1:length(terminals))]
ex = :($oper($ex, $t))
end
return ex
end
which will give the following, for instance:
Main> #macroexpand #random_oper(1, [+, -, /], [1,2,3])
:((-)([3, 3]...))
Main> #macroexpand #random_oper(2, [+, -, /], [1,2,3])
:((+)((-)([2, 3]...), 3))

Thanks Arda for the thorough response! This helped, but part of me thinks there may be a better way to do this as it seems too roundabout. Since I am writing a genetic program, I will need to create 500 of these ASTs, all with random functions and terminals from a set of allowed functions and terminals (fs and ts in the code). I will also need to test each function with 20 different values of x and v.
In order to accomplish this with the information you have given, I have come up with the following macro:
macro create_function(defs)
for name in eval(defs)
ex = quote
function $(Symbol(name))(x,v)
fs = [:+, :-, :DIV, :GT]
ts = [x,v,-1]
return create_single_full_tree(4, fs, ts)
end
end
eval(ex)
end
end
I can then supply a list of 500 random function names in my main() function, such as ["func1, func2, func3,.....". Which I can eval with any x and v values in my main function. This has solved my issue, however, this seems to be a very roundabout way of doing this, and may make it difficult to evolve each AST with each iteration.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

An unexpected variable reference changing during Julia Zygote gradient pullback - julia

Related

Julia - last() does not work inside the function

Julia JuMP making sure nonlinear objective function has correct function signatures so that autodifferentiate works properly?

Unable to plot solution of ODE in Maxima

In Julia Flux, I keep getting the error LoadError: Mutating arrays is not supported, but I don't see where I am mutating an array

Evaluate expression with local variables

Categories

Resources