Type instability in Julia Generator

Type instability in Julia Generator - julia

While trying to reduce the number of allocations generated by a function computing a likelihood by e.g. using Generator expressions, I came across the following behavior which I do not quite understand. Take the following two functions:
function testMax!(x,X,β)
xmax = 0.0
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
if x[i] > xmax
xmax = x[i]
end
end
y = 0.0
for i ∈ eachindex(x)
y += exp(x[i]-xmax)
end
return xmax, y
end
function testMaxWeird!(x,X,β)
xmax = 0.0
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
if x[i] > xmax
xmax = x[i]
end
end
y = sum(exp(x[j]-xmax) for j ∈ eachindex(x))
return xmax, y
end
Both generate the same output
using Random
Random.seed!(1234)
H = 10000;
X = rand(H,2);
β = rand(2);
x = zeros(H);
testMax!(x,X,β)
x = zeros(H);
testMaxWeird!(x,X,β)
returns (1.0772897308017204, 6101.682959406999). However, the first one is type stable, while the second one is not (and therefore much slower).
#code_warntype testMax!(x,X,β)
#code_warntype testMaxWeird!(x,X,β)
In particular, the problem lies with the type of y and xmax, the difference in the outputs being in the #code_warntype lines
y::Float64
xmax::Float64
versus
y::Any
xmax#_9::Core.Box
I am just confused as of why exactly this occurs, and whether it is due to bad practice on how I am defining xmax multiple times within the function, or to the way I am using the Generator expression?
Edit/Follow-up
The references and solutions provided are very helpful. I am still somewhat confused as to when exactly this can be expected to happen -- is it due to the way that xmax is updated within the for-loop, is its scope different from any other local variable defined within the function? Why does e.g. the (less efficient) way of computing the max not lead to the same closure issue?
function testMax2!(x,X,β)
#inbounds for i ∈ eachindex(x)
x[i] = X[i,2] * β[1] + X[i,2] * β[2]
end
xmax = maximum(x)
y = sum(exp(x[j]-xmax) for j ∈ eachindex(x))
return xmax, y
end
Edit 2: Nevermind, I think the Performance Tips explain this: "The parser, when translating it into lower-level instructions, substantially reorganizes the above code by extracting the inner function to a separate code block." I assume that this means that it comes from the variable being assigned multiple times such that the "reordering" of the code might lead to confusion.

This is a variation on a longstanding issue when creating some closures https://github.com/JuliaLang/julia/issues/15276.

Related

Optim Julia parameter meaning

I'm trying to use Optim in Julia to solve a two variable minimization problem, similar to the following
x = [1.0, 2.0, 3.0]
y = 1.0 .+ 2.0 .* x .+ [-0.3, 0.3, -0.1]
function sqerror(betas, X, Y)
err = 0.0
for i in 1:length(X)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
res = optimize(b -> sqerror(b, x, y), [0.0,0.0])
res.minimizer
I do not quite understand what [0.0,0.0] means. By looking at the document http://julianlsolvers.github.io/Optim.jl/v0.9.3/user/minimization/. My understanding is that it is the initial condition. However, if I change that to [0.0,0., 0.0], the algorithm still work despite the fact that I only have two unknowns, and the algorithm gives me three instead of two minimizer. I was wondering if anyone knows what[0.0,0.0] really stands for.

It is initial value. optimize by itself cannot know how many values your sqerror function takes. You specify it by passing this initial value.
For example if you add dimensionality check to sqerror you will get a proper error:
julia> function sqerror(betas::AbstractVector, X::AbstractVector, Y::AbstractVector)
#assert length(betas) == 2
err = 0.0
for i in eachindex(X, Y)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
sqerror (generic function with 2 methods)
julia> optimize(b -> sqerror(b, x, y), [0.0,0.0,0.0])
ERROR: AssertionError: length(betas) == 2
Note that I also changed the loop condition to eachindex(X, Y) to ensure that your function checks if X and Y vectors have aligned indices.
Finally if you want performance and reduce compilation cost (so e.g. assuming you do this optimization many times) it would be better to define your optimized function like this:
objective_factory(x, y) = b -> sqerror(b, x, y)
optimize(objective_factory(x, y), [0.0,0.0])

Julia & Functions - NoMethodError

I'm not understanding why the following snippet of code is returning a NoMethodError in Julia
using Calculus
nx = 101
nt = 101
dx = 2*pi / (nx - 1)
nu = 0.07
dt = dx*nu
function init(x, nu, t)
phi = exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
dphi_dx = derivative(phi)
u = ( 2.0*nu /phi )*dphi_dx + 4.0
return u
end
x = range(0.0,stop=2*pi,length=nx)
t = 0.0
u = [init(x0,nu,t) for x0 in x]
My aim here is to populate the elements of an array named u with values as calculated by my function init. The u array should have nx elements with u calculated at every x value in the range between 0.0 and 2*pi.

Next time please also post the error message and take a detailed at it before, so you can try to spot the mistake by yourself.
I don't really know the Calculus package but it seems you are using it wrong. Your phi is a number and not a function. You can't take a derivative from just a single number. Change it to
phi = x -> exp( -x^2 / 4.0*nu ) + exp( -(x - 2.0*pi)^2 / 4.0*nu )
an then call the phi and derivative at argument x, so phi(x) and derivative(phi,x) or dphi_x(x). As I don't know much about the Calculus package you should take a look at its documentation again to verify that the derivative command is doing exactly what you want like that.
Little extra: there are also element-wise operations in Julia (similar to Matlab for example) that apply functions to the whole array. Instead of [init(x0,nu,t) for x0 in x], you can also write init.(x,nu,t).

2D curve fitting in Julia

I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?

The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])

In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.

Functionally idiomatic FFT

I've written the this radix-2 FFT with the goal of making it functionally idiomatic without sacrificing too much performance:
let reverse x bits =
let rec reverse' x bits y =
match bits with
| 0 -> y
| _ -> ((y <<< 1) ||| (x &&& 1))
|> reverse' (x >>> 1) (bits - 1)
reverse' x bits 0
let radix2 (vector: Complex[]) (direction: int) =
let z = vector.Length
let depth = floor(Math.Log(double z, 2.0)) |> int
if (1 <<< depth) <> z then failwith "Vector length is not a power of 2"
// Complex roots of unity; "twiddle factors"
let unity: Complex[] =
let xpn = float direction * Math.PI / double z
Array.Parallel.init<Complex> (z/2) (fun i ->
Complex.FromPolarCoordinates(1.0, (float i) * xpn))
// Permutes elements of input vector via bit-reversal permutation
let pvec = Array.Parallel.init z (fun i -> vector.[reverse i depth])
let outerLoop (vec: Complex[]) =
let rec recLoop size =
if size <= z then
let mid, step = size / 2, z / size
let rec inrecLoop i =
if i < z then
let rec bottomLoop idx k =
if idx < i + mid then
let temp = vec.[idx + mid] * unity.[k]
vec.[idx + mid] <- (vec.[idx] - temp)
vec.[idx] <- (vec.[idx] + temp)
bottomLoop (idx + 1) (k + step)
bottomLoop i 0
inrecLoop (i + size)
inrecLoop 0
recLoop (size * 2)
recLoop 2
vec
outerLoop pvec
The outerLoop segment is the biggest nested tail-recursive mess I have ever written. I replicated the algorithm in the Wikipedia article for the Cooley-Tukey algorithm, but the only functional constructs I could think to implement using higher-order functions result in massive hits to both performance and memory efficiency. Are there other solutions that would yield the same results without resulting in massive slow-downs, while still being idiomatic?

I'm not an expert on how the algorithm works, so there might be a nice functional implementation, but it is worth noting that using a localised mutation is perfectly idiomatic in F#.
Your radix2 function is functional from the outside - it takes vector array as an input, never mutates it, creates a new array pvec which it then initializes (using some mutation along the way) and then returns it. This is a similar pattern to what built-in functions like Array.map use (which initializes a new array, mutates it and then returns it). This is often a sensible way of doing things, because some algorithms are better written using mutation.
In this case, it's perfectly reasonable to also use local mutable variables and loops. Doing that will make your code more readable compared to the tail-recursive version. I have not tested this, but my naive translation of your outerLoop function would just be to use three nested loops - something like this:
let mutable size = 2
while size <= z do
let mid, step = size / 2, z / size
let mutable i = 0
while i < z do
for j in 0 .. mid - 1 do
let idx, k = i + j, step * j
let temp = pvec.[idx + mid] * unity.[k]
pvec.[idx + mid] <- (pvec.[idx] - temp)
pvec.[idx] <- (pvec.[idx] + temp)
i <- i + size
size <- size * 2
This might not be exactly right (I did this just be refactoring your code), but I think it's actually more idiomatic than using complex nested tail-recursive functions in this case.

Unclassified statement at (1) in a mathematical expression

My first Fortran lesson is to plot the probability density function of the radial Sturmian functions. In case you are interested, the radial Sturmian functions are used to graph the momentum space eigenfunctions for the hydrogen atom.
In order to produce these radial functions, one needs to first produce some polynomials called the Gegenbauer polynomials, denoted
Cba(x),
where a and b should be stacked atop each other. One needs these polynomials because the Sturmians (let's call them R_n,l) are defined like so,
R_n,l(p) = N pl⁄(p2 + k2)l+2 Cn - l - 1l + 1(p2 - k2⁄p2 + k2),
where N is a normalisation constant, p is the momentum, n is the principle quantum number, l is the angular momentum and k is a constant. The normalisation constant is there so that when I come to square this function, it will produce a probability distribution for the momentum of the electron in a hydrogen atom.
Gegenbauer polynomials are generated using the following recurrence relation:
Cnl(x) = 1⁄n[2(l + n - 1) x Cn - 1l(x) - (2l + n - 2)Cn - 2l(x)],
with C0l(x) = 1 and C1l(x) = 2lx, as you may have noticed, l is fixed but n is not. At the start of my program, I will specify both l and n and work out the Gegenbauer polynomial I need for the radial function I wish to plot.
The problems I am having with my code at the moment are all in my subroutine for working out the value of the Gegenbauer polynomial Cn-l-1l+1(p2 - k2⁄p2 + k2) for incremental values of p between 0 and 3. I keep getting the error
Unclassified statement at (1)
but I cannot see what the issue is.
program Radial_Plot
implicit none
real, parameter :: pi = 4*atan(1.0)
integer, parameter :: top = 1000, l = 50, n = 100
real, dimension(1:top) :: x, y
real increment
real :: a=0.0, b = 2.5, k = 0.3
integer :: i
real, dimension(1:top) :: C
increment = (b-a)/(real(top)-1)
x(1) = 0.0
do i = 2, top
x(i) = x(i-1) + increment
end do
Call Gegenbauer(top, n, l, k, C)
y = x*C
! y is the function that I shall be plotting between values a and b.
end program Radial_Plot
Subroutine Gegenbauer(top1, n1, l1, k1, CSub)
! This subroutine is my attempt to calculate the Gegenbauer polynomials evaluated at a certain number of values between c and d.
implicit none
integer :: top1, i, j, n1, l1
real :: k1, increment1, c, d
real, dimension(1:top1) :: x1
real, dimension(1:n1 - l1, 1:top1) :: C1
real, dimension(1:n1 - l1) :: CSub
c = 0.0
d = 3.0
k1 = 0.3
n1 = 50
l1 = 25
top1 = 1000
increment1 = (d - c)/(real(top1) - 1)
x1(1) = 0.0
do i = 2, top1
x1(i) = x1(i-1) + increment1
end do
do j = 1, top1
C1(1,j) = 1
C1(2,j) = 2(l1 + 1)(x1(i)^2 - k1^2)/(x1(i)^2 + k1^2)
! All the errors occurring here are all due to, and I quote, 'Unclassifiable statement at (1)', I can't see what the heck I have done wrong.
do i = 3, n1 - l1
C1(i,j) = 2(((l1 + 1)/n1) + 1)(x1(i)^2 - k1^2)/(x1(i)^2 + k1^2)C1(i,j-1) - ((2(l1+1)/n1) + 1)C1(i,j-2)
end do
CSub(j) = Cn(n1 - l1,j)^2
end do
return
end Subroutine Gegenbauer

As francesalus correctly pointed out, the problem is because you use ^ instead of ** for exponentiation. Additionally, you do not put * between the terms you are multiplying.
C1(1,j) = 1
C1(2,j) = 2*(l1 + 1)*(x1(i)**2 - k1**2)/(x1(i)**2 + k1**2)
do i = 3, n1 - l1
C1(i,j) = 2 * (((l1 + 1)/n1) + 1) * (x1(i)**2 - k1**2) / &
(x1(i)**2 + k1**2)*C1(i,j-1) - ((2(l1+1)/n1) + 1) * &
C1(i,j-2)
end do
CSub(j) = Cn(n1 - l1,j)**2
Since you are beginning I have some advice. Learn to put all subroutines and functions to modules (unless they are internal). There is no reason for the return statement at the and of the subroutine, similarly as a stop statement isn't necessary at the and of the program.