I have 2 arrays of tuples and I have a loop asking if one element is in the other.
At each step I ask if the tuple contained in the coord array is in the Y array. The loop works fine except for one element which I cant explain why. Here is what I have :
Y[55:65] # This is the array I want to check at each step if my state is in or not.
11-element Array{Any,1}: (2.0, 1.0) (3.0, 1.0) (4.0, 1.0) (5.0,
1.0) (6.0, 1.0) (7.0, 1.0) (8.0, 1.0) (9.0, 1.0) (10.0, 1.0) (11.0, 1.0) (12.0, 1.0)
coord[i-1] # this is one element of coord that is in Y
0-dimensional Array{Tuple{Float64,Float64},0}: (6.0, 1.0)
coord[i] # this is an other element of coord that is in Y
0-dimensional Array{Tuple{Float64,Float64},0}: (7.0, 1.0)
But then when I test when they are in Y:
in(coord[i],Y[55:65]) # testing if coord[i] is in Y
false
in(coord[i-1],Y[55:65]) # testing if coord[i-1] is in Y
true
I dont understand: they are both represented in the same way in Y, they have the same type, why do I get from using in() that one is in and not the other?
I use Julia version 0.6.3.
Thanks in advance for the help!
How did you get coord and Y? If you get them by calculations rather than direct assignments, they may not be exactly equal even if they are displayed so. For example:
julia> p1 = fill((6.0, 1.0))
0-dimensional Array{Tuple{Float64,Float64},0}:
(6.0, 1.0)
julia> p2 = fill((7.0 + 3eps(), 1.0))
0-dimensional Array{Tuple{Float64,Float64},0}:
(7.000000000000001, 1.0)
julia> Y = [p1, p2]
2-element Array{Array{Tuple{Float64,Float64},0},1}:
(6.0, 1.0)
(7.0, 1.0) # NOTE that it get truncated in display but the content did not changed!
julia> x = fill((6.0, 1.0))
0-dimensional Array{Tuple{Float64,Float64},0}:
(6.0, 1.0)
julia> x in Y
true
julia> x = fill((7.0, 1.0))
0-dimensional Array{Tuple{Float64,Float64},0}:
(7.0, 1.0)
julia> x in Y
false
If this is the case, you can either round them before comparing or write the in function mannually using isapprox (or the ≈ operator, typed in Julia by \approx + Tab)
Related
I'm trying to use Optim in Julia to solve a two variable minimization problem, similar to the following
x = [1.0, 2.0, 3.0]
y = 1.0 .+ 2.0 .* x .+ [-0.3, 0.3, -0.1]
function sqerror(betas, X, Y)
err = 0.0
for i in 1:length(X)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
res = optimize(b -> sqerror(b, x, y), [0.0,0.0])
res.minimizer
I do not quite understand what [0.0,0.0] means. By looking at the document http://julianlsolvers.github.io/Optim.jl/v0.9.3/user/minimization/. My understanding is that it is the initial condition. However, if I change that to [0.0,0., 0.0], the algorithm still work despite the fact that I only have two unknowns, and the algorithm gives me three instead of two minimizer. I was wondering if anyone knows what[0.0,0.0] really stands for.
It is initial value. optimize by itself cannot know how many values your sqerror function takes. You specify it by passing this initial value.
For example if you add dimensionality check to sqerror you will get a proper error:
julia> function sqerror(betas::AbstractVector, X::AbstractVector, Y::AbstractVector)
#assert length(betas) == 2
err = 0.0
for i in eachindex(X, Y)
pred_i = betas[1] + betas[2] * X[i]
err += (Y[i] - pred_i)^2
end
return err
end
sqerror (generic function with 2 methods)
julia> optimize(b -> sqerror(b, x, y), [0.0,0.0,0.0])
ERROR: AssertionError: length(betas) == 2
Note that I also changed the loop condition to eachindex(X, Y) to ensure that your function checks if X and Y vectors have aligned indices.
Finally if you want performance and reduce compilation cost (so e.g. assuming you do this optimization many times) it would be better to define your optimized function like this:
objective_factory(x, y) = b -> sqerror(b, x, y)
optimize(objective_factory(x, y), [0.0,0.0])
I'm attempting to translate the equivalent of the following Python code (from SMT GEKPLS) into Julia:
def differences(X, Y):
D = X[:, np.newaxis, :] - Y[np.newaxis, :, :]
return D.reshape((-1, X.shape[1]))
So, given an input like this:
X = np.array([[1.0,1.0,1.0], [2.0,2.0,2.0]])
Y = np.array([[1.0,2.0,3.0], [4.0,5.0,6.0], [7.0,8.0,9.0]])
diff = differences(X,Y)
We get an output (diff) that looks like this:
[[ 0. -1. -2.]
[-3. -4. -5.]
[-6. -7. -8.]
[ 1. 0. -1.]
[-2. -3. -4.]
[-5. -6. -7.]]
What is an efficient way to do this with Julia code? I expect the X and Y input matrices to be quite large.
After some thinking, I came to this function:
function differences(X, Y)
Rx = repeat(X, inner=(size(Y, 1), 1))
Ry = repeat(Y, size(X, 1))
Rx - Ry
end
I hope I was helpful.
Here's a version that avoids repeat, which creates unnecessary data duplication:
function diffs_row(X, Y)
N = size(X, 2)
return reshape(reshape(X', 1, N, :) .- Y', N, :)'
end
The reason for all the adjoints ' is that it isn't really natural to operate row-wise in Julia. Julia arrays are column-major so reshape will retrieve data column-wise. If you decide instead to change the orientation of the data, you could write
function diffs_col(X, Y)
N = size(X, 1)
return reshape(reshape(X, N, 1, :) .- Y, N, :)
end
instead.
One often sees this when translating numpy code to Julia. Numpy is natively row-major, so the translation becomes a bit awkward. You should consider changing your data layout to be column major in many cases.
This might be faster than other alternatives, while still being easy to understand.
[x .- y for x ∈ X for y ∈ Y]
6-element Vector{Vector{Float64}}:
[0.0, -1.0, -2.0]
[-3.0, -4.0, -5.0]
[-6.0, -7.0, -8.0]
[1.0, 0.0, -1.0]
[-2.0, -3.0, -4.0]
[-5.0, -6.0, -7.0]
The one thing I disliked about numpy is that one has to exactly remember each function in conjunction with a combination of input parameters. In Julia, the traditional loop can serve as an efficient drop-in replacement for most algorithms.
Addendum: The above might be the fastest solution as I said, provided that working with a Vector{Vector{Float64}} is not an issue. If it is, here is another solution that outputs a Matrix{Float64} while being fast as well.
function diffr(X,Y)
i, l, m, n = 0, length(first(X)), length(X), length(Y)
Z = Matrix{Float64}(undef, m*n, l)
for x in X, y in Y
Z[i+=1,:] .= x .- y
end
Z
end
And here is a performance comparison of all posted solutions on my computer.
#btime [x.-y for x∈$X for y∈$Y] # 312.245 ns (9 allocations: 656 bytes)
#btime diffr($X, $Y) # 73.868 ns (1 allocation: 208 bytes)
#btime differences($X, $Y) # 439.000 ns (12 allocations: 896 bytes)
#btime diffs_row($X, $Y) # 463.131 ns (11 allocations: 784 bytes)
I have a complex matrix (i.e. Array{Complex{Float64},2}) in julia that I would like to upsample in one dimension.
My equivalent python code is:
data_package['time_series'] = sp.signal.resample(data_package['time_series'] .astype('complex64'), data_package['time_series'].shape[1]*upsample_factor, axis=1)
A resample() function can be found in DSP.jl. But it only works on Vectors, so one has to apply it manually along the desired dimension. One possible way looks like this (resampling along the second dimension, with a new rate of 2):
julia> using DSP
julia> test = reshape([1.0im, 2.0im, 3.0im, 4., 5., 6.], 3, 2)
3×2 Matrix{ComplexF64}:
0.0+1.0im 4.0+0.0im
0.0+2.0im 5.0+0.0im
0.0+3.0im 6.0+0.0im
julia> newRate = 2
2
julia> up = [resample(test[:, i], newRate) for i in 1:size(test, 2)] # gives a vector of vectors
2-element Vector{Vector{ComplexF64}}:
[0.0 + 0.9999042566881922im, 0.0 + 1.2801955476665785im, 0.0 + 1.9998085133763843im, 0.0 + 2.968204475861045im, 0.0 + 2.9997127700645763im]
[3.9996170267527686 + 0.0im, 4.466495565312296 + 0.0im, 4.999521283440961 + 0.0im, 6.154504493506763 + 0.0im, 5.9994255401291525 + 0.0im]
julia> cat(up..., dims = 2) # fuse to matrix
5×2 Matrix{ComplexF64}:
0.0+0.999904im 3.99962+0.0im
0.0+1.2802im 4.4665+0.0im
0.0+1.99981im 4.99952+0.0im
0.0+2.9682im 6.1545+0.0im
0.0+2.99971im 5.99943+0.0im
Please consider the package FFTResampling.jl
The method is based on the FFT, assuming periodic and band-limited input.
Im using SymPy in Julia. My purpose is to solve a homogeneous system of linear equations (Ax=0) with more unknowns than variables (A is not square).
Then, Im using the following code.
using SymPy
x, y, z, w = symbols("x y z w")
M = sympy.Matrix(((9, 2, 1,- 4, 0), (-4, -3, -1, -5, 0)))
s = linsolve(M, (x, y, z, w))
With this code Im able to get the correct solution. However, I´dont known how to manipulate that solution.
The final goal is to be able to get the solution in matrix form as lines representing (x and y) and column (z and w). (since x(z, w) and y(z,w)).
Thanks
If numerical (rather than symbolic) computations are acceptable, then this will get the job done:
julia> using LinearAlgebra
julia> M = rand(2,4)
2×4 Array{Float64,2}:
0.497965 0.704514 0.152799 0.69448
0.594486 0.695488 0.327688 0.710573
julia> Q,R = qr(M);
C = -R[1:2,1:2]\R[1:2,3:4]
xy(zw) = C*zw;
# Check that `[xy(zw); zw]` is indeed in the nullspace of `M`:
julia> zw = rand(2)
M*[xy(zw);zw]
2-element Array{Float64,1}:
-2.7755575615628914e-17
-1.1102230246251565e-16
I have an array Z in Julia which represents an image of a 2D Gaussian function. I.e. Z[i,j] is the height of the Gaussian at pixel i,j. I would like to determine the parameters of the Gaussian (mean and covariance), presumably by some sort of curve fitting.
I've looked into various methods for fitting Z: I first tried the Distributions package, but it is designed for a somewhat different situation (randomly selected points). Then I tried the LsqFit package, but it seems to be tailored for 1D fitting, as it is throwing errors when I try to fit 2D data, and there is no documentation I can find to lead me to a solution.
How can I fit a Gaussian to a 2D array in Julia?
The simplest approach is to use Optim.jl. Here is an example code (it was not optimized for speed, but it should show you how you can handle the problem):
using Distributions, Optim
# generate some sample data
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const m = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
decode(x) = (mu=x[1:2], sig=[x[3] x[4]; x[4] x[5]], s=x[6])
function objective(x)
mu, sig, s = decode(x)
try # sig might be infeasible so we have to handle this case
est_d = MvNormal(mu, sig)
ref_m = [s * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_m, m))
catch
sum(m)
end
end
# test for an example starting point
result = optimize(objective, [1.0, 0.0, 1.0, 0.0, 1.0, 1.0])
decode(result.minimizer)
Alternatively you could use constrained optimization e.g. like this:
using Distributions, JuMP, NLopt
true_d = MvNormal([1.0, 0.0], [2.0 1.0; 1.0 3.0])
const xr = -3:0.1:3
const yr = -3:0.1:3
const s = 5.0
const Z = [s * pdf(true_d, [x, y]) for x in xr, y in yr]
m = Model(solver=NLoptSolver(algorithm=:LD_MMA))
#variable(m, m1)
#variable(m, m2)
#variable(m, sig11 >= 0.001)
#variable(m, sig12)
#variable(m, sig22 >= 0.001)
#variable(m, sc >= 0.001)
function obj(m1, m2, sig11, sig12, sig22, sc)
est_d = MvNormal([m1, m2], [sig11 sig12; sig12 sig22])
ref_Z = [sc * pdf(est_d, [x, y]) for x in xr, y in yr]
sum((a-b)^2 for (a,b) in zip(ref_Z, Z))
end
JuMP.register(m, :obj, 6, obj, autodiff=true)
#NLobjective(m, Min, obj(m1, m2, sig11, sig12, sig22, sc))
#NLconstraint(m, sig12*sig12 + 0.001 <= sig11*sig22)
setvalue(m1, 0.0)
setvalue(m2, 0.0)
setvalue(sig11, 1.0)
setvalue(sig12, 0.0)
setvalue(sig22, 1.0)
setvalue(sc, 1.0)
status = solve(m)
getvalue.([m1, m2, sig11, sig12, sig22, sc])
In principle, you have a loss function
loss(μ, Σ) = sum(dist(Z[i,j], N([x(i), y(j)], μ, Σ)) for i in Ri, j in Rj)
where x and y convert your indices to points on the axes (for which you need to know the grid distance and offset positions), and Ri and Rj the ranges of the indices. dist is the distance measure you use, eg. squared difference.
You should be able to pass this into an optimizer by packing μ and Σ into a single vector:
pack(μ, Σ) = [μ; vec(Σ)]
unpack(v) = #views v[1:N], reshape(v[N+1:end], N, N)
loss_packed(v) = loss(unpack(v)...)
where in your case N = 2. (Maybe the unpacking deserves some optimization to get rid of unnecessary copying.)
Another thing is that we have to ensure that Σ is positive semidifinite (and hence also symmetric). One way to do that is to parametrize the packed loss function differently, and optimize over some lower triangular matrix L, such that Σ = L * L'. In the case N = 2, we can write this as
unpack(v) = v[1:2], LowerTriangular([v[3] zero(v[3]); v[4] v[5]])
loss_packed(v) = let (μ, L) = unpack(v)
loss(μ, L * L')
end
(This is of course prone to further optimization, such as expanding the multiplication directly in to loss). A different way is to specify the condition as constraints into the optimizer.
For the optimzer to work you probably have to get the derivative of loss_packed. Either have to find the manually calculate it (by a good choice of dist), or maybe more easily by using a log transformation (if you're lucky, you find a way to reduce it to a linear problem...). Alternatively you could try to find an optimizer that does automatic differentiation.