I am very new to the Julia programming language and I am testing out some Euclidean distance operations I typically perform in other languages. The functions work if calling it serially, but the pmap calls are not returning the desired results. Could someone take a look and let me know if I am going about this the right way? is pmap even the best way to approach this?
using Distributed
#Example data
d1 = randn(50000,3)
d2 = randn(50000,3)
First Function: Euclidean Distance Matrix
function EDM(m1, m2)
n1 = size(m1, 1)
n2 = size(m2,1)
k = size(m1, 2)
Dist = zeros(n1,n2)
for i in 1:n1
for j in 1:n2
dtemp = 0
for a in 1:k
dtemp += (m1[i,a] - m2[j,a]) ^ 2
end
Dist[i,j] = sqrt(dtemp)
end
end
return Dist
end
#pmap call
function pmap_EDM(m1,m2)
return pmap(EDM, m1, m2)
end
Second Function: Minimum Euclidean Distances Unidirectional
function MED(m1, m2)
n1 = size(m1, 1)
n2 = size(m2,1)
k = size(m1, 2)
Dist = zeros(n1,1)
for i in 1:n1
dsum = Inf
for j in 1:n2
dtemp = 0
for a in 1:k
dtemp += (m1[i,a] - m2[j,a]) ^ 2
end
dtemp = sqrt(dtemp)
if dtemp < dsum
dsum = copy(dtemp)
end
end
Dist[i,1] = dsum
end
return Dist
end
#pmap call
function pmap_MED(m1,m2)
return pmap(MED, m1, m2)
end
Third Function: Minimum Euclidean Distances and Corresponding Indices Unidirectional
function MEDI(m1, m2)
n1 = size(m1, 1)
n2 = size(m2,1)
k = size(m1, 2)
Dist = zeros(n1,2)
for i in 1:n1
dsum = Inf
dsum_ind = 0
for j in 1:n2
dtemp = 0
for a in 1:k
dtemp += (m1[i,a] - m2[j,a]) ^ 2
end
dtemp = sqrt(dtemp)
if dtemp < dsum
dsum = copy(dtemp)
dsum_ind = copy(j)
end
end
Dist[i,1] = dsum
Dist[i,2] = dsum_ind
end
return Dist
end
#pmap call
function pmap_MEDI(m1,m2)
return pmap(MEDI, m1, m2)
end
Calling functions
r1 = EDM(d1,d2) #serial
r2 = pmap_EDM(d1,d2)
r3 = MED(d1,d2) #serial
r4 = pmap_MED(d1,d2)
r5 = MEDI(d1,d2) #serial
r6 = pmap_MEDI(d1,d2)
Edited:
The first function should return a simple Euclidean distance matrix with the distances between each row in one array to every row in the second array. The second and third functions are deviations of this to return a subset of those distances based on the minimum distance for each row in one array to every other row in another array (with the third function returning the index position of the minimum distance). The distances do not appear to be calculated correctly and the latter two functions using pmap are returning an nx3 matrix rather than nx1 and nx2 respectively.
Edited 2: example using smaller data set to show results
d1 = randn(5,3)
d2 = randn(5,3)
julia> EDM(d1,d2)
5×5 Array{Float64,2}:
2.60637 3.18867 1.0745 2.60328 1.58608
1.2763 2.31037 3.04379 2.74113 2.00452
1.70024 2.07731 3.12397 2.60893 2.05932
2.44581 1.57345 0.910323 1.08718 0.407675
3.42936 1.13001 2.18345 1.08764 1.70883
julia> pmap_EDM(d1,d2)
5×3 Array{Array{Float64,2},2}:
[0.397928] [2.39283] [0.953501]
[1.06776] [0.815057] [1.87973]
[0.151963] [3.05161] [0.650967]
[0.571021] [0.275554] [0.883151]
[0.109293] [0.635398] [1.58254]
julia> MED(d1,d2)
5×1 Array{Float64,2}:
1.0744953977891307
1.2762979313081781
1.7002448697495505
0.40767454400155695
1.0876399289364607
julia> pmap_MED(d1,d2)
5×3 Array{Array{Float64,2},2}:
[0.397928] [2.39283] [0.953501]
[1.06776] [0.815057] [1.87973]
[0.151963] [3.05161] [0.650967]
[0.571021] [0.275554] [0.883151]
[0.109293] [0.635398] [1.58254]
julia> MEDI(d1,d2)
5×2 Array{Float64,2}:
1.0745 3.0
1.2763 1.0
1.70024 1.0
0.407675 5.0
1.08764 4.0
julia> pmap_MEDI(d1,d2)
5×3 Array{Array{Float64,2},2}:
[0.397928 1.0] [2.39283 1.0] [0.953501 1.0]
[1.06776 1.0] [0.815057 1.0] [1.87973 1.0]
[0.151963 1.0] [3.05161 1.0] [0.650967 1.0]
[0.571021 1.0] [0.275554 1.0] [0.883151 1.0]
[0.109293 1.0] [0.635398 1.0] [1.58254 1.0]
Edited 3: #distributed version of function two
using Distributed
using SharedArrays
#Minimum Euclidean Distances Unidirectional
#everywhere function MD(v1, m2)
n = size(m2, 1)
dsum = Inf
for j in 1:n
dtemp = sqrt((v1[1] - m2[j,1]) ^ 2 + (v1[2] - m2[j,2]) ^ 2 + (v1[3] - m2[j,3]) ^ 2)
if dtemp < dsum
dsum = dtemp
end
end
return dsum
end
function MED(m1, m2)
n1 = size(m1,1)
Dist = SharedArray{Float64}(n1)
m3 = SharedArray{Float64}(m2)
#sync #distributed for k in 1:n1
Dist[k] = MD(m1[k,:], m3)
end
return Dist
end
I did not went into the details of your code, but could it be that you apply pmap at the wrong code level?
For instance if you have the following serial code
for i = 1:imax
# do some work
end
You would write this as:
function function_for_single_iteration(i)
# do some work
end
pmap(function_for_single_iteration,1:imax)
Essentially the pmap replaces a (outer) for loop.
Before using pmap, I usually first use the serial map function to check that I have the same results.
Note that pmap and map would return a vector. In your case, probably a vector of vectors of distances. You would need to use cat to turn this into a matrix.
Related
I want to sum all elements in a matrix A with dimension n times n. The matrix is symmetric and has 0s on the diagonal. The fastest way to do so that I have found is simply
sum(A). However this seems wasteful since it doesn't use the fact that I only need to calculate the lower triangle of the matrix. However, sum(tril(A, -1)) is significantly slower, and sum(A[i, j] for i = 1:n-1 for j = i+1:n) even more so. Is there a more efficient way to sum the matrix?
Edit: The solution by #AboAmmar performs well. Here is code (with summing the diagonal separately, something that can be removed if there is only zeros on the diagonal) to compare:
using BenchmarkTools
using LinearAlgebra
function sum_triu(A)
m, n = size(A)
#assert m == n
s = zero(eltype(A))
for j = 2:n
#simd for i = 1:j-1
s += #inbounds A[i,j]
end
end
s *= 2
for i = 1:n
s += A[i, i]
end
return s
end
N = 1000
A = Symmetric(rand(0:9,N,N))
A -= diagm(diag(A))
#btime sum(A)
#btime 2 * sum(tril(A))
#btime sum_triu(A)
This is 2.7X faster than sum for n = 1000 matrix. Make sure to add a #simd before the loop and use #inbounds. Also, use the correct loop order for fast memory access.
function sum_triu(A)
m, n = size(A)
#assert m == n
s = zero(eltype(A))
for j = 1:n
#simd for i = 1:j
s += #inbounds A[i,j]
end
end
return 2 * s
end
Example run on my PC:
sum_triu(A) = 499268.7328022966
sum(A) = 499268.73280229873
93.000 μs (0 allocations: 0 bytes)
249.900 μs (0 allocations: 0 bytes)
How about
2 * sum(LowerTriangular(A))
help?> LA.LowerTriangular
LowerTriangular(A::AbstractMatrix)
Construct a LowerTriangular view of the matrix A.
tril creates a new matrix, which allocates memory. Since a LowerTriangular is a view into the existing matrix, there's no memory allocation.
I have multiple objective functions for the same model in Julia JuMP created using an #optimize in a for loop. What does it mean to have multiple objective functions in Julia? What objective is minimized, or is it that all the objectives are minimized jointly? How are the objectives minimized jointly?
using JuMP
using MosekTools
K = 3
N = 2
penalties = [1.0, 3.9, 8.7]
function fac1(r::Number, i::Number, l::Number)
fac1 = 1.0
for m in 0:r-1
fac1 *= (i-m)*(l-m)
end
return fac1
end
function fac2(r::Number, i::Number, l::Number, tau::Float64)
return tau ^ (i + l - 2r + 1)/(i + l - 2r + 1)
end
function Q_r(i::Number, l::Number, r::Number, tau::Float64)
if i >= r && l >= r
return 2 * fac1(r, i, l) * fac2(r, i, l, tau)
else
return 0.0
end
end
function Q(i::Number, l::Number, tau::Number)
elem = 0
for r in 0:N
elem += penalties[r + 1] * Q_r(i, l, r, tau)
end
return elem
end
# discrete segment starting times
mat = Array{Float64, 3}(undef, K, N+1, N+1)
function Q_mat()
for k in 0:K-1
for i in 1:N+1
for j in 1:N+1
mat[k+1, i, j] = Q(i, j, convert(Float64, k))
end
end
return mat
end
end
function A_tau(r::Number, n::Number, tau::Float64)
fac = 1
for m in 1:r
fac *= (n - (m - 1))
end
if n >= r
return fac * tau ^ (n - r)
else
return 0.0
end
end
function A_tau_mat(tau::Float64)
mat = Array{Float64, 2}(undef, N+1, N+1)
for i in 1:N+1
for j in 1:N+1
mat[i, j] = A_tau(i, j, tau)
end
end
return mat
end
function A_0(r::Number, n::Number)
if r == n
fac = 1
for m in 1:r
fac *= r - (m - 1)
end
return fac
else
return 0.0
end
end
m = Model(optimizer_with_attributes(Mosek.Optimizer, "QUIET" => false, "INTPNT_CO_TOL_DFEAS" => 1e-7))
#variable(m, A[i=1:K+1,j=1:K,k=1:N+1,l=1:N+1])
#variable(m, p[i=1:K+1,j=1:N+1])
# constraint difference might be a small fractional difference.
# assuming that time difference is 1 second starting from 0.
for i in 1:K
#constraint(m, -A_tau_mat(convert(Float64, i-1)) * p[i] .+ A_tau_mat(convert(Float64, i-1)) * p[i+1] .== [0.0, 0.0, 0.0])
end
for i in 1:K+1
#constraint(m, A_tau_mat(convert(Float64, i-1)) * p[i] .== [1.0 12.0 13.0])
end
#constraint(m, A_tau_mat(convert(Float64, K+1)) * p[K+1] .== [0.0 0.0 0.0])
for i in 1:K+1
#objective(m, Min, p[i]' * Q_mat()[i] * p[i])
end
optimize!(m)
println("p value is ", value.(p))
println(A_tau_mat(0.0), A_tau_mat(1.0), A_tau_mat(2.0))
With the standard JuMP you can have only one goal function at a time. Running another #objective macro just overwrites the previous goal function.
Consider the following code:
julia> m = Model(GLPK.Optimizer);
julia> #variable(m,x >= 0)
x
julia> #objective(m, Max, 2x)
2 x
julia> #objective(m, Min, 2x)
2 x
julia> println(m)
Min 2 x
Subject to
x >= 0.0
It can be obviously seen that there is only one goal function left.
However, indeed there is an area in optimization called multi-criteria optimization. The goal here is to find a Pareto-barrier.
There is a Julia package for handling MC and it is named MultiJuMP. Here is a sample code:
using MultiJuMP, JuMP
using Clp
const mmodel = multi_model(Clp.Optimizer, linear = true)
const y = #variable(mmodel, 0 <= y <= 10.0)
const z = #variable(mmodel, 0 <= z <= 10.0)
#constraint(mmodel, y + z <= 15.0)
const exp_obj1 = #expression(mmodel, -y +0.05 * z)
const exp_obj2 = #expression(mmodel, 0.05 * y - z)
const obj1 = SingleObjective(exp_obj1)
const obj2 = SingleObjective(exp_obj2)
const multim = get_multidata(mmodel)
multim.objectives = [obj1, obj2]
optimize!(mmodel, method = WeightedSum())
This library also supports plotting of the Pareto frontier.
The disadvantage is that as of today it does not seem to be actively maintained (however it works with the current Julia and JuMP versions).
I've tried to reproduce the model from a PYMC3 and Stan comparison. But it seems to run slowly and when I look at #code_warntype there are some things -- K and N I think -- which the compiler seemingly calls Any.
I've tried adding types -- though I can't add types to turing_model's arguments and things are complicated within turing_model because it's using autodiff variables and not the usuals. I put all the code into the function do_it to avoid globals, because they say that globals can slow things down. (It actually seems slower, though.)
Any suggestions as to what's causing the problem? The turing_model code is what's iterating, so that should make the most difference.
using Turing, StatsPlots, Random
sigmoid(x) = 1.0 / (1.0 + exp(-x))
function scale(w0::Float64, w1::Array{Float64,1})
scale = √(w0^2 + sum(w1 .^ 2))
return w0 / scale, w1 ./ scale
end
function do_it(iterations::Int64)::Chains
K = 10 # predictor dimension
N = 1000 # number of data samples
X = rand(N, K) # predictors (1000, 10)
w1 = rand(K) # weights (10,)
w0 = -median(X * w1) # 50% of elements for each class (number)
w0, w1 = scale(w0, w1) # unit length (euclidean)
w_true = [w0, w1...]
y = (w0 .+ (X * w1)) .> 0.0 # labels
y = [Float64(x) for x in y]
σ = 5.0
σm = [x == y ? σ : 0.0 for x in 1:K, y in 1:K]
#model turing_model(X, y, σ, σm) = begin
w0_pred ~ Normal(0.0, σ)
w1_pred ~ MvNormal(σm)
p = sigmoid.(w0_pred .+ (X * w1_pred))
#inbounds for n in 1:length(y)
y[n] ~ Bernoulli(p[n])
end
end
#time chain = sample(turing_model(X, y, σ, σm), NUTS(iterations, 200, 0.65));
# ϵ = 0.5
# τ = 10
# #time chain = sample(turing_model(X, y, σ), HMC(iterations, ϵ, τ));
return (w_true=w_true, chains=chain::Chains)
end
chain = do_it(1000)
I have a long vector V and a large matrix M. My purpose is in the Julia code below.
using LinearAlgebra
function myfunction(M,V)
n = size(V,1)
sum = 0
summ = 0
for i = 1:n-1
for j = i+1:n
a= [i,j]
Y = V[a]
X = M[a,a]
sum += Y'*inv(X)*Y
summ += tr(X)*Y'*Y
end
end
return sum, summ
end
M = randn(10000,10000)
V = randn(10000)
#time myfunction(M,V)
Since the vector is very long and the matrix is very large, this procedure takes a long time. I spent a long time on this issue. I really appreciate your help!
I just would manually unroll the calculations to avoid allocations:
function myfunction2(M::AbstractMatrix{T},V::AbstractVector{T}) where {T}
n = size(V, 1)
sum = zero(T)
summ = zero(T)
for i = 2:n
for j = 1:i-1
#inbounds y1, y2 = V[i], V[j]
y11 = y1*y1
y12 = y1*y2
y22 = y2*y2
#inbounds a, b, c, d = M[i,i], M[i,j], M[j,i], M[j,j]
sum += (d*y11-(c+b)*y12+a*y22) / (a*d-b*c)
summ += (a+d)*(y11+y22)
end
end
return sum, summ
end
(note that I make explicit assumptions about M and V)
EDIT this is minimally faster
function myfunction3(M::AbstractMatrix{T},V::AbstractVector{T}) where {T}
n = size(V, 1)
sum = zero(T)
summ = zero(T)
for i = 2:n
#inbounds y1 = V[i]
#inbounds a = M[i,i]
y11 = y1*y1
for j = 1:i-1
#inbounds y2 = V[j]
y12 = y1*y2
y22 = y2*y2
#inbounds b, c, d = M[i,j], M[j,i], M[j,j]
sum += (d*y11-(c+b)*y12+a*y22) / (a*d-b*c)
summ += (a+d)*(y11+y22)
end
end
return sum, summ
end
I am trying to call both functions, starting with sort_exercise
# reference https://www.geeksforgeeks.org/merge-sort/
# Merges two subarrays of A[]
# First subarray is A[p..m]
# Second subarray is A[m+1..r]
julia> function sort_exercise(A::Vector{Int}, p, m, r)
n1 = m - p + 1
n2 = r - m
# create temp arrays
L = zeros(Int, n1)
R = zeros(Int, n2)
# copy data to temp arrays L[] and R[]
for i = 1:n1
L[i] = A[p + i]
end
for j = 1:n2
R[j] = A[m + 1 + j]
end
# Merge temp arrays back to A[1..r]
i = 0 # Initial index of first subarray
j = 0 # Initial index of second subarray
k = p # Initial index of merged subarray
while i < n1; j < n2
if L[i] <= R[j]
A[k] = L[i]
i += 1
else
A[k] = R[j]
j += 1
end
k += 1
end
# Copy any possible remaining elements of L[]
while i < n1
A[k] = L[i]
i += 1
k += 1
end
# Copy any possible remaining elements of R[]
while j < n2
A[k] = R[j]
j += 1
k += 1
end
end
sort_exercise (generic function with 1 method)
julia> sort_exercise([4, 5, 22, 1, 3], 1, 3, 5)
ERROR: BoundsError: attempt to access 5-element Array{Int64,1} at index [6]
Stacktrace:
[1] sort_exercise(::Array{Int64,1}, ::Int64, ::Int64, ::Int64) at ./REPL[1]:14
julia> function merge_exercise(A::Vector{Int}, p, r)
if p < r
# equivalent to `(p + r) / 2` w/o overflow for big p and
h (no idea what h is)
m = (p+(r - 1)) / 2
# merge first half
merge_exercise(A, p, m)
# with second half
merge_exercise(A, m + 1, r)
# sort merged halves
sort_exercise(A, p, m, r)
end
end
merge_exercise (generic function with 1 method)
It seems that you have translated the Python code.
In fact, in python L = [0] * (n1) creates an array of size n1 filled with 0. In Julia you can use L = zeros(Int, n1) to accomplish the same.
L = zeros(Int, 1) * n1 is just the array [0] therefore you have the out-of-bound error.
Note that for i in range(1,n1) can also be written as for i = 1:n1.