Julia: Searching for a column in a sorted matrix - julia

I have a matrix that is sorted like the one shown below
1 1 2 2 3
1 2 3 4 1
2 1 2 1 1
It's a bit hard for me to describe the ordering, but hopefully it's clear from the example. The rough idea is that we first sort on the first row, then the second, etc.
I would like to find a specific column in the matrix, and that column may or may not exist in it.
I tried the following code:
index = searchsortedfirst(1:total_cols, col, lt=(index,x) -> (matrix[: index] < x))
The above code works, but it is slow. I profiled the code, and it spends a lot of time in "_get_index". I then tried the following
#views index = searchsortedfirst(1:total_cols, col, lt=(index,x) -> (matrix[: index] < x))
As expected this helped a lot, likely due to the slices I'm taking. However, is there a better way to go about this? There still seems to be a lot of overhead, and I feel like there might be a cleaner way to write this, which would be easier to optimize.
However, I absolutely value speed over clarity.
Here is some code I wrote to compare binary vs. linear search.
using Profile
function test_search()
max_val = 20
rows = 4
matrix = rand(1:max_val, rows, 10^5)
matrix = Array{Int64,2}(sortslices(matrix, dims=2))
indices = #time #profile lin_search(matrix, rows, max_val, 10^3)
indices = #time #profile bin_search(matrix, rows, max_val, 10^3)
end
function bin_search(matrix, rows, max_val, repeats)
indices = zeros(repeats)
x = zeros(Int64, rows)
cols = size(matrix)[2]
for i = 1:repeats
x = rand(1:max_val, rows)
#inbounds #views index = searchsortedfirst(1:cols, x, lt=(index,x)->(matrix[:,index] < x))
indices[i] = index
end
return indices
end
function array_eq(matrix, index, y, rows)
for i=1:rows
#inbounds if view(matrix, i, index) != y[i]
return false
end
end
return true
end
function lin_search(matrix, rows, max_val, repeats)
indices = zeros(repeats)
x = zeros(Int64, rows)
cols = size(matrix)[2]
for i = 1:repeats
index = cols + 1
x = rand(1:max_val, rows)
for j=1:cols
if array_eq(matrix, j, x, rows)
index = j;
break
end
end
indices[i] = index
end
return indices
end
Profile.clear()
test_search()
Here is some sample output
0.041356 seconds (68.90 k allocations: 3.431 MiB)
0.070224 seconds (110.45 k allocations: 5.418 MiB)
After adding some more #inbounds, it looks like a linear search is faster than binary. Seems strange when there are 10^5 columns.

If speed is most important, why not simply use the fact that Julia allows you to write fast loops?
julia> function findcol(M, col)
#inbounds #views for c in axes(M, 2)
M[:,c] == col && return c
end
return nothing
end
findcol (generic function with 1 method)
julia> col = [2,3,2];
julia> M = [1 1 2 2 3;
1 2 3 4 1;
2 1 2 1 1];
julia> #btime findcol($M, $col)
32.854 ns (3 allocations: 144 bytes)
3
This should probably be fast enough and does not even take into account any ordering.

I discovered two issues, that when fixed result in both linear and binary searches being much faster. And the binary search becomes faster than linear.
First, there was some type instability. I changed on one of the lines to
matrix::Array{Int64,2} = Array{Int64,2}(sortslices(matrix, dims=2))
This resulted in an order of magnitude speedup. Also it turns out that using #views does not do anything in the following code
#inbounds #views index = searchsortedfirst(1:cols, x, lt=(index,x)->(matrix[:,index] < x))
I am new to Julia, but my hunch is that since matrix[:,index] is copied no matter what in the anonymous function. This would make sense, since it allows for closures.
If I write a separate non-anonymous function, then that copy goes away. Linear search didn't copy the slices, so this also really sped up the binary search.

Related

Fast summation of symmetric matrix in Julia

I want to sum all elements in a matrix A with dimension n times n. The matrix is symmetric and has 0s on the diagonal. The fastest way to do so that I have found is simply
sum(A). However this seems wasteful since it doesn't use the fact that I only need to calculate the lower triangle of the matrix. However, sum(tril(A, -1)) is significantly slower, and sum(A[i, j] for i = 1:n-1 for j = i+1:n) even more so. Is there a more efficient way to sum the matrix?
Edit: The solution by #AboAmmar performs well. Here is code (with summing the diagonal separately, something that can be removed if there is only zeros on the diagonal) to compare:
using BenchmarkTools
using LinearAlgebra
function sum_triu(A)
m, n = size(A)
#assert m == n
s = zero(eltype(A))
for j = 2:n
#simd for i = 1:j-1
s += #inbounds A[i,j]
end
end
s *= 2
for i = 1:n
s += A[i, i]
end
return s
end
N = 1000
A = Symmetric(rand(0:9,N,N))
A -= diagm(diag(A))
#btime sum(A)
#btime 2 * sum(tril(A))
#btime sum_triu(A)
This is 2.7X faster than sum for n = 1000 matrix. Make sure to add a #simd before the loop and use #inbounds. Also, use the correct loop order for fast memory access.
function sum_triu(A)
m, n = size(A)
#assert m == n
s = zero(eltype(A))
for j = 1:n
#simd for i = 1:j
s += #inbounds A[i,j]
end
end
return 2 * s
end
Example run on my PC:
sum_triu(A) = 499268.7328022966
sum(A) = 499268.73280229873
93.000 μs (0 allocations: 0 bytes)
249.900 μs (0 allocations: 0 bytes)
How about
2 * sum(LowerTriangular(A))
help?> LA.LowerTriangular
LowerTriangular(A::AbstractMatrix)
Construct a LowerTriangular view of the matrix A.
tril creates a new matrix, which allocates memory. Since a LowerTriangular is a view into the existing matrix, there's no memory allocation.

Is it possible to find a few common multiples of a list of numbers, without them having to be integers?

I don't even know if something like this is possible, but:
Let us say we have three numbers:
A = 6
B = 7.5
C = 24
I would like to find a few evenly spaced common multiples of these numbers between 0 and 2.
So the requirement is: one_of_these_numbers / common_multiple = an_integer (or almost an integer with a particular tolerance)
For example, a good result would be [0.1 , 0.5 , 1 , 1.5]
I have no idea if this is possible, because one can not iterate through a range of floats, but is there a smart way to do it?
I am using python, but a solution could be represented in any language of your preference.
Thank you for your help!
While I was writing my question, I actually came up with an idea for the solution.
To find common divisors using code, we have to work with integers.
My solution is to multiply all numbers by a factor = 1, 10, 100, ...
so that we can act as if they are integers, find their integer common divisors, and then redivide them by the factor to get a result.
Better explained in code:
a = 6
b = 7.5
c = 24
# Find a few possible divisors between 0 and 2 so that all numbers are divisible
by div.
# We define a function that finds all divisors in a range of numbers, supposing
all numbers are integers.
def find_common_divisors(numbers, range_start, range_end):
results = []
for i in range(range_start + 1, range_end + 1):
if all([e % i == 0 for e in numbers]):
results.append(i)
return results
def main():
nums = [a, b, c]
range_start = 0
range_end = 2
factor = 1
results = [1]
while factor < 11:
nums_i = [e * factor for e in nums]
range_end_i = range_end * factor
results += [e / factor for e in find_common_divisors(nums_i, range_start, range_end_i)]
factor *= 10
print(sorted(set(results)))
if __name__ == '__main__':
main()
For these particular numbers, I get the output:
[0.1, 0.3, 0.5, 1, 1.5]
If we need more results, we can adjust while factor < 11: to a higher number than 11 like 101.
I am curious to see if I made any mistake in my code.
Happy to hear some feedback.
Thank you!

In Julia, how to skip certain numbers in a for loop

In the julia for loop, I want to skip the numbers that can be divided by 500. For example, in the loop below, I want to skip i=500, 1000, 1500, 2000, 2500, ..., 10000. How can I do that?
n=10000
result = zeros(n)
for i = 1:n
result[i] = i
end
just use continue:
for i = 1:n
iszero(i%500) && continue
result[i] = i
end
A more efficient option than continue would be to use Iterators.filter to create a lazy iterator which filters the values you don't require.
For the example given in the OP, this would be:
n=10000
result = zeros(n)
iter = Iterators.filter(i -> (i % 500) != 0, 1:n)
for i = iter
result[i] = i
end
A benchmark comparing with #jling's answer:
julia> #btime for i = 1:n
iszero(i%500) && continue
result[i] = i
end
1.693 ms (28979 allocations: 609.06 KiB)
julia> #btime for i = iter
result[i] = i
end
1.177 ms (28920 allocations: 607.81 KiB)
While the answer by #jing is perfectly correct, sometimes you have a list of values that you want to actually skip. In such cases Not may be an elegant solution:
julia> skipvals = [2,4,5];
julia> for i in (1:7)[Not(skipvals)]
print("$i ")
end
1 3 6 7
Not is a part of the InvertedIndices package, but also gets imported together via DataFrames, so in most data science codes it is just there for you to use :)

Improving the speed of a for loop in Julia

Here is my code in Julia platform and I like to speed it up. Is there anyway that I can make this faster? It takes 0.5 seconds for a dataset of 50k*50k. I was expecting Julia to be a lot faster than this or I am not sure if I am doing a silly implementation.
ar = [[1,2,3,4,5], [2,3,4,5,6,7,8], [4,7,8,9], [9,10], [2,3,4,5]]
SV = rand(10,5)
function h_score_0(ar ,SV)
m = length(ar)
SC = Array{Float64,2}(undef, size(SV, 2), m)
for iter = 1:m
nodes = ar[iter]
for jj = 1:size(SV, 2)
mx = maximum(SV[nodes, jj])
mn = minimum(SV[nodes, jj])
term1 = (mx - mn)^2;
SC[jj, iter] = (term1);
end
end
return score = sum(SC, dims = 1)
end
You have some unnecessary allocations in your code:
mx = maximum(SV[nodes, jj])
mn = minimum(SV[nodes, jj])
Slices allocate, so each line makes a copy of the data here, you're actually copying the data twice, once on each line. You can either make sure to copy only once, or even better: use view, so there is no copy at all (note that view is much faster on Julia v1.5, in case you are using an older version).
SC = Array{Float64,2}(undef, size(SV, 2), m)
And no reason to create a matrix here, and sum over it afterwards, just accumulate while you are iterating:
score[i] += (mx - mn)^2
Here's a function that is >5x as fast on my laptop for the input data you specified:
function h_score_1(ar, SV)
score = zeros(eltype(SV), length(ar))
#inbounds for i in eachindex(ar)
nodes = ar[i]
for j in axes(SV, 2)
SVview = view(SV, nodes, j)
mx = maximum(SVview)
mn = minimum(SVview)
score[i] += (mx - mn)^2
end
end
return score
end
This function outputs a one-dimensional vector instead of a 1xN matrix in your original function.
In principle, this could be even faster if we replace
mx = maximum(SVview)
mn = minimum(SVview)
with
(mn, mx) = extrema(SVview)
which only traverses the vector once, instead of twice. Unfortunately, there is a performance issue with extrema, so it is currently not as fast as separate maximum/minimum calls: https://github.com/JuliaLang/julia/issues/31442
Finally, for absolutely getting the best performance at the cost of brevity, we can avoid creating a view at all and turn the calls to maximum and minimum into a single explicit loop traversal:
function h_score_2(ar, SV)
score = zeros(eltype(SV), length(ar))
#inbounds for i in eachindex(ar)
nodes = ar[i]
for j in axes(SV, 2)
mx, mn = -Inf, +Inf
for node in nodes
x = SV[node, j]
mx = ifelse(x > mx, x, mx)
mn = ifelse(x < mn, x, mn)
end
score[i] += (mx - mn)^2
end
end
return score
end
This also avoids the performance issue that extrema suffers, and looks up the SV element once per node. Although this version is annoying to write, it's substantially faster, even on Julia 1.5 where views are free. Here are some benchmark timings with your test data:
julia> using BenchmarkTools
julia> #btime h_score_0($ar, $SV)
2.344 μs (52 allocations: 6.19 KiB)
1×5 Matrix{Float64}:
1.95458 2.94592 2.79438 0.709745 1.85877
julia> #btime h_score_1($ar, $SV)
392.035 ns (1 allocation: 128 bytes)
5-element Vector{Float64}:
1.9545848011260765
2.9459235098820167
2.794383144368953
0.7097448590904598
1.8587691646610984
julia> #btime h_score_2($ar, $SV)
118.243 ns (1 allocation: 128 bytes)
5-element Vector{Float64}:
1.9545848011260765
2.9459235098820167
2.794383144368953
0.7097448590904598
1.8587691646610984
So explicitly writing out the innermost loop is worth it here, reducing time by another 3x or so. It's annoying that the Julia compiler isn't yet able to generate code this efficient, but it does get smarter with every version. On the other hand, the explicit loop version will be fast forever, so if this code is really performance critical, it's probably worth writing it out like this.

Sorting with parity in julia

Suppose I have the following array:
[6,3,3,5,6],
Is there an already implemented way to sort the array and that returns also the number of permutations that it had to make the algorithm to sort it?
For instance, I have to move 3 times to the right with the 6 so it can be ordered, which would give me parity -1.
The general problem would be to order an arbitrary array (all integers, with repeated indexes!), and to know the parity performed by the algorithm to order the array.
a=[6,3,3,5,6]
sortperm(a) - [ 1:size(a)[1] ]
Results in
3-element Array{Int64,1}:
1
1
1
-3
0
sortperm shows you where each n-th index should go into. We're using 1:size(a)[1] to compare the earlier index to its original indexation.
If your array is small, you can compute the determinant of the permutation matrix
function permutation_sign_1(p)
n = length(p)
A = zeros(n,n)
for i in 1:n
A[i,p[i]] = 1
end
det(A)
end
In general, you can decompose the permutation as a product of cycles,
count the number of even cycles, and return its parity.
function permutation_sign_2(p)
n = length(p)
not_seen = Set{Int}(1:n)
seen = Set{Int}()
cycles = Array{Int,1}[]
while ! isempty(not_seen)
cycle = Int[]
x = pop!( not_seen )
while ! in(x, seen)
push!( cycle, x )
push!( seen, x )
x = p[x]
pop!( not_seen, x, 0 )
end
push!( cycles, cycle )
end
cycle_lengths = map( length, cycles )
even_cycles = filter( i -> i % 2 == 0, cycle_lengths )
length( even_cycles ) % 2 == 0 ? 1 : -1
end
The parity of a permutation can also be obtained from the
number of inversions.
It can be computed by slightly modifying the merge sort algorithm.
Since it is also used to compute Kendall's tau (check less(corkendall)),
there is already an implementation.
using StatsBase
function permutation_sign_3(p)
x = copy(p)
number_of_inversions = StatsBase.swaps!(x)
number_of_inversions % 2 == 0 ? +1 : -1
end
On your example, those three functions give the same result:
x = [6,3,3,5,6]
p = sortperm(x)
permutation_sign_1( p )
permutation_sign_2( p )
permutation_sign_3( p ) # -1

Resources