How to break out of nested for loops in Julia - julia

I have tried to break out of nested loops in a quite ineffective way:
BreakingPoint = false
a=["R1","R2","R3"]
b=["R2","R3","R4"]
for i in a
for j in b
if i == j
BreakingPoint = true
println("i = $i, j = $j.")
end
if BreakingPoint == true; break; end
end
if BreakingPoint == true; break; end
end
Is there an easier way to do that? In my actual problem, I have no idea about what are in arrays a and b, apart from they are ASCIIStrings. The array names (a and b in sample code) are also auto-generated through meta-programming methods.

You can do one of two things
have the loop statement (if thats what its called) in a multi outer loop
for i in a, j in b
if i == j
break
end
end
which is clean, but not always possible
I will be crucified for suggesting this, but you can use #goto and #label
for i in a
for j in b
if i == j
#goto escape_label
end
end
end
#label escape_label
If you go with the #goto/#label way, for the sake of the people maintaining/reviewing the code, document your use properly, as navigating code with labels is breathtakingly annoying
For the discussion on the multi-loop break, see this

Put the 2D loop into a function, and do an early return when you want to break.

Related

looping over an array and checking its elements

I have an array 'A'. I want to do a loop over all the elements of A, checking to see if any are greater than or equal to 1. If they are, I would like to assign a '1' to a new array 'B' in the same element index of A.
How would I go about implementing this?
I have the cumbersome idea of:
for i in 1:end
for j in 1:end
if A[i,j] >= 1
B[i,j] = 1
else
B[i,j] = 0
end
end
end
but I would prefer something more succinct.
Just use broadcast:
B = A .≥ 1
You can certainly use broadcasting as Oscar suggested (e.g. B = A .>= 1), but there's also nothing wrong with loops, since loops are fast and avoid excess allocations. You really only need one loop though, and the if statement is slightly superfluous, so:
B = similar(A, Int64) # If B doesn't already exist, otherwise omit this line
#inbounds for i in eachindex(A)
B[i] = A[i] >= 1
end
The #inbounds is optional, but improves speed.

How to make use of Threads optional in a Julia function

I have a function that optionally uses threads for its main loop, doing so when an argument usingthreads is true. At the moment, the code looks like this:
function dosomething(usingthreads::Bool)
n = 1000
if usingthreads
Threads.#threads for i = 1:n
#20 lines of code here
end
else
for i = 1:n
#same 20 lines of code repeated here
end
end
end
Less nasty than the above would be to put the "20 lines" in a separate function. Is there another way?
You could use a macro that changes its behavior depending on the result of Threads.nthreads():
macro maybe_threaded(ex)
if Threads.nthreads() == 1
return esc(ex)
else
return esc(:(Threads.#threads $ex))
end
end
Without threading, this macro will be a no-op:
julia> #macroexpand #maybe_threaded for i in 1:5
print(i)
end
:(for i = 1:5
#= REPL[2]:2 =#
print(i)
end)
But when threading is enabled and e.g. JULIA_NUM_THREADS=4 it will expand to the threaded version:
julia> #maybe_threaded for i in 1:5
print(i)
end
41325
Edit: Upon rereading the question, I realize this doesn't really answer it but it might be useful anyway.
You can use ThreadsX as suggested in this discourse link.
The answer from the thread (all credit to oxinabox):
using ThreadsX
function foo(multi_thread=true)
_foreach = multi_thread ? ThreadsX.foreach : Base.foreach
_foreach(1:10) do ii
#show ii
end
end

Combinatorics through Recursion: How to be functional

I am working on a combinatorics function. I want to input a string and output all possible combinations of that string using each character. For instance, this will print all the combinations of ME = {MM, ME, EM, EE}.
function combo_recursive(a)
arr_combo = split(a, "")
arr_size = size(arr_combo)
arr_max = arr_size[1]
mutab!e = []
function combo_recurse(b)
if b ≤ 1
for i in 1:arr_max
append!(mutab!e,arr_combo[i])
println(join(mutab!e))
pop!(mutab!e)
end
else
for j in 1:arr_max
append!(mutab!e,arr_combo[j])
combo_recurse(b-1)
pop!(mutab!e)
end
end
end
combo_recurse(arr_max)
end
It works fine but, I implemented mutable arrays to achieve the desired result. Any recommendations on how to apply a functional ethos to this??

Julia BoundsError when deleting items of a list while iterating over it

I would like to iterate over a list and occasionally delete items of said list. Below a toy example:
function delete_item!(myarray, item)
deleteat!(myarray, findin(myarray, [item]))
end
n = 1000
myarray = [i for i = 1:n];
for a in myarray
if a%2 == 0
delete_item!(myarray, a)
end
end
However I get error:
BoundsError: attempt to access 500-element Array{Int64,1} at index [502]
How can I fix it (as efficiently as possible)?
Additional information. The above seems like a silly example, in my original problem I have a list of agents which interact. Therefore I am not sure if iterating over a copy would be the best solution. For example:
#creating my agent
mutable struct agent <: Any
id::Int
end
function delete_item!(myarray::Array{agent, 1}, item::agent)
deleteat!(myarray, findin(myarray, [item]))
end
#having my list of agents
n = 1000
myarray = agent[agent(i) for i = 1:n];
#trying to remove agents from list while having them interact
for a in myarray
#agent does stuff
if a.id%2 == 0 #if something happens remove
delete_item!(myarray, a)
end
end
Unfortunately there is no single answer to this question as most efficient approach depends on the logic of the whole model (in particular do other agents' actions depend on the fact that some entry is actually deleted from an array).
In most cases the following approach should be the simplest (I am leaving findin which is inefficient but I understand that you may have duplicates in myarray in general):
n = 1000
myarray = [i for i = 1:n];
keep = trues(n)
for (i, a) in enumerate(myarray)
keep[i] || continue # do not process an agent that is marked for deletion
if a%2 == 0 # here application logic might also need to check keep in some cases
keep[findin(myarray, [a])] = false
end
end
myarray = myarray[keep]
If for some reason you really need to delete elements of myarray in each iteration here is how you can do it:
n = 1000
myarray = [i for i = 1:n];
i = 1
while i <= length(myarray)
a = myarray[i]
if a%2 == 0
todelete = findin(myarray, [a])
i -= count(x -> x < i, todelete) # if myarray has duplicates of a you have to move the counter back
deleteat!(myarray, todelete)
else
i += 1
end
end
In general the code you give will not be very fast (e.g. if you know myarray does not contain duplicates it can be much simpler - and I guess you can).
EDIT: Here is how you can implement both versions if you know you do not have duplicates (you can simply use agent's index - observe that we can also avoid unnecessary checks):
n = 1000
myarray = [i for i = 1:n];
keep = trues(n)
for (i, a) in enumerate(myarray)
if a%2 == 0 # here application logic might also need to check keep in some cases
keep[i] = false
end
end
myarray = myarray[keep]
If for some reason you really need to delete elements of myarray in each iteration here is how you can do it:
n = 1000
myarray = [i for i = 1:n];
i = 1
while i <= length(myarray)
a = myarray[i]
if a%2 == 0
deleteat!(myarray, i)
else
i += 1
end
end

Identify which rows (or columns) have values in sparse Matrix

I need to identify the rows (/columns) that have defined values in a large sparse Boolean Matrix. I want to use this to 1. slice (actually view) the Matrix by those rows/columns; and 2. slice (/view) vectors and matrices that have the same dimensions as the margins of a Matrix. I.e. the result should probably be a Vector of indices / Bools or (preferably) an iterator.
I've tried the obvious:
a = sprand(10000, 10000, 0.01)
cols = unique(a.colptr)
rows = unique(a.rowvals)
but each of these take like 20ms on my machine, probably because they allocate about 1MB (at least they allocate cols and rows). This is inside a performance-critical function, so I'd like the code to be optimized. The Base code seems to have an nzrange iterator for sparse matrices, but it is not easy for me to see how to apply that to my case.
Is there a suggested way of doing this?
Second question: I'd need to also perform this operation on views of my sparse Matrix - would that be something like x = view(a,:,:); cols = unique(x.parent.colptr[x.indices[:,2]]) or is there specialized functionality for this? Views of sparse matrices appear to be tricky (cf https://discourse.julialang.org/t/slow-arithmetic-on-views-of-sparse-matrices/3644 – not a cross-post)
Thanks a lot!
Regarding getting the non-zero rows and columns of a sparse matrix, the following functions should be pretty efficient:
nzcols(a::SparseMatrixCSC) = collect(i
for i in 1:a.n if a.colptr[i]<a.colptr[i+1])
function nzrows(a::SparseMatrixCSC)
active = falses(a.m)
for r in a.rowval
active[r] = true
end
return find(active)
end
For a 10_000x10_000 matrix with 0.1 density it takes 0.2ms and 2.9ms for cols and rows, respectively. It should also be quicker than method in question (apart from the correctness issue as well).
Regarding views of sparse matrices, a quick solution would be to turn view into a sparse matrix (e.g. using b = sparse(view(a,100:199,100:199))) and use functions above. In code:
nzcols(b::SubArray{T,2,P}) where {T,P<:AbstractSparseArray} = nzcols(sparse(b))
nzrows(b::SubArray{T,2,P}) where {T,P<:AbstractSparseArray} = nzrows(sparse(b))
A better solution would be to customize the functions according to view. For example, when the view uses UnitRanges for both rows and columns:
# utility predicate returning true if element of sorted v in range r
inrange(v,r) = searchsortedlast(v,last(r))>=searchsortedfirst(v,first(r))
function nzcols(b::SubArray{T,2,P,Tuple{UnitRange{Int64},UnitRange{Int64}}}
) where {T,P<:SparseMatrixCSC}
return collect(i+1-start(b.indexes[2])
for i in b.indexes[2]
if b.parent.colptr[i]<b.parent.colptr[i+1] &&
inrange(b.parent.rowval[nzrange(b.parent,i)],b.indexes[1]))
end
function nzrows(b::SubArray{T,2,P,Tuple{UnitRange{Int64},UnitRange{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
for c in b.indexes[2]
for r in nzrange(b.parent,c)
if b.parent.rowval[r] in b.indexes[1]
active[b.parent.rowval[r]+1-start(b.indexes[1])] = true
end
end
end
return find(active)
end
which work faster than the versions for the full matrices (for 100x100 submatrix of above 10,000x10,000 matrix cols and rows take 16μs and 12μs, respectively on my machine, but these are unstable results).
A proper benchmark would use fixed matrices (or at least fix the random seed). I'll edit this line with such a benchmark if I do it.
In case the indices are not ranges, the fallback to converting to a sparse matrix works, but here are versions for indices which are Vectors. If the indices are mixed, yet another set of versions needs to be made. Quite repetitive, but this is the strength of Julia, when the versions are done, the code will choose optimized methods correctly using the types in the caller without too much effort.
function sortedintersecting(v1, v2)
i,j = start(v1), start(v2)
while i <= length(v1) && j <= length(v2)
if v1[i] == v2[j] return true
elseif v1[i] > v2[j] j += 1
else i += 1
end
end
return false
end
function nzcols(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
brows = sort(unique(b.indexes[1]))
return [k
for (k,i) in enumerate(b.indexes[2])
if b.parent.colptr[i]<b.parent.colptr[i+1] &&
sortedintersecting(brows,b.parent.rowval[nzrange(b.parent,i)])]
end
function nzrows(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
for c in b.indexes[2]
active[findin(b.indexes[1],b.parent.rowval[nzrange(b.parent,c)])] = true
end
return find(active)
end
-- ADDENDUM --
Since it was noted nzrows for Vector{Int} indices is a bit slow, this is an attempt to improve its speed by replacing findin with a version exploiting sortedness:
function findin2(inds,v,w)
i,j = start(v),start(w)
res = Vector{Int}()
while i<=length(v) && j<=length(w)
if v[i]==w[j]
push!(res,inds[i])
i += 1
elseif (v[i]<w[j]) i += 1
else j += 1
end
end
return res
end
function nzrows(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
inds = sortperm(b.indexes[1])
brows = (b.indexes[1])[inds]
for c in b.indexes[2]
active[findin2(inds,brows,b.parent.rowval[nzrange(b.parent,c)])] = true
end
return find(active)
end

Resources