looping over an array and checking its elements - julia

I have an array 'A'. I want to do a loop over all the elements of A, checking to see if any are greater than or equal to 1. If they are, I would like to assign a '1' to a new array 'B' in the same element index of A.
How would I go about implementing this?
I have the cumbersome idea of:
for i in 1:end
for j in 1:end
if A[i,j] >= 1
B[i,j] = 1
else
B[i,j] = 0
end
end
end
but I would prefer something more succinct.

Just use broadcast:
B = A .≥ 1

You can certainly use broadcasting as Oscar suggested (e.g. B = A .>= 1), but there's also nothing wrong with loops, since loops are fast and avoid excess allocations. You really only need one loop though, and the if statement is slightly superfluous, so:
B = similar(A, Int64) # If B doesn't already exist, otherwise omit this line
#inbounds for i in eachindex(A)
B[i] = A[i] >= 1
end
The #inbounds is optional, but improves speed.

Related

Get a number from an array of digits

To split a number into digits in a given base, Julia has the digits() function:
julia> digits(36, base = 4)
3-element Array{Int64,1}:
0
1
2
What's the reverse operation? If you have an array of digits and the base, is there a built-in way to convert that to a number? I could print the array to a string and use parse(), but that sounds inefficient, and also wouldn't work for bases > 10.
The previous answers are correct, but there is also the matter of efficiency:
sum([x[k]*base^(k-1) for k=1:length(x)])
collects the numbers into an array before summing, which causes unnecessary allocations. Skip the brackets to get better performance:
sum(x[k]*base^(k-1) for k in 1:length(x))
This also allocates an array before summing: sum(d.*4 .^(0:(length(d)-1)))
If you really want good performance, though, write a loop and avoid repeated exponentiation:
function undigit(d; base=10)
s = zero(eltype(d))
mult = one(eltype(d))
for val in d
s += val * mult
mult *= base
end
return s
end
This has one extra unnecessary multiplication, you could try to figure out some way of skipping that. But the performance is 10-15x better than the other approaches in my tests, and has zero allocations.
Edit: There's actually a slight risk to the type handling above. If the input vector and base have different integer types, you can get a type instability. This code should behave better:
function undigits(d; base=10)
(s, b) = promote(zero(eltype(d)), base)
mult = one(s)
for val in d
s += val * mult
mult *= b
end
return s
end
The answer seems to be written directly within the documentation of digits:
help?> digits
search: digits digits! ndigits isdigit isxdigit disable_sigint
digits([T<:Integer], n::Integer; base::T = 10, pad::Integer = 1)
Return an array with element type T (default Int) of the digits of n in the given base,
optionally padded with zeros to a specified size. More significant digits are at higher
indices, such that n == sum([digits[k]*base^(k-1) for k=1:length(digits)]).
So for your case this will work:
julia> d = digits(36, base = 4);
julia> sum([d[k]*4^(k-1) for k=1:length(d)])
36
And the above code can be shortened with the dot operator:
julia> sum(d.*4 .^(0:(length(d)-1)))
36
Using foldr and muladd for maximum conciseness and efficiency
undigits(d; base = 10) = foldr((a, b) -> muladd(base, b, a), d, init=0)

Julia BoundsError when deleting items of a list while iterating over it

I would like to iterate over a list and occasionally delete items of said list. Below a toy example:
function delete_item!(myarray, item)
deleteat!(myarray, findin(myarray, [item]))
end
n = 1000
myarray = [i for i = 1:n];
for a in myarray
if a%2 == 0
delete_item!(myarray, a)
end
end
However I get error:
BoundsError: attempt to access 500-element Array{Int64,1} at index [502]
How can I fix it (as efficiently as possible)?
Additional information. The above seems like a silly example, in my original problem I have a list of agents which interact. Therefore I am not sure if iterating over a copy would be the best solution. For example:
#creating my agent
mutable struct agent <: Any
id::Int
end
function delete_item!(myarray::Array{agent, 1}, item::agent)
deleteat!(myarray, findin(myarray, [item]))
end
#having my list of agents
n = 1000
myarray = agent[agent(i) for i = 1:n];
#trying to remove agents from list while having them interact
for a in myarray
#agent does stuff
if a.id%2 == 0 #if something happens remove
delete_item!(myarray, a)
end
end
Unfortunately there is no single answer to this question as most efficient approach depends on the logic of the whole model (in particular do other agents' actions depend on the fact that some entry is actually deleted from an array).
In most cases the following approach should be the simplest (I am leaving findin which is inefficient but I understand that you may have duplicates in myarray in general):
n = 1000
myarray = [i for i = 1:n];
keep = trues(n)
for (i, a) in enumerate(myarray)
keep[i] || continue # do not process an agent that is marked for deletion
if a%2 == 0 # here application logic might also need to check keep in some cases
keep[findin(myarray, [a])] = false
end
end
myarray = myarray[keep]
If for some reason you really need to delete elements of myarray in each iteration here is how you can do it:
n = 1000
myarray = [i for i = 1:n];
i = 1
while i <= length(myarray)
a = myarray[i]
if a%2 == 0
todelete = findin(myarray, [a])
i -= count(x -> x < i, todelete) # if myarray has duplicates of a you have to move the counter back
deleteat!(myarray, todelete)
else
i += 1
end
end
In general the code you give will not be very fast (e.g. if you know myarray does not contain duplicates it can be much simpler - and I guess you can).
EDIT: Here is how you can implement both versions if you know you do not have duplicates (you can simply use agent's index - observe that we can also avoid unnecessary checks):
n = 1000
myarray = [i for i = 1:n];
keep = trues(n)
for (i, a) in enumerate(myarray)
if a%2 == 0 # here application logic might also need to check keep in some cases
keep[i] = false
end
end
myarray = myarray[keep]
If for some reason you really need to delete elements of myarray in each iteration here is how you can do it:
n = 1000
myarray = [i for i = 1:n];
i = 1
while i <= length(myarray)
a = myarray[i]
if a%2 == 0
deleteat!(myarray, i)
else
i += 1
end
end

Identify which rows (or columns) have values in sparse Matrix

I need to identify the rows (/columns) that have defined values in a large sparse Boolean Matrix. I want to use this to 1. slice (actually view) the Matrix by those rows/columns; and 2. slice (/view) vectors and matrices that have the same dimensions as the margins of a Matrix. I.e. the result should probably be a Vector of indices / Bools or (preferably) an iterator.
I've tried the obvious:
a = sprand(10000, 10000, 0.01)
cols = unique(a.colptr)
rows = unique(a.rowvals)
but each of these take like 20ms on my machine, probably because they allocate about 1MB (at least they allocate cols and rows). This is inside a performance-critical function, so I'd like the code to be optimized. The Base code seems to have an nzrange iterator for sparse matrices, but it is not easy for me to see how to apply that to my case.
Is there a suggested way of doing this?
Second question: I'd need to also perform this operation on views of my sparse Matrix - would that be something like x = view(a,:,:); cols = unique(x.parent.colptr[x.indices[:,2]]) or is there specialized functionality for this? Views of sparse matrices appear to be tricky (cf https://discourse.julialang.org/t/slow-arithmetic-on-views-of-sparse-matrices/3644 – not a cross-post)
Thanks a lot!
Regarding getting the non-zero rows and columns of a sparse matrix, the following functions should be pretty efficient:
nzcols(a::SparseMatrixCSC) = collect(i
for i in 1:a.n if a.colptr[i]<a.colptr[i+1])
function nzrows(a::SparseMatrixCSC)
active = falses(a.m)
for r in a.rowval
active[r] = true
end
return find(active)
end
For a 10_000x10_000 matrix with 0.1 density it takes 0.2ms and 2.9ms for cols and rows, respectively. It should also be quicker than method in question (apart from the correctness issue as well).
Regarding views of sparse matrices, a quick solution would be to turn view into a sparse matrix (e.g. using b = sparse(view(a,100:199,100:199))) and use functions above. In code:
nzcols(b::SubArray{T,2,P}) where {T,P<:AbstractSparseArray} = nzcols(sparse(b))
nzrows(b::SubArray{T,2,P}) where {T,P<:AbstractSparseArray} = nzrows(sparse(b))
A better solution would be to customize the functions according to view. For example, when the view uses UnitRanges for both rows and columns:
# utility predicate returning true if element of sorted v in range r
inrange(v,r) = searchsortedlast(v,last(r))>=searchsortedfirst(v,first(r))
function nzcols(b::SubArray{T,2,P,Tuple{UnitRange{Int64},UnitRange{Int64}}}
) where {T,P<:SparseMatrixCSC}
return collect(i+1-start(b.indexes[2])
for i in b.indexes[2]
if b.parent.colptr[i]<b.parent.colptr[i+1] &&
inrange(b.parent.rowval[nzrange(b.parent,i)],b.indexes[1]))
end
function nzrows(b::SubArray{T,2,P,Tuple{UnitRange{Int64},UnitRange{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
for c in b.indexes[2]
for r in nzrange(b.parent,c)
if b.parent.rowval[r] in b.indexes[1]
active[b.parent.rowval[r]+1-start(b.indexes[1])] = true
end
end
end
return find(active)
end
which work faster than the versions for the full matrices (for 100x100 submatrix of above 10,000x10,000 matrix cols and rows take 16μs and 12μs, respectively on my machine, but these are unstable results).
A proper benchmark would use fixed matrices (or at least fix the random seed). I'll edit this line with such a benchmark if I do it.
In case the indices are not ranges, the fallback to converting to a sparse matrix works, but here are versions for indices which are Vectors. If the indices are mixed, yet another set of versions needs to be made. Quite repetitive, but this is the strength of Julia, when the versions are done, the code will choose optimized methods correctly using the types in the caller without too much effort.
function sortedintersecting(v1, v2)
i,j = start(v1), start(v2)
while i <= length(v1) && j <= length(v2)
if v1[i] == v2[j] return true
elseif v1[i] > v2[j] j += 1
else i += 1
end
end
return false
end
function nzcols(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
brows = sort(unique(b.indexes[1]))
return [k
for (k,i) in enumerate(b.indexes[2])
if b.parent.colptr[i]<b.parent.colptr[i+1] &&
sortedintersecting(brows,b.parent.rowval[nzrange(b.parent,i)])]
end
function nzrows(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
for c in b.indexes[2]
active[findin(b.indexes[1],b.parent.rowval[nzrange(b.parent,c)])] = true
end
return find(active)
end
-- ADDENDUM --
Since it was noted nzrows for Vector{Int} indices is a bit slow, this is an attempt to improve its speed by replacing findin with a version exploiting sortedness:
function findin2(inds,v,w)
i,j = start(v),start(w)
res = Vector{Int}()
while i<=length(v) && j<=length(w)
if v[i]==w[j]
push!(res,inds[i])
i += 1
elseif (v[i]<w[j]) i += 1
else j += 1
end
end
return res
end
function nzrows(b::SubArray{T,2,P,Tuple{Vector{Int64},Vector{Int64}}}
) where {T,P<:SparseMatrixCSC}
active = falses(length(b.indexes[1]))
inds = sortperm(b.indexes[1])
brows = (b.indexes[1])[inds]
for c in b.indexes[2]
active[findin2(inds,brows,b.parent.rowval[nzrange(b.parent,c)])] = true
end
return find(active)
end

How to break out of nested for loops in Julia

I have tried to break out of nested loops in a quite ineffective way:
BreakingPoint = false
a=["R1","R2","R3"]
b=["R2","R3","R4"]
for i in a
for j in b
if i == j
BreakingPoint = true
println("i = $i, j = $j.")
end
if BreakingPoint == true; break; end
end
if BreakingPoint == true; break; end
end
Is there an easier way to do that? In my actual problem, I have no idea about what are in arrays a and b, apart from they are ASCIIStrings. The array names (a and b in sample code) are also auto-generated through meta-programming methods.
You can do one of two things
have the loop statement (if thats what its called) in a multi outer loop
for i in a, j in b
if i == j
break
end
end
which is clean, but not always possible
I will be crucified for suggesting this, but you can use #goto and #label
for i in a
for j in b
if i == j
#goto escape_label
end
end
end
#label escape_label
If you go with the #goto/#label way, for the sake of the people maintaining/reviewing the code, document your use properly, as navigating code with labels is breathtakingly annoying
For the discussion on the multi-loop break, see this
Put the 2D loop into a function, and do an early return when you want to break.

if a<x<b in matlab

I need any help for Matlab's thinking method.Ithink I can explaine my problem with a simple example better. Let's say that I have a characteristic function x=y+x0, x0's are may starting values.Then I want to define my function in a grid.Then I define a finer grid and I want to ask him if he knows where an arbitrary (x*,y*) is.To determine it mathematically I should ask where the corresponding starting point (x0*) is. If this startig point stay between x(i,1)
clear
%%%%%%%%%%&First grid%%%%%%%%%%%%%%%%%%%%
x0=linspace(0,10,6);
y=linspace(0,5,6);
for i=1:length(x0)
for j=1:length(y)
x(i,j)=y(j)+x0(i);
%%%%%%%%%%%%%%%%%%%Second grid%%%%%%%%%%%%%%%%%%
x0fine=linspace(0,10,10);
yfine=linspace(0,5,10);
for p=1:length(x0fine)
for r=1:length(yfine)
xfine(p,r)=yfine(r)+x0fine(p);
if (x(i,1)<xfine(p,1)')&(x0fine(p,1)'<x(i+1,1))%%%%I probabliy have my first mistake %here
% if y(j)<yfine(r)<y(j+1)
% xint(i,j)=(x(i,j)+x(i,j+1)+x(i+1,j)+x(i+1,j+1))./4;
% else
% xint(i,j)= x(i,j);
%end
end
end
end
end
While a < b < c is legal MATLAB syntax, I doubt that it does what you think it does. It does not check that a < b and b < c. What it does is, it checks whether a < b, returning a logical value (maybe an array of logicals) and then, interpreting this logical as 0 or 1, compares it against c:
>> 2 < 0 < 2
ans =
1
>> 2 < 0 < 1
ans =
1
>> 0 < 0 < 1
ans =
1
First in matlab you should avoid as much as possible to do loops.
For instance you can compute x and xfine, with the following code:
x0=linspace(0,10,6);
y=linspace(0,5,6);
x=bsxfun(#plus,x0',y);
x0fine=linspace(0,10,10);
yfine=linspace(0,5,10);
xfine=bsxfun(#plus,x0fine',yfine);
Then given (X*,y*) your want to fine x0*, in your simple example, you can just do: x0*=x*-y*, I think.

Resources