Bad results in sparse matrix assignment with logical indexing - julia

In Matlab/Octave, I can use logical indexing to assign a value to matrix B in every location that meets a certain requirement in matrix A.
octave:1> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09];
octave:2> C = A < .12
C =
1
0
0
0
1
0
0
1
1
1
octave:3> B = spalloc(10,1);
octave:4> B(C) = 1
B =
Compressed Column Sparse (rows = 10, cols = 1, nnz = 5 [50%])
(1, 1) -> 1
(5, 1) -> 1
(8, 1) -> 1
(9, 1) -> 1
(10, 1) -> 1
However, if I attempt essentially the same code in Julia, the results are incorrect:
julia> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09];
julia> B = spzeros(10,1)
10x1 sparse matrix with 0 Float64 entries:
julia> C = A .< .12
10-element BitArray{1}:
true
false
false
false
true
false
false
true
true
true
julia> B[C] = 1
1
julia> B
10x1 sparse matrix with 5 Float64 entries:
[0 , 1] = 1.0
[0 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
Have I made a mistake in the syntax somewhere, am I misunderstanding something, or is this a bug? Note, I get the correct results if I use full matrices in Julia, but since the matrix in my application is really sparse (essential boundary conditions in a finite element simulation), I would much prefer to use the sparse matrices

It looks as if sparse has some problems with BitArray's.
julia> VERSION
v"0.3.5"
julia> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09]
julia> B = spzeros(10,1)
julia> C = A .< .12
julia> B[C] = 1
julia> B
10x1 sparse matrix with 5 Float64 entries:
[0 , 1] = 1.0
[0 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
So I get the same thing as the questioner. However when I do things "my way"
julia> B = sparse(C)
ERROR: `sparse` has no method matching sparse(::BitArray{1})
julia> B = sparse(float(C))
10x1 sparse matrix with 5 Float64 entries:
[1 , 1] = 1.0
[5 , 1] = 1.0
[8 , 1] = 1.0
[9 , 1] = 1.0
[10, 1] = 1.0
So this works if you convert the BitArray to Float. I imagine that this workaround will get you going, but it does seem that sparse should work with BitArray.
Some Additional Thoughts (Edit)
As I thought further about this, it occurs to me that one reason why there is no BitArray method for sparse() is that it is not terribly useful to implement sparse storage for an already highly compact type. Considering B and C from above:
julia> sizeof(C)
8
julia> sizeof(B)
40
So for these data, the sparse version is much larger than the original. It's actually worse than this simple (perhaps simplistic) check shows at first glance. sizeof(::BitArray{1}) appears to be the size of the entire array, but sizeof(::SparseMatrixCSC{}) shows the size of each element stored. So the real size disparity is something like 8 versus 200 bytes.
Of course if the data is sparse enough (somewhat less than 1% true), sparse storage begins to win out, despite it's high overhead.
julia> C = rand(10^6) .< 0.01
julia> B = sparse(float(C))
julia> sizeof(C)
125000
julia> sum(C)*sizeof(B)
394520
julia> C = rand(10^6) .< 0.001
julia> B = sparse(float(C))
julia> sizeof(C)
125000
julia> sum(C)*sizeof(B)
40280
So perhaps it is not an oversight that sparse() has no BitArray method. Cases where it would represent a significant space saving may be less common than one might think at first glance.

Related

Add matrix dimension with map()

I have a 2D array of colours in Julia
using Images
white = RGB{Float32}(1, 1, 1)
green = RGB{Float32}(0.1, 1, 0.1)
blue = RGB{Float32}(0, 0.1, 1)
A = [white white;
green blue;
blue blue]
I want to turn each RGB colour into a Array{Float32, 3} in the higher dimension. This is what I tried:
B = map(A) do a
[a.r, a.g, a.b]
end
size(B) == (3, 2, 3) # (rows, cols, channels)
# => false
Instead, B is a 2D-matrix of 1D-arrays.
Does Julia have a map-like method for expanding the dimensions of a matrix?
You should use ImageCore's channelview instead:
julia> Av = channelview(A)
3×3×2 reinterpret(reshape, Float32, ::Array{RGB{Float32},2}) with eltype Float32:
[:, :, 1] =
1.0 0.1 0.0
1.0 1.0 0.1
1.0 0.1 1.0
[:, :, 2] =
1.0 0.0 0.0
1.0 0.1 0.1
1.0 1.0 1.0
The color channel is the first dimension (the fastest dimension). You can check that by setting some values and seeing the impact on the original, since Av is a view of A:
julia> Av[1,2,1] = -5
-5
julia> Av[1,2,2] = -10
-10
julia> A
3×2 Array{RGB{Float32},2} with eltype RGB{Float32}:
RGB{Float32}(1.0,1.0,1.0) RGB{Float32}(1.0,1.0,1.0)
RGB{Float32}(-5.0,1.0,0.1) RGB{Float32}(-10.0,0.1,1.0)
RGB{Float32}(0.0,0.1,1.0) RGB{Float32}(0.0,0.1,1.0)
In both cases we tweaked the red channel, because of using 1 as the first index.
map doesn't work because it is elementwise, so it can only allocate an output with the same size as the input. It's not difficult to allocate your output array in this case:
function RGBtoT_loop(x::Array{RGB{T}, N}) where {T,N}
# allocate output array without any (valid) instances
result = Array{T, N+1}(undef, size(x)..., 3)
# write RGB values into output array
for i in CartesianIndices(x)
result[i, 1] = x[i].r
result[i, 2] = x[i].g
result[i, 3] = x[i].b
end
result
end
EDIT: I figured out how to do the same thing by broadcasting the getfield method where getfield(RGBvalue, 1) is equivalent to RGBvalue.r. Interestingly, ndims(A) and thus the Tuple of ones is calculated at compile-time from A's type parameters, so this method ends up type-stable and only allocates 1 thing at run-time: the result array.
RGBtoT(x) = getfield.(x, reshape(1:3, ntuple(i->1, ndims(x))..., 3) )

indices of unique elements of vector in Julia

How to get indexes of unique elements of a vector?
For instance if you have a vector v = [1,2,1,3,5,3], the unique elements are [1,2,3,5] (output of unique) and their indexes are ind = [1,2,4,5]. What function allows me to compute ind so that v[ind] = unique(v) ?
This is a solution for Julia 0.7:
findfirst.(isequal.(unique(x)), [x])
or similar working under Julia 0.6.3 and Julia 0.7:
findfirst.(map(a -> (y -> isequal(a, y)), unique(x)), [x])
and a shorter version (but it will not work under Julia 0.7):
findfirst.([x], unique(x))
It will probably not be the fastest.
If you need speed you can write something like (should work both under Julia 0.7 and 0.6.3):
function uniqueidx(x::AbstractArray{T}) where T
uniqueset = Set{T}()
ex = eachindex(x)
idxs = Vector{eltype(ex)}()
for i in ex
xi = x[i]
if !(xi in uniqueset)
push!(idxs, i)
push!(uniqueset, xi)
end
end
idxs
end
Another suggestion is
unique(i -> x[i], 1:length(x))
which is about as fast as the function in the accepted answer (in Julia 1.1), but a bit briefer.
If you don't care about finding the first index for each unique element, then you can use a combination of the unique and indexin functions:
julia> indexin(unique(v), v)
4-element Array{Int64,1}:
3
2
6
5
Which gets one index for each unique element of v in v. These are all in base and works in 0.6. This is about 2.5 times slower than #Bogumil's function, but it's a simple alternative.
A mix between mattswon and Bogumił Kamiński answers (thanks !):
uniqueidx(v) = unique(i -> v[i], eachindex(v))
eachindex allows to work with any kind of array, even views.
julia> v = [1,2,1,3,5,3];
julia> uniqueidx(v)
4-element Vector{Int64}:
1
2
4
5
julia> v2 = reshape(v, 2, 3)
2×3 Matrix{Int64}:
1 1 5
2 3 3
julia> subv2 = view(v2, 1:2, 1:2)
2×2 view(::Matrix{Int64}, 1:2, 1:2) with eltype Int64:
1 1
2 3
julia> uniqueidx(subv2)
3-element Vector{CartesianIndex{2}}:
CartesianIndex(1, 1)
CartesianIndex(2, 1)
CartesianIndex(2, 2)

How do I add a dimension to an array? (opposite of `squeeze`)

I can never remember how to do this this.
How can go
from a Vector (size (n1)) to a Column Matrix (size (n1,1))?
or from a Matrix (size (n1,n2)) to a Array{T,3} (size (n1,n2,1))?
or from a Array{T,3} (size (n1,n2,n3)) to a Array{T,4} (size (n1,n2,n3, 1))?
and so forth.
I want to know to take Array and use it to define a new Array with an extra singleton trailing dimension.
I.e. the opposite of squeeze
You can do this with reshape.
You could define a method for this:
add_dim(x::Array) = reshape(x, (size(x)...,1))
julia> add_dim([3;4])
2×1 Array{Int64,2}:
3
4
julia> add_dim([3;4])
2×1 Array{Int64,2}:
3
4
julia> add_dim([3 30;4 40])
2×2×1 Array{Int64,3}:
[:, :, 1] =
3 30
4 40
julia> add_dim(rand(4,3,2))
4×3×2×1 Array{Float64,4}:
[:, :, 1, 1] =
0.483307 0.826342 0.570934
0.134225 0.596728 0.332433
0.597895 0.298937 0.897801
0.926638 0.0872589 0.454238
[:, :, 2, 1] =
0.531954 0.239571 0.381628
0.589884 0.666565 0.676586
0.842381 0.474274 0.366049
0.409838 0.567561 0.509187
Another easy way other than reshaping to an exact shape, is to use cat and ndims together. This has the added benefit that you can specify "how many extra (singleton) dimensions you would like to add". e.g.
a = [1 2 3; 2 3 4];
cat(ndims(a) + 0, a) # add zero singleton dimensions (i.e. stays the same)
cat(ndims(a) + 1, a) # add one singleton dimension
cat(ndims(a) + 2, a) # add two singleton dimensions
etc.
UPDATE (julia 1.3). The syntax for cat has changed in julia 1.3 from cat(dims, A...) to cat(A...; dims=dims).
Therefore the above example would become:
a = [1 2 3; 2 3 4];
cat(a; dims = ndims(a) + 0 )
cat(a; dims = ndims(a) + 1 )
cat(a; dims = ndims(a) + 2 )
etc.
Obviously, like Dan points out below, this has the advantage that it's nice and clean, but it comes at the cost of allocation, so if speed is your top priority and you know what you're doing, then in-place reshape operations will be faster and are to be preferred.
Some time before the Julia 1.0 release a reshape(x, Val{N}) overload was added which for N > ndim(x) results in the adding of right most singleton dimensions.
So the following works:
julia> add_dim(x::Array{T, N}) where {T,N} = reshape(x, Val(N+1))
add_dim (generic function with 1 method)
julia> add_dim([3;4])
2×1 Array{Int64,2}:
3
4
julia> add_dim([3 30;4 40])
2×2×1 Array{Int64,3}:
[:, :, 1] =
3 30
4 40
julia> add_dim(rand(4,3,2))
4×3×2×1 Array{Float64,4}:
[:, :, 1, 1] =
0.0737563 0.224937 0.6996
0.523615 0.181508 0.903252
0.224004 0.583018 0.400629
0.882174 0.30746 0.176758
[:, :, 2, 1] =
0.694545 0.164272 0.537413
0.221654 0.202876 0.219014
0.418148 0.0637024 0.951688
0.254818 0.624516 0.935076
Try this
function extend_dims(A,which_dim)
s = [size(A)...]
insert!(s,which_dim,1)
return reshape(A, s...)
end
the variable extend_dim specifies which dimension to extend
Thus
extend_dims(randn(3,3),1)
will produce a 1 x 3 x 3 array and so on.
I find this utility helpful when passing data into convolutional neural networks.

How do you select a subset of an array based on a condition in Julia

How do you do simply select a subset of an array based on a condition? I know Julia doesn't use vectorization, but there must be a simple way of doing the following without an ugly looking multi-line for loop
julia> map([1,2,3,4]) do x
return (x%2==0)?x:nothing
end
4-element Array{Any,1}:
nothing
2
nothing
4
Desired output:
[2, 4]
Observed output:
[nothing, 2, nothing, 4]
You are looking for filter
http://docs.julialang.org/en/release-0.4/stdlib/collections/#Base.filter
Here is example an
filter(x->x%2==0,[1,2,3,5]) #anwers with [2]
There are element-wise operators (beginning with a "."):
julia> [1,2,3,4] % 2 .== 0
4-element BitArray{1}:
false
true
false
true
julia> x = [1,2,3,4]
4-element Array{Int64,1}:
1
2
3
4
julia> x % 2 .== 0
4-element BitArray{1}:
false
true
false
true
julia> x[x % 2 .== 0]
2-element Array{Int64,1}:
2
4
julia> x .% 2
4-element Array{Int64,1}:
1
0
1
0
You can use the find() function (or the .== syntax) to accomplish this. E.g.:
julia> x = collect(1:4)
4-element Array{Int64,1}:
1
2
3
4
julia> y = x[find(x%2.==0)]
2-element Array{Int64,1}:
2
4
julia> y = x[x%2.==0] ## more concise and slightly quicker
2-element Array{Int64,1}:
2
4
Note the .== syntax for the element-wise operation. Also, note that find() returns the indices that match the criteria. In this case, the indices matching the criteria are the same as the array elements that match the criteria. For the more general case though, we want to put the find() function in brackets to denote that we are using it to select indices from the original array x.
Update: Good point #Lutfullah Tomak about the filter() function. I believe though that find() can be quicker and more memory efficient. (though I understand that anonymous functions are supposed to get better in version 0.5 so perhaps this might change?) At least in my trial, I got:
x = collect(1:100000000);
#time y1 = filter(x->x%2==0,x);
# 9.526485 seconds (100.00 M allocations: 1.554 GB, 2.76% gc time)
#time y2 = x[find(x%2.==0)];
# 3.187476 seconds (48.85 k allocations: 1.504 GB, 4.89% gc time)
#time y3 = x[x%2.==0];
# 2.570451 seconds (57.98 k allocations: 1.131 GB, 4.17% gc time)
Update2: Good points in comments to this post that x[x%2.==0] is faster than x[find(x%2.==0)].
Another updated version:
v[v .% 2 .== 0]
Probably, for the newer versions of Julia, one needs to add broadcasting dot before both % and ==

How to append a vector to Julia matrix as a row?

I have an empty matrix initially:
m = Matrix(0, 3)
and a row that I want to add:
v = [2,3]
I try to do this:
[m v]
But I get an error
ERROR: ArgumentError: number of rows of each array must match
What's the proper way to do this?
That is because your matrix sizes don't match. Specifically v does not contain enough columns to match m. And its transposed
So this doesnt work
m = Matrix(0, 3)
v = [2,3]
m = cat(1, m, v) # or a = [m; v]
>> ERROR: DimensionMismatch("mismatch in dimension 2 (expected 3 got 1)")
whereas this does
m = Matrix(0, 3)
v = [2 3 4]
m = cat(1, m, v) # or m = [m; v]
>> 1x3 Array{Any,2}:
>> 2 3 4
and if you run it again it creates another row
m = cat(1, m, v) # or m = [m; v]
>> 2x3 Array{Any,2}:
>> 2 3 4
>> 2 3 4
Use the vcat (concatenate vertically) function:
help?> vcat
search: vcat hvcat VecOrMat DenseVecOrMat StridedVecOrMat AbstractVecOrMat levicivita is_valid_char #vectorize_2arg
vcat(A...)
Concatenate along dimension 1
Notice you have to transpose the vector v, ie. v', else you get a DimensionMismatch error:
julia> v = zeros(3)
3-element Array{Float64,1}:
0.0
0.0
0.0
julia> m = ones(3, 3)
3x3 Array{Float64,2}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
julia> vcat(m, v') # '
4x3 Array{Float64,2}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
0.0 0.0 0.0
julia> v' # '
1x3 Array{Float64,2}:
0.0 0.0 0.0
julia> vcat(m, v)
ERROR: DimensionMismatch("mismatch in dimension 2 (expected 3 got 1)")
in cat_t at abstractarray.jl:850
in vcat at abstractarray.jl:887
Note: the comments; # ' are there just to make syntax highlighting work well.
Isn't that Matrix creates a two-dimensional array in Julia? If you try with m =[0, 3], which creates a one-dimensional Vector for you, you can append it by [m; v].
I think using [m v] is create a two-dimensional array as well, from the Julia Document

Resources