Julia: what's the difference between map and broadcast? - julia

I have found some discussions on Julia Users group but they were too technical for me.
I would like to know what are the criteria to choose between the two.
I am following the JuliaBox tutorial but it doesn't explain much. Thanks

map and broadcast are different when dealing with multiple collections of different dimensions. While broadcast will try to cast all the objects to a common dimension, map will directly apply the given function elementwise.
julia> map(+, 1, [2,2,2])
1-element Array{Int64,1}:
3
julia> broadcast(+, 1, [2,2,2])
3-element Array{Int64,1}:
3
3
3
the broadcast example has the same result as map(+, [1,1,1], [2,2,2]).
Also note the behavior of broadcast when failing at finding a common dimension between two arguments:
julia> broadcast(+, [1,2,3], [2 2; 2 0])
ERROR: DimensionMismatch("arrays could not be broadcast to a common size")
Stacktrace:
[1] _bcs1 at ./broadcast.jl:439 [inlined]
[2] _bcs at ./broadcast.jl:433 [inlined]
[3] broadcast_shape at ./broadcast.jl:427 [inlined]
[4] combine_axes at ./broadcast.jl:422 [inlined]
[5] instantiate at ./broadcast.jl:266 [inlined]
[6] materialize at ./broadcast.jl:748 [inlined]
[7] broadcast(::typeof(+), ::Array{Int64,1}, ::Array{Int64,2}) at ./broadcast.jl:702
[8] top-level scope at none:0
julia> map(+, [1,2,3], [2 2; 2 0])
3-element Array{Int64,1}:
3
4

Both methods apply a function to their arguments. The difference between the two is how they treat multidimensional arrays.
map tries to zip the arrays column first, creating an iterator of tuples, which are used as arguments to the given function. Thus
julia> map(+, [1,2,3,4], [10, 20, 30, 40])
4-element Array{Int64,1}:
11
22
33
44
while
julia> map(+, [1,2,3,4], [10 20; 30 40])
4-element Array{Int64,1}:
11
32
23
44
Notice the element pairing when changing the layout of the second array. On the other hand, broadcast does an element-wise application of the given function. If the two array arguments have the same dimensions, the pairing of elements is similar to map.
julia> broadcast(+, [1,2,3,4], [10, 20, 30, 40])
4-element Array{Int64,1}:
11
22
33
44
Otherwise, broadcast checks if the two arrays match in at least one dimension and that the other dimension of at least one array is equal to 1. If this condition is satisfied, broadcast extends the arrays to a common size by duplicating the array data along the dimension equal to 1. This is how you get
julia> broadcast(+, [1, 2, 3, 4], [10 20 30 40])
4×4 Array{Int64,2}:
11 21 31 41
12 22 32 42
13 23 33 43
14 24 34 44

Related

How can I put a slice of a Matrix into a 3D Array with SMatrix type of inner structure?

Suppose I have this Matrix:
julia> mat = [
1 2 3 4
5 6 7 8
9 8 7 6
];
Then I want to put slices of this Matrix into a 3D Array with types of SMatrix{Int64}, like below:
julia> using StaticArrays
julia> arr = Array{SMatrix{Int64}, 3}(undef, 3, 2, 3);
julia> col_idx = [1, 2, 3];
julia> foreach(x->arr[:, :, x] = mat[:, x:x+1], col_idx)
ERROR: MethodError: Cannot `convert` an object of type
Int64 to an object of type
SMatrix{Int64}
Closest candidates are:
convert(::Type{T}, ::LinearAlgebra.Factorization) where T<:AbstractArray at C:\Users\JUL\.julia\juliaup\julia-1.8.3+0.x64\share\julia\stdlib\v1.8\LinearAlgebra\src\factorization.jl:58
convert(::Type{SA}, ::Tuple) where SA<:StaticArray at C:\Users\JUL\.julia\packages\StaticArrays\x7lS0\src\convert.jl:179
convert(::Type{SA}, ::SA) where SA<:StaticArray at C:\Users\JUL\.julia\packages\StaticArrays\x7lS0\src\convert.jl:178
...
Stacktrace:
[1] setindex!
# .\array.jl:968 [inlined]
[2] macro expansion
# .\multidimensional.jl:946 [inlined]
[3] macro expansion
# .\cartesian.jl:64 [inlined]
[4] macro expansion
# .\multidimensional.jl:941 [inlined]
[5] _unsafe_setindex!(::IndexLinear, ::Array{SMatrix{Int64}, 3}, ::Matrix{Int64}, ::Base.Slice{Base.OneTo{Int64}}, ::Base.Slice{Base.OneTo{Int64}}, ::Int64)
# Base .\multidimensional.jl:953
[6] _setindex!
# .\multidimensional.jl:930 [inlined]
[7] setindex!(::Array{SMatrix{Int64}, 3}, ::Matrix{Int64}, ::Function, ::Function, ::Int64)
# Base .\abstractarray.jl:1344
[8] (::var"#5#6")(x::Int64)
# Main .\REPL[20]:1
[9] foreach(f::var"#5#6", itr::Vector{Int64})
# Base .\abstractarray.jl:2774
[10] top-level scope
# REPL[20]:1
How can I achieve it?
P.S.:
This is just a minimal and reproducible example. In the practical sense, I have a size of (10, 10, 2000) for arr and a big size for mat as well (10x2000, I guess)!
If I understood correctly, do you want an Array of SMatrices?
mat = [ 1 2 3 4
5 6 7 8
9 8 7 6 ];
using StaticArrays
col_idx = [1, 2, 3];
arr = [SMatrix{3,2}(mat[:, x:x+1]) for x in col_idx]
3-element Vector{SMatrix{3, 2, Int64, 6}}:
[1 2; 5 6; 9 8]
[2 3; 6 7; 8 7]
[3 4; 7 8; 7 6]
Then, what if I say:
julia> using StaticArrays
julia> mat = [
1 2 3 4
5 6 7 8
9 8 7 6
];
julia> arr = Array{Int64, 3}(undef, 3, 2, 3);
julia> foreach(x->arr[:, :, x] = mat[:, x:x+1], [1, 2, 3]);
julia> sarr = SArray{Tuple{3, 2, 3}}(arr)
3×2×3 SArray{Tuple{3, 2, 3}, Int64, 3, 18} with indices SOneTo(3)×SOneTo(2)×SOneTo(3):
[:, :, 1] =
1 2
5 6
9 8
[:, :, 2] =
2 3
6 7
8 7
[:, :, 3] =
3 4
7 8
7 6
julia> typeof(sarr[:, :, 1])
SMatrix{3, 2, Int64, 6} (alias for SArray{Tuple{3, 2}, Int64, 2, 6})
First, I created a regular 3D Array, then constructed a SArray based on it.
However, in the case of your practical situation, I tried the following:
julia> mat = rand(10, 2000);
julia> arr = Array{Float64, 3}(undef, 10, 2, 1999);
julia> foreach(x->arr[:, :, x] = mat[:, x:x+1], 1:1999);
julia> sarr = SArray{Tuple{10, 2, 1999}}(arr);
But it takes too much time to construct such a container. (I already canceled it, and I don't know the runtime of it.). Hence, in these cases, it's better to take #AboAmmar's advice.
Inspired by #Shayan and #AboAmmar, this answer explores using BlockArrays.jl package to construct the desired result. BlockArrays puts existing arrays into a 'meta-array'. The sub-arrays can be of SMatrix type.
In code:
using StaticArrays, BlockArrays
mat = rand(10,2000) # random demo matrix
# make all the slice SArrays
arr = [SArray{Tuple{10,2,1}, Float64, 3}(mat[:,i:i+1])
for i=1:1999]
arr = reshape(arr,1,1,1999)
# glue them into a BlockArray
bricked = mortar(arr)
After now:
julia> size(bricked)
(10, 2, 1999)
julia> bricked[:,:,25]
1×1-blocked 10×2 BlockMatrix{Float64}:
0.265972 0.258414
0.396142 0.863366
0.41708 0.648276
0.960283 0.773064
0.62513 0.268989
0.132796 0.0493077
0.844674 0.791772
0.59638 0.0769661
0.221536 0.388623
0.595742 0.50732
Hopefully this method gets the performance trade-off you wanted (or at least introduces some new ideas).

Converting a n-element Vector{Vector{Int64}} to a Vector{Int64}

I have a list of vectors (vector of vectors) like the following:
A 2-element Vector{Vector{Int64}}: A= [347229118, 1954075737, 6542148346,347229123, 1954075753, 6542148341] [247492691, 247490813, -2796091443606465490, 247491615, 247492910, 247491620, -4267071114472318843, 747753505]
the goal is to have them all in just one vector. I did try collect, A[:], vec(A), flatten(A) but it still returns 2-element Vector{Vector{Int64}}
I don't know what command I should use. Is there anything
Assuming your input data is:
julia> x = [[1, 2], [3, 4], [5, 6]]
3-element Vector{Vector{Int64}}:
[1, 2]
[3, 4]
[5, 6]
here are some natural options you have.
Option 1: use Iterators.flatten:
julia> collect(Iterators.flatten(x))
6-element Vector{Int64}:
1
2
3
4
5
6
You can omit collect in which case you get a lazy iterator over the source data which is more memory efficient.
Option 2: use vcat:
julia> reduce(vcat, x)
6-element Vector{Int64}:
1
2
3
4
5
6
You could also write:
julia> vcat(x...)
6-element Vector{Int64}:
1
2
3
4
5
6
but splatting might get problematic if your x vector is very long. In which case I recommend you to use the reduce function as shown above.

Create a Vector of Integers and missing Values

What a hazzle...
I'm trying to create a vector of integers and missing values. This works fine:
b = [4, missing, missing, 3]
But I would actually like the vector to be longer with more missing values and therefore use repeat(), but this doesn't work
append!([1,2,3], repeat([missing], 1000))
and this also doesn't work
[1,2,3, repeat([missing], 1000)]
Please, help me out, here.
It is also worth to note that if you do not need to do an in-place operation with append! actually in such cases it is much easier to do vertical concatenation:
julia> [[1, 2, 3]; repeat([missing], 2); 4; 5] # note ; that denotes vcat
7-element Array{Union{Missing, Int64},1}:
1
2
3
missing
missing
4
5
julia> vcat([1,2,3], repeat([missing], 2), 4, 5) # this is the same but using a different syntax
7-element Array{Union{Missing, Int64},1}:
1
2
3
missing
missing
4
5
The benefit of vcat is that it automatically does the type promotion (as opposed to append! in which case you have to correctly specify the eltype of the target container before the operation).
Note that because vcat does automatic type promotion in corner cases you might get a different eltype of the result of the operation:
julia> x = [1, 2, 3]
3-element Array{Int64,1}:
1
2
3
julia> append!(x, [1.0, 2.0]) # conversion from Float64 to Int happens here
5-element Array{Int64,1}:
1
2
3
1
2
julia> [[1, 2, 3]; [1.0, 2.0]] # promotion of Int to Float64 happens in this case
5-element Array{Float64,1}:
1.0
2.0
3.0
1.0
2.0
See also https://docs.julialang.org/en/v1/manual/arrays/#man-array-literals.
This will work:
append!(Union{Int,Missing}[1,2,3], repeat([missing], 1000))
[1,2,3] creates just a Vector{Int} and since Julia is strongly typed the Vector{Int} cannot accept values of non-Int type. Hence, when defining a structure, that you plan to hold more data types within, you need to explicitly state it - here we have defined Vector{Union{Int,Missing}}.

Vectorized splatting

I'd like to be able to splat an array of tuples into a function in a vectorized fashion. For example, if I have the following function,
function foo(x, y)
x + y
end
and the following array of tuples,
args_array = [(1, 2), (3, 4), (5, 6)]
then I could use a list comprehension to obtain the desired result:
julia> [foo(args...) for args in args_array]
3-element Array{Int64,1}:
3
7
11
However, I would like to be able to use the dot vectorization notation for this operation:
julia> foo.(args_array...)
ERROR: MethodError: no method matching foo(::Int64, ::Int64, ::Int64)
But as you can see, that particular syntax doesn't work. Is there a vectorized way to do this?
foo.(args_array...) doesn't work because it's doing:
foo.((1, 2), (3, 4), (5, 6))
# which is roughly equivalent to
[foo(1,3,5), foo(2,4,6)]
In other words, it's taking each element of args_array as a separate argument and then broadcasting foo over those arguments. You want to broadcast foo over the elements directly. The trouble is that running:
foo.(args_array)
# is roughly equivalent to:
[foo((1,2)), foo((3,4)), foo((5,6))]
In other words, the broadcast syntax is just passing each tuple as a single argument to foo. We can fix that with a simple intermediate function:
julia> bar(args) = foo(args...);
julia> bar.(args_array)
3-element Array{Int64,1}:
3
7
11
Now that's doing what you want! You don't even need to construct the second argument if you don't want to. This is exactly equivalent:
julia> (args->foo(args...)).(args_array)
3-element Array{Int64,1}:
3
7
11
And in fact you can generalize this quite easily:
julia> splat(f) = args -> f(args...);
julia> (splat(foo)).(args_array)
3-element Array{Int64,1}:
3
7
11
You could zip the args_array, which effectively transposes the array of tuples:
julia> collect(zip(args_array...))
2-element Array{Tuple{Int64,Int64,Int64},1}:
(1, 3, 5)
(2, 4, 6)
Then you can broadcast foo over the transposed array (actually an iterator) of tuples:
julia> foo.(zip(args_array...)...)
(3, 7, 11)
However, this returns a tuple instead of an array. If you need the return value to be an array, you could use any of the following somewhat cryptic solutions:
julia> foo.(collect.(zip(args_array...))...)
3-element Array{Int64,1}:
3
7
11
julia> collect(foo.(zip(args_array...)...))
3-element Array{Int64,1}:
3
7
11
julia> [foo.(zip(args_array...)...)...]
3-element Array{Int64,1}:
3
7
11
How about
[foo(x,y) for (x,y) in args_array]

Can broadcast be applied to subarrays/slices of array in julia

I would like to broadcast to subarrays (i.e. broadcast to slices of array). This is useful in GPU programming for example I'd like to have:
X,Y,Z = (rand(3,3,3) for _=1:3)
#.[1,2] X = f(2X^2 + 6X^3 - sqrt(X)) + Y*Z
where #.[1,2] means broadcasting along dim 3, i.e. apply colons to dim 1 and 2 in the expression.
Is there a way to support this "sub-broadcast"?
Edit: add an example
julia> a = reshape(1:8, (2,2,2))
2×2×2 Base.ReshapedArray{Int64,3,UnitRange{Int64},Tuple{}}:
[:, :, 1] =
1 3
2 4
[:, :, 2] =
5 7
6 8
julia> broadcast(*, a, a)
2×2×2 Array{Int64,3}:
[:, :, 1] =
1 9
4 16
[:, :, 2] =
25 49
36 64
julia> broadcast(*, a, a, dim=3) # I would like to broadcast the matrix multiplication (batch mode) instead of elementwise multiplication.
2×2×2 Array{Int64,3}:
[:, :, 1] =
7 15
10 22
[:, :, 2] =
67 91
78 106
Edit 2: I am trying different vectorization methods here https://arrayfire.com/introduction-to-vectorization/ via the ArrayFire.jl package (a wrapper of arrayfire), including vectorization, parallel for-loops, batching, and advanced vectorizations. arrayfire has the gfor (http://arrayfire.org/docs/page_gfor.htm) method to run parallel computations on slices of matrices, and is implemented via broadcast in ArrayFire.jl. Currently, julia's broadcast acts element-wise. I just wonder if it can act "slice-wise" then it can do pure julia 3D and 4D support for Linear Algebra functions (https://github.com/arrayfire/arrayfire/issues/483).
Of course normal nested for loops will get the job done. I am just exited about the broadcast . syntax, and wonder if it can be extend.
I think you're looking for mapslices.
mapslices(x->x*x, a, (1,2))
2×2×2 Array{Int64,3}:
[:, :, 1] =
7 15
10 22
[:, :, 2] =
67 91
78 106
mapslices(f, A, dims)
Transform the given dimensions of array A using function f. f is
called on each slice of A of the form A[...,:,...,:,...]. dims is an
integer vector specifying where the colons go in this expression.
The results are concatenated along the remaining dimensions. For
example, if dims is [1,2] and A is 4-dimensional, f is called on
A[:,:,i,j] for all i and j.
Use setdiff if you want to specify which dimension to concatenate along instead of on which to apply the function.
(If you need a multi-argument version check out this gist https://gist.github.com/alexmorley/e585df0d8d857d7c9e4a5af75df43d00)

Resources