Perculiar behavior with two dimensional array of vectors in Julia - julia

I want to create a two dimensional array of 2D vectors (to represent a vector field).
My code is something like this
N=10
dx=1/(N-1)
dy=1/(N-1)
#initial data
U=fill(zeros(2), (N, N))
for i=1:1:N
for j=1:1:N
U[i,j][1]=(i-1)*dx
U[i,j][2]=(j-1)*dy
end
end
print(U[5, 7])
The result is [1.0, 1.0], which is not what I want. I have no idea why. However, if I change the code to something like this
N=10
dx=1/(N-1)
dy=1/(N-1)
#initial data
U=fill(zeros(2), (N, N))
for i=1:1:N
for j=1:1:N
U[i,j]=[(i-1)*dx, (i-1)*dx]
end
end
print(U[5, 7])
Then it print out the correct result, which is [0.4444444444444444, 0.6666666666666666]. So, what going on?

This behaviour is expected. Note the following:
julia> x = fill(zeros(1), 2)
2-element Array{Array{Float64,1},1}:
[0.0]
[0.0]
julia> x[1][1] = 5.0
5.0
julia> x[2][1]
5.0
I managed to change x[2][1] just by changing x[1][1]. You can probably guess the problem at this point. You are populating all the elements of your matrix with the same vector. Therefore when you mutate one, you are mutating all.
To get the behaviour you want, you could build your initial matrix like this:
x = [ zeros(2) for n in 1:N, m in 1:N ]
The key point here is to consider whether the first argument to your fill call is, or contains, a mutable. If it does not, then it'll work like you expected, e.g. fill(0.0, 2). But if it does contain a mutable, then the output of fill will contain pointers to the single mutable object, and you'll get the behaviour you've encountered above.
Note, my use of the word "contains" here is important, since an immutable that contains a mutable will still result in a pointer to the single mutable object and hence the behaviour you have encountered. So for example:
struct T1 ; x::Vector{Float64} ; end
T1() = T1(zeros(1))
x = fill(T1(), 2)
x[1].x[1] = 5.0
x[2].x[1]
still mutates the second element of x.

The issue is that fill(zeros(2), (N, N)) points to a single 2-element vector. So if you change the contents of one, they all change:
julia> U = fill(ones(2), 2)
2-element Vector{Vector{Float64}}:
[1.0, 1.0]
[1.0, 1.0]
julia> U[1][1] = 2
2
julia> U
2-element Vector{Vector{Float64}}:
[2.0, 1.0]
[2.0, 1.0]
Thus, in your first snippet, you only change the contents of the unique 2-element vector. Check your U, it should be filled with [1.0, 1.0].
In your second snippet however, you allocate new vectors in each entry of U, so you don't have this problem.
A solution is to preallocate differently. You could try, e.g.,
U = Array{Vector{Float64}}(undef, (N, N))

Related

Why doesn't julia broadcasting work when deals with more than one array?

I have defined two structs and a function like this
struct A
x::Float64
end
struct B
y::Float64
end
f(a::A, b::B) = a.x*sin(b.y)
f.([A(0.1), A(0.2)], [B(1.), B(2.), B(3.)])
But f returns this error:
DimensionMismatch("arrays could not be broadcast to a common size")
How can I solve this error? I expect an array with 6 elements as the function output.
The problem is that your first argument is a 2-element Vector, and a second argument is 3-element Vector.
If you e.g. make the first argument a 1x2 Matrix, then all works fine:
julia> f.([A(0.1) A(0.2)], [B(1.), B(2.), B(3.)])
3×2 Array{Float64,2}:
0.0841471 0.168294
0.0909297 0.181859
0.014112 0.028224
(note that the missing or 1-length dimensions get automatically broadcasted)
Note that you could also broadcast calls to A and B constructors:
f.(A.([0.1 0.2]), B.(1.:3.))
The arrays have to have compatible dimensions - either identical in size and shape (local operations), or they span a larger vector space where each has singleton dimensions where the others have non-singleton dimensions, e.g. as an operation on the dimensions, the .* operator will cause the mapping
(1 x 1 x n) .* (p x q x 1) => p x q x n

In-place modification/reassignment of vector in Julia without getting copies

Here's some toy code:
type MyType
x::Int
end
vec = [MyType(1), MyType(2), MyType(3), MyType(4)]
ids = [2, 1, 3, 1]
vec = vec[ids]
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(1)
MyType(3)
MyType(1)
That looks fine, except for this behavior:
julia> vec[2].x = 60
60
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(60)
MyType(3)
MyType(60)
I want to be able to rearrange the contents of a vector, with the possibility that I eliminate some values and duplicate others. But when I duplicate values, I don't want this copy behavior. Is there an "elegant" way to do this? Something like this works, but yeesh:
vec = [deepcopy(vec[ids[i]]) for i in 1:4]
The issue is that you're creating mutable types, and your vector therefore contains references to the instantiated data - so when you create a vector based on ids, you're creating what amounts to a vector of pointers to the structures. This further means that the elements in the vector with the same id are actually pointers to the same object.
There's no good way to do this without ensuring that your references are different. That either means 1) immutable types, which means you can't reassign x, or 2) copy/deepcopy.

Creating vector from two vectors in Julia

I have two vectors, say, x=[1;1] and y=[2;2]
I want to construct a vector whose element is a combination of these two, i.e. z=[[1,2],[1,2]]
What is the most efficient way to do that?
Just use zip. By default, this will create a vector of tuples:
julia> z = collect(zip(x,y))
2-element Array{Tuple{Int64,Int64},1}:
(1,2)
(1,2)
Note that this is different than what you wanted, but it'll be much more efficient. If you really want an array of arrays, you can use a comprehension:
julia> [[a,b] for (a,b) in zip(x,y)]
2-element Array{Array{Int64,1},1}:
[1,2]
[1,2]

Slicing and broadcasting multidimensional arrays in Julia : meshgrid example

I recently started learning Julia by coding a simple implementation of Self Organizing Maps. I want the size and dimensions of the map to be specified by the user, which means I can't really use for loops to work on the map arrays because I don't know in advance how many layers of loops I will need. So I absolutely need broadcasting and slicing functions that work on arrays of arbitrary dimensions.
Right now, I need to construct an array of indices of the map. Say my map is defined by an array of size mapsize = (5, 10, 15), I need to construct an array indices of size (3, 5, 10, 15) where indices[:, a, b, c] should return [a, b, c].
I come from a Python/NumPy background, in which the solution is already given by a specific "function", mgrid :
indices = numpy.mgrid[:5, :10, :15]
print indices.shape # gives (3, 5, 10, 15)
print indices[:, 1, 2, 3] gives [1, 2, 3]
I didn't expect Julia to have such a function on the get-go, so I turned to broadcasting. In NumPy, broadcasting is based on a set of rules that I find quite clear and logical. You can use arrays of different dimensions in the same expression as long as the sizes in each dimension match or one of it is 1 :
(5, 10, 15) broadcasts to (5, 10, 15)
(10, 1)
(5, 1, 15) also broadcasts to (5, 10, 15)
(1, 10, 1)
To help with this, you can also use numpy.newaxis or None to easily add new dimensions to your array :
array = numpy.zeros((5, 15))
array[:,None,:] has shape (5, 1, 15)
This helps broadcast arrays easily :
A = numpy.arange(5)
B = numpy.arange(10)
C = numpy.arange(15)
bA, bB, bC = numpy.broadcast_arrays(A[:,None,None], B[None,:,None], C[None,None,:])
bA.shape == bB.shape == bC.shape = (5, 10, 15)
Using this, creating the indices array is rather straightforward :
indices = numpy.array(numpy.broadcast_arrays(A[:,None,None], B[None,:,None], C[None,None,:]))
(indices == numpy.mgrid[:5,:10,:15]).all() returns True
The general case is of course a bit more complicated, but can be worked around using list comprehension and slices :
arrays = [ numpy.arange(i)[tuple([None if m!=n else slice(None) for m in range(len(mapsize))])] for n, i in enumerate(mapsize) ]
indices = numpy.array(numpy.broadcast_arrays(*arrays))
So back to Julia. I tried to apply the same kind of rationale and ended up achieving the equivalent of the arrays list of the code above. This ended up being rather simpler than the NumPy counterpart thanks to the compound expression syntax :
arrays = [ (idx = ones(Int, length(mapsize)); idx[n] = i;reshape([1:i], tuple(idx...))) for (n,i)=enumerate(mapsize) ]
Now I'm stuck here, as I don't really know how to apply the broadcasting to my list of generating arrays here... The broadcast[!] functions ask for a function f to apply, and I don't have any. I tried using a for loop to try forcing the broadcasting:
indices = Array(Int, tuple(unshift!([i for i=mapsize], length(mapsize))...))
for i=1:length(mapsize)
A[i] = arrays[i]
end
But this gives me an error : ERROR: convert has no method matching convert(::Type{Int64}, ::Array{Int64,3})
Am I doing this the right way? Did I overlook something important? Any help is appreciated.
If you're running julia 0.4, you can do this:
julia> function mgrid(mapsize)
T = typeof(CartesianIndex(mapsize))
indices = Array(T, mapsize)
for I in eachindex(indices)
indices[I] = I
end
indices
end
It would be even nicer if one could just say
indices = [I for I in CartesianRange(CartesianIndex(mapsize))]
I'll look into that :-).
Broadcasting in Julia has been modelled pretty much on broadcasting in NumPy, so you should hopefully find that it obeys more or less the same simple rules (not sure if the way to pad dimensions when not all inputs have the same number of dimensions is the same though, since Julia arrays are column-major).
A number of useful things like newaxis indexing and broadcast_arrays have not been implemented (yet) however. (I hope they will.) Also note that indexing works a bit differently in Julia compared to NumPy: when you leave off indices for trailing dimensions in NumPy, the remaining indices default to colons. In Julia they could be said to default to ones instead.
I'm not sure if you actually need a meshgrid function, most things that you would want to use it for could be done by using the original entries of your arrays array with broadcasting operations. The major reason that meshgrid is useful in matlab is because it is terrible at broadcasting.
But it is quite straightforward to accomplish what you want to do using the broadcast! function:
# assume mapsize is a vector with the desired shape, e.g. mapsize = [2,3,4]
N = length(mapsize)
# Your line to create arrays below, with an extra initial dimension on each array
arrays = [ (idx = ones(Int, N+1); idx[n+1] = i;reshape([1:i], tuple(idx...))) for (n,i) in enumerate(mapsize) ]
# Create indices and fill it one coordinate at a time
indices = zeros(Int, tuple(N, mapsize...))
for (i,arr) in enumerate(arrays)
dest = sub(indices, i, [Colon() for j=1:N]...)
broadcast!(identity, dest, arr)
end
I had to add an initial singleton dimension on the entries of arrays to line up with the axes of indices (newaxis had been useful here...).
Then I go through each coordinate, create a subarray (a view) on the relevant part of indices, and fill it. (Indexing will default to returning subarrays in Julia 0.4, but for now we have to use sub explicitly).
The call to broadcast! just evaluates the identity function identity(x)=x on the input arr=arrays[i], broadcasts to the shape of the output. There's no efficiency lost in using the identity function for this; broadcast! generates a specialized function based on the given function, number of arguments, and number of dimensions of the result.
I guess this is the same as the MATLAB meshgrid functionality. I've never really thought about the generalization to more than two dimensions, so its a bit harder to get my head around.
First, here is my completely general version, which is kinda crazy but I can't think of a better way to do it without generating code for common dimensions (e.g. 2, 3)
function numpy_mgridN(dims...)
X = Any[zeros(Int,dims...) for d in 1:length(dims)]
for d in 1:length(dims)
base_idx = Any[1:nd for nd in dims]
for i in 1:dims[d]
cur_idx = copy(base_idx)
cur_idx[d] = i
X[d][cur_idx...] = i
end
end
#show X
end
X = numpy_mgridN(3,4,5)
#show X[1][1,2,3] # 1
#show X[2][1,2,3] # 2
#show X[3][1,2,3] # 3
Now, what I mean by code generation is that, for the 2D case, you can simply do
function numpy_mgrid(dim1,dim2)
X = [i for i in 1:dim1, j in 1:dim2]
Y = [j for i in 1:dim1, j in 1:dim2]
return X,Y
end
and for the 3D case:
function numpy_mgrid(dim1,dim2,dim3)
X = [i for i in 1:dim1, j in 1:dim2, k in 1:dim3]
Y = [j for i in 1:dim1, j in 1:dim2, k in 1:dim3]
Z = [k for i in 1:dim1, j in 1:dim2, k in 1:dim3]
return X,Y,Z
end
Test with, e.g.
X,Y,Z=numpy_mgrid(3,4,5)
#show X
#show Y
#show Z
I guess mgrid shoves them all into one tensor, so you could do that like this
all = cat(4,X,Y,Z)
which is still slightly different:
julia> all[1,2,3,:]
1x1x1x3 Array{Int64,4}:
[:, :, 1, 1] =
1
[:, :, 1, 2] =
2
[:, :, 1, 3] =
3
julia> vec(all[1,2,3,:])
3-element Array{Int64,1}:
1
2
3

Julia: append to an empty vector

I would like to create an empty vector and append to it an array in Julia. How do I do that?
x = Vector{Float64}
append!(x, rand(10))
results in
`append!` has no method matching append!(::Type{Array{Float64,1}}, ::Array{Float64,1})
Thanks.
Your variable x does not contain an array but a type.
x = Vector{Float64}
typeof(x) # DataType
You can create an array as Array(Float64, n)
(but beware, it is uninitialized: it contains arbitrary values) or zeros(Float64, n),
where n is the desired size.
Since Float64 is the default, we can leave it out.
Your example becomes:
x = zeros(0)
append!( x, rand(10) )
I am somewhat new to Julia and came across this question after getting a similar error. To answer the original question for Julia version 1.2.0, all that is missing are ():
x = Vector{Float64}()
append!(x, rand(10))
This solution (unlike x=zeros(0)) works for other data types, too. For example, to create an empty vector to store dictionaries use:
d = Vector{Dict}()
push!(d, Dict("a"=>1, "b"=>2))
A note regarding use of push! and append!:
According to the Julia help, push! is used to add individual items to a collection, while append! adds an collection of items to a collection. So, the following pieces of code create the same array:
Push individual items:
a = Vector{Float64}()
push!(a, 1.0)
push!(a, 2.0)
Append items contained in an array:
a = Vector{Float64}()
append!(a, [1.0, 2.0])
You can initialize an empty Vector of any type by typing the type in front of []. Like:
Float64[] # Returns what you want
Array{Float64, 2}[] # Vector of Array{Float64,2}
Any[] # Can contain anything
New answer, for Julia 1. append! is deprecated, you now need to use push!(array, element) to add elements to an array
my_stuff = zeros()
push!(my_stuff, "new element")

Resources