Unique elements of an array of arrays with "issetequal" - julia

For an array like
v=[[1,2],[1,2,3],[2,3,1]]
I am looking for a method to delete all entries that are duplicated in the sense that they are equal when considered as sets.
In this example, issetequal([1,2,3],[2,3,1]) = true, so the method should return the array [[1,2],[1,2,3]].
In principle, something like unique(issetequal, v) would solve the problem. But in practice, this option gives the error
ERROR: MethodError: no method matching issetequal(::Array{Int64,1})
Does anybody have a sugestion?

From the documentation, we see that this form of unique takes as first argument a unary function:
unique(f, itr)
Returns an array containing one value from itr for each unique value produced by f applied to elements of itr.
Examples
≡≡≡≡≡≡≡≡≡≡
julia> unique(x -> x^2, [1, -1, 3, -3, 4])
3-element Array{Int64,1}:
1
3
4
In your example, issetequal is a binary function that directly checks the set-equality of two values. What you want instead is the Set constructor, which constructs a Set out of an Array. You can then let unique test the equality between sets:
julia> unique(Set, [[1,2],[1,2,3],[2,3,1]])
2-element Array{Array{Int64,1},1}:
[1, 2]
[1, 2, 3]

Related

Why doesn't julia broadcasting work when deals with more than one array?

I have defined two structs and a function like this
struct A
x::Float64
end
struct B
y::Float64
end
f(a::A, b::B) = a.x*sin(b.y)
f.([A(0.1), A(0.2)], [B(1.), B(2.), B(3.)])
But f returns this error:
DimensionMismatch("arrays could not be broadcast to a common size")
How can I solve this error? I expect an array with 6 elements as the function output.
The problem is that your first argument is a 2-element Vector, and a second argument is 3-element Vector.
If you e.g. make the first argument a 1x2 Matrix, then all works fine:
julia> f.([A(0.1) A(0.2)], [B(1.), B(2.), B(3.)])
3×2 Array{Float64,2}:
0.0841471 0.168294
0.0909297 0.181859
0.014112 0.028224
(note that the missing or 1-length dimensions get automatically broadcasted)
Note that you could also broadcast calls to A and B constructors:
f.(A.([0.1 0.2]), B.(1.:3.))
The arrays have to have compatible dimensions - either identical in size and shape (local operations), or they span a larger vector space where each has singleton dimensions where the others have non-singleton dimensions, e.g. as an operation on the dimensions, the .* operator will cause the mapping
(1 x 1 x n) .* (p x q x 1) => p x q x n

In-place modification/reassignment of vector in Julia without getting copies

Here's some toy code:
type MyType
x::Int
end
vec = [MyType(1), MyType(2), MyType(3), MyType(4)]
ids = [2, 1, 3, 1]
vec = vec[ids]
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(1)
MyType(3)
MyType(1)
That looks fine, except for this behavior:
julia> vec[2].x = 60
60
julia> vec
4-element Array{MyType,1}:
MyType(2)
MyType(60)
MyType(3)
MyType(60)
I want to be able to rearrange the contents of a vector, with the possibility that I eliminate some values and duplicate others. But when I duplicate values, I don't want this copy behavior. Is there an "elegant" way to do this? Something like this works, but yeesh:
vec = [deepcopy(vec[ids[i]]) for i in 1:4]
The issue is that you're creating mutable types, and your vector therefore contains references to the instantiated data - so when you create a vector based on ids, you're creating what amounts to a vector of pointers to the structures. This further means that the elements in the vector with the same id are actually pointers to the same object.
There's no good way to do this without ensuring that your references are different. That either means 1) immutable types, which means you can't reassign x, or 2) copy/deepcopy.

How to get minimal value of Array in Julia?

How do I get the minimum value of an Array, Vector, or Matrix in Julia?
The code min([1, 2, 3]) doesn't work.
The Julia manual:
https://docs.julialang.org/en/v1/base/math/#Base.min
https://docs.julialang.org/en/v1/base/collections/#Base.minimum
min(x, y, ...)
Return the minimum of the arguments. Operates elementwise over arrays.
julia> min([1, 2, 3]...)
1
julia> min(2,3)
2
minimum(A, dims)
Compute the minimum value of an array over the given dimensions.
minimum!(r, A)
Compute the minimum value of A over the singleton dimensions of r, and write results to r.
julia> minimum([1, 2, 3])
1

Slicing and broadcasting multidimensional arrays in Julia : meshgrid example

I recently started learning Julia by coding a simple implementation of Self Organizing Maps. I want the size and dimensions of the map to be specified by the user, which means I can't really use for loops to work on the map arrays because I don't know in advance how many layers of loops I will need. So I absolutely need broadcasting and slicing functions that work on arrays of arbitrary dimensions.
Right now, I need to construct an array of indices of the map. Say my map is defined by an array of size mapsize = (5, 10, 15), I need to construct an array indices of size (3, 5, 10, 15) where indices[:, a, b, c] should return [a, b, c].
I come from a Python/NumPy background, in which the solution is already given by a specific "function", mgrid :
indices = numpy.mgrid[:5, :10, :15]
print indices.shape # gives (3, 5, 10, 15)
print indices[:, 1, 2, 3] gives [1, 2, 3]
I didn't expect Julia to have such a function on the get-go, so I turned to broadcasting. In NumPy, broadcasting is based on a set of rules that I find quite clear and logical. You can use arrays of different dimensions in the same expression as long as the sizes in each dimension match or one of it is 1 :
(5, 10, 15) broadcasts to (5, 10, 15)
(10, 1)
(5, 1, 15) also broadcasts to (5, 10, 15)
(1, 10, 1)
To help with this, you can also use numpy.newaxis or None to easily add new dimensions to your array :
array = numpy.zeros((5, 15))
array[:,None,:] has shape (5, 1, 15)
This helps broadcast arrays easily :
A = numpy.arange(5)
B = numpy.arange(10)
C = numpy.arange(15)
bA, bB, bC = numpy.broadcast_arrays(A[:,None,None], B[None,:,None], C[None,None,:])
bA.shape == bB.shape == bC.shape = (5, 10, 15)
Using this, creating the indices array is rather straightforward :
indices = numpy.array(numpy.broadcast_arrays(A[:,None,None], B[None,:,None], C[None,None,:]))
(indices == numpy.mgrid[:5,:10,:15]).all() returns True
The general case is of course a bit more complicated, but can be worked around using list comprehension and slices :
arrays = [ numpy.arange(i)[tuple([None if m!=n else slice(None) for m in range(len(mapsize))])] for n, i in enumerate(mapsize) ]
indices = numpy.array(numpy.broadcast_arrays(*arrays))
So back to Julia. I tried to apply the same kind of rationale and ended up achieving the equivalent of the arrays list of the code above. This ended up being rather simpler than the NumPy counterpart thanks to the compound expression syntax :
arrays = [ (idx = ones(Int, length(mapsize)); idx[n] = i;reshape([1:i], tuple(idx...))) for (n,i)=enumerate(mapsize) ]
Now I'm stuck here, as I don't really know how to apply the broadcasting to my list of generating arrays here... The broadcast[!] functions ask for a function f to apply, and I don't have any. I tried using a for loop to try forcing the broadcasting:
indices = Array(Int, tuple(unshift!([i for i=mapsize], length(mapsize))...))
for i=1:length(mapsize)
A[i] = arrays[i]
end
But this gives me an error : ERROR: convert has no method matching convert(::Type{Int64}, ::Array{Int64,3})
Am I doing this the right way? Did I overlook something important? Any help is appreciated.
If you're running julia 0.4, you can do this:
julia> function mgrid(mapsize)
T = typeof(CartesianIndex(mapsize))
indices = Array(T, mapsize)
for I in eachindex(indices)
indices[I] = I
end
indices
end
It would be even nicer if one could just say
indices = [I for I in CartesianRange(CartesianIndex(mapsize))]
I'll look into that :-).
Broadcasting in Julia has been modelled pretty much on broadcasting in NumPy, so you should hopefully find that it obeys more or less the same simple rules (not sure if the way to pad dimensions when not all inputs have the same number of dimensions is the same though, since Julia arrays are column-major).
A number of useful things like newaxis indexing and broadcast_arrays have not been implemented (yet) however. (I hope they will.) Also note that indexing works a bit differently in Julia compared to NumPy: when you leave off indices for trailing dimensions in NumPy, the remaining indices default to colons. In Julia they could be said to default to ones instead.
I'm not sure if you actually need a meshgrid function, most things that you would want to use it for could be done by using the original entries of your arrays array with broadcasting operations. The major reason that meshgrid is useful in matlab is because it is terrible at broadcasting.
But it is quite straightforward to accomplish what you want to do using the broadcast! function:
# assume mapsize is a vector with the desired shape, e.g. mapsize = [2,3,4]
N = length(mapsize)
# Your line to create arrays below, with an extra initial dimension on each array
arrays = [ (idx = ones(Int, N+1); idx[n+1] = i;reshape([1:i], tuple(idx...))) for (n,i) in enumerate(mapsize) ]
# Create indices and fill it one coordinate at a time
indices = zeros(Int, tuple(N, mapsize...))
for (i,arr) in enumerate(arrays)
dest = sub(indices, i, [Colon() for j=1:N]...)
broadcast!(identity, dest, arr)
end
I had to add an initial singleton dimension on the entries of arrays to line up with the axes of indices (newaxis had been useful here...).
Then I go through each coordinate, create a subarray (a view) on the relevant part of indices, and fill it. (Indexing will default to returning subarrays in Julia 0.4, but for now we have to use sub explicitly).
The call to broadcast! just evaluates the identity function identity(x)=x on the input arr=arrays[i], broadcasts to the shape of the output. There's no efficiency lost in using the identity function for this; broadcast! generates a specialized function based on the given function, number of arguments, and number of dimensions of the result.
I guess this is the same as the MATLAB meshgrid functionality. I've never really thought about the generalization to more than two dimensions, so its a bit harder to get my head around.
First, here is my completely general version, which is kinda crazy but I can't think of a better way to do it without generating code for common dimensions (e.g. 2, 3)
function numpy_mgridN(dims...)
X = Any[zeros(Int,dims...) for d in 1:length(dims)]
for d in 1:length(dims)
base_idx = Any[1:nd for nd in dims]
for i in 1:dims[d]
cur_idx = copy(base_idx)
cur_idx[d] = i
X[d][cur_idx...] = i
end
end
#show X
end
X = numpy_mgridN(3,4,5)
#show X[1][1,2,3] # 1
#show X[2][1,2,3] # 2
#show X[3][1,2,3] # 3
Now, what I mean by code generation is that, for the 2D case, you can simply do
function numpy_mgrid(dim1,dim2)
X = [i for i in 1:dim1, j in 1:dim2]
Y = [j for i in 1:dim1, j in 1:dim2]
return X,Y
end
and for the 3D case:
function numpy_mgrid(dim1,dim2,dim3)
X = [i for i in 1:dim1, j in 1:dim2, k in 1:dim3]
Y = [j for i in 1:dim1, j in 1:dim2, k in 1:dim3]
Z = [k for i in 1:dim1, j in 1:dim2, k in 1:dim3]
return X,Y,Z
end
Test with, e.g.
X,Y,Z=numpy_mgrid(3,4,5)
#show X
#show Y
#show Z
I guess mgrid shoves them all into one tensor, so you could do that like this
all = cat(4,X,Y,Z)
which is still slightly different:
julia> all[1,2,3,:]
1x1x1x3 Array{Int64,4}:
[:, :, 1, 1] =
1
[:, :, 1, 2] =
2
[:, :, 1, 3] =
3
julia> vec(all[1,2,3,:])
3-element Array{Int64,1}:
1
2
3

Initialize an array of arrays in Julia

I'm trying to create an array of two arrays. However, a = [[1, 2], [3, 4]] doesn't do that, it actually concats the arrays. This is true in Julia: [[1, 2], [3, 4]] == [1, 2, 3, 4]. Any idea?
As a temporary workaround, I use push!(push!(Array{Int, 1}[], a), b).
If you want an array of arrays as opposed to a matrix (i.e. 2-dimensional Array):
a = Array[ [1,2], [3,4] ]
You can parameterize (specify the type of the elements) an Array literal by putting the type in front of the []. So here we are parameterizing the Array literal with the Array type. This changes the interpretation of brackets inside the literal declaration.
Sean Mackesey's answer will give you something of type Array{Array{T,N},1} (or Array{Array{Int64,N},1}, if you put the type in front of []). If you instead want something more strongly typed, for example a vector of vectors of Int (i.e. Array{Array{Int64,1},1}), use the following:
a = Vector{Int}[ [1,2], [3,4] ]
In Julia v0.5, the original syntax now produces the desired result:
julia> a = [[1, 2], [3, 4]]
2-element Array{Array{Int64,1},1}:
[1,2]
[3,4]
julia> VERSION
v"0.5.0"
For a general answer on constructing Arrays of type Array:
In Julia, you can have an Array that holds other Array type objects. Consider the following examples of initializing various types of Arrays:
A = Array{Float64}(10,10) # A single Array, dimensions 10 by 10, of Float64 type objects
B = Array{Array}(10,10,10) # A 10 by 10 by 10 Array. Each element is an Array of unspecified type and dimension.
C = Array{Array{Float64}}(10) ## A length 10, one-dimensional Array. Each element is an Array of Float64 type objects but unspecified dimensions
D = Array{Array{Float64, 2}}(10) ## A length 10, one-dimensional Array. Each element of is an 2 dimensional array of Float 64 objects
Consider for instance, the differences between C and D here:
julia> C[1] = rand(3)
3-element Array{Float64,1}:
0.604771
0.985604
0.166444
julia> D[1] = rand(3)
ERROR: MethodError:
rand(3) produces an object of type Array{Float64,1}. Since the only specification for the elements of C are that they be Arrays with elements of type Float64, this fits within the definition of C. But, for D we specified that the elements must be 2 dimensional Arrays. Thus, since rand(3) does not produce a 2 dimensional array, we cannot use it to assign a value to a specific element of D
Specify Specific Dimensions of Arrays within an Array
Although we can specify that an Array will hold elements which are of type Array, and we can specify that, e.g. those elements should be 2-dimensional Arrays, we cannot directly specify the dimenions of those elements. E.g. we can't directly specify that we want an Array holding 10 Arrays, each of which being 5,5. We can see this from the syntax for the Array() function used to construct an Array:
Array{T}(dims)
constructs an uninitialized dense array with element type T. dims may be a tuple or a series of integer arguments. The syntax Array(T, dims) is also available, but deprecated.
The type of an Array in Julia encompasses the number of the dimensions but not the size of those dimensions. Thus, there is no place in this syntax to specify the precise dimensions. Nevertheless, a similar effect could be achieved using an Array comprehension:
E = [Array{Float64}(5,5) for idx in 1:10]
For those wondering, in v0.7 this is rather similar:
Array{Array{Float64,1},2}(undef, 10,10) #creates a two-dimensional array, ten rows and ten columns where each element is an array of type Float64
Array{Array{Float64, 2},1}(undef,10) #creates a one-dimensional array of length ten, where each element is a two-dimensional array of type Float64
You probably want a matrix:
julia> a = [1 2; 3 4]
2x2 Int64 Array:
1 2
3 4
Maybe a tuple:
julia> a = ([1,2],[3,4,5])
([1,2],[3,4,5])
You can also do {[1,2], [3,4]} which creates an Array{Any,1} containing [1,2] and [3,4] instead of an Array{Array{T,N},1}.

Resources