Generating only unique permutations - julia

I am using permutations from the Combinatorics library on a list with many repeated values. My issue is that permutations is creating all permutations, leading to overflow, even though many of the permutations are identical.
julia> collect(permutations([1, 1, 2, 2], 4))
24-element Array{Array{Int64,1},1}:
[1, 1, 2, 2]
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 1, 2, 2]
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
[2, 2, 1, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
[2, 2, 1, 1]
Lots of identical values. What I really want are only the unique permutations, without needing to first generate all permutations:
julia> unique(collect(permutations([1, 1, 2, 2], 4)))
6-element Array{Array{Int64,1},1}:
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
I could see the argument that permutations should always return all permutations, whether unique or not, but is there a way to generate only the unique permutations so I don't run out of memory?

Going through unique can be prohibitive even for vectors of relatively small size (e.g. 14 is I think already problematic). In such cases you can consider something like this:
using Combinatorics, StatsBase
function trans(x, v::Dict{T, Int}, l) where T
z = collect(1:l)
idxs = Vector{Int}[]
for k in x
push!(idxs, z[k])
deleteat!(z, k)
end
res = Vector{T}(undef, l)
for (j, k) in enumerate(keys(v))
for i in idxs[j]
res[i] = k
end
end
res
end
function myperms(x)
v = countmap(x)
s = Int[length(x)]
for (k,y) in v
l = s[end]-y
l > 0 && push!(s, l)
end
iter = Iterators.product((combinations(1:s[i], vv) for (i, vv) in enumerate(values(v)))...)
(trans(z, v, length(x)) for z in iter)
end
(this is a quick writeup so the code quality is not production grade - in terms of style and squeezing out maximum performance, but I hope it gives you the idea how this can be approached)
This gives you a generator of unique permutations taking into account duplicates. It is reasonably fast:
julia> x = [fill(1, 7); fill(2, 7)]
14-element Array{Int64,1}:
1
1
1
1
1
1
1
2
2
2
2
2
2
2
julia> #time length(collect(myperms(x)))
0.002902 seconds (48.08 k allocations: 4.166 MiB)
3432
While this operation for unique(permutations(x)) would not terminate in any reasonable size.

There is a multiset_permutation in the package Combinatorics:
julia> for p in multiset_permutations([1,1,2,2],4) p|>println end
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]

I ran into this problem too, and use IterTools's distinct:
using IterTools, Combinatorics
distinct(permutations([1, 1, 2, 2], 4))

Related

Algorithm to generate all nonnegative solutions to x1 + x2 + ....xk = n. (Stars and bars)

What is an algorithm to generate every solution to x1 + x2 + ....xk = n, xi >= 0? I am aware it is some form of backtracking but I am not sure how to implement this.
Without recursion, quicky modified from my answer with non-zero partitions, implements stars and bars principle
from itertools import combinations
#print(parts(7,4))
def partswithzeros(n,r):
result = []
for comb in combinations(range(n+r-1), r-1):
zpart = [comb[0]] + [comb[i]-comb[i-1]-1 for i in range(1,r-1)] + [n+r-2 - comb[-1]]
result.append(zpart)
return(result)
print(partswithzeros(5,3))
>>>[[0, 0, 5], [0, 1, 4], [0, 2, 3], [0, 3, 2], [0, 4, 1], [0, 5, 0], [1, 0, 4],
[1, 1, 3], [1, 2, 2], [1, 3, 1], [1, 4, 0], [2, 0, 3], [2, 1, 2], [2, 2, 1],
[2, 3, 0], [3, 0, 2], [3, 1, 1], [3, 2, 0], [4, 0, 1], [4, 1, 0], [5, 0, 0]]
Straightforward recursion will work here. The first element of the solution can take any value xi from 0 to n. To find the second and subsequent values, find the possible partitions for (n−xi, k−1). The recursion stops when k=1, in which case the final element must be equal to the remaining value of n.
I hope that makes sense. Perhaps the following Python code will help:
def get_partitions(n, k, p=[]):
# End point. If k=1, then the last element must equal n
# p[] starts as an empty list; we will append values to it
# at each recursive step
if k <= 1:
if k == 1 and n >= 0:
yield(p + [n])
return
# If k is zero, then there is no solution
yield None
# Recursively loop over every possible value for the current element
for i in range(0,n+1):
yield from get_partitions(n-i, k-1, p+[i])
>>> for p in get_partitions(4,3):
... print(p)
...
[0, 0, 4]
[0, 1, 3]
[0, 2, 2]
[0, 3, 1]
[0, 4, 0]
[1, 0, 3]
[1, 1, 2]
[1, 2, 1]
[1, 3, 0]
[2, 0, 2]
[2, 1, 1]
[2, 2, 0]
[3, 0, 1]
[3, 1, 0]
[4, 0, 0]

Generating vectors fixing some values in Julia

I need vectors such that
[1,1,1,1,...,1]
[2,1,1,1,...,1]
[3,1,1,1,...,1]
.
.
.
[J1,1,1,1,...,1]
[1,2,1,1,...,1]
.
.
.
[1,J2,1,1,...,1]
[1,1,2,1,...,1]
.
.
.
[1,1,J3,1,...,1]
.
.
.
[1,1,1,1,1,JD]
In the case of D=5, it is easy to implement.
J = [3,4,5,3,4]
D = length(J)
H = Vector{Vector{Int8}}(undef,1+sum( J.-1 ))
cum = cumsum(J)
H[1:cum[1]] = [[i,1,1,1,1] for i=1:J[1]]
H[cum[1]+1:cum[2]-1] = [[1,j,1,1,1] for j=2:J[2]]
H[cum[2]+0:cum[3]-2] = [[1,1,k,1,1] for k=2:J[3]]
H[cum[3]-1:cum[4]-3] = [[1,1,1,l,1] for l=2:J[4]]
H[cum[4]-2:cum[5]-4] = [[1,1,1,1,m] for m=2:J[5]];
#15-element Vector{Vector{Int8}}:
# [1, 1, 1, 1, 1]
# [2, 1, 1, 1, 1]
# [3, 1, 1, 1, 1]
# [1, 2, 1, 1, 1]
# [1, 3, 1, 1, 1]
# [1, 4, 1, 1, 1]
# [1, 1, 2, 1, 1]
# [1, 1, 3, 1, 1]
# [1, 1, 4, 1, 1]
# [1, 1, 5, 1, 1]
# [1, 1, 1, 2, 1]
# [1, 1, 1, 3, 1]
# [1, 1, 1, 1, 2]
# [1, 1, 1, 1, 3]
# [1, 1, 1, 1, 4]
How can I implement it with general D?
I wrote the code as follows:
J = [3,5,4,2,5,8]
D = length(J)
H = Vector{Vector{Int8}}(undef,1+sum( J.-1 ))
cum = cumsum(J)
for d = 1:D
if d==1
for q = 1:J[1]
one_hot = ones(Int8,D)
one_hot[d] = q
H[q] = one_hot
end
else
for q = 2:J[d]
one_hot = ones(Int8,D)
one_hot[d] = q
H[cum[d-1]+(2-d)+q-1] = one_hot
end
end
end
But I think there is a better method.
Do you have any idea?
EDIT
Thank you for providing ideas.
I conducted a numerical experiment to compare your code. Apparently, AboAmmar's code is the best in terms of efficiency.
using BenchmarkTools
J = [120,120,120,120,120]
#btime get_H_August(J)
16.900 μs (600 allocations: 79.59 KiB)
#btime get_H_Stepan(J)
1.733 μs (2 allocations: 23.39 KiB)
#btime get_H_AboAmmar(J)
705.755 ns (1 allocation: 3.06 KiB)
#btime get_H_Dan(J)
72.900 μs (1795 allocations: 177.88 KiB)
function get_H_August(J)
H = typeof(J)[ones(size(J))] # first row of 1's
sizehint!(H, 1+sum(J.-1)) # we know the final size
for (idx, j) in enumerate(J)
for i = 2:j
# Place `i` at index `idx` and 1's elsewhere
row = ifelse.(1:length(J) .== idx, i, 1)
push!(H, row)
end
end
return H
end
function get_H_Stepan(J)
colsize = sum(J) - (length(J) - 1)
M = fill(1, (colsize, length(J)))
for (j, jval) in enumerate(J)
if j == 1
M[1:J[1], 1] .= 1:J[1]
continue
end
s = sum(#view J[1:j-1]) - j + 3 # start index is
# sum of previous J's - number of intersections + 1
# number of intersections = length of previous J's array - 1
# length of previous J's array is j - 1
# so, sum - (j - 1 - 1) + 1
f = s + (jval - 2) # final index
M[s:f, j] .= 2:jval # filling
end
return M
end
function get_H_AboAmmar(J)
l = 1
H = ones(Int8, sum(J)-length(J)+1,length(J))
for (i,j) in pairs(J)
for k in 2:j
H[l+=1,i] = k
end
end
return H
end
function get_H_Dan(J)
D = length(J)
H = vcat([[vcat(ones(Int8,i-1),Int8(k),ones(Int8,D-i))
for k=1+(i>1):J[i]] for i=1:D]...)
return H
end
Can be written quite easily in array comprehension:
julia> J = [3, 4, 5, 3, 4];
julia> l = length(J);
julia> H = [(v=ones(Int8,l);v[i]=k;v) for (i,j) in pairs(J) for k in 2:j];
julia> H = [[ones(Int8,l)]; H]
15-element Vector{Vector{Int8}}:
[1, 1, 1, 1, 1]
[2, 1, 1, 1, 1]
[3, 1, 1, 1, 1]
[1, 2, 1, 1, 1]
[1, 3, 1, 1, 1]
[1, 4, 1, 1, 1]
[1, 1, 2, 1, 1]
[1, 1, 3, 1, 1]
[1, 1, 4, 1, 1]
[1, 1, 5, 1, 1]
[1, 1, 1, 2, 1]
[1, 1, 1, 3, 1]
[1, 1, 1, 1, 2]
[1, 1, 1, 1, 3]
[1, 1, 1, 1, 4]
If you want something 10X faster, then build H as a matrix and use its rows like this:
l = 1
H = ones(Int8, sum(J)-length(J)+1,length(J))
for (i,j) in pairs(J)
for k in 2:j
H[l+=1,i] = k
end
end
I've come with
J = [3, 4, 5, 3, 4]
colsize = sum(J) - (length(J) - 1)
M = fill(1, (colsize, length(J)))
for (j, jval) in enumerate(J)
if j == 1
M[1:J[1], 1] .= 1:J[1]
continue
end
s = sum(#view J[1:j-1]) - j + 3 # start index is
# sum of previous J's - number of intersections + 1
# number of intersections = length of previous J's array - 1
# length of previous J's array is j - 1
# so, sum - (j - 1 - 1) + 1
f = s + (jval - 2) # final index
M[s:f, j] .= 2:jval # filling
end
# Your vectors
#show M[1, :]
#show M[2, :]
#show M[3, :]
#show M[4, :]
#show M[5, :]
# For debug
# for row in eachrow(M)
# println(row)
# end
My idea is to look at the desired vectors as rows of a matrix and to fill the matrix' columns.
There's many ways to do this of course - I think this is readable and concise enough.
J = [3,4,5,3,4]
H = typeof(J)[ones(size(J))] # first row of 1's
sizehint!(H, 1+sum(J.-1)) # we know the final size
for (idx, j) in enumerate(J)
for i = 2:j
# Place `i` at index `idx` and 1's elsewhere
row = ifelse.(1:length(J) .== idx, i, 1)
push!(H, row)
end
end
This version looks quite short and cute:
julia> J = [3,4,5,3,4];
julia> D = length(J);
julia> H = vcat([[vcat(ones(Int8,i-1),Int8(k),ones(Int8,D-i))
for k=1+(i>1):J[i]] for i=1:D]...)
15-element Vector{Vector{Int8}}:
[1, 1, 1, 1, 1]
[2, 1, 1, 1, 1]
[3, 1, 1, 1, 1]
[1, 2, 1, 1, 1]
[1, 3, 1, 1, 1]
[1, 4, 1, 1, 1]
[1, 1, 2, 1, 1]
[1, 1, 3, 1, 1]
[1, 1, 4, 1, 1]
[1, 1, 5, 1, 1]
[1, 1, 1, 2, 1]
[1, 1, 1, 3, 1]
[1, 1, 1, 1, 2]
[1, 1, 1, 1, 3]
[1, 1, 1, 1, 4]

Convert Array of NamedTuple to Arrays in Julia

I have an Array of NamedTuple I read out of an hdf5 file in Julia. It has names X, Y, and Z. Is there an succinct way to convert this to three arrays containing the values of X, Y, and Z respectively?
typeof(science_h5["/Nav/Position"][:])
Array{NamedTuple{(:X, :Y, :Z),Tuple{Float32,Float32,Float32}},1}
You can use Tables.columntable from Tables.jl:
julia> a = [(X=i, Y=i+1, Z=i+2) for i in 1:5]
5-element Vector{NamedTuple{(:X, :Y, :Z), Tuple{Int64, Int64, Int64}}}:
(X = 1, Y = 2, Z = 3)
(X = 2, Y = 3, Z = 4)
(X = 3, Y = 4, Z = 5)
(X = 4, Y = 5, Z = 6)
(X = 5, Y = 6, Z = 7)
julia> Tables.columntable(a)
(X = [1, 2, 3, 4, 5], Y = [2, 3, 4, 5, 6], Z = [3, 4, 5, 6, 7])
If you want to only use Julia Base you could do:
julia> X, Y, Z = [getindex.(a, i) for i in 1:3]
3-element Vector{Vector{Int64}}:
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]
or
julia> X, Y, Z = [getproperty.(a, i) for i in (:X, :Y, :Z)]
3-element Vector{Vector{Int64}}:
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]

Removing items from a vector in a loop - PARI/GP

I am looking at vectors [a,b,c], for a,b,c in [-1,0,1] along with a function, cycle, which shifts each entry of a vector one to the left: cycle( v ) = [v[3], v[1], v[2]].
I want to only consider vectors such that no two vectors are "cycle-equivalent"; ie: if I look at vectors x, y, I don't want y = cycle( x ).
What I tried was setting up a vector V which had all 27 of my possible vectors, and then defining the following:
removecycle( V, n ) = {
local( N );
N = setsearch( V, cycle( V[n] ) );
return( V[^N] );
}
This allows me to specify a specific vector, apply the function, and then return a new vector with the outcome, if there is one, removed. The issue of course is then I have to repeat this with the new vector, and repeat again and again, and opens myself up to human error.
How can I automate this? I imagine it is possible to set it up to have my vector of vectors V, test cycle( V[1] ), throw away the result, return a new vector W, then test cycle( W[2] ), etc etc until all possibilities have been tested. But I'm just not sure how to set it up!
Edit: MNWE, with numbers changed to above for convenience.
V=[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 1], [1, 2, 2], [1, 2, 3], [1, 3, 1], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 2], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 2], [3, 3, 3]];
vecsort(vecsort(V),are_cycles,8)
> [[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 2], [1, 2, 3], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 3]]
#vecsort(vecsort(V),are_cycles,8)
> 23
In my case, I would have cycle( [1, 1, 2] ) = [2, 1, 1], so I would want [2, 1, 1] removed as well, but this hasn't happened. As said, I guess the comparator needs improving, but I'm not sure how!
You can remove the duplicates with help of the custom comparator via vecsort(_, _, 8). See the MWE below:
all_cycles(v) = [[v[3], v[1], v[2]], [v[2], v[3], v[1]]];
contains(list, value) = {
#select(n -> n == value, list) > 0
};
\\ your comparator here.
are_cycles(v1, v2) = {
if(contains(all_cycles(v1), v2), 0, lex(v1, v2));
};
V = [];
vecsort(vecsort(V), are_cycles, 8)
> []
V = all_cycles([1, 2, 3]);
vecsort(vecsort(V), are_cycles, 8)
> [[2, 3, 1]]
V = concat([[1, 2, 3], [3, 2, 1]], all_cycles([1, 2, 3]));
vecsort(vecsort(V), are_cycles, 8)
> [[1, 2, 3], [3, 2, 1]]
V = concat(V, all_cycles([3, 2, 1]));
vecsort(vecsort(V), are_cycles, 8)
> [[1, 2, 3], [1, 3, 2]]
Edit: much easier approach is to substitute each element with the representative of equivalence class it belongs to.
Now no custom comparator is required.
all_cycles(v) = [v, cycle(v), cycle(cycle(v))];
representative(v) = vecsort(all_cycles(v))[1];
V=[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 1], [1, 2, 2], [1, 2, 3], [1, 3, 1], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 2], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 2], [3, 3, 3]];
#vecsort(apply(representative, V),,8)
> 11
vecsort(apply(representative, V),,8)
> [[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 2], [1, 2, 3], [1, 3, 2], [1, 3, 3], [2, 2, 2], [2, 2, 3], [2, 3, 3], [3, 3, 3]]

Pytorch. How I can expand dimension in tensor (from [[1, 2, 3]] to [[1, 2, 3, 4]])?

x = torch.tensor([1, 2, 3])
print(x)
# some operatons?
print(x)
Out:
tensor([[ 1, 2, 3]])
tensor([[ 1, 2, 3, 4]])
I tried tensor.expand() method, but nor success...
I find solution...
>>> x = torch.tensor([[1, 2, 3]])
>>> print(x)
>>> print(x.size())
tensor([[1, 2, 3]])
torch.Size([1, 3])
>>> x = x.repeat_interleave(torch.tensor([1, 1, 2]))
>>> print(x)
>>> print(x.size())
tensor([1, 2, 3, 3])
torch.Size([4])
>>> x[3] = 4
>>> x = x.unsqueeze(0)
>>> print(x)
>>> print(x.size())
tensor([[1, 2, 3, 4]])
torch.Size([1, 4])

Resources