Related
I need vectors such that
[1,1,1,1,...,1]
[2,1,1,1,...,1]
[3,1,1,1,...,1]
.
.
.
[J1,1,1,1,...,1]
[1,2,1,1,...,1]
.
.
.
[1,J2,1,1,...,1]
[1,1,2,1,...,1]
.
.
.
[1,1,J3,1,...,1]
.
.
.
[1,1,1,1,1,JD]
In the case of D=5, it is easy to implement.
J = [3,4,5,3,4]
D = length(J)
H = Vector{Vector{Int8}}(undef,1+sum( J.-1 ))
cum = cumsum(J)
H[1:cum[1]] = [[i,1,1,1,1] for i=1:J[1]]
H[cum[1]+1:cum[2]-1] = [[1,j,1,1,1] for j=2:J[2]]
H[cum[2]+0:cum[3]-2] = [[1,1,k,1,1] for k=2:J[3]]
H[cum[3]-1:cum[4]-3] = [[1,1,1,l,1] for l=2:J[4]]
H[cum[4]-2:cum[5]-4] = [[1,1,1,1,m] for m=2:J[5]];
#15-element Vector{Vector{Int8}}:
# [1, 1, 1, 1, 1]
# [2, 1, 1, 1, 1]
# [3, 1, 1, 1, 1]
# [1, 2, 1, 1, 1]
# [1, 3, 1, 1, 1]
# [1, 4, 1, 1, 1]
# [1, 1, 2, 1, 1]
# [1, 1, 3, 1, 1]
# [1, 1, 4, 1, 1]
# [1, 1, 5, 1, 1]
# [1, 1, 1, 2, 1]
# [1, 1, 1, 3, 1]
# [1, 1, 1, 1, 2]
# [1, 1, 1, 1, 3]
# [1, 1, 1, 1, 4]
How can I implement it with general D?
I wrote the code as follows:
J = [3,5,4,2,5,8]
D = length(J)
H = Vector{Vector{Int8}}(undef,1+sum( J.-1 ))
cum = cumsum(J)
for d = 1:D
if d==1
for q = 1:J[1]
one_hot = ones(Int8,D)
one_hot[d] = q
H[q] = one_hot
end
else
for q = 2:J[d]
one_hot = ones(Int8,D)
one_hot[d] = q
H[cum[d-1]+(2-d)+q-1] = one_hot
end
end
end
But I think there is a better method.
Do you have any idea?
EDIT
Thank you for providing ideas.
I conducted a numerical experiment to compare your code. Apparently, AboAmmar's code is the best in terms of efficiency.
using BenchmarkTools
J = [120,120,120,120,120]
#btime get_H_August(J)
16.900 μs (600 allocations: 79.59 KiB)
#btime get_H_Stepan(J)
1.733 μs (2 allocations: 23.39 KiB)
#btime get_H_AboAmmar(J)
705.755 ns (1 allocation: 3.06 KiB)
#btime get_H_Dan(J)
72.900 μs (1795 allocations: 177.88 KiB)
function get_H_August(J)
H = typeof(J)[ones(size(J))] # first row of 1's
sizehint!(H, 1+sum(J.-1)) # we know the final size
for (idx, j) in enumerate(J)
for i = 2:j
# Place `i` at index `idx` and 1's elsewhere
row = ifelse.(1:length(J) .== idx, i, 1)
push!(H, row)
end
end
return H
end
function get_H_Stepan(J)
colsize = sum(J) - (length(J) - 1)
M = fill(1, (colsize, length(J)))
for (j, jval) in enumerate(J)
if j == 1
M[1:J[1], 1] .= 1:J[1]
continue
end
s = sum(#view J[1:j-1]) - j + 3 # start index is
# sum of previous J's - number of intersections + 1
# number of intersections = length of previous J's array - 1
# length of previous J's array is j - 1
# so, sum - (j - 1 - 1) + 1
f = s + (jval - 2) # final index
M[s:f, j] .= 2:jval # filling
end
return M
end
function get_H_AboAmmar(J)
l = 1
H = ones(Int8, sum(J)-length(J)+1,length(J))
for (i,j) in pairs(J)
for k in 2:j
H[l+=1,i] = k
end
end
return H
end
function get_H_Dan(J)
D = length(J)
H = vcat([[vcat(ones(Int8,i-1),Int8(k),ones(Int8,D-i))
for k=1+(i>1):J[i]] for i=1:D]...)
return H
end
Can be written quite easily in array comprehension:
julia> J = [3, 4, 5, 3, 4];
julia> l = length(J);
julia> H = [(v=ones(Int8,l);v[i]=k;v) for (i,j) in pairs(J) for k in 2:j];
julia> H = [[ones(Int8,l)]; H]
15-element Vector{Vector{Int8}}:
[1, 1, 1, 1, 1]
[2, 1, 1, 1, 1]
[3, 1, 1, 1, 1]
[1, 2, 1, 1, 1]
[1, 3, 1, 1, 1]
[1, 4, 1, 1, 1]
[1, 1, 2, 1, 1]
[1, 1, 3, 1, 1]
[1, 1, 4, 1, 1]
[1, 1, 5, 1, 1]
[1, 1, 1, 2, 1]
[1, 1, 1, 3, 1]
[1, 1, 1, 1, 2]
[1, 1, 1, 1, 3]
[1, 1, 1, 1, 4]
If you want something 10X faster, then build H as a matrix and use its rows like this:
l = 1
H = ones(Int8, sum(J)-length(J)+1,length(J))
for (i,j) in pairs(J)
for k in 2:j
H[l+=1,i] = k
end
end
I've come with
J = [3, 4, 5, 3, 4]
colsize = sum(J) - (length(J) - 1)
M = fill(1, (colsize, length(J)))
for (j, jval) in enumerate(J)
if j == 1
M[1:J[1], 1] .= 1:J[1]
continue
end
s = sum(#view J[1:j-1]) - j + 3 # start index is
# sum of previous J's - number of intersections + 1
# number of intersections = length of previous J's array - 1
# length of previous J's array is j - 1
# so, sum - (j - 1 - 1) + 1
f = s + (jval - 2) # final index
M[s:f, j] .= 2:jval # filling
end
# Your vectors
#show M[1, :]
#show M[2, :]
#show M[3, :]
#show M[4, :]
#show M[5, :]
# For debug
# for row in eachrow(M)
# println(row)
# end
My idea is to look at the desired vectors as rows of a matrix and to fill the matrix' columns.
There's many ways to do this of course - I think this is readable and concise enough.
J = [3,4,5,3,4]
H = typeof(J)[ones(size(J))] # first row of 1's
sizehint!(H, 1+sum(J.-1)) # we know the final size
for (idx, j) in enumerate(J)
for i = 2:j
# Place `i` at index `idx` and 1's elsewhere
row = ifelse.(1:length(J) .== idx, i, 1)
push!(H, row)
end
end
This version looks quite short and cute:
julia> J = [3,4,5,3,4];
julia> D = length(J);
julia> H = vcat([[vcat(ones(Int8,i-1),Int8(k),ones(Int8,D-i))
for k=1+(i>1):J[i]] for i=1:D]...)
15-element Vector{Vector{Int8}}:
[1, 1, 1, 1, 1]
[2, 1, 1, 1, 1]
[3, 1, 1, 1, 1]
[1, 2, 1, 1, 1]
[1, 3, 1, 1, 1]
[1, 4, 1, 1, 1]
[1, 1, 2, 1, 1]
[1, 1, 3, 1, 1]
[1, 1, 4, 1, 1]
[1, 1, 5, 1, 1]
[1, 1, 1, 2, 1]
[1, 1, 1, 3, 1]
[1, 1, 1, 1, 2]
[1, 1, 1, 1, 3]
[1, 1, 1, 1, 4]
I am looking at vectors [a,b,c], for a,b,c in [-1,0,1] along with a function, cycle, which shifts each entry of a vector one to the left: cycle( v ) = [v[3], v[1], v[2]].
I want to only consider vectors such that no two vectors are "cycle-equivalent"; ie: if I look at vectors x, y, I don't want y = cycle( x ).
What I tried was setting up a vector V which had all 27 of my possible vectors, and then defining the following:
removecycle( V, n ) = {
local( N );
N = setsearch( V, cycle( V[n] ) );
return( V[^N] );
}
This allows me to specify a specific vector, apply the function, and then return a new vector with the outcome, if there is one, removed. The issue of course is then I have to repeat this with the new vector, and repeat again and again, and opens myself up to human error.
How can I automate this? I imagine it is possible to set it up to have my vector of vectors V, test cycle( V[1] ), throw away the result, return a new vector W, then test cycle( W[2] ), etc etc until all possibilities have been tested. But I'm just not sure how to set it up!
Edit: MNWE, with numbers changed to above for convenience.
V=[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 1], [1, 2, 2], [1, 2, 3], [1, 3, 1], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 2], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 2], [3, 3, 3]];
vecsort(vecsort(V),are_cycles,8)
> [[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 2], [1, 2, 3], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 3]]
#vecsort(vecsort(V),are_cycles,8)
> 23
In my case, I would have cycle( [1, 1, 2] ) = [2, 1, 1], so I would want [2, 1, 1] removed as well, but this hasn't happened. As said, I guess the comparator needs improving, but I'm not sure how!
You can remove the duplicates with help of the custom comparator via vecsort(_, _, 8). See the MWE below:
all_cycles(v) = [[v[3], v[1], v[2]], [v[2], v[3], v[1]]];
contains(list, value) = {
#select(n -> n == value, list) > 0
};
\\ your comparator here.
are_cycles(v1, v2) = {
if(contains(all_cycles(v1), v2), 0, lex(v1, v2));
};
V = [];
vecsort(vecsort(V), are_cycles, 8)
> []
V = all_cycles([1, 2, 3]);
vecsort(vecsort(V), are_cycles, 8)
> [[2, 3, 1]]
V = concat([[1, 2, 3], [3, 2, 1]], all_cycles([1, 2, 3]));
vecsort(vecsort(V), are_cycles, 8)
> [[1, 2, 3], [3, 2, 1]]
V = concat(V, all_cycles([3, 2, 1]));
vecsort(vecsort(V), are_cycles, 8)
> [[1, 2, 3], [1, 3, 2]]
Edit: much easier approach is to substitute each element with the representative of equivalence class it belongs to.
Now no custom comparator is required.
all_cycles(v) = [v, cycle(v), cycle(cycle(v))];
representative(v) = vecsort(all_cycles(v))[1];
V=[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 1], [1, 2, 2], [1, 2, 3], [1, 3, 1], [1, 3, 2], [1, 3, 3], [2, 1, 1], [2, 1, 2], [2, 1, 3], [2, 2, 1], [2, 2, 2], [2, 2, 3], [2, 3, 1], [2, 3, 2], [2, 3, 3], [3, 1, 1], [3, 1, 2], [3, 1, 3], [3, 2, 1], [3, 2, 2], [3, 2, 3], [3, 3, 1], [3, 3, 2], [3, 3, 3]];
#vecsort(apply(representative, V),,8)
> 11
vecsort(apply(representative, V),,8)
> [[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 2, 2], [1, 2, 3], [1, 3, 2], [1, 3, 3], [2, 2, 2], [2, 2, 3], [2, 3, 3], [3, 3, 3]]
x = torch.tensor([1, 2, 3])
print(x)
# some operatons?
print(x)
Out:
tensor([[ 1, 2, 3]])
tensor([[ 1, 2, 3, 4]])
I tried tensor.expand() method, but nor success...
I find solution...
>>> x = torch.tensor([[1, 2, 3]])
>>> print(x)
>>> print(x.size())
tensor([[1, 2, 3]])
torch.Size([1, 3])
>>> x = x.repeat_interleave(torch.tensor([1, 1, 2]))
>>> print(x)
>>> print(x.size())
tensor([1, 2, 3, 3])
torch.Size([4])
>>> x[3] = 4
>>> x = x.unsqueeze(0)
>>> print(x)
>>> print(x.size())
tensor([[1, 2, 3, 4]])
torch.Size([1, 4])
I was trying some operations on the List object and wanted to see some "broadcast" behavior :
x = [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x = -1*x
In [46]: x
Out[46]: []
I was expecting something like x = [1, -1, -2, -3, -4, -5, -6, -7, -8, -9].
What is actually happening?
You can only this kind of multiplication with a pandas Series (or better the underlaying numpy array). If you write something like
List = n * List
with n as an integer your list gets resized by n:
x = [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x = 3*x
print(x)
>> [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And negative numbers will remove your list entries (treated as 0 - see here).
Values of n less than 0 are treated as 0 (which yields an empty
sequence of the same type as s).
So you have to use one of these methods to multiply each list element:
NewList = [i * 5 for i in List]
for i in List:
NewList.append(i * 5)
import pandas as pd
s = pd.Series(List)
NewList = (s * 5).tolist()
You want the following:
x = [-1 * i for i in x]
I am using permutations from the Combinatorics library on a list with many repeated values. My issue is that permutations is creating all permutations, leading to overflow, even though many of the permutations are identical.
julia> collect(permutations([1, 1, 2, 2], 4))
24-element Array{Array{Int64,1},1}:
[1, 1, 2, 2]
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 1, 2, 2]
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
[2, 2, 1, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
[2, 2, 1, 1]
Lots of identical values. What I really want are only the unique permutations, without needing to first generate all permutations:
julia> unique(collect(permutations([1, 1, 2, 2], 4)))
6-element Array{Array{Int64,1},1}:
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
I could see the argument that permutations should always return all permutations, whether unique or not, but is there a way to generate only the unique permutations so I don't run out of memory?
Going through unique can be prohibitive even for vectors of relatively small size (e.g. 14 is I think already problematic). In such cases you can consider something like this:
using Combinatorics, StatsBase
function trans(x, v::Dict{T, Int}, l) where T
z = collect(1:l)
idxs = Vector{Int}[]
for k in x
push!(idxs, z[k])
deleteat!(z, k)
end
res = Vector{T}(undef, l)
for (j, k) in enumerate(keys(v))
for i in idxs[j]
res[i] = k
end
end
res
end
function myperms(x)
v = countmap(x)
s = Int[length(x)]
for (k,y) in v
l = s[end]-y
l > 0 && push!(s, l)
end
iter = Iterators.product((combinations(1:s[i], vv) for (i, vv) in enumerate(values(v)))...)
(trans(z, v, length(x)) for z in iter)
end
(this is a quick writeup so the code quality is not production grade - in terms of style and squeezing out maximum performance, but I hope it gives you the idea how this can be approached)
This gives you a generator of unique permutations taking into account duplicates. It is reasonably fast:
julia> x = [fill(1, 7); fill(2, 7)]
14-element Array{Int64,1}:
1
1
1
1
1
1
1
2
2
2
2
2
2
2
julia> #time length(collect(myperms(x)))
0.002902 seconds (48.08 k allocations: 4.166 MiB)
3432
While this operation for unique(permutations(x)) would not terminate in any reasonable size.
There is a multiset_permutation in the package Combinatorics:
julia> for p in multiset_permutations([1,1,2,2],4) p|>println end
[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]
I ran into this problem too, and use IterTools's distinct:
using IterTools, Combinatorics
distinct(permutations([1, 1, 2, 2], 4))