Roll array, uniform circular shift - julia

Given an array:
arr = [1, 2, 3, 4, 5]
I would like to shift all the elements.
shift!(arr, 2) => [4, 5, 1, 2, 3]
In Python, this is accomplished with Numpy using numpy.roll. How is this done in Julia?

No need to implement it yourself, there is a built-in function for this
julia> circshift(arr, 2)
5-element Array{Int64,1}:
4
5
1
2
3
It's also (slightly) more efficient than roll2 proposed above:
julia> #btime circshift($arr, 2);
68.563 ns (1 allocation: 128 bytes)
julia> #btime roll2($arr, 2);
70.605 ns (4 allocations: 256 bytes)
Note, however, that none of the proposed functions operates in-place. They all create a new array. There is also the built-in circshift!(dest, src, shift) which operates in a preallocated dest (which, however, must be != src).

The function by Seanny123 does a lot of copying can be improved to have smaller memory footprint and execute faster. Consider:
function roll2(arr, step)
len = length(arr)
[view(arr,len-step+1:len); view(arr,1:len-step)]
end
arr = [1,2,3,4,5,6,7,8,9,10];
And now the times (REPL output):
julia> using BenchmarkTools
julia> #btime roll($arr,2);
124.254 ns (3 allocations: 400 bytes)
julia> #btime roll2($arr,2);
73.386 ns (4 allocations: 288 bytes)
Of course the fastest way is to change arr in-place.

You can write a simple function for this:
function roll(arr, step)
return vcat(arr[end-step+1:end], arr[1:end-step])
end
println(roll(1:5, 2))
# => [4, 5, 1, 2, 3]
println(roll(1:6, 4))
# => [3, 4, 5, 6, 1, 2]

Related

How to efficiently generate random unique series of bits with specific length?

Let's say I want to generate 3 unique random series of bits with a length of three. The possible output can be:
001 or [0, 0, 1]
010 or [0, 1, 0]
111 or [1, 1, 1]
#or
011 or [0, 1, 1]
110 or [1, 1, 0]
111 or [1, 1, 1]
# etc.
I provided two notations above (the Vector notation is preferred). The point is where they should be unique. I tried:
julia> unique(convert.(BitVector, rand.(Ref([0, 1]), repeat([3], 3))))
2-element Vector{BitVector}:
[0, 1, 1]
[0, 1, 0]
As you can see, there might be a set of two unique BitVectors rather than 3 and this is natural here. I can replace repeat([3], 3) with repeat([3], 6) to somewhat ensure I would get three unique sets:
julia> unique(convert.(BitVector, rand.(Ref([0, 1]), repeat([3], 5))))[1:3]
3-element Vector{BitVector}:
[1, 0, 0]
[1, 1, 1]
[1, 0, 1]
But I wonder if there's any better idea for this?
*However, I'm really curious about how I can efficiently generate the first notation for this question (like 101, 001, etc.).
Update: The following randBitSeq will be 3X faster. It generates unique random numbers first, then it fills a Boolean matrix with their binary values.
using StatsBase
function randBitSeq(N, L)
M = Matrix{Bool}(undef,N,L)
S = sample(0:2^L-1, N; replace=false)
i = 0
for n in S
i += 1
for j = 1:L
if n > 0
M[i,j] = isodd(n)
n ÷= 2
else
M[i,j] = false
end
end
end
return M
end
#btime randBitSeq(50, 10)
1.350 μs (3 allocations: 9.17 KiB)
# vs.
#btime randseqset(50, 10)
3.050 μs (5 allocations: 10.84 KiB)
Constructing all possible combinations will exponentially eat memory. A better option is to generate a Set of N random binary series of length L each. Then add more series if the required number is not achieved. This seems much faster for N,L > 3.
function randSeq(N, L)
s = Set(rand(Bool,L) for i=1:N)
while length(s) < N
push!(s, rand(Bool,L))
end
s
end
N = 50; L = 10
#btime randSeq($N, $L)
4.071 μs (57 allocations: 4.71 KiB)
Another nice option is:
using Random, StatsBase
function randseqset(N, L)
L < sizeof(Int)*8-1 || error("Short seqs only")
m = BitArray(undef, L, N)
s = sample(0:(1<<L)-1, N; replace=false)
map(i->digits!(#view(m[:,i]), s[i]; base=2), 1:N)
end
A version with simple vectors instead of #views is:
function randseqset(N, L)
L < sizeof(Int)*8-1 || error("Short seqs only")
s = sample(0:(1<<L)-1, N; replace=false)
map(i->digits!(Vector{Bool}(undef, L),
s[i]; base=2), 1:N)
end
It has the benefit of adapting to parameters a bit (inherited from the no-replace sample code). And it is quite performant and allocation thrifty.
For example:
julia> N = 10; L = 4;
julia> #btime randSeq($N, $L);
1.126 μs (16 allocations: 1.12 KiB)
julia> #btime randseqset($N, $L);
654.878 ns (5 allocations: 944 bytes)
PS If Matrix{Bool} preferable to BitMatrix then replace m = ... line with m = Matrix{Bool}(undef, L, N)
PPS As for the question about the strings, the following works (using same logic as above):
randseqstrset(N, L) = getindex.(
bitstring.(sample(0:(1<<L)-1, N; replace=false)),
Ref(sizeof(Int)*8-L+1:sizeof(Int)*8))
for example:
julia> randseqstrset(3,3)
3-element Vector{String}:
"101"
"000"
"011"
UPDATE: If speed is really an issue, another version can use some BitMatrix trickery:
function randBitSeq2(N, L)
BM = BitMatrix(undef, 0,0)
BM.chunks = sample(0:2^L-1, N; replace=false)
BM.dims = (sizeof(Int64)*8, N)
BM.len = sizeof(Int64)*8*N
return #view(BM[1:L,:])
end
This version is called randBitSeq2 because it returns a matrix like randBitSeq but is twice as fast:
julia> #btime randBitSeq2(50,10);
1.882 μs (5 allocations: 9.20 KiB)
julia> #btime randBitSeq(50,10);
3.634 μs (3 allocations: 9.17 KiB)
Here's a thought: given you're only talking about bit strings of length 3, instead of trying to randomly generate them and then enforce uniqueness, how about just take the set of all 3 bit strings and shuffle it around, and then select 3.
For example you could use collect(Iterators.product([0,1],[0,1],[0,1])) to generates all length 3 bit strings, and then shuffle(x)[1:3] from Random to sample without replacement, e.g.
julia> using Random
julia> shuffle(collect(Iterators.product([0,1],[0,1],[0,1])))[1:3]
3-element Array{Tuple{Int64,Int64,Int64},1}:
(0, 1, 1)
(1, 1, 1)
(1, 0, 1)
Also if you want BitVectors instead you can do
julia> shuffle(BitArray.(Iterators.product([0,1],[0,1],[0,1])))[1:3]
3-element Array{BitArray{1},1}:
[1, 0, 1]
[0, 1, 1]
[1, 1, 1]
Obviously this won't scale well with the length of the strings, but a suggestion for this small case.

How to calculate Euclidean distance between a tuple and each tuple in a Vector using map in Julia?

I want to calculate the Euclidean distance between a tuple and each tuple within a Vector in Julia using the map function, like below (but I get two values instead of three):
julia> tups = [
(1, 3),
(11, 2),
(0, 1)
];
julia> map((x, y) -> √(sum((x.-y).^2)), tups, (3, 3))
2-element Vector{Float64}:
2.0
8.06225774829855
How can I make it work correctly?
Julia has the Distances package especially for these types of calculations. The 'Julian way' encourages interoperability between packages to allow benefitting from future development of the ecosystem. For example, new metric definitions, or specialized hardware code to compute distances.
For the problem in the post, the code would look:
julia> using Distances
julia> tups = [
(1, 3),
(11, 2),
(0, 1)
];
julia> euclidean.(tups,Ref((3,3)))
3-element Vector{Float64}:
2.0
8.06225774829855
3.605551275463989
Notice the use of broadcasting instead of map with dot syntax euclidean.. The Ref((3,3)) causes broadcasting to consider (3,3) as a single element to broadcast and not break it to a pair of Ints.
The code you've written is pretty equal to this:
[
func((1, 3), 3),
func((11, 2), 3)
]
The map function iterates over the given collections iter times equal to the lowest length:
julia> length((3, 3)), length(tups)
(2, 3)
So it iterates two times, not three. To make that work, you can repeat the (3, 3), three times or even omit the (3, 3) argument:
julia> map((x, y) -> √(sum((x.-y).^2)), tups, ((3, 3), (3, 3), (3, 3)))
3-element Vector{Float64}:
2.0
8.06225774829855
3.605551275463989
# OR
julia> map((x, y) -> √(sum((x.-y).^2)), tups, ((3, 3) for _∈1:3))
3-element Vector{Float64}:
2.0
8.06225774829855
3.605551275463989
# Or omit the last argument
julia> map(arg -> √((3 - arg[1])^2 + (3 - arg[2])^2), tups)
3-element Vector{Float64}:
2.0
8.06225774829855
3.605551275463989

Combining vectors of unequal length

x = [1, 2, 3, 4]
y = [1, 2]
If I want to be able to operate on the two vectors with a default value filling in, what are the strategies?
E.g. would like to do the following and implicitly fill in with 0 or missing
x + y # would like [2, 4, 3, 4]
Ideally would like to do this in a generic way so that I could do arbitrary operations with the two.
Disregarding whether Julia has something built-in to do this, remember that Julia is fast. This means that you can write code to support this kind of need.
extend!(x, y::Vector, default=0) = extend!(x, length(y), default)
extend!(x, n::Int, default=0) = begin
while length(x) < n
push!(x, default)
end
x
end
Then when you have code such as you describe, you can symmetrically extend x and y:
x = [1, 2, 3, 4]
y = [1, 2]
extend!(x, y)
extend!(y, x)
x + y
==> [2, 4, 3, 4]
Note that this mutates y. In many cases, the desired length would come from outside the code and would be applied to both x and y. I can also imagine that 0 is a bad default in general (even though it is completely appropriate in your context of addition.
A comment below makes the worthy point that you should consider using append! instead of looping over push!. In fact, it is best to measure differences like that if you care about very small differences. I went ahead and tested:
julia> using BenchmarkTools
julia> extend1(x, n) = begin
while length(x) < n
push!(x, 0)
end
x
end
julia> #btime begin
x = rand(10)
sum(x)
end
59.815 ns (1 allocation: 160 bytes)
5.037723569560573
julia> #btime begin
x = rand(10)
extend1(x, 1000)
sum(x)
end
7.281 μs (8 allocations: 20.33 KiB)
6.079832879992913
julia> x = rand(10)
julia> #btime begin
x = rand(10)
append!(x, zeros(990))
sum(x)
end
1.290 μs (3 allocations: 15.91 KiB)
3.688526541987817
julia>
Pushing primitives in a loop is damned fast, allocating a vector of zeros so we can use append! is very slightly faster.
But the real lesson here is seen in the fact that the loop version takes microseconds to append nearly 1000 values (reallocating the array several times). Appending 10 values one by one takes just over 150ns (and append! is slightly faster). This is blindingly fast. Literally doing nothing in R or Python can take longer than this.
This difference would matter in some situations and would be undetectable in many others. If it matters, measure. If it doesn't, do the simplest thing that comes to mind because Julia has your back (performance-wise).
FURTHER UPDATE
Taking a hint from another of Colin's comments, here are results where we use append! but we don't allocate a list. Instead, we use a generator ... that is, a data structure that invents data when asked for it with an interface much like a list. The results are much better than what I showed above.
julia> #btime begin
x = rand(10)
append!(x, (0 for i in 1:990))
sum(x)
end
565.814 ns (2 allocations: 8.03 KiB)
Note the round brackets around the 0 for i in 1:990.
In the end, Colin was right. Using append! is much faster if we can avoid related overheads. Surprisingly, the base function Iterators.repeated(0, 990) is much slower.
But, no matter what, all of these options are pretty blazingly fast and all of them would probably be so fast that none of these subtle differences would matter.
Julia is fun!
Note that if you want to fill with missing or some other type different from the element type in your original vector, then you will need to change the type of your vectors to allow those new elements. The function below will handle any case.
function fillvectors(x, y, fillvalue=missing)
xl = length(x)
yl = length(y)
if xl < yl
x::Vector{Union{eltype(x), typeof(fillvalue)}} = x
for i in xl+1:yl
push!(x, fillvalue)
end
end
if yl < xl
y::Vector{Union{eltype(y), typeof(fillvalue)}} = y
for i in yl+1:xl
push!(y, fillvalue)
end
end
return x, y
end
x = [1, 2, 3, 4]
y = [1, 2]
julia> (x, y) = fillvectors(x, y)
([1, 2, 3, 4], Union{Missing, Int64}[1, 2, missing, missing])
julia> y
4-element Vector{Union{Missing, Int64}}:
1
2
missing
missing
julia> (x, y) = fillvectors(x, y, 0)
([1, 2, 3, 4], [1, 2, 0, 0])
julia> y
4-element Vector{Int64}:
1
2
0
0
julia> (x, y) = fillvectors(x, y, 1.001)
([1, 2, 3, 4], Union{Float64, Int64}[1, 2, 1.001, 1.001])
julia> y
4-element Vector{Union{Float64, Int64}}:
1
2
1.001
1.001

Julia: Generate all non-repeating permutations in set with duplicates

Let's say, I have a vector x = [0, 0, 1, 1]
I want to generate all different permutations. However, the current permutations function in Julia does not recognize the presence of duplicates in the vector. Therefore in this case, it will output the exact same permutation three times (this one, one where both zeros are swapped and one where the ones are swapped).
Does anybody know a workaround? Because in larger system I end up with an out of bounds error...
Many thanks in advance! :)
permutations returns an iterator and hence running it through unique could be quite efficient with regard to memory usage.
julia> unique(permutations([0, 0, 1, 1]))
6-element Array{Array{Int64,1},1}:
[0, 0, 1, 1]
[0, 1, 0, 1]
[0, 1, 1, 0]
[1, 0, 0, 1]
[1, 0, 1, 0]
[1, 1, 0, 0]
I found this answer that I adapted. It expects a sorted vector (or at least repeated values should be together in the list).
julia> function unique_permutations(x::T, prefix=T()) where T
if length(x) == 1
return [[prefix; x]]
else
t = T[]
for i in eachindex(x)
if i > firstindex(x) && x[i] == x[i-1]
continue
end
append!(t, unique_permutations([x[begin:i-1];x[i+1:end]], [prefix; x[i]]))
end
return t
end
end
julia> #btime unique_permutations([0,0,0,1,1,1]);
57.100 μs (1017 allocations: 56.83 KiB)
julia> #btime unique(permutations([0,0,0,1,1,1]));
152.400 μs (2174 allocations: 204.67 KiB)
julia> #btime unique_permutations([1;zeros(Int,100)]);
7.047 ms (108267 allocations: 10.95 MiB)
julia> #btime unique(permutations([1;zeros(Int,8)]));
88.355 ms (1088666 allocations: 121.82 MiB)

Julia - combining vectors into the matrix

Let's assume I have two vectors x = [1, 2] and y = [3, 4]. How to best combine them to get a matrix m = [1 2; 3 4] in Julia Programming language? Thanks in advance for your support.
Note that in vcat(x', y') the operation x' is adjoint so it should not be used if you are working with complex numbers or vector elements that do not have adjoint defined (e.g. strings). Therefore then permutedims should be used but it will be slower as it allocates. A third way to do it is (admittedly it is more cumbersome to type):
julia> [reshape(x, 1, :); reshape(y, 1, :)]
2×2 Array{Int64,2}:
1 2
3 4
It is non allocating like [x'; y'] but does not do a recursive adjoint.
EDIT:
Note for Cameron:
julia> x = repeat(string.('a':'z'), 10^6);
julia> #btime $x';
1.199 ns (0 allocations: 0 bytes)
julia> #btime reshape($x, 1, :);
36.455 ns (2 allocations: 96 bytes)
so reshape allocates but only minimally (it needs to create an array object, while x' creates an immutable struct which does not require allocation).
Also I think it was a design decision to allocate. As for isbitsunion types actually reshape returns a struct so it does not allocate (similarly like for ranges):
julia> #btime reshape($x, 1, :)
12.211 ns (0 allocations: 0 bytes)
1×2 reshape(::Array{Union{Missing, Int64},1}, 1, 2) with eltype Union{Missing, Int64}:
1 missing
Two ways I know of:
julia> x = [1,2];
julia> y = [3,4];
julia> vcat(x', y')
2×2 Array{Int64,2}:
1 2
3 4
julia> permutedims(hcat(x, y))
2×2 Array{Int64,2}:
1 2
3 4
One more option - this one works both with numbers and other objects as Strings:
julia> rotl90([y x])
2×2 Array{Int64,2}:
1 2
3 4
What about
vcat(transpose(x), transpose(y))
or
[transpose(x); transpose(y)]

Resources