Julia - Repeat entries of a vector by another vector (inner) - julia

I have an array x and I would like to repeat each entry of x a number of times specified by the corresponding entries of another array y, of the same length of x.
x = [1, 2, 3, 4, 5] # Array to be repeated
y = [3, 2, 1, 2, 3] # Repetitions for each element of x
# result should be [1, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5]
Is there a way to do this in Julia?

Your x and y vectors constitute what is called a run-length encoding of the vector [1, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5]. So if you take the inverse of the run-length encoding, you will get the vector you are looking for. The StatsBase.jl package contains the rle and inverse_rle functions. We can use inverse_rle like this:
julia> using StatsBase
julia> x = [1, 2, 3, 4, 5];
julia> y = [3, 2, 1, 2, 3];
julia> inverse_rle(x, y)
11-element Vector{Int64}:
1
1
1
2
2
3
4
4
5
5
5

You've given what I would have suggested as the answer already in your comment:
vcat(fill.(x, y)...)
How does this work? Start with fill:
help?> fill
fill(x, dims::Tuple)
fill(x, dims...)
Create an array filled with the value x. For example, fill(1.0, (5,5)) returns a 5×5 array of floats, with each element initialized to 1.0.
This is a bit more complicated than it needs to be for our case (where we only have one dimension to fill into), so let's look at a simple example:
julia> fill(1, 3)
3-element Vector{Int64}:
1
1
1
so fill(1, 3) just means "take the number one, and put this number into a one-dimensional array 3 times."
This of course is exactly what we want to do here: for every element in x, we want an array that holds this element multiple times, with the multiple given by the corresponding element in y. We could therefore loop over x and y and do something like:
julia> for (xᵢ, yᵢ) ∈ zip(x, y)
fill(xᵢ, yᵢ)
end
Now this loop doesn't return anything, so we'd have to preallocate some storage and assign to that within the loop. A more concise way of writing this while automatically returning an object would be a comprehension:
julia> [fill(xᵢ, yᵢ) for (xᵢ, yᵢ) ∈ zip(x, y)]
5-element Vector{Vector{Int64}}:
[1, 1, 1]
[2, 2]
[3]
[4, 4]
[5, 5, 5]
and even more concisely, we can just use broadcasting:
julia> fill.(x, y)
5-element Vector{Vector{Int64}}:
[1, 1, 1]
[2, 2]
[3]
[4, 4]
[5, 5, 5]
so from the comprehension or the broadcast we are getting a vector of vectors, each vector being an element of x repeated y times. Now all that remains is to put these together into a single vector by concatenating them vertically:
julia> vcat(fill.(x, y)...)
11-element Vector{Int64}:
1
1
1
2
2
3
4
4
5
5
5
Here we are using splatting to essentially do:
z = fill.(x, y)
vcat(z[1], z[2], z[3], z[4], z[5])
Note that splatting can have suboptimal performance for arrays of variable length, so a better way is to use reduce which is special cased for this and will give the same result:
reduce(vcat, fill.(x, y))

If performance is a priority, you can also do it the long, manual way:
function runlengthdecode(vals::Vector{T}, reps::Vector{<:Integer}) where T
length(vals) == length(reps) || throw(ArgumentError("Same number of values and counts expected"))
result = Vector{T}(undef, sum(reps))
resind = 1
for (valind, numrep) in enumerate(reps)
for i in 1:numrep
#inbounds result[resind] = vals[valind]
resind += 1
end
end
result
end
This runs about 12 times faster than the vcat/fill based method for the given data, likely because of avoiding creating all the intermediate filled vectors.
You can also instead use fill! on the preallocated result's #views, by replacing the loop in above code with:
for (val, numrep) in zip(vals, reps)
fill!(#view(result[resind:resind + numrep - 1]), val)
resind += numrep
end
which has comparable performance.

Also, for completeness, a comprehension can be quite handy for this. And it's faster than fill and vcat.
julia> [x[i] for i=1:length(x) for j=1:y[i]]
11-element Vector{Int64}:
1
1
1
2
2
3
4
4
5
5
5

Related

Apply function with multiple arguments to a vector in Julia

I would like to apply function with multiple arguments to a vector.
It seems that both map() and map!() can be helpful.
It works perfect if function has one argument:
f = function(a)
a+a
end
x=[1,2,3,4,5]
map(f, x)
output: [2, 4, 6, 8, 10]
However, it is not clear how to pass arguments to the function, if possible, and the vector to broadcast, if the function has multiple arguments.
f = function(a,b)
a*b
end
However, non of the following working:
b=3
map(f(a,b), x, 3)
map(f, x, 3)
map(f, a=x, b=3)
map(f(a,b), x, 3)
map(f(a,b), a=x,b=3)
Expected output:
[3,6,9,12,15]
Use broadcast - just as you suggested in the question:
julia> f = function(a,b)
a*b
end
#1 (generic function with 1 method)
julia> x=[1,2,3,4,5]
5-element Vector{Int64}:
1
2
3
4
5
julia> b=3
3
julia> f.(x, b)
5-element Vector{Int64}:
3
6
9
12
15
map does not broadcast, so if b is a scalar you would manually need to write:
julia> map(f, x, Iterators.repeated(b, length(x)))
5-element Vector{Int64}:
3
6
9
12
15
You can, however, pass two iterables to map without a problem:
julia> map(f, x, x)
5-element Vector{Int64}:
1
4
9
16
25
One possible solution is to create an anonymous function inside map as follows -->
x = [1, 2, 3, 4, 5]
b = 3
f = function(a, b)
a * b
end
map(x -> f(x, b), x)
which produces below output-->
5-element Vector{Int64}:
3
6
9
12
15
Explanation :- Anonymous function is taking values from vector as its first argument and 2nd argument is fixed with b = 3.
A few other options:
julia> map(Base.splat(func), Iterators.product(3, x))
5-element Vector{Int64}:
3
6
9
12
15
Iterators.product returns a list of tuples (3, 1), (3, 2), etc. Since our function func takes multiple separate arguments and not a tuple, we use Base.splat on it which takes the tuple and splats it into separate arguments to pass on to func.
julia> using SplitApplyCombine: product
julia> product(func, x, 3)
5-element Vector{Int64}:
3
6
9
12
15
SplitApplyCombine.jl's product function can directly map a given function over each combination (Cartesian product) of the given arguments.
julia> map(func, x, Iterators.cycle(3))
5-element Vector{Int64}:
3
6
9
12
15
A difference from the two previous ways is that if the shorter argument was a vector with more than one element in it, the previous methods would apply the function to each combination of elements from the two arguments, whereas this one would behave like Python's zip_longest, repeating the shorter vector until they were the same length (and then applying the function).
julia> y = [10, 1000];
julia> SplitApplyCombine.product(func, x, y) # previous method
5×2 Matrix{Int64}:
10 1000
20 2000
30 3000
40 4000
50 5000
julia> map(func, x, Iterators.cycle(y))
5-element Vector{Int64}:
10
2000
30
4000
50

Julia transpose vector of vectors

I have a vector of vectors, say
julia> m=[[1,2],[3,4],[5,6]]
3-element Vector{Vector{Int64}}:
[1, 2]
[3, 4]
[5, 6]
which I want to transpose, meaning that I want a 2-element vector with the corresponding 3-element vectors (1,3,5 and 2,4,6).
This could obviously be done with loops, but I suspect that this is slow and am sure that Julia has a better solution for it. The best one I could come up with so far looks like that:
julia> matrixM=reshape(collect(Iterators.flatten(m)), (size(m[1],1),size(m,1)))
2×3 Matrix{Int64}:
1 3 5
2 4 6
julia> map(i->matrixM[i,:], 1:size(matrixM,1))
2-element Vector{Vector{Int64}}:
[1, 3, 5]
[2, 4, 6]
You can use:
julia> using SplitApplyCombine
julia> invert([[1,2],[3,4],[5,6]])
2-element Vector{Vector{Int64}}:
[1, 3, 5]
[2, 4, 6]

How to search and manipulate sparse matrices in Julia

I have a large sparse matrix. I would like to do be able to do two things:
Find a row that has only one non-zero value. Let's call its row index idx.
Zero out column idx, found in 1. I would like to do this efficiently as the matrix is large.
I have tried reading https://docs.julialang.org/en/v1/stdlib/SparseArrays/ but I can't see how to do either.
If I understand you correctly this should work:
julia> using SparseArrays
# Dummy data
julia> A = sparse([1, 1, 2, 2, 3, 3], [1, 2, 3, 1, 2, 3], [2, 3, 0, 0, 0, 5])
3×3 SparseMatrixCSC{Int64,Int64} with 6 stored entries:
[1, 1] = 2
[2, 1] = 0
[1, 2] = 3
[3, 2] = 0
[2, 3] = 0
[3, 3] = 5
# Count non-zero elements across rows
julia> using StatsBase
julia> valcounts = countmap(A.rowval[A.nzval .!= 0])
Dict{Int64,Int64} with 2 entries:
3 => 1
1 => 2
# Find the row(s) with only one non-zero element
julia> [k for k ∈ keys(valcounts) if valcounts[k] == 1]
1-element Array{Int64,1}:
3
# Set the non-zero element in the third row to zero
julia> A[3, A[3, :] .> 0] .= 0
1-element view(::SparseMatrixCSC{Int64,Int64}, 3, [3]) with eltype Int64:
0
julia> A
3×3 SparseMatrixCSC{Int64,Int64} with 6 stored entries:
[1, 1] = 2
[2, 1] = 0
[1, 2] = 3
[3, 2] = 0
[2, 3] = 0
[3, 3] = 0

How to generate all possible sample paths in Julia from vectors of unequal length

I have 5 vectors t1,...,t5, of respective unequal lengths n1, .. ,n5. How can I generate an (n1*...*n5)x(5) matrix in Julia, which would be:
What you may be looking for is Iterators.product though it does not generate exactly what you request
julia> n1, n2, n3, n4, n5 = 2, 3, 4, 5, 6;
julia> a = Iterators.product(1:n1, 1:n2, 1:n3, 1:n4, 1:n5)
Base.Iterators.ProductIterator{NTuple{5,UnitRange{Int64}}}((1:2, 1:3, 1:4, 1:5, 1:6))
julia> first(a)
(1, 1, 1, 1, 1)
julia> reduce(vcat, a)
600-element Array{NTuple{5,Int64},1}:
(1, 1, 1, 1, 1)
(2, 1, 1, 1, 1)
(1, 2, 1, 1, 1)
(2, 2, 1, 1, 1)
....
It doesn't create the Matrix you requested, but most of the time you'll generate a Matrix like that to use it for something else. In this case this is better, as it avoids allocating a temporary Matrix.
#BogumiłKamiński wrote in a comment below that you can get a Matrix (not ordered exactly like the one in your example though) from the object by
julia> reduce(vcat, reduce.(hcat, a))
720×5 Array{Int64,2}:
1 1 1 1 1
2 1 1 1 1
1 2 1 1 1
...
which is maybe not the first thing one would think about, but gets the job done nicely.

Number of linear independent subsets with cardinality 4

I have a list of vectors in the vector space Q with a dimension of 5, which I want to order in a list and use Combinations(list, 4) to get all sublists with 4 elements. I then want to
check how many of those sublists are linear independent in the Vector Space with V.linear dependence(vs) == [].
I'm running into an error when running my code:
V = VectorSpace(QQ,5)
V.list = ([2, 2, 2,-3,-3],[2, 2,-3,2,-3],[2,2,-3,-3,2],[2,-3,2,2,-3],[2,-3,2,-3,2],[2,-3,-3,2,2],[-3,2,2,2,-3],[-3,2,2,-3,2],[-3,2,-3,2,2],[-3,-3,2,2,2])
C = Combinations(list, 4)
V.linear_dependence(C) == []
"ValueError: vector [[2, 2, 2, -3, -3], [2, 2, -3, 2, -3], [2, 2, -3, -3, 2], [2, -3, 2, 2, -3]] is not an element of Vector space of dimension 5 over Rational Field"
Anyone got any clues as to what im missing?
You are asking it to just take a list (or actually, tuple) and put it in the vector space, but I think Sage doesn't do that automatically. Try this.
V = VectorSpace(QQ,5)
list = ([2, 2, 2,-3,-3],[2, 2,-3,2,-3],[2,2,-3,-3,2],[2,-3,2,2,-3],[2,-3,2,-3,2],[2,-3,-3,2,2],[-3,2,2,2,-3],[-3,2,2,-3,2],[-3,2,-3,2,2],[-3,-3,2,2,2])
C = Combinations(list, 4)
for c in C:
if V.linear_dependence([V(x) for x in c]) == []: print c
The reason for a double list is that neither of these things are inherently in a vector space.
A slight modification to this, replacing print c with z+=1 (having predefined z=0) says that 185 of your 210 combinations appear to be linearly independent.
By the way, comparing to the empty list might not be as efficient as other options.

Resources