How can I concisely define multidimensional `Vec`s in Rust? - vector

When initializing a multidimensional Vec in Rust, I can use the vec!-macro like this:
vec![vec![0; 100]; 200]
However, this gets messy for Vecs of higher dimensions. Currently, I am using this:
vec![vec![vec![vec![vec![vec![vec![vec![0; N-1]; N-1]; N-1]; N-1]; 2]; 2]; 2]; 2]
This is not very concise, and also the order in which the dimensions are written is reverse to the indexing order. Is there a more concise way to do this? I am looking for something like
vec![0; 2, 2, 2, 2, N-1, N-1, N-1, N-1]

The ndarray crate allows you to have an N-dimensional array. For anything above 6 dimensions, you can use the ArrayD type. You can create a dynamic dymension using IxDyn - documentation with examples.
Example for a 7x7x7...x7 array initialization and element access:
let mut array_7d = ArrayD::<f64>::zeros(IxDyn(&[7, 7, 7, 7, 7, 7, 7, 7]));
let index = IxDyn(&[0, 0, 0, 0, 0, 0, 0, 0]);
array_7d[&index] = 1.0;


Using gather() to retrieve rows from 3d tensor with 2d tensor in Pytorch

I'm new to Pytorch and having an issue with the gather() function:
I have a 3d tensor, x[i,j,k]:
I have an index tensor:
I want to use the values of index to iterate over x[j] and fetch the (complete) rows. I've tried gather() with all dims, squeezing, unsqueezing and it never seems to get the output I'm looking for, which would be:
I've also tried repeating the values of index to get the same shape as x but it did not work.
I know I can do this with an if loop, but I'm pretty sure I can do it with gather() as well. Thanks for the help
Let us set up the two tensors x and index:
>>> x = torch.arange(1,19).view(2,3,3)
>>> index = torch.tensor([[1,2,0]])
What you are looking for is the torch.gather operation:
out[i][j][k] = x[i][index[i][j][k]][k]
In other to apply this function, you need to expand index to the same shape as out. Additionally, a transpose operation is required to flip your original index tensor.
>>> i = index.T.expand_as(x)
tensor([[[1, 1, 1],
[2, 2, 2],
[0, 0, 0]],
[[1, 1, 1],
[2, 2, 2],
[0, 0, 0]]])
If you compare with the pseudo code line above, you can see how every element of i represents the row of the original tensor x the operator will gather values from.
Applying the function gets us to the desired result:
x.gather(dim=1, index=index.T.expand_as(x))
tensor([[[ 4, 5, 6],
[ 7, 8, 9],
[ 1, 2, 3]],
[[13, 14, 15],
[16, 17, 18],
[10, 11, 12]]])

Broadcasting struct creation with `Base.#kwdef`

If I have a large struct that I want to create an array of (e.g. to later create a StructArray), how can I create an array of structs when I have keyword defaults.
Base.#kwdef struct MyType
a = 0
b = 0
c = 0
d = 0
... # can be up to 10 or 20 fields
Base.#kwdef is nice because I can create objects with MyType(b=10,e=5) but sometimes I have arrays of the argument. I would like to be able to broadcast or succinctly construct an array of the structs.
That is I would like the following would create an array of three MyTypes: MyType.(c=[5,6,7],d = [1,2,3])
Instead, it creates a single MyType where c and d are arrays rather than scalar values.
What are ways to keep the convenience of both Base.#kwdef and easy array of struct construction?
Seems like a good use case for a comprehension:
julia> [MyType(c=cval, d=dval) for (cval, dval) in zip([5, 6, 7], [1, 2, 3])]
3-element Vector{MyType}:
MyType(0, 0, 5, 1)
MyType(0, 0, 6, 2)
MyType(0, 0, 7, 3)
Another possiblity (based on this answer ) is to explicitly do the broadcast call yourself:
julia> broadcast((cval, dval) -> MyType(c = cval, d = dval), [5, 6, 7], [1, 2, 3])
3-element Vector{MyType}:
MyType(0, 0, 5, 1)
MyType(0, 0, 6, 2)
MyType(0, 0, 7, 3)
or the equivalent ((cval, dval) -> MyType(c = cval, d = dval)).([5, 6, 7], [1, 2, 3]) as mentioned in the comment there.
Out of these, the array comprehension seems to me the clearest and most obvious way to go about it.
Following this post: there is no nice built-in syntax for your case.
One option is comprehension (see the other answer), second option (which I prefer here more) is building an anonymous function and vectoring over it such as:
julia> ((x,y)->MyType(;c=x,d=y)).([1,2],[3,5])
2-element Vector{MyType}:
MyType(0, 0, 1, 3)
MyType(0, 0, 2, 5)
It is also possible to call broadcast directly as:
julia> broadcast((x,y)->MyType(;c=x,d=y), [1,2],[3,5])
2-element Vector{MyType}:
MyType(0, 0, 1, 3)
MyType(0, 0, 2, 5)

Most common term in a vector - PARI/GP

I feel like I'm being really stupid here as I would have thought there's a simple command already in Pari, or it should be a simple thing to write up, but I simply cannot figure this out.
Given a vector, say V, which will have duplicate entries, how can one determine what the most common entry is?
For example, say we have:
V = [ 0, 1, 2, 2, 3, 4, 6, 8, 8, 8 ]
I want something which would return the value 8.
I'm aware of things like vecsearch, but I can't see how that can be tweaked to make this work?
Very closely related to this, I want this result to return the most common non-zero entry, and some vectors I look at will have 0 as the most common entry. Eg: V = [ 0, 0, 0, 0, 3, 3, 5 ]. So whatever I execute here I would like to return 3.
I tried writing up something which would remove all zero terms, but again struggled.
The thing I have tried in particular is:
rem( v ) = {
my( c );
while( c = vecsearch( v, 0 ); #c, v = vecextract( v, "^c" ) ); v
but vecextract doesn't seem to like this set up.
If you can ensure all the elements are within the some fixed range then it is enough just to do the counting sorting with PARI/GP code like this:
counts_for(v: t_VEC, lower: t_INT, upper: t_INT) = {
my(counts = vector(1+upper-lower));
for(i=1, #v, counts[1+v[i]-lower]++);
vector(#counts, i, [i-1, counts[i]])
V1 = [0, 1, 2, 2, 3, 4, 6, 8, 8, 8];
vecsort(counts_for(V1, 0, 8), [2], 4)[1][1]
> 8
V2 = [0, 0, 0, 0, 3, 3, 5];
vecsort(counts_for(V2, 0, 5), [2], 4)[1][1]
> 0
You also can implement the following short-cut for the sake of convenience:
counts_for1(v: t_VEC) = {
counts_for(v, vecmin(v), vecmax(v))
most_frequent(v: t_VEC) = {
vecsort(counts, [2], 4)[1][1]
> 8
> 0
The function matreduce provides this in a more general setting: applied to a vector of objects, it returns a 2-column matrix whose first column contains the distinct objects and the second their multiplicity in the vector. (The function has a more general form that takes the union of multisets.)
most_frequent(v) = my(M = matreduce(v), [n] = matsize(M)); M[n, 1];
most_frequent_non0(v) =
{ my(M = matreduce(v), [n] = matsize(M), x = M[n, 1]);
if (x == 0, M[n - 1, 1], x);
? most_frequent([ 0, 1, 2, 2, 3, 4, 6, 8, 8, 8 ])
%1 = 8
? most_frequent([x, x, Mod(1,3), [], [], []])
%2 = []
? most_frequent_non0([ 0, 0, 0, 0, 3, 3, 5 ])
%3 = 5
? most_frequent_non0([x, x, Mod(1,3), [], [], []])
%4 = x
The first function will error out if fed an empty vector, and the second one if there are no non-zero entries. The second function tests for "0" using the x == 0 test (and we famously have [] == 0 in GP); for a more rigorous semantic, use x === 0 in the function definition.

Julia idiomatic way to split vector to subvectors based on condition

Let's say I have a vector a = [1, 0, 1, 2, 3, 4, 5, 0, 5, 6, 7, 8, 0, 9, 0] and I want to split it to smaller vectors based on a condition depending on value in that array. E.g. value being zero.
Thus I want to obtain vector of following vectors
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
So far this was working for me as a naive solution, but it loses the type.
function split_by_λ(a::Vector, λ)
b = []
temp = []
for i in a
push!(temp, i)
if λ(i)
push!(b, temp)
temp = []
split_by_λ(a, isequal(0))
Then I tried to play with ranges, which feels a bit more idiomatic, and does not lose the type.
function split_by_λ(a::Vector, λ)
idx = findall(λ, a)
ranges = [(:)(i==1 ? 1 : idx[i-1]+1, idx[i]) for i in eachindex(idx)]
map(x->a[x], ranges)
split_by_λ(a, isequal(0))
but it still feels very cumbersome regarding it's a rather simple task.
Is there something I'm missing, some easier way?
Maybe someone has a shorter idea but here is mine:
julia> inds = vcat(0,findall(==(0),a),length(a))
julia> getindex.(Ref(a), (:).(inds[1:end-1].+1,inds[2:end]))
5-element Array{Array{Int64,1},1}:
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
Or if you want to avoid copying a
julia> view.(Ref(a), (:).(inds[1:end-1].+1,inds[2:end]))
5-element Array{SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true},1}:
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
0-element view(::Array{Int64,1}, 16:15) with eltype Int64
Pretty much the same as Przemyslaw's answer, but maybe less cryptic dense:
function split_by(λ, a::Vector)
first, last = firstindex(a), lastindex(a)
splits = [first-1; findall(λ, a); last]
s1, s2 = #view(splits[1:end-1]), #view(splits[2:end])
return [view(a, i1+1:i2) for (i1, i2) in zip(s1, s2)]
Also, I changed the signature to the conventional one of "functions first", which allows you to use do-blocks. Additionally, this should work with offset indexing.
One could surely get rid of the intermediate allocations, but I think that gets ugly without yield:
function split_by(λ, a::Vector)
result = Vector{typeof(view(a, 1:0))}()
l = firstindex(a)
r = firstindex(a)
while r <= lastindex(a)
if λ(a[r])
push!(result, #view(a[l:r]))
l = r + 1
r += 1
push!(result, #view(a[l:end]))
return result

Transforming matrix diagonals to ragged array?

I'm trying to come up with a non brute-force solution to the following problem. Given a matrix of arbitrary size:
[6 0 3 5]
[3 7 1 4]
[1 4 8 2]
[0 2 5 9]
Transform its diagonals to a list of vectors, like so:
(1, 2)
(3, 4, 5)
(6, 7, 8, 9)
(0, 1, 2)
(3, 4)
(Working from bottom left to top right in this example)
Is there an elegant way to do this short of iterating up the left column and across the top row?
I would just write a little function to transform the vector indices into matrix indices.
Say the matrix is NxN square, then there will be 2N-1 vectors; if we number the vectors from 0 to 2N-2, element k of vector n will be at row max(N-1-n+k,k) and column max(n+k-N+1,k) (or in reverse, the matrix element at row i, column j will be element min(i,j) of vector N-1+j-i). Then whenever you need to access an element of a vector, just convert the coordinates from k,n to i,j (that is, convert vector indices to matrix indices) and access the appropriate element of the matrix. Instead of actually having a list of vectors, you'll wind up with something that emulates a list of vectors, in the sense that it can give you any desired element of any vector in the list - which is really just as good. (Welcome to duck typing ;-)
If you're going to access every element of the matrix, though, it might just be quicker to iterate, rather than doing this computation every time.
(non-checked code)
Something like this (java code):
// suppose m is the matrix, so basically an int[][] array with r rows and c columns
// m is an int[rows][cols];
List result = new ArrayList(rows + cols - 1);
for (int i = 0; i < (rows + cols - 1))
int y;
int x;
if (i < rows)
x = 0;
y = rows - i - 1;
x = i - rows + 1;
y = 0;
Vector v = new Vector();
while (y < rows && x < cols)
v.add(new Integer(m[y][c]));
// result now contains the vectors you wanted
Edit: i had x and y mixed up, corrected now.
m = {{6, 0, 3, 5},
{3, 7, 1, 4},
{1, 4, 8, 2},
{0, 2, 5, 9}};
Table[Diagonal[m, i], {i, 1 - Length#m, Length#m[[1]] - 1}]
Which gives a list of the i'th diagonals where the 0th diagonal is the main diagonal, i = -1 gives the one below it, etc. In other words, it returns:
{{0}, {1, 2}, {3, 4, 5}, {6, 7, 8, 9}, {0, 1, 2}, {3, 4}, {5}}
Of course using the built-in Diagonal function is kind of cheating. Here's an implementation of Diagonal from scratch:
(* Grab the diagonal starting from element (i,j). *)
diag0[m_,i_,j_] := Table[m[[i+k, j+k]], {k, 0, Min[Length[m]-i, Length#m[[1]]-j]}]
(* The i'th diagonal -- negative means below the main diagonal, positive above. *)
Diagonal[m_, i_] := If[i < 0, diag0[m, 1-i, 1], diag0[m, 1, i+1]]
The Table function is basically a for loop that collects into a list. For example,
Table[2*i, {i, 1, 5}]
returns {2,4,6,8,10}.
