Efficiently chunk large vector into a vector of vectors

Efficiently chunk large vector into a vector of vectors - vector

I want to chunk a large vector into a vector of vectors. I know about chunks(), but am not sure of the best way to go from the iterator to a 2D Vec. I have found the following to work, but is there a better way to write this?
let v: Vec<i32> = vec![1, 1, 1, 2, 2, 2, 3, 3, 3];
let v_chunked: Vec<Vec<i32>> = v.chunks(3).map(|x| x.to_vec()).collect();
println!("{:?}", v_chunked); // [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5031d4d0e43470242b8304d483967a25
An operation similar to this is one of the slowest parts of my program after profiling and I was wondering how to improve it.

If a Vec<Vec<i32>> is what you really want then this is a pretty good way of doing it. Any other approach (excluding unsafe code, see below) is unlikely to be significantly faster or use noticeably less memory. Regardless of the actual code, each nested Vec is a new memory allocation and all the data will be need to copied - and that's essentially all that your code does.
A more "Rusty" way to represent a 2D structure like this is a Vec of slices into the original data. That way you don't do any copying and no new allocations.
let v_slices: Vec<&[i32]> = v.chunks(3).collect();
println!("{:?}", v_slices); // [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
Edit: I did have an extra bit here with some unsafe code that would transform a Vec<i32> into a Vec<Vec<i32>> without reallocating. However, it has been pointed out that it still has Undefined Behaviour, and that the problem is fundamentally not fixable

With the help of the comments, I found storing the data as a 1D Vec to be much more efficient. Then to deal with it conveniently I use chunks and work with the Vec of slices as needed within the function bodies using the data.

Related

How to find the index of an array, where the element has value x, in R

I have a very large array (RFO_2003; dim = c(360, 180, 13, 12)) of numeric data. I made the array using a for-loop that does some calculations based another array. I am trying to check some samples of data in this array to ensure I have generated it properly.
To do this, I want to apply a function that returns the index of the array where that element equals a specific value. For example, I want to start by looking at a few examples where the value == 100.
I tried
which(RFO_2003 == 100)
That returned (first line of results)
[1] 459766 460208 460212 1177802 1241374 1241498 1241499 1241711 1241736 1302164 1302165
match gave the same results. What I was expecting was something more like
[8, 20, 3, 6], [12, 150, 4, 7], [16, 170, 4, 8]
Is there a way to get the indices in that format?
My searches have found solutions in other languages, lots of stuff on vectors, or the index is never output, it is immediately fed into another part of a custom function so I can't see which part would output the index in a way I understand, such as this question, although that one also returns dimnames not an index.

Interpolation of missing values in Julia

Suppose I have an array where some values are NaN. For example [1, 2, NaN, 4].
Is there any Julia library that able to fill the array with interpolated values?
So that the result would be [1, 2, 3, 4].
I can not see if I am able to do it with Interpolations.jl.
For now, I am not concerned with the interpolation function. I just need to fill this linearly.
Thanks in advance.

You can just do it yourself, as long as it is not an edge case, find the next and previous value and take the average.

Output of this strange loop related to matrices

Let us consider the following pseudocode:
int n=n;
int A[][]
scanf(A[][],%d);
for i=1:n;i++
{
x=A[i][i]
for j=1:n;j++
{
if x<A[i][j]
a=x;
x=A[i][j];
A[i][i]=x;
A[i][j]=a;
return A[][]
I am fumbling on this pseudo code.the question, I think is just that the diagonal entries are compared and exchanged for the greatest entries. But, will the output depend on the entries of the matrix or will be independent of it is my main question. Specifically, is there any general formula for the output? Is it dependent on the type of matrix A I think it should some power of A. Any hints? Thanks beforehand.

You could just write your code on any language you love.
n = 3
A = [[1,2,3], [3,5,6], [7,8,9]]
for i in range(n):
x=A[i][i]
for j in range(n):
a = None
if x < A[i][j]:
a = x
x=A[i][j]
A[i][i]=x
A[i][j]=a
print (A)
Gives you:
[[3, 1, 2], [None, 6, 3], [None, 7, None]]
But, will the output depend on the entries of the matrix or will be
independent of it is my main question.
Ofc it depends. Your can see the initial data in the output. That means output depends on data.
Specifically, is there any general formula for the output?
I believe NO, but I cant mathematically prove. Just look at Nones appear in output. I hardly imagine such formula.
Is it dependent on the type of matrix A I think it should some power
of A.
What is 'type of matrix' ?

converting a for cycle in R

I would like to convert a for cycle into a faster operation such as apply.
Here is my code
for(a in 1:dim(k)[1]){
for(b in 1:dim(k)[2]){
if( (k[a,b,1,1]==0) & (k[a,b,1,2]==0) & (k[a,b,1,3]==0) ){
k[a,b,1,1]<-1
k[a,b,1,2]<-1
k[a,b,1,3]<-1
}
}
}
It's a simple code that does a check on each element of the multidimensional array k and if the three elements are the same and equal to 0, it assigns the value 1.
Is there a way to make it faster?. The matrix k has 1,444,000 elements and it takes too long to run it. Can anyone help?
Thanks

With apply you can return all your 3-combinations as a numeric vector and then check for your specific condition:
# This creates an array with the same properties as yours
array <- array(data = sample(c(0, 1), 81, replace = TRUE,
prob = c(0.9, 0.1)), c(3, 3, 3, 3))
# This loops over all vectors in the fourth dimension and returns a
# vector of ones if your condition is met
apply(array, MARGIN = c(1, 2, 3), FUN = function(x) {
if (sum(x) == 0 & length(unique(x)) == 1)
return(c(1, 1, 1))
else
return(x)
})
Note that the MARGIN argument specifies the dimensions over which to loop. You want the fourth dimension vectors so you specify c(1, 2, 3).
If you then assign this newly created array to the old one, you replaced all vectors where the condition is met with ones.

You should first use the filter function twice (composed), and then the apply (lapply?) function on the filtered array. Maybe you can also reduce the array, because it looks like you're not very interested in the third dimension (always accessing the 1st item). You should probably do some reading about functional programming in R here http://adv-r.had.co.nz/Functionals.html
Note I'm not a R programmer, but I'm quite familiar with functional programming (Haskell etc) so this might give you an idea. This might be faster, but it depends a bit on how R is designed (lazy or eager evaluation etc).

create new vector from existing vectors by using "rep"

Suppose I have the following two vectors,
a<-c(2,3,5)
b<-c(1,3,2)
Now I want to create a new vector c with this results from a and b,
2, 3, 3, 3, 5, 5
I tried this code, but it just does not work, I am stocked here. Help please. How can I get the results showed above?
for (i in 1:3){
c<-rep(a[i], each=b[i])
}