R: subsetting N-dimensional arrays - r

Consider the following 3-dimensional array:
set.seed(123)
arr = array(sample(c(1:10)), dim=c(3,4,2))
which yields
> arr
, , 1
[,1] [,2] [,3] [,4]
[1,] 10 9 8 2
[2,] 5 1 4 10
[3,] 6 7 3 5
, , 2
[,1] [,2] [,3] [,4]
[1,] 6 7 3 5
[2,] 9 8 2 6
[3,] 1 4 10 9
I'd like to subset it like
arr[c(1,2), c(2,4), c(1)]
but the catch is that I don't know (a) which indices or (b) which dimension the indices are.
What is the best way to access an N-dimensional array with index variables?
ll = list(c(1,2), c(2,4), c(1))
arr[ll] # doesn't work
arr[grid.expand(ll)] # doesn't work
# ..what else?

use do.call, such as:
do.call(`[`, c(list(arr), ll))
or more cleanly, using a wrapper function:
getArr <- function(...)
`[`(arr, ...)
do.call(getArr, ll)
[,1] [,2]
[1,] 10 5
[2,] 7 3

There is the asub function from the abind package:
library(abind)
asub(arr, ll)
which can also do a lot more, in particular extract along a subset of the dimensions (https://stackoverflow.com/a/17752012/1201032). Worth having in your toolbox.

Related

How to apply a function on every element of all elements in a list in R

I have a list containing matrices of the same size in R. I would like to apply a function over the same element of all matrices. Example:
> a <- matrix(1:4, ncol = 2)
> b <- matrix(5:8, ncol = 2)
> c <- list(a,b)
> c
[[1]]
[,1] [,2]
[1,] 1 3
[2,] 2 4
[[2]]
[,1] [,2]
[1,] 5 7
[2,] 6 8
Now I want to apply the mean function and would like to get a matrix like that:
[,1] [,2]
[1,] 3 5
[2,] 4 6
One conceptual way to do this would be to sum up the matrices and then take the average value of each entry. Try using Reduce:
Reduce('+', c) / length(c)
Output:
[,1] [,2]
[1,] 3 5
[2,] 4 6
Demo here:
Rextester
Another option is to construct an array and then use apply.
step 1: constructing the array.
Using the abind library and do.call, you can do this:
library(abind)
myArray <- do.call(function(...) abind(..., along=3), c)
Using base R, you can strip out the structure and then rebuild it like this:
myArray <- array(unlist(c), dim=c(dim(a), length(c)))
In both instances, these return the desired array
, , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 6 8
step 2: use apply to calculate the mean along the first and second dimensions.
apply(myArray, 1:2, mean)
[,1] [,2]
[1,] 3 5
[2,] 4 6
This will be more flexible than Reduce, since you can swap out many more functions, but it will be slower for this particular application.

How to delete an element at a time from a vector while retaining the others?

I have a vector x containing 5 elements.
x <- (1,2,3,4,5)
I would want to delete one element at each iteration and retain other elements in the vector.(as shown below)
x <- (2,3,4,5) #vector after first iteration
x <- (1,3,4,5) #vector after second iteration
x <- (1,2,4,5) #vector after third iteration
x <- (1,2,3,5) #vector after fourth iteration
and also, is it possible to store these new vectors in a list?
is there a way to extend this to multiple vectors?
You could use combn:
combn(5,4)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 2
[2,] 2 2 2 3 3
[3,] 3 3 4 4 4
[4,] 4 5 5 5 5
To get the data as a list:
as.list(data.frame(combn(5,4)))
To use this on multiple vectors or a matrix, first transform it into a data.frame, to make it easier for lapply to go over the length (columns) of the data.frame. Then you can use lapply with combn like so:
mat <- data.frame(matrix(1:10,5))
lapply(mat, function(x) combn(x,length(x)-1))
$X1
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 2
[2,] 2 2 2 3 3
[3,] 3 3 4 4 4
[4,] 4 5 5 5 5
$X2
[,1] [,2] [,3] [,4] [,5]
[1,] 6 6 6 6 7
[2,] 7 7 7 8 8
[3,] 8 8 9 9 9
[4,] 9 10 10 10 10
We can do
lapply(seq_along(x), function(i) x[-i])
drop_n <- function(n, x) x[-n]
lapply(1:5, drop_n, x)
Here you have a way to get what you want. You only need to change the parameter n to make it more general
# Generate a list
L <- list()
# Define the number of elements
n <- 5
# Define the values
values <- 1:n
# Complete the list
for (i in 1:n){
L[[i]] <- values[-i]
}

represent nested for-loops for function with two parameters using same looping variables

I try to find a more efficient way to write that piece of code,
I considered apply, mapply and sweep, but I see no way how to rewrite it...
points.proj is m x k matrix, data.proj is n x k matrix.
So essentially I'd like to apply fun to each element of points.mat with the corresp. same columnnumber of the other matrix data.mat...the result should be an m x k matrix again.
for(i in 1:m){
for(j in 1:k){
Bounds[i,j] <- fun(points.proj[i,j],data.proj[,j])
}}
fun <- function(a,b) sum(a<b)
> points.proj
[,1] [,2]
[1,] 6 5
[2,] 7 6
[3,] 8 5
> data.proj
[,1] [,2]
[1,] 8 3
[2,] 2 0
[3,] 9 4
[4,] 6 7
[5,] 2 9
> Bounds
[,1] [,2]
[1,] 2 2
[2,] 2 2
[3,] 1 2
Thanks for helping

How to slice a n dimensional array with a m*(n-i) dimensional matrix?

If i have a n dimensional array it can be sliced by a m * n matrix like this
a <- array(1:27,c(3,3,3))
b <- matrix(rep(1:3,3),3)
# This will return the index a[1,1,1] a[2,2,2] and a[3,3,3]
a[b]
# Output
[1] 1 14 27
Is there any "effective and easy" way to do a similar slice but to keep some dimensions free?
That is slice a n dimensional array with a m * (n-i) dimensional array and
get a i+1 dimensional array as result.
a <- array(1:27,c(3,3,3))
b <- matrix(rep(1:2,2),2)
# This will return a vector of the index a[1] a[2] a[1] and a[2]
a[b]
# Output
[1] 1 2 1 2
# This will return the indexes of the cartesian product between the vectors,
# that is a array consisting of a[1,,1] a[1,,2] a[2,,1] and a[2,,2]
a[c(1,2),,c(1,2)]
# Output
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
The desired result should be if the last command returned an array
with a[1,,1] and a[2,,2].
For now I solve this the problem with a for loop and abind but I'm sure there must be a better way.
# Desired functionality
a <- array(1:27,c(3,3,3))
b <- array(c(c(1,2),c(1,2)),c(2,2))
sliceem(a,b,freeDimension=2)
# Desired output (In this case rbind(a[1,,1],a[2,,2]) )
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 11 14 17
I think this is the cleanest way -- making a separate function:
slicem <- function(a,idx,drop=FALSE) do.call(`[`,c(list(a),idx,list(drop=drop)))
# usage for OP's example
a <- array(1:27, c(3,3,3))
idx <- list(1:2, TRUE, 1:2)
slicem(a,idx)
which gives
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
You have to write TRUE for each dimension that you aren't selecting from.
Following the OP's new expectations...
library(abind)
nistfun <- function(a,list_o_idx,drop=FALSE){
lens <- lengths(list_o_idx)
do.call(abind, lapply(seq.int(max(lens)), function(i)
slicem(a, mapply(`[`, list_o_idx, pmin(lens,i), SIMPLIFY=FALSE), drop=drop)
))
}
# usage for OP's new example
nistfun(a, idx)
# , , 1
#
# [,1] [,2] [,3]
# [1,] 1 4 7
#
# , , 2
#
# [,1] [,2] [,3]
# [1,] 11 14 17
Now, any non-TRUE indices must have the same length, since they will be matched up.
abind is used here instead of rbind (see an earlier edit on this answer) because it is the only sensible general way to think about slicing up an array. If you really want to drop dimensions, it's quite ambiguous which should be dropped and how, so the vector alone is returned:
nistfun(a, idx, drop=TRUE)
# [1] 1 4 7 11 14 17
If you want to throw this back into an array of some sort, you can do that after the fact:
matrix( nistfun(a, idx), max(lengths(idx)), dim(a)[sapply(idx,isTRUE)]), byrow=TRUE)
# [,1] [,2] [,3]
# [1,] 1 4 7
# [2,] 11 14 17

Functional way to stack list of 2d matrices into 3d matrix

After a clever lapply, I'm left with a list of 2-dimensional matrices.
For example:
set.seed(1)
test <- replicate( 5, matrix(runif(25),ncol=5), simplify=FALSE )
> test
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.8357088 0.29589546 0.9994045 0.2862853 0.6973738
[2,] 0.2377494 0.14704832 0.0348748 0.7377974 0.6414624
[3,] 0.3539861 0.70399206 0.3383913 0.8340543 0.6439229
[4,] 0.8568854 0.10380669 0.9150638 0.3142708 0.9778534
[5,] 0.8537634 0.03372777 0.6172353 0.4925665 0.4147353
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.1194048 0.9833502 0.9674695 0.6687715 0.1928159
[2,] 0.5260297 0.3883191 0.5150718 0.4189159 0.8967387
[3,] 0.2250734 0.2292448 0.1630703 0.3233450 0.3081196
[4,] 0.4864118 0.6232975 0.6219023 0.8352553 0.3633005
[5,] 0.3702148 0.1365402 0.9859542 0.1438170 0.7839465
[[3]]
...
I'd like to turn that into a 3-dimensional array:
set.seed(1)
replicate( 5, matrix(runif(25),ncol=5) )
Obviously, if I'm using replicate I can just turn on simplify, but sapply does not simplify the result properly, and stack fails utterly. do.call(rbind,mylist) turns it into a 2d matrix rather than 3d array.
I can do this with a loop, but I'm looking for a neat and functional way to handle it.
The closest way I've come up with is:
array( do.call( c, test ), dim=c(dim(test[[1]]),length(test)) )
But I feel like that's inelegant (because it disassembles and then reassembles the array attributes of the vectors, and needs a lot of testing to make safe (e.g. that the dimensions of each element are the same).
Try this:
simplify2array(test)
You can use the abind package and then use abind(test, along = 3)
library(abind)
testArray <- abind(test, along = 3)
Or you could use simplify = 'array' in a call to sapply, (instead of lapply). simplify = 'array' is not the same as simplify = TRUE, as it will change the argument higher in simplify2array
eg
foo <- function(x) matrix(1:10, ncol = 5)
# the default is simplify = TRUE
sapply(1:5, foo)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
[6,] 6 6 6 6 6
[7,] 7 7 7 7 7
[8,] 8 8 8 8 8
[9,] 9 9 9 9 9
[10,] 10 10 10 10 10
# which is *not* what you want
# so set `simplify = 'array'
sapply(1:5, foo, simplify = 'array')
, , 1
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
, , 2
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
, , 3
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
, , 4
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
, , 5
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
An array is simply an atomic vector with dimensions. Each of the matrix components of test is really just a vector with dimensions too. Hence the simplest solution I can think of is to unroll the list test into a vector and convert that to an array using array and suitably supplied dimensions.
set.seed(1)
foo <- replicate( 5, matrix(runif(25),ncol=5) )
tmp <- array(unlist(test), dim = c(5,5,5))
> all.equal(foo, tmp)
[1] TRUE
> is.array(tmp)
[1] TRUE
> dim(tmp)
[1] 5 5 5
If you don't want to hardcode the dimensions, we have to make some assumptions but can easily fill in the dimension from test, e.g.
tmp2 <- array(unlist(test), dim = c(dim(test[[1]]), length(test)))
> all.equal(foo, tmp2)
[1] TRUE
This assumes that the dimensions of each component are all the same, but then I don't see how you could put sub-matrices into a 3-d array if that condition doesn't hold.
This may seem hacky, to unroll the list, but this is simply exploiting how R handles matrices and arrays as vectors with dimensions.
test2 <- unlist(test)
dim(test2) <- c(dim(test[[1]]),5)
or if you do not know the expected size ahead of time:
dim3 <- c(dim(test[[1]]), length(test2)/prod(dim(test[[1]])))
dim(test2) <- dim3

Resources