I have a 3D R array, e.g.:
a <- array(1:27, dim = c(3,3,3))
How can I (efficiently) convert this into a matrix in which the 3rd dimension is bound / stacked below each other, i.e.:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
[4,] 10 13 16
[5,] 11 14 17
[6,] 12 15 18
[7,] 19 22 25
[8,] 20 23 26
[9,] 21 24 27
I can clumsily achieve this with:
rbind(a[,,1], a[,,2], a[,,3])
but this is not generalizable well if I have many entries in the 3rd dimension (except with looping).
There must be a more elegant way to achieve this, but I could not find it. Ideas like
apply(a, 3, rbind)
apply(a, 3, c)
create a matrix, but the 3rd dimension simply become the columns. I want to keep the 2D matrices of the first 2 dimensions and just bind them together. I am aware this will mess up the indices, but we can disregard this for my use case.
I would be especially happy about a base R solution, but am also interested if this can be achieved with a (lightweight) package.
Edit:
This answer to a seemingly unrelated question provided a useful hint. This approach seems to achieve the desired result:
matrix(aperm(a, c(1, 3, 2)), nrow = dim(a)[1] * dim(a)[3])
Are there other ideas?
With aperm we transpose an array by permuting its dimensions and optionally resizing it:
y <- aperm(a, c(1, 3, 2))
dim(y) <- c(prod(dim(a)[-2]), dim(a)[2])
y
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
[4,] 10 13 16
[5,] 11 14 17
[6,] 12 15 18
[7,] 19 22 25
[8,] 20 23 26
[9,] 21 24 27
I have a number of subarrays, say 2 (for simplicity), each with the same number of rows and columns. Each spot in the subarrays is occupied by a number in [1, 10].
What I would like to do is move rows randomly between subarrays according to some rate of movement m = [0, 1]. m = 0 corresponds to no movement, while m = 1 means that any rows across all subarrays can be moved.
I take inspiration from:
How to swap a number of the values between 2 rows in R
but my problem is a bit different than this. I do know that sample() would be needed here.
Is there an easy way to go about accomplishing this?
This doesn't do it, but I believe I'm on the right track anyway.
m <- 0.2
a <- array(dim = c(5, 5, 2)) # 5 rows, 5 columns, 2 subarrays
res <- rep(sample(nrow(a), size = ceiling(nrow(a)*m), replace = FALSE)) # sample 20% of rows from array a.
Any assistance is appreciated.
It is significantly easier if you can use a matrix (2-dim array).
set.seed(2)
m <- 0.2
d <- c(10, 4)
a <- array(sample(prod(d)), dim = d)
a
# [,1] [,2] [,3] [,4]
# [1,] 8 17 14 1
# [2,] 28 37 40 26
# [3,] 22 38 16 29
# [4,] 7 35 3 32
# [5,] 34 11 23 4
# [6,] 36 33 19 31
# [7,] 5 24 30 13
# [8,] 39 6 27 25
# [9,] 15 10 12 9
# [10,] 18 2 21 20
(I'm going to set the seed again to something that conveniently gives me something "interesting" to show.)
set.seed(2)
ind <- which(runif(d[1]) < m)
ind
# [1] 1 4 7
The first randomness, runif, is compared against m and generates the indices that may change. The second randomness, sample below, takes those indices and possibly reorders them. (In this case, it reorders "1,4,7" to "4,1,7", meaning the third of the rows-that-may-change will be left unchanged.)
a[ind,] <- a[sample(ind),]
a
# [,1] [,2] [,3] [,4]
# [1,] 7 35 3 32 # <-- row 4
# [2,] 28 37 40 26
# [3,] 22 38 16 29
# [4,] 8 17 14 1 # <-- row 1
# [5,] 34 11 23 4
# [6,] 36 33 19 31
# [7,] 5 24 30 13 # <-- row 7, unchanged
# [8,] 39 6 27 25
# [9,] 15 10 12 9
# [10,] 18 2 21 20
Note that this is probabilistic, which means a probability of 0.2 does not guarantee you 20% (or even any) of the rows will be swapped.
(Since I'm guessing you'd really like to preserve your 3-dim (or even n-dim) array, you might be able to use aperm to transfer between array <--> matrix.)
EDIT 1
As an alternative to a probabilitic use of runif, you can use:
ind <- head(sample(d[1]),size=d[1]*m)
to get closer to your goal of "20%". Since d[1]*m will often not be an integer, head silently truncates/floors the number, so you'll get the price-is-right winner: closest to but not over your desired percentage.
EDIT 2
A reversible method for transforming an n-dimensional array into a matrix and back again. Caveat: though the logic appears solid, my testing has only included a couple arrays.
array2matrix <- function(a) {
d <- dim(a)
ind <- seq_along(d)
a2 <- aperm(a, c(ind[2], ind[-2]))
dim(a2) <- c(d[2], prod(d[-2]))
a2 <- t(a2)
attr(a2, "origdim") <- d
a2
}
The reversal uses the "origdim" attribute if still present; this will work as long as your modifications to the matrix do not clear its attributes. (Simple row-swapping does not.)
matrix2array <- function(m, d = attr(m, "origdim")) {
ind <- seq_along(d)
m2 <- t(m)
dim(m2) <- c(d[2], d[-2])
aperm(m2, c(ind[2], ind[-2]))
}
(These two functions should probably do some more error-checks, such as is.null(d).)
A sample run:
set.seed(2)
dims <- 5:2
a <- array(sample(prod(dims)), dim=dims)
Quick show:
a[,,1,1:2,drop=FALSE]
# , , 1, 1
# [,1] [,2] [,3] [,4]
# [1,] 23 109 61 90
# [2,] 84 15 27 102
# [3,] 68 95 83 24
# [4,] 20 53 117 46
# [5,] 110 62 43 8
# , , 1, 2
# [,1] [,2] [,3] [,4]
# [1,] 118 25 14 93
# [2,] 65 21 16 77
# [3,] 87 82 3 38
# [4,] 92 12 78 17
# [5,] 49 4 75 80
The transformation:
m <- array2matrix(a)
dim(m)
# [1] 30 4
head(m)
# [,1] [,2] [,3] [,4]
# [1,] 23 109 61 90
# [2,] 84 15 27 102
# [3,] 68 95 83 24
# [4,] 20 53 117 46
# [5,] 110 62 43 8
# [6,] 67 47 1 54
Proof of reversability:
identical(matrix2array(m), a)
# [1] TRUE
EDIT 3, "WRAP UP of all code"
Creating fake data:
dims <- c(5,4,2)
(a <- array(seq(prod(dims)), dim=dims))
# , , 1
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 2 7 12 17
# [3,] 3 8 13 18
# [4,] 4 9 14 19
# [5,] 5 10 15 20
# , , 2
# [,1] [,2] [,3] [,4]
# [1,] 21 26 31 36
# [2,] 22 27 32 37
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 25 30 35 40
(m <- array2matrix(a))
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 2 7 12 17
# [3,] 3 8 13 18
# [4,] 4 9 14 19
# [5,] 5 10 15 20
# [6,] 21 26 31 36
# [7,] 22 27 32 37
# [8,] 23 28 33 38
# [9,] 24 29 34 39
# [10,] 25 30 35 40
# attr(,"origdim")
# [1] 5 4 2
The random-swapping of rows. I'm using 50% here.
pct <- 0.5
nr <- nrow(m)
set.seed(3)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 2 8 4 3 9
(ind2 <- sample(ind1))
# [1] 3 2 9 8 4
m[ind1,] <- m[ind2,]
m
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 3 8 13 18
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 5 10 15 20
# [6,] 21 26 31 36
# [7,] 22 27 32 37
# [8,] 2 7 12 17
# [9,] 4 9 14 19
# [10,] 25 30 35 40
# attr(,"origdim")
# [1] 5 4 2
(Note that I pre-made ind1 and ind2 here, mostly to see what was going on internally. You can replace m[ind2,] with m[sample(ind1),] for the same effect.)
BTW: if we had instead used a seed of 2, we would notice that 2 rows are not swapped:
set.seed(2)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 2 7 5 10 6
(ind2 <- sample(ind1))
# [1] 6 2 5 10 7
Because of this, I chose a seed of 3 for demonstration. However, this may give the appearance of things not working. Lacking more controlling code, sample does not ensure that positions change: it is certainly reasonable to expect that "randomly swap rows" could randomly choose to move row 2 to row 2. Take for example:
set.seed(267)
(ind1 <- sample(nr, size = ceiling(nr * pct)))
# [1] 3 6 5 7 2
(ind2 <- sample(ind1))
# [1] 3 6 5 7 2
The first randomly chooses five rows, and then reorders them randomly into an unchanged order. (I suggest that if you want to force that they are all movements, you should ask a new question asking about just forcing a sample vector to change.)
Anyway, we can regain the original dimensionality with the second function:
(a2 <- matrix2array(m))
# , , 1
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 16
# [2,] 3 8 13 18
# [3,] 23 28 33 38
# [4,] 24 29 34 39
# [5,] 5 10 15 20
# , , 2
# [,1] [,2] [,3] [,4]
# [1,] 21 26 31 36
# [2,] 22 27 32 37
# [3,] 2 7 12 17
# [4,] 4 9 14 19
# [5,] 25 30 35 40
In the first plane of the array, rows 1 and 5 are unchanged; in the second plane, rows 1, 2, and 5 are unchanged. Five rows the same, five rows moved around (but otherwise unchanged within each row).
Suppose I have the following matrix:
mat <- matrix(1:20, ncol=5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
and the following vector
counts=c(2,1,2)
I need to collapse this matrix by adding the columns based on each value of that vector counts. That means that the first two columns most be added, the third remain equal and sum the last two columns. My resulting matrix must be like this
[,1] [,2] [,3]
[1,] 6 9 30
[2,] 8 10 32
[3,] 10 11 34
[4,] 12 12 36
How could I do this in an automatic way, given that in my case I have a very big matrix and with a vector of counts with different values?
One way would be to replicate the sequence of 'counts' by 'counts' vector, use that to split the column sequence of 'mat' to return a list, loop through the list with sapply, use the column index to subset the 'mat' for each list element and get the rowSums.
mat2 <- sapply(split(1:ncol(mat), rep(seq_along(counts), counts)),
function(i) rowSums(mat[,i,drop=FALSE]))
dimnames(mat2) <- NULL
mat2
# [,1] [,2] [,3]
#[1,] 6 9 30
#[2,] 8 10 32
#[3,] 10 11 34
#[4,] 12 12 36
Another idea, conceptually similar to akrun's:
t(rowsum(t(mat), rep(seq_along(counts), counts)))
# 1 2 3
#[1,] 6 9 30
#[2,] 8 10 32
#[3,] 10 11 34
#[4,] 12 12 36
I have a character matrix mtr of n rows and 3 columns.
I have a numeric vector nmb with some numbers, for example, 4,5,6
I want to sort only the rows of mtr, the numbers of which are contained by nmb, by the first column of my matrix.
So in my case I want to leave my matrix untouched except for rows 4,5,6 which I would like to be sorted by the first column and, of course, written back into my matrix mtr.
How could I do that? Thanks.
You can do it in this way:
mtr[nmb,] <- mtr[order(mtr[nmb,1]),]
I think this will do it
mtr[nmb,] <- mtr[nmb,][order(mtr[nmb,1]),]
An example:
nmb <- 4:6
mtr <- matrix(30:1, ncol=3)
> mtr
[,1] [,2] [,3]
[1,] 30 20 10
[2,] 29 19 9
[3,] 28 18 8
[4,] 27 17 7
[5,] 26 16 6
[6,] 25 15 5
[7,] 24 14 4
[8,] 23 13 3
[9,] 22 12 2
[10,] 21 11 1
> mtr[nmb,] <- mtr[nmb,][order(mtr[nmb,1]),]
> mtr
[,1] [,2] [,3]
[1,] 30 20 10
[2,] 29 19 9
[3,] 28 18 8
[4,] 25 15 5 <-
[5,] 26 16 6 <- sorted
[6,] 27 17 7 <-
[7,] 24 14 4
[8,] 23 13 3
[9,] 22 12 2
[10,] 21 11 1
I have following problem:
myvec <- c(1:3)
mymat <- as.matrix(cbind(a = 6:15, b = 16:25, c= 26:35))
mymat
a b c
[1,] 6 16 26
[2,] 7 17 27
[3,] 8 18 28
[4,] 9 19 29
[5,] 10 20 30
[6,] 11 21 31
[7,] 12 22 32
[8,] 13 23 33
[9,] 14 24 34
[10,] 15 25 35
I want to multiply the mymat with myvec and construct new vector such that
sum(6*1, 16*2, 26*3)
sum(7*1, 17*2, 27*3)
....................
sum(15*1, 25*2, 35*3)
Sorry, this is simple question that I do not know...
Edit: typo corrected
The %*% operator in R does matrix multiplication:
> mymat %*% myvec
[,1]
[1,] 116
[2,] 122
...
[10,] 170
An alternative, but longer way can be this one:
rowSums(t(apply(mymat, 1, function(x) myvec*x)),na.rm=T)
Is the only way that I found that can ignore NA's inside the matrix.
Matrices are vectors in column major order:
colSums( t(mymat) * myvec )
(Edited after hopefully reading question correctly this time.)