Split matrices in R - r

I am trying to split a matrix along the rows but not sure how to do it. For example if I have a NxM matrix but want to split the matrix such that I get n (N/n)xM matrices, how would I do that?
So if I had a matrix X:
[,1] [,2]
[1,] 1 21
[2,] 2 22
[3,] 3 23
[4,] 4 24
[5,] 5 25
[6,] 6 26
[7,] 7 27
[8,] 8 28
[9,] 9 29
[10,] 10 30
[11,] 11 31
[12,] 12 32
[13,] 13 33
[14,] 14 34
[15,] 15 35
[16,] 16 36
[17,] 17 37
[18,] 18 38
[19,] 19 39
[20,] 20 40`
The output of a function block(X,n) if n = 2 would be
[[1]]
[,1] [,2]
[1,] 1 21
[2,] 2 22
[3,] 3 23
[4,] 4 24
[5,] 5 25
[6,] 6 26
[7,] 7 27
[8,] 8 28
[9,] 9 29
[10,] 10 30
[[2]]
[,1] [,2]
[1,] 11 31
[2,] 12 32
[3,] 13 33
[4,] 14 34
[5,] 15 35
[6,] 16 36
[7,] 17 37
[8,] 18 38
[9,] 19 39
[10,] 20 40
Thanks for any help in advance!

We create a grouping column to split
n <- 10
grp <- (seq_len(nrow(X)) - 1) %/% n + 1
split(as.data.frame(X), grp)
Or use index to subset the rows
lapply(seq(1, nrow(X), by = n), function(i) X[i:(i+n -1), ])
data
X <- matrix(1:40, ncol = 2)

Related

How to lag a matrix in R

I want to know the command in R to lag a matrix.
I have defined x as:
> (x <- matrix(1:50, 10, 5))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 11 21 31 41
[2,] 2 12 22 32 42
[3,] 3 13 23 33 43
[4,] 4 14 24 34 44
[5,] 5 15 25 35 45
[6,] 6 16 26 36 46
[7,] 7 17 27 37 47
[8,] 8 18 28 38 48
[9,] 9 19 29 39 49
[10,] 10 20 30 40 50
I want create l.x:
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 11 21 31 41
[3,] 2 12 22 32 42
[4,] 3 13 23 33 43
[5,] 4 14 24 34 44
[6,] 5 15 25 35 45
[7,] 6 16 26 36 46
[8,] 7 17 27 37 47
[9,] 8 18 28 38 48
[10,] 9 19 29 39 49
lag will coerce your object to a time-series (ts class to be specific) and only shifts the time index. It does not change the underlying data.
You need to manually lag the matrix yourself by adding rows of NA at the beginning and removing the same number of rows at the end. Here's an example of a function that does just that:
lagmatrix <- function(x, k) {
# ensure 'x' is a matrix
stopifnot(is.matrix(x))
if (k == 0)
return(x)
na <- matrix(NA, nrow=abs(k), ncol=ncol(x))
if (k > 0) {
nr <- nrow(x)
# prepend NA and remove rows from end
rbind(na, x[-((nr-k):nr),])
} else {
# append NA and remove rows from beginning
rbind(x[-1:k,], na)
}
}
Or you can use a lag function that does what you expect. For example, xts::lag.xts.
> xts::lag.xts(x)
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 11 21 31 41
[3,] 2 12 22 32 42
[4,] 3 13 23 33 43
[5,] 4 14 24 34 44
[6,] 5 15 25 35 45
[7,] 6 16 26 36 46
[8,] 7 17 27 37 47
[9,] 8 18 28 38 48
[10,] 9 19 29 39 49
> is.matrix(xts::lag.xts(x))
[1] TRUE
Here is one manual method in base R with head and rbind:
rbind(NA, head(x, 9))
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 11 21 31 41
[3,] 2 12 22 32 42
[4,] 3 13 23 33 43
[5,] 4 14 24 34 44
[6,] 5 15 25 35 45
[7,] 6 16 26 36 46
[8,] 7 17 27 37 47
[9,] 8 18 28 38 48
[10,] 9 19 29 39 49
More generally, as noted by #akrun, head(., -1) will work for any sized matrix:
rbind(NA, head(x, -1))
We can use apply
library(dplyr)
apply(x, 2, lag)
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA NA NA NA NA
# [2,] 1 11 21 31 41
# [3,] 2 12 22 32 42
# [4,] 3 13 23 33 43
# [5,] 4 14 24 34 44
# [6,] 5 15 25 35 45
# [7,] 6 16 26 36 46
# [8,] 7 17 27 37 47
# [9,] 8 18 28 38 48
#[10,] 9 19 29 39 49
0r
rbind(NA, x[-nrow(x),])
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA NA NA NA NA
# [2,] 1 11 21 31 41
# [3,] 2 12 22 32 42
# [4,] 3 13 23 33 43
# [5,] 4 14 24 34 44
# [6,] 5 15 25 35 45
# [7,] 6 16 26 36 46
# [8,] 7 17 27 37 47
# [9,] 8 18 28 38 48
#[10,] 9 19 29 39 49
Below is a pure dplyr solution without the need for apply. Only annoyance here is that it needs to be converted to a data.frame to work.
library(dplyr)
x %>% as.data.frame %>% mutate_each( funs(lag))

R: How to sum pairs in a Matrix by row?

Probably this would be easy. I have a Matrix:
testM <- matrix(1:40, ncol = 4, byrow = FALSE)
testM
[,1] [,2] [,3] [,4]
[1,] 1 11 21 31
[2,] 2 12 22 32
[3,] 3 13 23 33
[4,] 4 14 24 34
[5,] 5 15 25 35
[6,] 6 16 26 36
[7,] 7 17 27 37
[8,] 8 18 28 38
[9,] 9 19 29 39
[10,] 10 20 30 40
and I want to "reduce" the matrix summing column pairs by row. Expected result:
[,1] [,2]
[1,] 12 52
[2,] 14 54
[3,] 16 56
[4,] 18 58
[5,] 20 60
[6,] 22 62
[7,] 24 64
[8,] 26 66
[9,] 28 68
[10,] 30 70
I tried this but doesn't work
X <- apply(1:(ncol(testM)/2), 1, function(x) sum(testM[x], testM[x+1]) )
Error in apply(1:(ncol(testM)/2), 1, function(x) sum(testM[x], testM[x + :
dim(X) must have a positive length
testM[,c(T,F)]+testM[,c(F,T)];
## [,1] [,2]
## [1,] 12 52
## [2,] 14 54
## [3,] 16 56
## [4,] 18 58
## [5,] 20 60
## [6,] 22 62
## [7,] 24 64
## [8,] 26 66
## [9,] 28 68
## [10,] 30 70
Here's a solution using rowSums()
sapply( list(1:2,3:4) , function(i) rowSums(testM[,i]) )
if the number of columns should be arbitrary, it gets more complicated:
li <- split( 1:ncol(testM) , rep(1:(ncol(testM)/2), times=1 , each=2))
sapply( li , function(i) rowSums(testM[,i]) )
We can do a matrix multiplication:
M <- matrix(c(1,1,0,0, 0,0,1,1), 4, 2)
testM %*% M
another solution with tapply():
g <- gl(ncol(testM)/2, 2)
t(apply(testM, 1, FUN=tapply, INDEX=g, sum))
How about:
matrix(c(testM[, 1] + testM[, 2], testM[, 2] + testM[, 4]), nrow = 10)
a solution around your initial idea:
sapply(seq(2, ncol(testM), 2), function(x) apply(testM[, (x-1):x], 1, sum))

Exclude specific columns from a matrix

I have a list of numbers (example bellow):
[[178]]
NULL
[[179]]
[1] 179 66
[[180]]
[1] 180 67
[[181]]
[1] 181 123
[[182]]
[1] 182
This list contains columns (179, 66, 180, 67, 181, 123) I want to exclude from a matrix.
I tried commands bellow, but they didn't work:
MyMatrix[, !(unlist(MyList))]
MyMatrix[, -(unlist(MyList))]
MyMatrix[, !unlist(MyList)]
MyMatrix[, -unlist(MyList)]
My question: what is a right way to exclude specific columns from a matrix?
Here's my small replication of your problem.
listOfColumns<-list(NULL, c(2,3), 5, NULL)
listOfColumns #print for viewing
#output
#[[1]]
#NULL
#[[2]]
#[1] 2 3
#[[3]]
#[1] 5
#[[4]]
#NULL
MyMatrix<-matrix(1:50, nrow=10, ncol=5)
MyMatrix #print for viewing
#output
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 11 21 31 41
#[2,] 2 12 22 32 42
#[3,] 3 13 23 33 43
#[4,] 4 14 24 34 44
#[5,] 5 15 25 35 45
#[6,] 6 16 26 36 46
#[7,] 7 17 27 37 47
#[8,] 8 18 28 38 48
#[9,] 9 19 29 39 49
#[10,] 10 20 30 40 50
First, the way you're going to want to subset your matrix so that you omit the given column numbers is to do
MyMatrix[-columnNumbers]
In R, negative numbers used to subset correspond to entries that should be omitted.
The following call output's what you want
MyMatrix[,-unlist(listOfNumbers)]
#output
# [,1] [,2]
# [1,] 1 31
# [2,] 2 32
# [3,] 3 33
# [4,] 4 34
# [5,] 5 35
# [6,] 6 36
# [7,] 7 37
# [8,] 8 38
# [9,] 9 39
# [10,] 10 40
If you want to keep this result for later use, you'll need to store it (As David Robinson got at)
MySmallerMatrix<-MyMatrix[,-unlist(listOfNumbers)]

Sorting specified rows in a matrix by the first column in R

I have a character matrix mtr of n rows and 3 columns.
I have a numeric vector nmb with some numbers, for example, 4,5,6
I want to sort only the rows of mtr, the numbers of which are contained by nmb, by the first column of my matrix.
So in my case I want to leave my matrix untouched except for rows 4,5,6 which I would like to be sorted by the first column and, of course, written back into my matrix mtr.
How could I do that? Thanks.
You can do it in this way:
mtr[nmb,] <- mtr[order(mtr[nmb,1]),]
I think this will do it
mtr[nmb,] <- mtr[nmb,][order(mtr[nmb,1]),]
An example:
nmb <- 4:6
mtr <- matrix(30:1, ncol=3)
> mtr
[,1] [,2] [,3]
[1,] 30 20 10
[2,] 29 19 9
[3,] 28 18 8
[4,] 27 17 7
[5,] 26 16 6
[6,] 25 15 5
[7,] 24 14 4
[8,] 23 13 3
[9,] 22 12 2
[10,] 21 11 1
> mtr[nmb,] <- mtr[nmb,][order(mtr[nmb,1]),]
> mtr
[,1] [,2] [,3]
[1,] 30 20 10
[2,] 29 19 9
[3,] 28 18 8
[4,] 25 15 5 <-
[5,] 26 16 6 <- sorted
[6,] 27 17 7 <-
[7,] 24 14 4
[8,] 23 13 3
[9,] 22 12 2
[10,] 21 11 1

probe-level mean-centering of RMA normalized probes with R bioconductor

I have several microarray datasets from Leukaemia and Lymphoma patients which I have normalized with rma eset <- rma (Data). I want to obtain the probe-level-mean-centring, what package can I use for that? would anyone recommend a useful and robust script to apply on my datasets?
for instance:
Data <- ReadAffy (....) #read raw CEL files
eset <- rma (Data) #rma normalization
expr_mat <- exprs(eset) #get expression
This will give me a table with rma normalized probes for my samples. What code can I add here to obtain log2 probe level mean centring ?
Thanks!
NB rma data is already log2 valued. The following will mean centre by probe/row
dummy data example:
> m=1:40
> dim(m)=c(10,4)
> m
[,1] [,2] [,3] [,4]
[1,] 1 11 21 31
[2,] 2 12 22 32
[3,] 3 13 23 33
[4,] 4 14 24 34
[5,] 5 15 25 35
[6,] 6 16 26 36
[7,] 7 17 27 37
[8,] 8 18 28 38
[9,] 9 19 29 39
[10,] 10 20 30 40
then by the magic of recycling, scale by row
> m-rowMeans(m)
[,1] [,2] [,3] [,4]
[1,] -15 -5 5 15
[2,] -15 -5 5 15
[3,] -15 -5 5 15
[4,] -15 -5 5 15
[5,] -15 -5 5 15
[6,] -15 -5 5 15
[7,] -15 -5 5 15
[8,] -15 -5 5 15
[9,] -15 -5 5 15
[10,] -15 -5 5 15

Resources