Dear StackOverflowers,
I have an integer matrix in R and I would like to subset it so that I remove 1 specified cell in each column. So that, for instance, a 4x3 matrix becomes a 3x3 matrix. I have tried doing it by creating the second logical matrix of the same dimensions.
(subject.matrix <- matrix(1:12, nrow = 4))
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
(query.matrix <- matrix(c(T, T, F, T, T, F, T, T, T, T, T, F), nrow = 4))
[,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] TRUE FALSE TRUE
[3,] FALSE TRUE TRUE
[4,] TRUE TRUE FALSE
The problem is that, when I index the first matrix by the second one, it is simplified to an integer vector.
subject.matrix[query.matrix]
[1] 1 2 4 5 7 8 9 10 11
I've tried adding drop=F, but to no avail. I know, I can just wrap the resulting vector into a 3x3 matrix. So the expected outcome would be:
matrix(subject.matrix[query.matrix], nrow = 3)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
But I wonder if there's a more elegant/direct solution. I'm also not attached to using a logical matrix as the index, if that means a simpler solution. Perhaps, I could subset it with a vector of indices for the rows to be removed in each column, which in this case would translate into c(3, 2, 4).
Many thanks!
Edit based on #LyzandeR suggestion: My final goal was to take column sums of the resulting matrix. So replacing the redundant values with NA's seems to be the best way to go.
I think that the only way you can preserve the matrix structure would be to use a more general way of your question edit i.e.:
matrix(subject.matrix[query.matrix], ncol = ncol(subject.matrix))
You could even convert it into a function if you plan on using it multiple times:
subset.mat <- function(mat, index, cols=ncol(mat)) {
matrix(mat[index], ncol = cols)
}
Output:
> subset.mat(subject.matrix, query.matrix)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Also (sorry just read your updated comment) you might consider using NAs in the matrix instead of subsetting them out, which will allow you to calculate the column sums as you say:
subject.matrix[!query.matrix] <- NA
subject.matrix
# [,1] [,2] [,3]
#[1,] 1 5 9
#[2,] 2 NA 10
#[3,] NA 7 11
#[4,] 4 8 NA
This is a little brute-forceish, but I think you'll be able to extrapolate it into something more general:
new.matrix = matrix(ncol = ncol(subject.matrix), nrow = nrow(subject.matrix) - 1)
for(i in 1:ncol(subject.matrix)){
new.matrix[,i] = subject.matrix[,i][query.matrix[,i] == TRUE]
}
new.matrix
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Essentially, I just initialized an empty matrix, and then iterated through each column of subject.matrix taking only the TRUE values for query.matrix.
Related
First I create a 5x4 matrix with random numbers from 1 to 10:
A <- matrix(sample(1:10, 20, TRUE), 5, 4)
> A
[,1] [,2] [,3] [,4]
[1,] 1 5 6 6
[2,] 5 9 9 4
[3,] 10 6 1 8
[4,] 4 4 10 2
[5,] 10 9 7 5
In the following step I would like to obtain the returns by row (for row 1: (5-1)/1, (6-5)/5, (6-6)/6 and the same procedure for the other rows). The final matrix should therefore be a 5x3 matrix.
You can make use of the Base R funtion diff() applied to your transposed matrix:
Code:
# Data
set.seed(1)
A <- matrix(sample(1:10, 20, TRUE), 5, 4)
# [,1] [,2] [,3] [,4]
#[1,] 9 7 5 9
#[2,] 4 2 10 5
#[3,] 7 3 6 5
#[4,] 1 1 10 9
#[5,] 2 5 7 9
# transpose so we get per row and not column returns
t(diff(t(A))) / A[, -ncol(A)]
[,1] [,2] [,3]
[1,] -0.2222222 -0.2857143 0.8000000
[2,] -0.5000000 4.0000000 -0.5000000
[3,] -0.5714286 1.0000000 -0.1666667
[4,] 0.0000000 9.0000000 -0.1000000
[5,] 1.5000000 0.4000000 0.2857143
A <- matrix(sample(1:10, 20, TRUE), 5, 4)
fn.Calc <- function(a,b){(a-b)/a}
B <- matrix(NA, nrow(A), ncol(A)-1)
for (ir in 1:nrow(B)){
for (ic in 1:ncol(B)){
B[ir, ic] <- fn.Calc(A[ir, ic+1], A[ir, ic])
}
}
small note: when working with random functions providing a seed is welcomed ;)
So what we have here:
fn.Calc is just the calculation you are trying to do, i've isolated it in a function so that it's easier to change if needed
then a new B matrix is created having 1 column less then A but the same rows
finally we are going to loop every element in this B matrix, I like to use ir standing for incremental rows and ic standing for incremental column and finally inside the loop (B[ir, ic] <- fn.Calc(A[ir, ic+1], A[ir, ic])) is when the magic happens where the actual values are calculated and stored in B
it's a very basic approach without calling any package, there's probably many other ways to solve this that require less code.
It is probably fairly basic but I have not found an easy solution.
Assume I have a three-dimensional matrix:
m <- array(seq_len(18),dim=c(3,3,2))
and I would like to subset the matrix with the arrays of indexes:
idxrows <- c(1,2,3)
idxcols <- c(1,1,2)
obtaining the arrays in position (1,1),(2,1) and (3,2), that is:
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 10 14 18
I have tried m[idxrows,idxcols,] but without any luck.
Is there anyway to do it (without obviously using a for loop)?
Not sure if there is any easy built in extract syntax, but you can work around this with mapply:
mapply(function(i, j) m[i,j,], idxrows, idxcols)
# [,1] [,2] [,3]
#[1,] 1 2 6
#[2,] 10 11 15
Or slightly more convoluted, create a index matrix whose columns match the dimensions of the original array:
thirdDim <- dim(m)[3]
index <- cbind(rep(idxrows, each = thirdDim), rep(idxcols, each = thirdDim), 1:thirdDim)
matrix(m[index], nrow = thirdDim)
# [,1] [,2] [,3]
#[1,] 1 2 6
#[2,] 10 11 15
I am working with the hand-written zip codes dataset. I have loaded the dataset like this:
digits <- read.table("./zip.train",
quote = "",
comment.char = "",
stringsAsFactors = F)
Then I get only the ones:
ones <- digits[digits$V1 == 1, -1]
Right now, in ones I have 442 rows, with 256 column. I need to transform each row in ones to a 16x16 matrix. I think what I am looking for is a list of 16x16 matrix like the ones in this question:
How to create a list of matrix in R
But I tried with my data and did not work.
At first I tried ones <- apply(ones, 1, matrix, nrow = 16, ncol = 16) but is not working as I thought it was. I also tried lapply with no luck.
An alternative is to just change the dims of your matrix.
Consider the following matrix "M":
M <- matrix(1:12, ncol = 4)
M
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 5 8 11
# [3,] 3 6 9 12
We are looking to create a three dimensional array from this, so you can specify the dimensions as "row", "column", "third-dimension". However, since the matrix is constructed by column, you first need to transpose it before changing the dimensions.
`dim<-`(t(M), c(2, 2, nrow(M)))
# , , 1
#
# [,1] [,2]
# [1,] 1 7
# [2,] 4 10
#
# , , 2
#
# [,1] [,2]
# [1,] 2 8
# [2,] 5 11
#
# , , 3
#
# [,1] [,2]
# [1,] 3 9
# [2,] 6 12
though there are probably simple ways, you can try with lapply:
ones_matrix <- lapply(1:nrow(ones), function(i){matrix(ones[i, ], nrow=16)})
Consider the following 3-dimensional array:
set.seed(123)
arr = array(sample(c(1:10)), dim=c(3,4,2))
which yields
> arr
, , 1
[,1] [,2] [,3] [,4]
[1,] 10 9 8 2
[2,] 5 1 4 10
[3,] 6 7 3 5
, , 2
[,1] [,2] [,3] [,4]
[1,] 6 7 3 5
[2,] 9 8 2 6
[3,] 1 4 10 9
I'd like to subset it like
arr[c(1,2), c(2,4), c(1)]
but the catch is that I don't know (a) which indices or (b) which dimension the indices are.
What is the best way to access an N-dimensional array with index variables?
ll = list(c(1,2), c(2,4), c(1))
arr[ll] # doesn't work
arr[grid.expand(ll)] # doesn't work
# ..what else?
use do.call, such as:
do.call(`[`, c(list(arr), ll))
or more cleanly, using a wrapper function:
getArr <- function(...)
`[`(arr, ...)
do.call(getArr, ll)
[,1] [,2]
[1,] 10 5
[2,] 7 3
There is the asub function from the abind package:
library(abind)
asub(arr, ll)
which can also do a lot more, in particular extract along a subset of the dimensions (https://stackoverflow.com/a/17752012/1201032). Worth having in your toolbox.
I have X, a three-dimensional array in R. I want to take a vector of indices indx (length equal to dim(X)[1]) and form a matrix where the first row is the first row of X[ , , indx[1]], the second row is the second row of X[ , , indx[2]], and so on.
For example, I have:
R> X <- array(1:18, dim = c(3, 2, 3))
R> X
, , 1
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
, , 2
[,1] [,2]
[1,] 7 10
[2,] 8 11
[3,] 9 12
, , 3
[,1] [,2]
[1,] 13 16
[2,] 14 17
[3,] 15 18
R> indx <- c(2, 3, 1)
My desired output is
R> rbind(X[1, , 2], X[2, , 3], X[3, , 1])
[,1] [,2]
[1,] 7 10
[2,] 14 17
[3,] 3 6
As of now I'm using the inelegant (and slow) sapply(1:dim(X)[2], function(x) X[cbind(1:3, x, indx)]). Is there any way to do this using the built-in indexing functions? I had no luck experimenting with the matrix indexing methods described in ?Extract, but I may just be doing it wrong.
Maybe like this:
t(sapply(1:3,function(x) X[,,idx][x,,x]))
I may be answering the wrong question (I can't reconcile your first description and your sample output)... This produces your sample output, but I can't say that it's much faster without running it on your data.
do.call(rbind, lapply(1:dim(X)[1], function(i) X[i, , indx[i]]))
Matrix indexing to the rescue! No applys needed.
Figure out which indices you want:
n <- dim(X)[2]
foo <- cbind(rep(seq_along(indx),n),
rep(seq.int(n), each=length(indx)),
rep(indx,n))
(the result is this)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 1 3
[3,] 3 1 1
[4,] 1 2 2
[5,] 2 2 3
[6,] 3 2 1
and use it as index, converting back to a matrix to make it look like your output.
> matrix(X[foo],ncol=n)
[,1] [,2]
[1,] 7 10
[2,] 14 17
[3,] 3 6