R: Picking values from matrix by indice matrix - r

I have a datamatrix with n rows and m columns (in this case n=192, m=1142) and an indice matrix of nxp (192x114). Each row of the indice matrix shows the column numbers of the elements that I would like to pick from the matching row of the datamatrix. Thus I have a situation something like this (with example values):
data<-matrix(1:30, nrow=3)
data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 4 7 10 13 16 19 22 25 28
[2,] 2 5 8 11 14 17 20 23 26 29
[3,] 3 6 9 12 15 18 21 24 27 30
columnindices<-matrix(sample(1:10,size=9, replace=TRUE),nrow=3)
columnindices
[,1] [,2] [,3]
[1,] 8 7 4
[2,] 10 8 10
[3,] 8 10 2
I would like to pick values from the datamatrix rows using the in columnindices matrix, so that the resulting matrix would look like this
[,1] [,2] [,3]
[1,] 22 19 10
[2,] 29 23 29
[3,] 24 30 6
I tried using a for loop:
result<-0
for(i in 1:3) {
result[i]<-data[i,][columnindices[,i]]
print[i]
}
but this doesn't show the wished result. I guess my problem should be rather simply solved, but unfortunately regardless many hours of work and multiple searches I still haven't been able to solve it (I am rookie). I would really appreciate some help!

Your loop is just a little bit off:
result <- matrix(rep(NA, 9), nrow = 3)
for(i in 1:3){
result[i,] <- data[i, columnindices[i,]]
}
> result
[,1] [,2] [,3]
[1,] 25 13 7
[2,] 29 29 23
[3,] 15 15 18
Note that the matrix is not exactly the one you posted as expected result because the code for your example columnindices does not match the matrix you posted below. Code should work as you want it.

The for-loop way described by #LAP is easier to understand and to implement.
If you would like to have something universal, i.e. you don't need to
adjust row number every time, you may utilise the mapply function:
result <- mapply(
FUN = function(i, j) data[i,j],
row(columnindices),
columnindices)
dim(result) <- dim(columnindices)
mapply loop through every element of two matrices,
row(columnindices) is for i row index
columnindices is for j column index.
It returns a vector, which you have to coerce to the initial columnindices dimension.

Related

Column having maximum L2 norm

How to find the column of a matrix which has the maximum L2 norm? The matrix has NA values in some columns, we want to ignore those columns.
The following code I am trying, but it shows error due to NA values.
#The matrix is T
for(i in 1:ncol(T)){
if(norm(y,type='2') < norm(T[,i],type = '2'))
y = T[,i]
}
I think it would also be useful if we could somehow get the columns of T as a list, since we could use which.max function then, but I could not do that. Is that possible?
Please help
Maybe you can write your own L2 norm and find the column with the maximum, i.e.,
which.max(sqrt(colSums(T**2)))
Example
T <- matrix(c(1:10,NA,12:19,NA),nrow = 4)
> T
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 NA 15 19
[4,] 4 8 12 16 NA
> which.max(sqrt(colSums(T**2)))
[1] 4

Flipping matrix columns

I have a matrix:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
I want to flip it so that the last column will be the firs and the first will be the last.
I know how to do it with a loop but is there any other quicker way to do this, e.g. with a function.
Here is the code that creates the matrix:
mat=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20), ncol=5)
We can use reverse sequencing from last column index to the first one to do the flipping
mat[,ncol(mat):1]
It can be wrapped into a function
revflip <- function(matr) {
matr[, ncol(matr):1, drop = FALSE]
}
revflip(mat)

Writing a loop/function that compares adjacent columns in a matrix and picks the max value so to reduce the number of columns

I'm new to R and stuck. I want to reduce the number of columns in a 92x8192 matrix. The matrix consists of 92 observations and each column resembles a data point in a spectrum. The value corresponds to an intensity that is an integer. I want to reduce the "resolution" (i.e. the number of data points = columns) of the spectrum in a somewhat controlled way.
Example:
[,1] [,2] [,3] [,4] [,5] [,6] [...]
[1,] 1 2 3 4 5 6
[2,] 7 8 9 10 11 12
[3,] 13 14 15 16 17 18
[4,] 19 20 21 22 23 24
[5,] 25 26 27 28 29 30
[6,] 31 32 33 34 35 36
What i would like to do is compare adjacent columns (for each row) e.g [1,1] and [1,2], and find the max value of those two entries (that would be [1,2] in that case). The smaller value should be dropped, and the next two adjacent columns should be evaluated. So that in the end there will only be ncol/2 left. I know there is something like pmax. But since my knowledge with loops and functions is far too limited at this point i don't know how to not only compare two columns at a time but do it for all 4096 pairs of values in each row. In the end the matrix should look like this:
[,1] [,2] [,3] [...]
[1,] 2 4 6
[2,] 8 10 12
[3,] 14 16 18
[4,] 20 22 24
[5,] 26 28 30
[6,] 32 34 36
The values i have used are not a good example because i know that in this case it looks like i could just drop every other column and i know how to do that.
Apologies if the question is worded in a complicated way but i think the task isn't really all that complicated.
Thanks for any help or suggestions on how to go about this task.
Example matrix:
> set.seed(101)
> x_full <- matrix(runif(30), nrow=5)
> x_full
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.37219838 0.3000548 0.8797957 0.59031973 0.7007115 0.79571976
[2,] 0.04382482 0.5848666 0.7068747 0.82043609 0.9568375 0.07121255
[3,] 0.70968402 0.3334671 0.7319726 0.22411848 0.2133520 0.38940777
[4,] 0.65769040 0.6220120 0.9316344 0.41166683 0.6610615 0.40645122
[5,] 0.24985572 0.5458286 0.4551206 0.03861056 0.9233189 0.65935508
Now reduce:
> x_reduced <- sapply(seq(1, ncol(x_full), 2), function(colnum) { pmax(x_full[, colnum], x_full[, colnum + 1]) })
> x_reduced
[,1] [,2] [,3]
[1,] 0.3721984 0.8797957 0.7957198
[2,] 0.5848666 0.8204361 0.9568375
[3,] 0.7096840 0.7319726 0.3894078
[4,] 0.6576904 0.9316344 0.6610615
[5,] 0.5458286 0.4551206 0.9233189
How it works: seq(1, ncol(x_full), 2) generates a sequence of integers representing the odd numbers up to the number of columns of x_full. Then sapply() applies a function to this sequence and presents the results in a tidy format (in this case it happens to be a matrix as we require). The function being applied is one that we specify using function: for column numbered colnum it just applies pmax() across that column and the next.
Example solution
mat = mat <- matrix(1:16,nrow=4)
m <- matrix(nrow=nrow(mat),ncol=ncol(mat)/2+1) #preassign a solution matrix to save time
for (i in seq(1,ncol(mat),2)){m[,i/2+1]<-(pmax(mat[,i],mat[,i+1]))}
your solution is then stored in m

how unroll a matrix in R

For example, I have a matrix:
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
I want it to become
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 3 4 5 6 7 8
[2,] 9 10 11 12 13 14 15 16
Thanks.
Let me expand on Zheyuan Li's answer, since these things can be a bit mysterious for the uninitiated. Basically, the same matrix function used to create a matrix from a vector, can also be used to reshape a matrix.
All one needs to realize is that a matrix is much like a vector but with a $dim attribute for its shape, and that the values of that underlying vector are stored by column.
To create your original matrix, you could do:
A <- matrix(1:16, nrow=4, byrow=TRUE)
print(attributes(A))
The byrow argument tells matrix to allocate the elements of the input vector in a rowwise fashion to the matrix, instead of columnwise.
However, it does not change the fact that after this allocation, the internal storing of values in the matrix is still by column. The byrow argument has then simply changed the ordering of elements in the underlying vector, as can easily be seen:
print(as.numeric(A))
What we need to get your desired output, is to first get the sequence in your matrix ordered by column - so that the underlying vector is 1:16 again. For this we can use the transpose function t(). After the transpose, we can bring the now nicely ordered values into the desired 2x8 shape in a rowwise fashion. So:
B <- matrix(t(A), nrow=2, byrow=TRUE)
print(B)

How to pass "nothing" as an argument to `[` for subsetting?

I was hoping to be able to construct a do.call formula for subsetting without having to identify the actual range of every dimension in the input array.
The problem I'm running into is that I can't figure out how to mimic the direct function x[,,1:n,] , where no entry in the other dimensions means "grab all elements."
Here's some sample code, which fails. So far as I can tell, either [ or do.call replaces my NULL list values with 1 for the index.
x<-array(1:6,c(2,3))
dimlist<-vector('list', length(dim(x)))
shortdim<-2
dimlist[[shortdim]] <- 1: (dim(x)[shortdim] -1)
flipped <- do.call(`[`,c(list(x),dimlist))
I suppose I could kludge a solution by assigning the value -2*max(dim(x)) to each element of dimlist, but yuck.
(FWIW, I have alternate functions which do the desired job either via melt/recast or the dreaded "build a string and then eval(parse(mystring)) , but I wanted to do it "better.")
Edit: as an aside, I ran a version of this code (with the equivalent of DWin's TRUE setup) against a function which used melt & acast ; the latter was several times slower to no real surprise.
After some poking around, alist seems to do the trick:
x <- matrix(1:6, nrow=3)
x
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
# 1st row
do.call(`[`, alist(x, 1, ))
[1] 1 4
# 2nd column
do.call(`[`, alist(x, , 2))
[1] 4 5 6
From ?alist:
‘alist’ handles its arguments as if they described function
arguments. So the values are not evaluated, and tagged arguments
with no value are allowed whereas ‘list’ simply ignores them.
‘alist’ is most often used in conjunction with ‘formals’.
A way of dynamically selecting which dimension is extracted. To create the initial alist of the desired length, see here (Hadley, using bquote) or here (using alist).
m <- array(1:24, c(2,3,4))
ndims <- 3
a <- rep(alist(,)[1], ndims)
for(i in seq_len(ndims))
{
slice <- a
slice[[i]] <- 1
print(do.call(`[`, c(list(m), slice)))
}
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 3 9 15 21
[3,] 5 11 17 23
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 2 8 14 20
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
I've always used TRUE as a placeholder in this instance:
> x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> do.call("[", list(x, TRUE,1))
[1] 1 2
Let's use a somewhat more complex x example: x <- array(1:36, c(2,9,2), then if the desire is for a vector to be substituted in a list of subscripts that will recover all of the first and second dimensions and only the second "slice" of the third dimension:
shortdim <- 3
short.idx <- 2
dlist <- rep(TRUE, length(dim(x)) )
dlist <- as.list(rep(TRUE, length(dim(x)) ))
> dlist
[[1]]
[1] TRUE
[[2]]
[1] TRUE
[[3]]
[1] TRUE
> dlist[shortdim] <- 2
> do.call("[", c(list(x), dlist) )
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 19 21 23 25 27 29 31 33 35
[2,] 20 22 24 26 28 30 32 34 36
Another point sometimes useful is that the logical indices get recycled so you can use c(TRUE,FALSE) to pick out every other item:
(x<-array(1:36, c(2,9,2)))
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 3 5 7 9 11 13 15 17
[2,] 2 4 6 8 10 12 14 16 18
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 19 21 23 25 27 29 31 33 35
[2,] 20 22 24 26 28 30 32 34 36
> x[TRUE,c(TRUE,FALSE), TRUE]
, , 1
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
, , 2
[,1] [,2] [,3] [,4] [,5]
[1,] 19 23 27 31 35
[2,] 20 24 28 32 36
And further variations on every-other-item are possible. Try c(FALSE, FALSE, TRUE) to get every third item starting with item-3.
Not a straight answer, but I'll demo asub as an alternative as I am pretty sure this is what the OP is eventually after.
library(abind)
Extract 1st row:
asub(x, idx = list(1), dims = 1)
Extract second and third column:
asub(x, idx = list(2:3), dims = 2)
Remove the last item from dimension shortdim as the OP wanted:
asub(x, idx = list(1:(dim(x)[shortdim]-1)), dims = shortdim)
You can also use negative indexing so this will work too:
asub(x, idx = list(-dim(x)[shortdim]), dims = shortdim)
Last, I will mention that the function has a drop option just like [ does.
Ok, here's the code for four versions, followed by microbenchmark . The speed appears to be pretty much the same for all of these. I'd like to check all answers as accepted, but since I can't, here are the chintzy criteria used:
DWin loses because you have to enter "TRUE" for placeholders.
flodel loses because it requires a non-base library
My original loses, of course, because of eval(parse()). So Hong Ooi wins. He advances to the next round of Who Wants to be a Chopped Idol :-)
flip1<-function(x,flipdim=1) {
if (flipdim > length(dim(x))) stop("Dimension selected exceeds dim of input")
a <-"x["
b<-paste("dim(x)[",flipdim,"]:1",collapse="")
d <-"]"
#now the trick: get the right number of commas
lead<-paste(rep(',',(flipdim-1)),collapse="")
follow <-paste(rep(',',(length(dim(x))-flipdim)),collapse="")
thestr<-paste(a,lead,b,follow,d,collapse="")
flipped<-eval(parse(text=thestr))
return(invisible(flipped))
}
flip2<-function(x,flipdim=1) {
if (flipdim > length(dim(x))) stop("Dimension selected exceeds dim of input")
dimlist<-vector('list', length(dim(x)) )
dimlist[]<-TRUE #placeholder to make do.call happy
dimlist[[flipdim]] <- dim(x)[flipdim]:1
flipped <- do.call(`[`,c(list(x),dimlist) )
return(invisible(flipped))
}
# and another...
flip3 <- function(x,flipdim=1) {
if (flipdim > length(dim(x))) stop("Dimension selected exceeds dim of input")
flipped <- asub(x, idx = list(dim(x)[flipdim]:1), dims = flipdim)
return(invisible(flipped))
}
#and finally,
flip4 <- function(x,flipdim=1) {
if (flipdim > length(dim(x))) stop("Dimension selected exceeds dim of input")
dimlist <- rep(list(bquote()), length(dim(x)))
dimlist[[flipdim]] <- dim(x)[flipdim]:1
flipped<- do.call(`[`, c(list(x), dimlist))
return(invisible(flipped))
}
Rgames> foo<-array(1:1e6,c(100,100,100))
Rgames> microbenchmark(flip1(foo),flip2(foo),flip3(foo),flip4(foo)
Unit: milliseconds
expr min lq median uq max neval
flip1(foo) 18.40221 18.47759 18.55974 18.67384 35.65597 100
flip2(foo) 21.32266 21.53074 21.76426 31.56631 76.87494 100
flip3(foo) 18.13689 18.18972 18.22697 18.28618 30.21792 100
flip4(foo) 21.17689 21.57282 21.73175 28.41672 81.60040 100
You can use substitute() to obtain an empty argument. This can be included in a normal list.
Then, to programmatically generate a variable number of empty arguments, just rep() it:
n <- 4
rep(list(substitute()), n)

Resources