I have a list of matrices with identical dimensions, for example:
mat.list=rep(list(matrix(rnorm(n=12,mean=1,sd=1), nrow = 3, ncol=4)),3)
I'm looking for an efficient way to retrieve a column from each matrix in the list where the column index of interest from each matrix is specified by a vector. For example, for this vector of column indices:
idx.vec=c(3,2,3)
I would like to obtain column 3 from matrix 1, column 2 from matrix 2, and column 3 from matrix 3, as a matrix so that this matrix dimensions are the number of rows of the matrices in the list by the number of matrices in the list.
For this example the result would therefore be:
cbind(mat.list[[1]][,3],mat.list[[2]][,2],mat.list[[3]][,3])
[,1] [,2] [,3]
[1,] 1.4852810 1.305448 1.4852810
[2,] 1.8647327 -1.237507 1.8647327
[3,] -0.0416013 2.156055 -0.0416013
One possible approach would be mapply('[', mat.list, TRUE, idx.vec). The trick is to use '[' for subsetting and TRUE as an argument to select all the rows. Here is how it works:
'['(matrix(1:4, ncol = 2), TRUE, 2)
# [1] 3 4
Another (ugly) approach would be lapply(mat.list, "[",,idx.vec)[[1]]:
> set.seed(1)
> mat.list=rep(list(matrix(rnorm(n=12,mean=1,sd=1), nrow = 3, ncol=4)),3)
> idx.vec=c(3,2,3)
> lapply(mat.list, "[",,idx.vec)[[1]]
[,1] [,2] [,3]
[1,] 1.487429 2.5952808 1.487429
[2,] 1.738325 1.3295078 1.738325
[3,] 1.575781 0.1795316 1.575781
Related
I'm trying to learn how to use the apply() functions.
Suppose we have a 3 row, 2 column matrix of test <- matrix(c(1,2,3,4,5,6), ncol = 2), and we would like the maximum value of each element in the first column (1, 2, 3) to not exceed 2 for example, so we end up with a matrix of (1,2,2,4,5,6).
How would one write an apply() function to do this?
Here's my latest attempt: test1 <- apply(test[,1], 2, function(x) {if(x > 2){return(x = 2)} else {return(x)}})
We may use pmin on the first column with value 2 as the second argument, so that it does elementwise checking with the recycled 2 and gets the minimum for each value from the first column
test[,1] <- pmin(test[,1], 2)
-output
> test
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 2 6
Note that apply needs the 'X' as an array/matrix or one with dimensions, when we subset only a single column/row, it drops the dimensions because drop = TRUE by default
If you really want to use the apply() function, I guess you're looking for something like this:
t(apply(test, 1, function(x) c(min(x[1], 2), x[2])))
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 2 6
But if you want my opinion, akrun's suggestion is definitely better.
Suppose we have a "test" matrix that looks like this: (1,2,3, 4,5,6, 7,8,9, 10,11,12) generated by running test <- matrix(1:12, ncol = 4). A simple 3 x 4 (rows x columns) matrix of numbers running from 1 to 12.
Now suppose we'd like to add a value of 1 to each element in each odd-numbered matrix column, so we end up with a matrix of the following values: (2,3,4, 4,5,6, 8,9,10, 10,11,12). How would we use an apply() function to do this?
Note that this is a simplified example. In the more complete code I'm working with, the matrix dynamically expands/contracts based on user inputs so I need an apply() function that counts the actual number of matrix columns, rather than using a fixed assumption of 4 columns per the above example. (And I'm not adding a value of 1 to the elements; I'm running the parallel minima function test[,1] <- pmin(test1[,1], 5) to say limit each value to a max of 5).
With my current limited understanding of the apply() family of functions, all I can so far do is apply(test, 2, function(x) {return(x+1)}) but this is adding a value of 1 to all elements in all columns rather than only the odd-numbered columns.
You may simply subset the input data frame to access only odd or even numbered columns. Consider:
test[c(TRUE, FALSE)] <- apply(test[c(TRUE, FALSE)], 2, function(x) f(x))
test[c(FALSE, TRUE)] <- apply(test[c(FALSE, TRUE)], 2, function(x) f(x))
This works because the recycling rules in R will cause e.g. c(TRUE, FALSE) to be repeated however many times is needed to cover all columns in the input test data frame.
For a matrix, we need to use the drop=FALSE flag when subsetting the matrix in order to keep it in matrix form when using apply():
test <- matrix(1:12, ncol = 4)
test[,c(TRUE, FALSE)] <- apply(test[,c(TRUE, FALSE),drop=FALSE], 2, function(x) x+1)
test
[,1] [,2] [,3] [,4]
[1,] 2 4 8 10
[2,] 3 5 9 11
[3,] 4 6 10 12
^ ^ ... these columns incremented by 1
You may use modulo %% 2.
odd <- !seq(ncol(test)) %% 2 == 0
test[, odd] <- apply(test[, odd], 2, function(x) {return(x + 1)})
# [,1] [,2] [,3] [,4]
# [1,] 2 4 8 10
# [2,] 3 5 9 11
# [3,] 4 6 10 12
Suppose I have a matrix like the example below called m1:
m1<-matrix(6:1,nrow=3,ncol=2)
[,1] [,2]
[1,] 6 3
[2,] 5 2
[3,] 4 1
How do I get the index row for the minimum value of each column?
I know which.min() will return the column index value for each row.
The output should be: 3 and 3 because the minimum for column [,1] is 4 corresponding to row [3,] and the minimum for column [,2] is 1 corresponding row [3,].
If we need column wise index use apply with MARGIN=2 and apply the which.min
apply(m1, 2, which.min)
#[1] 3 3
If 1 column at a time is needed:
apply(as.matrix(m1[,1, drop = FALSE]), 2, which.min)
If we check ?Extract, the default usage is
x[i, j, ... , drop = TRUE]
drop - For matrices and arrays. If TRUE the result is coerced to the lowest possible dimension (see the examples). This only works for extracting elements, not for the replacement. See drop for further details.
To avoid getting dimensions dropped, use drop = FALSE
If we need the min values of each row
do.call(pmin, as.data.frame(m1))
Or
apply(m1, 2, min)
Or
library(matrixStats)
rowMins(m1)
data
m1 <- matrix(6:1,nrow=3,ncol=2)
I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.
I want to find maximum value in each column for every 2 rows (say). How to do that in R? For example
matrix(c(3,1,20,5,4,12,6,2,9,7,8,7), byrow=T, ncol=3)
I want the output like this
matrix(c(5,4,20,7,8,9), byrow=T, ncol=3)
Here is one way of doing it.
Define a vector that contains information about the groups you want. In this case, I use rep to repeat a sequence of numbers.
Then define a helper function to calculate the column maximum of an array — this is a simple apply of max.
finally, use sapply with an anonymous function that applies colMax to each of your grouped array subsets.
The code:
groups <- rep(1:2, each=2)
colMax <- function(x)apply(x, 2, max)
t(
sapply(unique(groups), function(i)colMax(x[which(groups==i), ]))
)
The results:
[,1] [,2] [,3]
[1,] 5 4 20
[2,] 7 8 9
A one long line:
t(sapply(seq(1,nrow(df1),by=2),function(i) apply(df1[seq(i,1+i),],2,max)))
Another option,
do.call(rbind, by(m, gl(nrow(m)/2, 2), function(x) apply(x, 2, max)))
apply(mat, 2, function(x) tapply(x, # work on each column
# create groups of 2 vector of proper length: 1,1,2,2,3,3,4,4 ....
rep(1:(length(x)/2), each=2, len=length(x))
max))
[,1] [,2] [,3]
1 5 4 20
2 7 8 9