I'd like to lookup matrix cells by using rows and columns from a data frame. Preferably, I'd like to do this in a vectorized way for best performance. However, the most obvious syntax leads to a lookup of all the row-column combinations possible, not only the combinations that stem from one data frame row:
Here is a small example:
> m1 <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), 3, 3)
>
> m1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
>
> p1 <- data.frame(row = c(2, 3, 1), column = c(3, 1, 2))
>
> p1
row column
1 2 3
2 3 1
3 1 2
>
> # vectorized indexing that does not work as intended
> m1[p1$row, p1$column]
[,1] [,2] [,3]
[1,] 8 2 5
[2,] 9 3 6
[3,] 7 1 4
>
> # this works as intended, but is possible slow due to R-language looping
> sapply(1 : nrow(p1), function (i) { m1[p1[i, "row"], p1[i, "column"]] })
[1] 8 3 4
The sapply call computes the output I expect (only m1[2, 3], m1[3, 1] and m1[1, 2]), but it's expected to be slow for larger data frames because it loops in R language.
Any thoughts on a better (ideally vectorized) way?
For your intended purpose you need to use a matrix to subset the matrix using certain row,column combinations. So you can try:
m1[as.matrix(p1)]
# [1] 8 3 4
Or if you have two vectors:
m1[cbind(row_idx, col_idx)]
Related
Suppose you have a matrix a
a <- matrix(1:9, 3, 3)
a
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
and a vector b indicating which element of each row you want to extract. That is, the vector b indicates the column of the element, for instance:
b <- c(1, 3, 1)
If we want to extract the indicated data points, we can simply index each desired element like this:
a[cbind(1:nrow(a),b)]
[1] 1 8 3
I would like to do it with a negative index vector. That is, R should return a matrix where exactly one element per row is omitted (in this case, a 3x2 matrix). If I try it in a naive approach, R throws an error:
c = -b
a[cbind(1:nrow(a),c)]
Error in a[cbind(1:nrow(a), c)] :
negative values are not allowed in a matrix subscript
Thank you!
Not pretty, but you could do
b <- c(1, 3, 1) + 3 * 0:2
matrix(c(t(a))[-b], 3, 2, byrow = TRUE)
Maybe this is another naive approach. We loop over every row in the matrix and remove index specified in b.
t(sapply(seq_len(nrow(a)), function(x) a[x, -b[x]]))
# [,1] [,2]
#[1,] 4 7
#[2,] 2 5
#[3,] 6 9
Or using mapply with split
t(mapply(`[`, split(a, seq_len(nrow(a))), -b))
This question already has answers here:
Add each element in one vector with each element in a second vector
(3 answers)
Closed 4 years ago.
I have 2 vectors of different sizes:
vector1 = [1, 2, 3]
vector2 = [1, 2, 3, 4, 5]
I want to make operations between them. Each number from the vector1 plus each number from the vector2. Something like that:
I'm trying unsuccessfully a for inside a for. Any help?
vector1 <- data.frame(c(1, 2, 3))
vector2 <- data.frame(c(1, 2, 3, 4, 5))
for (i in vector1) {
for (j in vector2) {
a <- i + j
}
}
This is the message error
Warning message:
In i + j : longer object length is not a multiple of shorter object length
You can use outer
> vector1 <- c(1, 2, 3)
> vector2 <- c(1, 2, 3, 4, 5)
> outer(vector1, vector2, FUN="+")
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 3 4 5 6 7
[3,] 4 5 6 7 8
If you really want to use a loop, you can use a nested for loop:
> result <- matrix(0, nrow = length(vector1), ncol=length(vector2))
> for(i in seq_len(length(vector1))){
for(j in seq_len(length(vector2))){
result[i,j] <- sum(vector1[i], vector2[j])
}
}
> result
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 3 4 5 6 7
[3,] 4 5 6 7 8
for is a very inefficient way to do this. Here's a way using sapply from base R although I feel there should be an even better way -
c(sapply(vector1, function(x) x + vector2))
# [1] 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8
I have a data frame like
df<-data.frame(a=c(1,2,3),b=c(4,5,6),c=c(7,8,9),d=c(10,11,12))
a b c d
1 1 4 7 10
2 2 5 8 11
3 3 6 9 12
I want to use every row to create 3 (nrow(df)) 2*2 matrixes. 1st use 1,4,7,10, 2nd use 2,5,8,11, 3rd use 3,6,9,12. So that I can get 3 matrixes. Thank you.
We can use split to split up the dataset into list and use matrix
lapply(split.default(as.matrix(df), row(df)), matrix, 2)
If we need the matrix columns to be 1, 7 followed by 4, 10, use the byrow=TRUE
lapply(split.default(as.matrix(df), row(df)), matrix, 2, byrow=TRUE)
Or use apply with MARGIN = 1 and wrap it with list to get a list output
do.call("c", apply(df, 1, function(x) list(matrix(x, ncol=2))))
If we need a for loop, preassign a as a list with length equal to the number of rows of 'df'
a <- vector("list", nrow(df))
for(i in 1:nrow(df)){ a[[i]] <- matrix(unlist(df[i,]), ncol=2)}
a
Or if it can be stored as array
array(t(df), c(2, 2, 3))
Or using map:
m <- matrix(c(t(df)), ncol = 2, byrow = T)
p <- 2 # number of rows
Map(function(i,j) m[i:j,], seq(1,nrow(m),p), seq(p,nrow(m),p))
# [[1]]
# [,1] [,2]
# [1,] 1 4
# [2,] 7 10
# [[2]]
# [,1] [,2]
# [1,] 2 5
# [2,] 8 11
# [[3]]
# [,1] [,2]
# [1,] 3 6
# [2,] 9 12
Let's say we have a matrix something like this:
> A = matrix(
+ c(2, 4, 3, 1, 5, 7), # the data elements
+ nrow=2, # number of rows
+ ncol=3, # number of columns
+ byrow = TRUE) # fill matrix by rows
> A # print the matrix
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
Now, I just used this small example, but imagine if the matrix was much bigger, like 200 rows and 5 columns etc. What I want to do, is to get the minimum value from column 3, and extract that row. In other words, find and get the row where is the 3rd attribute the lowest in the entire column of that data frame.
dataToReturn <- which(A== min(A[, 3])
but this doesn't work.
Another way is to use which.min
A[which.min(A[, 3]), ]
##[1] 2 4 3
You can do this with a simple subsetting via [] and min:
A[A[,3] == min(A[,3]),]
[1] 2 4 3
This reads: Return those row(s) of A where the value of column 3 equals the minimum of column 3 of A.
If you have a matrix like this:
A <- matrix(c(2,4,3,1,5,7,1,3,3), nrow=3, byrow = T)
> A
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
[3,] 1 3 3
> A[which.min(A[, 3]), ] #returns only the first row with minimum condition
[1] 2 4 3
> A[A[,3] == min(A[,3]),] #returns all rows with minimum condition
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 3 3
I have a problem I think best can be solved using cominatorics.
Lets say you have values 4 values (2,5,6,7). I would like to get all vectors where I pick out 3 of them, that is I would like a matrix with (2,5,6),(2,5,7),(5,6,7). I would like to do this with a general vector. How do I do it?
x <- c(2, 5, 6, 7)
combn(x, 3)
gives
> combn(x, 3)
[,1] [,2] [,3] [,4]
[1,] 2 2 2 5
[2,] 5 5 6 6
[3,] 6 7 7 7
I have created a package iterpc which is able to solve most of the combination/permutation related problems.
library(iterpc)
x <- c(2, 5, 6, 7)
# combinations
getall(iterpc(4, 3, label=x))
# combinations with replacement
getall(iterpc(4, 3, replace=TRUE, label=x))
# permutations
getall(iterpc(4, 3, label=x, ordered=TRUE))