I have two dataframes that look similar to this example:
> matrix(1:9, nrow = 3, ncol = 3)
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> matrix(rexp(9), 3) < 1
[,1] [,2] [,3]
[1,] TRUE TRUE FALSE
[2,] FALSE TRUE FALSE
[3,] FALSE FALSE TRUE
I want to sum individual entries of a row, but only when the logical matrix of the same size is TRUE, else this row element should not be in the sum. All rows have at least one case where one element of matrix 2 is TRUE.
The result should look like this
[,1]
[1,] 12
[2,] 5
[3,] 9
Thanks for the help.
Multiplying your T/F matrix by your other one will zero out all the elements where FALSE. You can then sum by row.
m1 <- matrix(1:9, nrow = 3, ncol = 3)
m2 <- matrix(rexp(9), 3) < 1
as.matrix(rowSums(m1 * m2), ncol = 1)
We replace the elements to NA and use rowSums with na.rm
matrix(rowSums(replace(m1, m2, NA), na.rm = TRUE))
# [,1]
#[1,] 12
#[2,] 5
#[3,] 9
Or use ifelse
matrix(rowSums(ifelse(m2, 0, m1)))
data
m1 <- matrix(1:9, nrow = 3, ncol = 3)
m2 <- matrix(rexp(9), 3) >= 1
Related
M = matrix(data = 1:9, nrow = 3, ncol = 3)
print (M)
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 1 4 8
## [3,] 3 6 9
I wanna check that elements in row [1,] are equal to elements at row[2,]. Or which elements at row[1,] are equal to elements at column[,3], for example.
I'm having issues in doing Excel's SUMIF function in R. There are 2 matrices, m and n. I want matrix n to take the sum of each column j of matrix m limited until the i-th row if the row i+1 in column j+1 is not empty (not sure if I made this clear, below are my explanation for a clearer view of what I want to do).
Below are my codes:
m <- matrix(c(1,2,3,4,5,'',7,'',''), nrow = 3)
n <- matrix('', nrow = 2, ncol = 3)
for (j in 1:2) {
n[2,j] <- sum(as.numeric(m[,j])[!is.na(m[,j+1])]
}
n[2,3] <- ''
Below is matrix m:
> m <- matrix(c(1,2,3,4,5,'',7,'',''), nrow = 3)
> m
[,1] [,2] [,3]
[1,] "1" "4" "7"
[2,] "2" "5" ""
[3,] "3" "" ""
The above codes yield the results for matrix n:
> n <- matrix('', nrow = 2, ncol = 3)
> n
[,1] [,2] [,3]
[1,] "" "" ""
[2,] "" "" ""
But I want the codes to yield this results:
> n <- matrix('', nrow = 2, ncol = 3)
> n
[,1] [,2] [,3]
[1,] "" "" ""
[2,] "3" "4" ""
Please help! Thanks!
Using numeric data:
m <- matrix(c(1,2,3,4,5,NA,7,NA,NA), nrow = 3)
n <- matrix(NA, nrow = 2, ncol = 3)
Bottom line up-front:
n[2,] <- colSums(m * cbind(!is.na(m)[,-1], FALSE), na.rm = TRUE)
Stepping through the logic:
Find NAs and shift one column to the left:
cbind(!is.na(m)[,-1], FALSE)
# [,1] [,2] [,3]
# [1,] TRUE TRUE FALSE
# [2,] TRUE FALSE FALSE
# [3,] FALSE FALSE FALSE
We can multiply that by the original m, where FALSE is effectively 0.
m * cbind(!is.na(m)[,-1], FALSE)
# [,1] [,2] [,3]
# [1,] 1 4 0
# [2,] 2 0 NA
# [3,] 0 NA NA
Column sums, using colSums(..., na.rm = TRUE)
colSums(m * cbind(!is.na(m)[,-1], FALSE), na.rm = TRUE)
# [1] 3 4 0
Assign that value to the second row of n:
n[2,] <- colSums(m * cbind(!is.na(m)[,-1], FALSE), na.rm = TRUE)
n
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] 3 4 0
A matrix can hold only one class so having empty character values ("") changes all the numeric variables to character. You can use NA instead which will keep the class intact and you can sum it. Also, I don't really understand why you need additional empty (or NA) rows when your actual data is present only in the last row.
Having said that, you can use apply column-wise to sum the values till the last non-NA value is found in that column - 1.
m <- matrix(c(1,2,3,4,5,NA,7,NA,NA), nrow = 3)
n <- matrix(NA, nrow = 2, ncol = 3)
n[2, ] <- apply(m, 2, function(x) sum(x[seq_len(max(which(!is.na(x))) - 1)]))
n
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] 3 4 0
I have matrix M and N given by
> M
[,1] [,2] [,3] [,4] [,5]
[1,] 5 1 1 7 7
[2,] 4 7 4 2 7
[3,] 11 19 20 50 30
> N
[,1] [,2]
[1,] 7 1
[2,] 7 7
I want to find the column values in M that should be paired with N to get
[,1] [,2]
7 1
7 7
30 19
I tried the code below. Can i get an efficient way of doing it or especially doing it without using the for commands?
E=numeric()
for (i in 1:2){
for (j in 1:5) {
if (N[1,i]==M[1,j] & N[2,i]==M[2,j]){
E[i]= M[3,j]
}
}
}
E
rbind(N,E)
Well here is your loop re-written
E <- vapply(seq(nrow(N)), function(i) M[3,M[1,] == N[1,i] & M[2,] == N[2,i]], numeric(1))
# with
> rbind(N,E)
[,1] [,2]
7 1
7 7
E 30 19
there is only one loop (vapply - a wrapper for a loop) which runs through the rows of N.
Here's a way using multiple calls to apply. We iterate over the columns of M and N to find which column in M matches the first column in N and then which matches the second column in N.
logicals <- apply(M[-3,], # exclude third row
2, # iterate over columns
FUN = function(x)
apply(N, 2, #then iterate over columns of N
FUN = function(y) all(x == y)))
# [,1] [,2] [,3] [,4] [,5]
# [1,] FALSE FALSE FALSE FALSE TRUE
# [2,] FALSE TRUE FALSE FALSE FALSE
M[,apply(logicals, 1, which)]
[,1] [,2]
[1,] 7 1
[2,] 7 7
[3,] 30 19
data
M <- structure(c(5, 4, 11, 1, 7, 19,
1, 4, 20, 7, 2, 50,
7, 7, 30),
.Dim = c(3L, 5L))
N <- structure(c(7, 7, 1, 7), .Dim = c(2L, 2L))
I am trying to map matching columns between 2 matrices. For simplicity, I have 2 simple matrices, a and b:
a <- matrix(c(1, 2), nrow = 2, ncol = 2)
b <- matrix(c(1,2,1,2,3:8), nrow = 2, ncol = 5)
> a
[,1] [,2]
[1,] 1 1
[2,] 2 2
> b
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 3 5 7
[2,] 2 2 4 6 8
I want to create a vector of length length(a[, 1]) = 2, ie
> out
[1] 1 2
Where the first element of out is the column number in b that matches the first column of a, and the second element of out is the column number in b that matches the second column in a. I have tried
> match(data.frame(a), data.frame(b))
[1] 1 1
but I need each element of the resulting vector to be unique. Probably simple solution, but I am not seeing it. Thanks!
May be you are looking for something like intersect.
a <- matrix(c(10, 20), nrow = 2, ncol = 2)
b <- matrix(c(10,20,1,2,3:6,10,20), nrow = 2, ncol = 5)
#> b
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 1 3 5 10
#[2,] 20 2 4 6 20
#Finding matching columns in b from a. Only 1st column of a is considered
matched <- b[,1:ncol(b)] == a[,1:1]
#> matched
# [,1] [,2] [,3] [,4] [,5]
#[1,] TRUE FALSE FALSE FALSE TRUE
#[2,] TRUE FALSE FALSE FALSE TRUE
desired <- which(matched[1,], arr.ind = TRUE)
#> desired
#[1] 1 5
The matched column 1 and 5 are returned.
I guess I'm not allowed to comment on here. Anyhoo...the above answer by MKR looks good, but I would add this line before creating the "desired" object. This is to ensure every column element matches (instead of testing the first row only).
matched<-sapply(1:ncol(matched),function(x) all(matched[,x]))
I would like to fast determine top k maximum values in a matrix, and then put those not the top k maximum value as zero, currently I work out the following solution. Can somebody improve these one, since when the matrix have many many rows, this one is not so fast?
thanks.
mat <- matrix(c(5, 1, 6, 4, 9, 1, 8, 9, 10), nrow = 3, byrow = TRUE)
sortedMat <- t(apply(mat, 1, function(x) sort(x, decreasing = TRUE, method = "quick")))
topK <- 2
sortedMat <- sortedMat[, 1:topK, drop = FALSE]
lmat <- mat
for (i in 1:nrow(mat)) {
lmat[i, ] <- mat[i, ] %in% sortedMat[i, ]
}
kMat <- mat * lmat
> mat
[,1] [,2] [,3]
[1,] 5 1 6
[2,] 4 9 1
[3,] 8 9 10
> kMat
[,1] [,2] [,3]
[1,] 5 0 6
[2,] 4 9 0
[3,] 0 9 10
In Rfast the command sort_mat sorts the columns of a matrix, colOrder does order for each column, colRanks gives ranks for each column and the colnth gives the nth value for each column. I believe at least one of them suit you.
You could use rank to speed this up. In case there are ties, you would have to decide on a method to break these (e.g. ties.method = "random").
kmat <- function(mat, k){
mat[t(apply(mat, 1, rank)) <= (ncol(mat)-k)] <- 0
mat
}
kmat(mat, 2)
## [,1] [,2] [,3]
## [1,] 5 0 6
## [2,] 4 9 0
## [3,] 0 9 10