I am trying to improve my code, but there is a task where I am not able to vectorize a loop. I have 5 sparse matrices, where I have to loop through and add a 1x3-vector resulting from the non-null elements to another matrix. In this example I work with only 1 sparse matrix to keep it simple.
m <- matrix(0, nrow = 3, ncol = 3)
m[1, 2] <- 5
m[3, 3] <- 8
n <- matrix(0, nrow=3, ncol=5)
for (i in 1:nrow(m)){
for (j in 1:ncol(m)){
if (!m[i,j]==0){
n[i, j:(j+2)] <- n[i, j:(j+2)] + rep(m[i, j], 3)
}
}
}
The code works, but I have the feeling that there are much better solutions with vectorized functions. I tried and failed with apply functions as the new matrix n has other dimensions as m.
Would be great to get some ideas here.
Here's a vectorized base R solution:
Create the sequence of indices where you want your values to be inserted. Check ?sequence to understand how this works under the hood.
Insert the values
idx <- sequence(nvec = rep(3, length(m[m != 0])), from = which(m != 0), by = nrow(m))
n[idx] <- rep(m[m != 0], each = 3)
n
[,1] [,2] [,3] [,4] [,5]
[1,] 0 5 5 5 0
[2,] 0 0 0 0 0
[3,] 0 0 8 8 8
I'm not sure this is all that much better, but you could use which() to identify the row and column indices of the non-zero values in m, then use a bit of dplyr code to expand the list of column indices by the required length. Then use those indices and the values to replace the appropriate cells in the matrix.
library(dplyr)
library(tidyr)
m <- matrix(0, nrow = 3, ncol = 3)
m[1, 2] <- 5
m[3, 3] <- 8
n <- matrix(0, nrow=3, ncol=5)
w <- which(m != 0, arr.ind=TRUE)
w <- cbind(w, value=m[w])
w <- as_tibble(w)
w <- w %>%
rowwise() %>%
mutate(col = list(col:(col+2))) %>%
unnest(col)
n[cbind(w$row, w$col)] <- w$value
n
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 5 5 5 0
#> [2,] 0 0 0 0 0
#> [3,] 0 0 8 8 8
Created on 2023-02-10 by the reprex package (v2.0.1)
Nested loops can often be solved with mapply() or outer().
#Sample Data
m <- matrix(0, nrow = 3, ncol = 3)
m[1, 2] <- 5
m[3, 3] <- 8
n <- matrix(0, nrow=3, ncol=5)
#Build the function using the <<- operators, which are normally only used in functions, and cause a search to be made through parent environments for an existing definition of the variable being assigned
FN <- function (i, j){
if (m[i, j] != 0){
n[i, j:(j+2)] <<- n[i, j:(j+2)] + rep(m[i, j], 3)
}
}
#Outer() function to iterate, with mapply() to vectorize the if statement
outer(1:nrow(m), 1:ncol(m), FUN=function(x, y) mapply(FN, x, y))
n
# [,1] [,2] [,3] [,4] [,5]
#[1,] 0 5 5 5 0
#[2,] 0 0 0 0 0
#[3,] 0 0 8 8 8
Related
I would like to write one function whose input is a square matrix, and it returns a square matrix whose numbers from the upper right corner down to lower left corner are preserved and other numbers are zero.
For example
suppose A is a 4*4 matrix in the following.(sorry I do not know how to type the matrix expression)
[1,2,3,4]
[5,6,7,8]
[9,10,11,12]
[13,14,15,16]
How can I write a function in R without any loops to transform the matrix into this?
[0,0,0,4]
[0,0,7,0]
[0,10,0,0]
[13,0,0,0]
This feels like a gymnastics exercise...
xy <- matrix(1:16, ncol = 4, byrow = TRUE)
xy <- apply(xy, MARGIN = 1, rev)
xy[lower.tri(xy)] <- 0
xy[upper.tri(xy)] <- 0
t(apply(xy, MARGIN = 1, rev))
[,1] [,2] [,3] [,4]
[1,] 0 0 0 4
[2,] 0 0 7 0
[3,] 0 10 0 0
[4,] 13 0 0 0
Here is another option.
mat <- matrix(1:16, 4, byrow = TRUE)
idx <- cbind(seq_len(nrow(mat)),
ncol(mat):1)
values <- mat[idx]
mat <- matrix(0, nrow = dim(mat)[1], ncol = dim(mat)[2])
mat[idx] <- values
mat
# [,1] [,2] [,3] [,4]
#[1,] 0 0 0 4
#[2,] 0 0 7 0
#[3,] 0 10 0 0
#[4,] 13 0 0 0
A non-apply solution using some maths to generate the indices stealing xy from #Roman
xy <- matrix(1:16, ncol = 4, byrow = TRUE)
ind <- nrow(xy)
xy[setdiff(1:length(xy), seq(ind, by = ind -1, length.out = ind))] <- 0
xy
# [,1] [,2] [,3] [,4]
#[1,] 0 0 0 4
#[2,] 0 0 7 0
#[3,] 0 10 0 0
#[4,] 13 0 0 0
Trying it on 5 X 5 matrix
xy <- matrix(1:25, 5, byrow = TRUE)
ind <- nrow(xy)
xy[setdiff(1:length(xy), seq(ind, by = ind -1, length.out = ind))] <- 0
xy
# [,1] [,2] [,3] [,4] [,5]
#[1,] 0 0 0 0 5
#[2,] 0 0 0 9 0
#[3,] 0 0 13 0 0
#[4,] 0 17 0 0 0
#[5,] 21 0 0 0 0
This answer takes a slightly different approach than the other answers. Instead of trying to zero out everything except for the diagonal, we can just build the diagonal by itself:
m <- matrix(rep(0,16), nrow = 4, byrow = TRUE)
for (i in 0:15) {
row <- floor(i / 4)
col <- i %% 4
if (i == 3 + (row*3)) {
m[row+1, col+1] <- i+1
}
}
m
[,1] [,2] [,3] [,4]
[1,] 0 0 0 4
[2,] 0 0 7 0
[3,] 0 10 0 0
[4,] 13 0 0 0
I just thought about a way to reverse the original diag function from base R.
You can see it by just typing diag in the console.
Here the highlighted change I made in my diag_reverse:
y <- x[((m - 1L):0L * (dim(x)[1L])) + (1L:m)] # m is min(dim(x))
And here's the complete function (I kept all the code except that one line):
diag_reverse <- function (x = 1, nrow, ncol, names = TRUE)
{
if (is.matrix(x)) {
if (nargs() > 1L && (nargs() > 2L || any(names(match.call()) %in%
c("nrow", "ncol"))))
stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix")
if ((m <- min(dim(x))) == 0L)
return(vector(typeof(x), 0L))
y <- x[((m - 1L):0L * (dim(x)[1L])) + (1L:m)] # HERE I made the change
if (names) {
nms <- dimnames(x)
if (is.list(nms) && !any(vapply(nms, is.null, NA)) &&
identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)]))
names(y) <- nm
}
return(y)
}
if (is.array(x) && length(dim(x)) != 1L)
stop("'x' is an array, but not one-dimensional.")
if (missing(x))
n <- nrow
else if (length(x) == 1L && nargs() == 1L) {
n <- as.integer(x)
x <- 1
}
else n <- length(x)
if (!missing(nrow))
n <- nrow
if (missing(ncol))
ncol <- n
.Internal(diag(x, n, ncol))
}
Then we can call it:
m <- matrix(1:16,nrow=4,ncol=4,byrow = T)
diag_reverse(m)
#[1] 4 7 10 13
I'll test it on other matrices to see if it gives always the correct answer.
The apply family are really just loops with a bow tie.
Here is a way to do it without apply. With some input checking and should work on any size matrix.
off_diag = function(X)
{
if(!is.matrix(X)) stop('Argument is not a matrix')
n <- nrow(X)
if(ncol(X) != n) stop('Matrix is not square')
if(n<2) return(X)
Y <- X * c(0,rep(rep(c(0,1),c(n-2,1)),n),rep(0,n-1))
return(Y)
}
Now it can handle numeric vectors, character vectors and NAs.
mat <- matrix(1:16, 4, byrow = TRUE)
off_diag(mat)
# [,1] [,2] [,3] [,4]
# [1,] 0 0 0 4
# [2,] 0 0 7 0
# [3,] 0 10 0 0
# [4,] 13 0 0 0
Edit: improvement
I realised my function will fail if there are NAs since NA*0 is NA, additionally it will not work on characters, but doesn't check the matrix has mode as numeric. So instead I use the same setup to make a logical vector
minor_diag = function(X)
{
if(!is.matrix(X)) stop('Argument is not a matrix')
n <- nrow(X)
if(ncol(X) != n) stop('Matrix is not square')
if(n<2) return(X)
index = c(TRUE,rep(rep(c(TRUE,FALSE),c(n-2,1)),n),rep(TRUE,n-1))
X[index]=0
return(X)
}
mat <- matrix(letters[1:16], 4, byrow = TRUE)
minor_diag(mat)
## [,1] [,2] [,3] [,4]
## [1,] "0" "0" "0" "d"
## [2,] "0" "0" "g" "0"
## [3,] "0" "j" "0" "0"
## [4,] "m" "0" "0" "0"
minor_diag(matrix(NA,2,2))
## [,1] [,2]
## [1,] 0 NA
## [2,] NA 0
A one liner without loops
#setup
n <- 5
A <- matrix(1:(n^2), n)
#solution
diag(diag(A[n:1,]))[n:1,]
I would like to transform an object like this
df <- data.frame(ROW = c(1,3),COLUMN =c(2,3),VALUE = c(10,20))
df
ROW COLUMN VALUE
1 2 10
3 3 20
to a matrix like this
m <-matrix(c(0,10,0,0,0,0,0,0,20),ncol = 3,nrow = 3)
m
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 10 0 0
[3,] 0 0 20
I know that the data frame represents a sparse matrix but I did not find any other question that relates exactly to what I am looking for.
We can use sparseMatrix
library(Matrix)
as.matrix( sparseMatrix(i = df$COLUMN, j= df$ROW, x= df$VALUE))
# [,1] [,2] [,3]
#[1,] 0 0 0
#[2,] 10 0 0
#[3,] 0 0 20
Or create a matrix of 0's and then assign
m1 <- matrix(0, 3, 3)
m1[as.matrix(df[2:1])] <- df$VALUE
Note: Based on the output, the ROW/COLUMN index seems to be reversed
We can iterate over the rows of df and fill a matrix according to the row and column indexes included in df:
# initialize
new_mat <- matrix(0, ncol = max(df$ROW), nrow = max(df$COLUMN))
for(i in 1:nrow(df)){
new_mat[df[i,]$COLUMN, df[i,]$ROW] <- df[i,]$VALUE
}
# [,1] [,2] [,3]
# [1,] 0 0 0
# [2,] 10 0 0
# [3,] 0 0 20
as #akrun has noted, it seems as though the row and column indexes are flipped.
Here is a solution with apply
mat <- matrix( 0, ncol = max(df$ROW), nrow = max(df$COLUMN) )
f <- function( x ) { mat[x[1],x[2]] <<- x[3] }
apply( df, 1, f } )
The <<- operator applies the value to the externally defined mat.
May need to use as.numeric etc. for data type conversions.
I've writen a function to compute a matrix where each column is the corresponding input matrix column minus the column mean.
# compute the deviation matrix
deviation <- function(X) {
one <- rep(1, nrow(X))
n <- ncol(X)
d <- matrix(data = NA, nrow = nrow(X), ncol = ncol(X))
for(i in seq.int(from = 1, to = n)) {
d[,i] <- X[,i] - mean(X[,i], na.rm = TRUE) * one
}
d
}
Could this function be written more idiomatically in R (using functional programming, perhaps)?
Use sweep and colMeans:
sweep(mat, 2, colMeans(mat))
By default, sweep uses - or the subtraction function, taking the column means as calculated by colMeans, from the values in each column (MARGIN=2). Gives the same result:
mat <- matrix(1:12,nrow=3)
deviation(mat)
# [,1] [,2] [,3] [,4]
#[1,] -1 -1 -1 -1
#[2,] 0 0 0 0
#[3,] 1 1 1 1
sweep(mat, 2, colMeans(mat))
# [,1] [,2] [,3] [,4]
#[1,] -1 -1 -1 -1
#[2,] 0 0 0 0
#[3,] 1 1 1 1
I would like to fast determine top k maximum values in a matrix, and then put those not the top k maximum value as zero, currently I work out the following solution. Can somebody improve these one, since when the matrix have many many rows, this one is not so fast?
thanks.
mat <- matrix(c(5, 1, 6, 4, 9, 1, 8, 9, 10), nrow = 3, byrow = TRUE)
sortedMat <- t(apply(mat, 1, function(x) sort(x, decreasing = TRUE, method = "quick")))
topK <- 2
sortedMat <- sortedMat[, 1:topK, drop = FALSE]
lmat <- mat
for (i in 1:nrow(mat)) {
lmat[i, ] <- mat[i, ] %in% sortedMat[i, ]
}
kMat <- mat * lmat
> mat
[,1] [,2] [,3]
[1,] 5 1 6
[2,] 4 9 1
[3,] 8 9 10
> kMat
[,1] [,2] [,3]
[1,] 5 0 6
[2,] 4 9 0
[3,] 0 9 10
In Rfast the command sort_mat sorts the columns of a matrix, colOrder does order for each column, colRanks gives ranks for each column and the colnth gives the nth value for each column. I believe at least one of them suit you.
You could use rank to speed this up. In case there are ties, you would have to decide on a method to break these (e.g. ties.method = "random").
kmat <- function(mat, k){
mat[t(apply(mat, 1, rank)) <= (ncol(mat)-k)] <- 0
mat
}
kmat(mat, 2)
## [,1] [,2] [,3]
## [1,] 5 0 6
## [2,] 4 9 0
## [3,] 0 9 10
I have generate a matrix by using the following for loop.
And now I am trying to generate a same matrix using while loop but don't know how to do so.
Can anyone help with this? Thank you so much.
i<- 1
j<- 1
m1<- matrix(0, nrow=5, ncol=5)
for(i in 1:5) {
for(j in i:5) {
m1[i,j]<- (j-i)*2+1
}
}
m1
i <- 1
j <- 1
m1 <- matrix(0, nrow = 5, ncol = 5)
while (i <= 5) {
while (j <= 5) {
m1[i, j] <- (j - i) * 2 + 1
j <- j + 1
}
i <- i + 1
j <- i
}
m1
What about this:
> m <- matrix(0, nrow=5, ncol=5)
> mu <- upper.tri(m)
> m[mu] <- apply(which(mu, arr.ind=TRUE), 1, function(x) (x[2] - x[1]) * 2 + 1)
> diag(m) <- 1
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 0 1 3 5 7
[3,] 0 0 1 3 5
[4,] 0 0 0 1 3
[5,] 0 0 0 0 1
Edit: sorry, just realized that you do not want to vectorize the task, but rewriting the code to use while instead of for. Although I have no idea why :)