How to calculate matrix cumsum by condition in R? - r

Lets say I have a 2x2 - matrix like:
and try to find the cumsum till position [0,1], which would be 1+3 = 4
or [1,0] which equals to 1+2 = 3
So only the values, which matched the criteria, will be summed together..
Is there a function/method to this?

You are looking for the sum of a leading block of a matrix? This is most straightforward if you work with numeric index. In case of character index (i.e., row names and column names), we can match for numeric index before doing sum.
mat <- matrix(1:4, 2, 2, dimnames = list(0:1, 0:1))
rn <- "0"; cn <- "1"
sum(mat[1:match(rn, rownames(mat)), 1:match(cn, colnames(mat))])
#[1] 4
rn <- "1"; cn <- "0"
sum(mat[1:match(rn, rownames(mat)), 1:match(cn, colnames(mat))])
#[1] 3
Could you maybe explain to me why this code works?
In general, you can extract a block of a matrix mat, between rows i1 ~ i2 and columns j1 ~ j2 using mat[i1:i2, j1:j2]. A leading block means that the starting row and column are i1 = 1 and j1 = 1. In your case, the terminating row and column are to be determined by names, so I do match to first find the right i2 and j2.
I could sort of see your motivation. This is like selecting a region in an Excel sheet. :)

Another possibility:
m <- matrix(1:4,nrow=2)
m
#> [,1] [,2]
#> [1,] 1 3
#> [2,] 2 4
pos <- c(1,0)
pos <- pos + 1
sum(m[1:pos[1],1:pos[2]])
#> [1] 3
pos <- c(0,1)
pos <- pos + 1
sum(m[1:pos[1],1:pos[2]])
#> [1] 4

cumsum of the first column of the matrix AsIs then that of the transpose.
lapply(list(I, t), \(f) {r <- unname(cumsum(f(m)[, 1])); r[length(r)]})
# [[1]]
# [1] 3
#
# [[2]]
# [1] 4
Data:
m <- matrix(c(1, 2, 3, 4), 2, 2)

Related

How to write an apply() function that only applies to odd-numbered columns in r matrix?

Suppose we have a "test" matrix that looks like this: (1,2,3, 4,5,6, 7,8,9, 10,11,12) generated by running test <- matrix(1:12, ncol = 4). A simple 3 x 4 (rows x columns) matrix of numbers running from 1 to 12.
Now suppose we'd like to add a value of 1 to each element in each odd-numbered matrix column, so we end up with a matrix of the following values: (2,3,4, 4,5,6, 8,9,10, 10,11,12). How would we use an apply() function to do this?
Note that this is a simplified example. In the more complete code I'm working with, the matrix dynamically expands/contracts based on user inputs so I need an apply() function that counts the actual number of matrix columns, rather than using a fixed assumption of 4 columns per the above example. (And I'm not adding a value of 1 to the elements; I'm running the parallel minima function test[,1] <- pmin(test1[,1], 5) to say limit each value to a max of 5).
With my current limited understanding of the apply() family of functions, all I can so far do is apply(test, 2, function(x) {return(x+1)}) but this is adding a value of 1 to all elements in all columns rather than only the odd-numbered columns.
You may simply subset the input data frame to access only odd or even numbered columns. Consider:
test[c(TRUE, FALSE)] <- apply(test[c(TRUE, FALSE)], 2, function(x) f(x))
test[c(FALSE, TRUE)] <- apply(test[c(FALSE, TRUE)], 2, function(x) f(x))
This works because the recycling rules in R will cause e.g. c(TRUE, FALSE) to be repeated however many times is needed to cover all columns in the input test data frame.
For a matrix, we need to use the drop=FALSE flag when subsetting the matrix in order to keep it in matrix form when using apply():
test <- matrix(1:12, ncol = 4)
test[,c(TRUE, FALSE)] <- apply(test[,c(TRUE, FALSE),drop=FALSE], 2, function(x) x+1)
test
[,1] [,2] [,3] [,4]
[1,] 2 4 8 10
[2,] 3 5 9 11
[3,] 4 6 10 12
^ ^ ... these columns incremented by 1
You may use modulo %% 2.
odd <- !seq(ncol(test)) %% 2 == 0
test[, odd] <- apply(test[, odd], 2, function(x) {return(x + 1)})
# [,1] [,2] [,3] [,4]
# [1,] 2 4 8 10
# [2,] 3 5 9 11
# [3,] 4 6 10 12

Create vector p for a sparse Matrix from a matrix of positions

Let's say I want to create a sparse matrix SMatrix where all non-zero values are 1.
I already have a matrix of positions, where column 1 stores row index and column 2 stores col index:
vec1 <- c(10,1)
vec2 <- c(12,1)
vec3 <- c(2,3)
positions <- matrix(c(vec1, vec2, vec3),
ncol=2,
dimnames = list(NULL, c("row", "col")),
byrow = T)
positions
row col
[1,] 10 1
[2,] 12 1
[3,] 2 3
I can create the vector x and i which will be the equivalent of SMatrix#x and SMatrix#i like this:
x <- rep(1, nrow(positions))
i <- positions[order(positions[,2]),1] - 1
But how can I create the vector p, which should be the equivalent of SMatrix#p ?
You can use Matrix::sparseMatrix to get the compressed, or pointer representation of the row or column indices.
Matrix::sparseMatrix(positions[,1], positions[,2], x=1)#p
#[1] 0 2 2 3
or use diffinv like:
diffinv(c(table(factor(positions[,2], seq_len(max(positions[,2]))))))
#[1] 0 2 2 3
Doing the opposite of:
dp <- diff(p)
rep(seq_along(dp),dp)
What is given in the manual to expanded form p to row or column indices.

How to locate elements that meet a condition in one matrix to identify elements in a second matrix

I want to identify the positions of elements in one matrix that meet a condition to then apply those positions to another matrix and find the means of those.
my_vector_1<-c(1,2,1,4,1,1,7,8,)
my_matrix_1<-matrix(data=my_vector_1, nrow=3, ncol=3)
my_vector_2<-c(2,4,6,8,10,11,12,13,14)
my_matrix_2<-matrix(data=my_vector_2, nrow=3, ncol=3)
First locate the positions of my_matrix_1==1 in the first matrix to find...
[1,1]
[2,2]
[3,1]
[3,2]
Then find the mean of the elements in the second matrix that are in the positions identified above...
7.25 #mean of 2, 10, 6, 11 in my_matrix_2
You could subset my_matrix_2 where my_matrix_1 has value 1 and take mean of those values.
mean(my_matrix_2[my_matrix_1 == 1])
#[1] 7.25
We can use arr.ind to find the row/column position
ind <- which(my_matrix_1 == 1, arr.ind = TRUE)
ind
# row col
#[1,] 1 1
#[2,] 3 1
#[3,] 2 2
#[4,] 3 2
mean(my_matrix_2[ind])
#7.25
Another way to do this would be
mean(my_matrix_2 * NA^(my_matrix_1 != 1), na.rm = TRUE)

r: how to partition a list or vector into pairs at an offset of 1

sorry for the elementary question but I need to partition a list of numbers at an offset of 1.
e.g.,
i have a list like:
c(194187, 193668, 192892, 192802 ..)
and need a list of lists like:
c(c(194187, 193668), c(193668, 192892), c(192892, 192802)...)
where the last element of list n is the first of list n+1. there must be a way to do this with
split()
but I can't figure it out
in mathematica, the command i need is Partition[list,2,1]
You can try like this, using zoo library
library(zoo)
x <- 1:10 # Vector of 10 numbers
m <- rollapply(data = x, 2, by=1, c) # Creates a Matrix of rows = n-1, each row as a List
l <- split(m, row(m)) #splitting the matrix into individual list
Output:
> l
$`1`
[1] 1 2
$`2`
[1] 2 3
$`3`
[1] 3 4
Here is an option using base R to create a vector of elements
v1 <- rbind(x[-length(x)], x[-1])
c(v1)
#[1] 194187 193668 193668 192892 192892 192802
If we need a list
split(v1, col(v1))
data
x <- c(194187, 193668, 192892, 192802);

subsetting matrix while preserving row.names

I'm trying to subset a matrix so that I only get the matrix where the first variable is larger than the second variable. I have the matrix out which is a 3000x2 matrix.
I tried
out<-out[out[,1] > out[,2]]
but this eliminates the row.names altogether, and I get a string of integers between 1 to 3000. Would there be a way to preserve the row.names?
Of note, if you only return a subset of one row to form a matrix with one dimension being unity, R will drop the row name:
m <- matrix(1:9, ncol = 3)
rownames(m) <- c("a", "b", "c")
m[1, ] # lost the row name
m[1, , drop = FALSE] # got row name back and a matrix
m[c(1,1), ] # the row name is back when result has nrow > 1
There appears to be no simple way of working around this other than checking for one-row result and assigning the row name.
A matrix is treated by R as a vector with columns and rows.
> A <- matrix(1:9, ncol=3)
# A is filled with 1,...,9 columnwise
> A
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
# only elements with even number in 2nd column of same row
> v <- A[A[,2] %% 2 == 0]
> m <- A[A[,2] %% 2 == 0,]
> v
[1] 1 3 4 6 7 9
> m
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 3 6 9
# The result of evaluating odd/even-ness of middle column.
# This boolean vector is repeated column-wise by default
# until all element's fate in A is determined.
> A[,2] %% 2 == 0
[1] TRUE FALSE TRUE
When you leave out the comma (v), then you address A as a 1-dimensional data structure and R implicitely handles your expression as a vector.
v is in that sense not "string of integers" but a vector of integers. When you add the comma, then you tell R that your condition only adresses the first dimension while indicating a second one (after the comma) - which causes R to handle your expression as a matrix (m).

Resources