How to using R to do this matrix calculation? - r

I have a matrix calculation, and I need to use R to calculate.

z <- matrix(1:9,nrow = 3,byrow = TRUE)
z
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
t(z)%*%z
[,1] [,2] [,3]
[1,] 66 78 90
[2,] 78 93 108
[3,] 90 108 126

Try this (with matrix multiplication):
library(optimbase)
Z <- matrix(1:9, byrow = TRUE, nrow = 3)
A <- transpose(Z[1, ]) %*% Z[1, ] + transpose(Z[2, ]) %*% Z[2, ] + transpose(Z[3, ]) %*% Z[3, ]
A # output
[,1] [,2] [,3]
[1,] 66 78 90
[2,] 78 93 108
[3,] 90 108 126

Try this:
> z <- matrix(c(1,2,3,4,5,6,7,8,9), 3, byrow=TRUE)
> z
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
> z[1,]
[1] 1 2 3
> matrix(z[1,])
[,1]
[1,] 1
[2,] 2
[3,] 3
Matrix-Multiplication (Column-Vector * Row-Vector):
> matrix(z[1,]) %*% z[1,] + matrix(z[2,]) %*% z[2,] + matrix(z[3,]) %*% z[3,]
[,1] [,2] [,3]
[1,] 66 78 90
[2,] 78 93 108
[3,] 90 108 126
See:
http://www.philender.com/courses/multivariate/notes/matr.html
Not the same result, when you calculate Row-Vector * Column-Vector:
> z[1,] %*% matrix(z[1,]) + z[2,] %*% matrix(z[2,]) + z[3,] %*% matrix(z[3,])
[,1]
[1,] 285

Try this:
z<-matrix(seq(1:9),ncol=3,nrow=3,byrow=TRUE)
A=z[1,]*t(z[1,])+z[2,]*t(z[2,])+z[3,]*t(z[3,])
A
# [,1] [,2] [,3]
# [1,] 66 93 126
Edit1: The following code works for any dimension as long as it's a square matrix:
colSums(t(apply(z,1,function(x)(x[1:nrow(z)]*t(x[1:nrow(z)])))))
#[1] 66 93 126
Edit2:
t(z)%*%z #the following 3 lines will all give you the same thing!
crossprod(z)
matrix(colSums(t(apply(z,1,
function(x)(matrix(x[1:nrow(z)])%*% x[1:nrow(z)])))),ncol=nrow(z),byrow = TRUE)
# [,1] [,2] [,3]
#[1,] 66 78 90
#[2,] 78 93 108
#[3,] 90 108 126

Related

Applying a function to sub-matrices within a larger matrix

So I have a 1256 by 5 matrix.
> head(retmatx12.30.3)
AMT HON KO
[1,] -0.006673489 -0.001292867 -0.0033654493
[2,] 0.004447249 0.002848406 0.0082009877
[3,] 0.001789891 0.002754232 -0.0035886573
[4,] -0.003479321 0.002231823 0.0024011113
[5,] -0.006605786 0.015159190 -0.0002394852
[6,] -0.002375004 -0.008267790 -0.0100625938
NEM NVAX
[1,] -0.034023392 -0.023255737
[2,] 0.016436786 0.007936468
[3,] 0.009529404 0.031496102
[4,] 0.046052588 0.007633549
[5,] -0.031446425 0.037878788
[6,] -0.001694084 0.036496350
I want to apply a function I've made to rows 1-126, then 2-127, and so on. The function is a block of matrix algebra that uses a matrix and a few vectors. Is it wise to somehow break the larger matrix into 1,131 126 by 5 matrices, and apply the function over each (hopefully at once). Or, some sort of application of apply?
Any help is greatly appreciated. Thanks
The actual numbers in the matrix are immaterial, so I'll use much smaller data to demonstrate one method, and a simple function to demonstrate the rolling calculation:
m <- matrix(1:24, nrow=8)
somefunc <- function(x) x %*% seq(ncol(x))
wid <- 4 # 126
somefunc(m[1:4,])
# [,1]
# [1,] 70
# [2,] 76
# [3,] 82
# [4,] 88
somefunc(m[2:5,])
# [,1]
# [1,] 76
# [2,] 82
# [3,] 88
# [4,] 94
The actual rolling work:
lapply(seq(nrow(m) - wid + 1), function(i) somefunc(m[i - 1 + seq(wid),]))
# [[1]]
# [,1]
# [1,] 70
# [2,] 76
# [3,] 82
# [4,] 88
# [[2]]
# [,1]
# [1,] 76
# [2,] 82
# [3,] 88
# [4,] 94
# [[3]]
# [,1]
# [1,] 82
# [2,] 88
# [3,] 94
# [4,] 100
# [[4]]
# [,1]
# [1,] 88
# [2,] 94
# [3,] 100
# [4,] 106
# [[5]]
# [,1]
# [1,] 94
# [2,] 100
# [3,] 106
# [4,] 112
where the first element of the output is from rows 1-4, then 2-5, then 2-6, etc.

Extract submatrix from matrix

I create a matrix in R with 10x10 (10 rows and 10 columns):
matriz <- matrix(1:100, nrow = 10, ncol = 10, byrow=T)
I want to extract square submatrices (3x3) from matrix (matriz), randomly and without overlap.
I see a package in R named "subset.matrix", but I couldn't in randomly matrix.
Any suggestion?
You can define the following function f
f <- function(mat, submat.size = 3) {
ridx <- Filter(function(x) length(x) == submat.size, split(sample(seq(nrow(mat))), ceiling(seq(nrow(mat)) / submat.size)))
cidx <- Filter(function(x) length(x) == submat.size, split(sample(seq(ncol(mat))), ceiling(seq(ncol(mat)) / submat.size)))
replicate(2, mat[ridx[[sample(length(ridx), 1)]], cidx[[sample(length(cidx), 1)]]], simplify = FALSE)
}
and this function enables you to generate a pair of sub-matrices which are random and non-overlapped.
Example Result
> f(matriz)
[[1]]
[,1] [,2] [,3]
[1,] 68 67 70
[2,] 38 37 40
[3,] 88 87 90
[[2]]
[,1] [,2] [,3]
[1,] 63 62 69
[2,] 33 32 39
[3,] 83 82 89
If you want all possible exclusive random sub-matrices each time, you can try
f2 <- function(mat, submat.size = 3) {
ridx <- Filter(function(x) length(x) == submat.size, split(sample(seq(nrow(mat))), ceiling(seq(nrow(mat)) / submat.size)))
cidx <- Filter(function(x) length(x) == submat.size, split(sample(seq(ncol(mat))), ceiling(seq(ncol(mat)) / submat.size)))
r <- list()
for (i in seq_along(ridx)) {
for (j in seq_along(cidx)) {
r[[length(r) + 1]] <- mat[ridx[[i]], cidx[[j]]]
}
}
r
}
and you will obtain
> f2(matriz)
[[1]]
[,1] [,2] [,3]
[1,] 3 6 5
[2,] 63 66 65
[3,] 83 86 85
[[2]]
[,1] [,2] [,3]
[1,] 2 8 4
[2,] 62 68 64
[3,] 82 88 84
[[3]]
[,1] [,2] [,3]
[1,] 1 10 7
[2,] 61 70 67
[3,] 81 90 87
[[4]]
[,1] [,2] [,3]
[1,] 13 16 15
[2,] 33 36 35
[3,] 23 26 25
[[5]]
[,1] [,2] [,3]
[1,] 12 18 14
[2,] 32 38 34
[3,] 22 28 24
[[6]]
[,1] [,2] [,3]
[1,] 11 20 17
[2,] 31 40 37
[3,] 21 30 27
[[7]]
[,1] [,2] [,3]
[1,] 43 46 45
[2,] 53 56 55
[3,] 73 76 75
[[8]]
[,1] [,2] [,3]
[1,] 42 48 44
[2,] 52 58 54
[3,] 72 78 74
[[9]]
[,1] [,2] [,3]
[1,] 41 50 47
[2,] 51 60 57
[3,] 71 80 77
I agree with the comment from user2974951 regarding randomness. However, this code block will do what you asked.
matriz <- matrix(1:100, nrow = 10, ncol = 10, byrow=T)
attempts <- 50
# Initialize a list to hold the results
sub_mats <- vector(mode = "list", length = attempts)
# The top left corner of the matrix can't have an index > 8
rand_x <- sample(1:8, attempts, replace = T)
rand_y <- sample(1:8, attempts, replace = T)
for (i in 1:attempts) {
# Get the three-length vectors
x_range <- rand_x[i] : (rand_x[i] + 2)
y_range <- rand_y[i] : (rand_y[i] + 2)
# Subset the matrix
sub_mat <- matriz[x_range, y_range]
# We'll use NAs to mark submatrices from previous loops
if (any(is.na(sub_mat))) next
# If there's no overlap, add it to the list
sub_mats[[i]] <- sub_mat
# Set this submatrix as NAs
matriz[x_range, y_range] <- rep(NA, 9)
}
# Remove failed attempts
sub_mats <- sub_mats[!sapply(sub_mats, is.null)]
Instead of a set number of attempts for the loop, you could use a counter. With 50 attempts, I get 4-6 sub-matrices. 1000 gives 6-8.

Apply function on array returning original number of dimensions

Take this example array:
set.seed(1)
rows <- 5
cols <- 4
dept <- 3
a <- array(sample(1:100, rows*cols*dept), dim = c(rows, cols, dept))
returning
> a
, , 1
[,1] [,2] [,3] [,4]
[1,] 68 43 85 73
[2,] 39 14 21 79
[3,] 1 82 54 37
[4,] 34 59 74 83
[5,] 87 51 7 97
, , 2
[,1] [,2] [,3] [,4]
[1,] 44 96 72 99
[2,] 84 42 80 91
[3,] 33 38 40 75
[4,] 35 20 69 6
[5,] 70 28 25 24
, , 3
[,1] [,2] [,3] [,4]
[1,] 32 22 100 50
[2,] 94 92 62 65
[3,] 2 90 23 11
[4,] 45 98 67 17
[5,] 18 64 49 36
For each "dept" dimension, I want to get the sum over the rows, while keeping the original three dimensions of the array. I tried
b <- apply(a, c(2,3), sum)
> b
[,1] [,2] [,3]
[1,] 229 266 191
[2,] 249 224 366
[3,] 241 286 301
[4,] 369 295 179
which gives the correct result but reduces it to a 4 by 3 matrix since the row dimension is collapsed to 1 and is no longer strictly needed. However, for my calculations it is inconvenient when dimension interpretations changes every time I perform an operation so I want to obtain a 1x4x3 array instead:
c <- array(b, dim = c(1, 4, 3))
> c
, , 1
[,1] [,2] [,3] [,4]
[1,] 229 249 241 369
, , 2
[,1] [,2] [,3] [,4]
[1,] 266 224 286 295
, , 3
[,1] [,2] [,3] [,4]
[1,] 191 366 301 179
This accomplishes what I want but I think it is a bit cumbersome and I am not sure how to generalize it to different operations on any number of dimensions. There has to be a more compact way of doing these operations. I found the ``rray` package but it is not compatible with R 4.0.2. Note that my actual arrays are much larger than this example and I will have to apply these types of operations many times in a numerical optimization problem, so computing efficiency is important.
To generalize and keep calculations in one line you could do:
array(apply(a, 2:3, sum), c(1, dim(a)[-1]))
# , , 1
#
# [,1] [,2] [,3] [,4]
# [1,] 229 249 241 369
#
# , , 2
#
# [,1] [,2] [,3] [,4]
# [1,] 266 224 286 295
#
# , , 3
#
# [,1] [,2] [,3] [,4]
# [1,] 191 366 301 179
Or, since it's vectorized and thus much faster, using colSums
array(colSums(a, dims=1), c(1, dim(a)[-1]))
# , , 1
#
# [,1] [,2] [,3] [,4]
# [1,] 229 249 241 369
#
# , , 2
#
# [,1] [,2] [,3] [,4]
# [1,] 266 224 286 295
#
# , , 3
#
# [,1] [,2] [,3] [,4]
# [1,] 191 366 301 179
Benchmark:
set.seed(42)
A <- array(rnorm(5e4*100*10), dim=c(5e4, 100, 10))
library(rray)
microbenchmark::microbenchmark(apply=array(apply(A, 2:3, sum), c(1, dim(A)[-1])),
colSums=array(colSums(A, dims=1), c(1, dim(A)[-1])),
rray_sum=rray_sum(A, 1)) ## rray: see other answer
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# apply 1273.51152 1381.72037 1416.33429 1395.84693 1433.72407 1848.88436 100 b
# colSums 72.07086 73.02890 73.85052 73.63013 74.38916 79.70227 100 a
# rray_sum 71.46261 72.50294 73.27564 73.00747 73.70348 80.36409 100 a
I was able in stall a R4.0 compatible version of the rray package using
remotes::install_github("r-lib/rray")
The desired result is then achieved (much faster) with
# Increasing the array size for more realistic benchmarking
rows <- 500
cols <- 100
dept <- 10
draws <- rnorm(rows*cols*dept) # Standard normal draws instead of sampling from integers
a <- rray(draws, dim = c(rows, cols, dept))
b <- rray_sum(a, 1)
Benchmark code:
bm <- microbenchmark(
base = {
a <- array(draws, dim = c(rows, cols, dept))
b <- apply(a, c(2,3), sum)
c <- array(b, dim = c(1, 4, 3))
c
},
rray = {
a <- rray(draws, dim = c(rows, cols, dept))
b <- rray_sum(a, 1)
b
}, times = 100)
> bm
Unit: microseconds
expr min lq mean median uq max neval
base 8619.9 8763.9 9245.898 8832.05 8984.25 20968.5 100
rray 838.6 939.6 1186.008 1103.50 1134.40 13580.8 100

Apply function in R behaving differently

I was studying apply functions in R. Here is what i tried out:
> x <- array(1:9,c(3,3))
> x
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> apply(x,1,function(x) x * 10)
[,1] [,2] [,3]
[1,] 10 20 30
[2,] 40 50 60
[3,] 70 80 90
My question is why is it like this? I was thinking, apply will apply the user defined function to each element in the row so that it will look as below. Why does it differ? Is there anything wrong in my understanding.
[,1] [,2] [,3]
[1,] 10 40 70
[2,] 20 50 80
[3,] 30 60 90

Quick way to perform this entry-wise operation on matrices in R?

Suppose I have two matrices:
A of size n by m, and B of size n by 2.
I would like to obtain a matrix C of size n by m, so that
C[i,j] = A[i,j] * B[i,2] + B[i,1].
How can I do that easily and quickly in R? Thanks!
R>A <- matrix(1:20,5,4)
R>A
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
R>B <- matrix(1:10,5,2)
R>B
[,1] [,2]
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
R>A * B[,2] + B[,1]
[,1] [,2] [,3] [,4]
[1,] 7 37 67 97
[2,] 16 51 86 121
[3,] 27 67 107 147
[4,] 40 85 130 175
[5,] 55 105 155 205

Resources