sum vectors in a matrix by distance from a cell (R) - r

Suppose I have a matrix A of dimensions n x m. A starting cell (i,j), And a constant k which satisfies k < n x m.
I need a way to extract the values inside A such that all values are within k steps from the starting cell. a step is either a column move or a row move.
Then Im looking to sum the extracted values by 2 groups where 1 group consists of sums obtained from the same column in the original matrix and the other group is the sum obtained from summation of values along rows of the original matrix.
It is important for me that this addresses situations where the starting cell is within k steps from the edge of the matrix.
Example set (I'm heavily simplifying here):
> #create matrix where m = 7,n = 7
> Mat <- sample(1:49,49) %>% matrix(7,7)
>
> #declare starting cell where (i = 4, j = 2)
> i = 4
> j = 2
>
> #declare number of steps
> k = 2
>
> Mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 25 35 29 10 16 46 23
[2,] 32 43 7 5 31 1 14
[3,] 36 19 49 45 13 41 47
[4,] 17 18 48 9 3 28 12
[5,] 26 6 30 33 20 2 11
[6,] 40 24 39 21 37 38 8
[7,] 4 15 34 22 27 44 42
> Mat[i,j]
[1] 18
for this example an output would be two vectors (one for column sums and one for row sums):
> Columnsum <- c(sum(36,17,26) , #sum(Mat[3:5,1])
+ sum(43,19,18,6,24), #sum(Mat[2:6,2])
+ sum(49,48,30), #sum(Mat[3:5,3])
+ sum(9)) #sum(Mat[4:4,3])
>
> Rowsum <- c(sum(43), #sum(Mat[2,2:2])
+ sum(36,19,49), #sum(Mat[3,1:3])
+ sum(17,18,48,9), #sum(Mat[4,1:4])
+ sum(26,6,30), #sum(Mat[5,1:3])
+ sum(24)) #sum(Mat[6,2:2])
>
> Columnsum
[1] 79 110 127 9
> Rowsum
[1] 43 104 92 62 24

You could 'remove' parts of your matrix Mat with entries more than k steps away from (i,j) by overwriting them with NA:
Mat[abs(row(Mat) - i) + abs(col(Mat) - j) > k] <- NA
Then remove the rows and columns that are entirely NA:
Mat <- Mat[rowSums(is.na(Mat)) != ncol(Mat), colSums(is.na(Mat)) != nrow(Mat)]
And finally you can compute the row and column sums:
Columnsum <- colSums(Mat, na.rm = TRUE)
Rowsum <- rowSums(Mat, na.rm = TRUE)

Related

Vector to a matrix where the next row starts 1 observation

Suppose I have a data set with 40 observations
y <- rnorm(40,10,10)
Now I would like to transform this vector into a matrix with 4 observations in each row.
On top of that, I would like the row to start with value y[i] and add one each iteration upuntil the 40th observation.
So for example:
r1 = y[1] y[2] y[3] y[4]
r2 = y[2] y[3] y[4] y[5]
r3 = y[3] y[4] y[5] y[6]
.
.
r40 = y[39] y[38] y[37] y[36]
Does anyone know how to do this?
You can use matrix like:
y <- 1:40
matrix(y, 41, 4)[1:37,]
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 2 3 4 5
# [3,] 3 4 5 6
#...
#[35,] 35 36 37 38
#[36,] 36 37 38 39
#[37,] 37 38 39 40
Or using seq in mapply and fill the index matrix with the values of y.
i <- 1:37
M <- t(mapply(seq, i, i+3))
M
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 2 3 4 5
# [3,] 3 4 5 6
#...
#[35,] 35 36 37 38
#[36,] 36 37 38 39
#[37,] 37 38 39 40
M[] <- y[M]
This is one way to produce the first 37 rows. If you want to change the direction for the last 3 rows, then it would be easy to do with the same code:
purrr::map(seq_len(37), ~y[.x:(.x+3)]) %>%
unlist() %>%
matrix(nrow = 37, byrow = T)
Only difference would be to first save the values of the first 37 rows, then produce the last 3 rows, bind them, and turn that vector to a matrix.
Try embed
embed(y, 4)[, 4:1]
which could give the desired output

Multiplication of matrix in R

I have the following problem :
w <- matrix(1:3,nrow=3,ncol=1)
mymat <- as.matrix(cbind(a = 6:15, b = 16:25, c= 26:35))
mymat
a b c
[1,] 6 16 26
[2,] 7 17 27
[3,] 8 18 28
[4,] 9 19 29
[5,] 10 20 30
[6,] 11 21 31
[7,] 12 22 32
[8,] 13 23 33
[9,] 14 24 34
[10,] 15 25 35
I want to obtain the following results in a matrix the same size as mymat:
a b c
[1,] 6*1 16*2 26*3
[2,] 7*1 17*2 27*3
[3,] 8*1 18*2 28*3
...
I've tried the lappy function but I am unable to get the results I want. Thanks!
Using sweep():
sweep(mymat, 2, w, "*")
Converting w into a matrix of the same dimensions:
mymat * t(w)[rep(1, NROW(mymat)), ]
1) diag Post multipy it by the appropriate diagonal matrix. We can omit c(), although it won't hurt, if w is a vector rather than a matrix.
mymat %*% diag(c(w))
2) KhatriRao We could alternately use the KhatriRao product. If w is the w defined in the question then matrix could be optionally omitted but we included it in case w is actually a vector. Note that the Matrix package comes with R so it does not have to be installed.
library(Matrix)
KhatriRao(mymat, t(matrix(w)))
3) mapply
mapply(`*`, as.data.frame(mymat), w)
We can use also use col to replicate the values and then multiply in base R
mymat * w[col(mymat)]

Sum Every N Values in Matrix

So I have taken a look at this question posted before which was used for summing every 2 values in each row in a matrix. Here is the link:
sum specific columns among rows. I also took a look at another question here: R Sum every k columns in matrix which is more similiar to mine. I could not get the solution in this case to work. Here is the code that I am working with...
y <- matrix(1:27, nrow = 3)
y
m1 <- as.matrix(y)
n <- 3
dim(m1) <- c(nrow(m1)/n, ncol(m1), n)
res <- matrix(rowSums(apply(m1, 1, I)), ncol=n)
identical(res[1,],rowSums(y[1:3,]))
sapply(split.default(y, 0:(length(y)-1) %/% 3), rowSums)
I just get an error message when applying this. The desired output is a matrix with the following values:
[,1] [,2] [,3]
[1,] 12 39 66
[2,] 15 42 69
[3,] 18 45 72
To sum consecutive sets of n elements from each row, you just need to write a function that does the summing and apply it to each row:
n <- 3
t(apply(y, 1, function(x) tapply(x, ceiling(seq_along(x)/n), sum)))
# 1 2 3
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72
Transform the matrix to an array and use colSums (as suggested by #nongkrong):
y <- matrix(1:27, nrow = 3)
n <- 3
a <- y
dim(a) <- c(nrow(a), ncol(a)/n, n)
b <- aperm(a, c(2,1,3))
colSums(b)
# [,1] [,2] [,3]
#[1,] 12 39 66
#[2,] 15 42 69
#[3,] 18 45 72
Of course this assumes that ncol(y) is divisible by n.
PS: You can of course avoid creating so many intermediate objects. They are there for didactic purposes.
I would do something similar to the OP -- apply rowSums on subsets of the matrix:
n = 3
ng = ncol(y)/n
sapply( 1:ng, function(jg) rowSums(y[, (jg-1)*n + 1:n ]))
# [,1] [,2] [,3]
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72

Keep column name when filtering matrix columns

I have a matrix, like the one generated with this code:
> m = matrix(data=c(1:50), nrow= 10, ncol = 5);
> colnames(m) = letters[1:5];
If I filter the columns, and the result have more than one column, the new matrix keeps the names. For example:
> m[, colnames(m) != "a"];
b c d e
[1,] 11 21 31 41
[2,] 12 22 32 42
[3,] 13 23 33 43
[4,] 14 24 34 44
[5,] 15 25 35 45
[6,] 16 26 36 46
[7,] 17 27 37 47
[8,] 18 28 38 48
[9,] 19 29 39 49
[10,] 20 30 40 50
Notice that here, the class is still matrix:
> class(m[, colnames(m) != "a"]);
[1] "matrix"
But, when the filter lets only one column, the result is a vector, (integer vector in this case) and the column name, is lost.
> m[, colnames(m) == "a"]
[1] 1 2 3 4 5 6 7 8 9 10
> class(m[, colnames(m) == "a"]);
[1] "integer"
The name of the column is very important.
I would like to keep both, matrix structure (a one column matrix) and the column's name.
But, the column's name is more important.
I already know how to solve this by the long way (by keeping track of every case). I'm wondering if there is an elegant, enlightening solution.
You need to set drop = FALSE. This is good practice for programatic use
drop
For matrices and arrays. If TRUE the result is coerced to the lowest possible dimension (see the examples)
m[,'a',drop=FALSE]
This will retain the names as well.
You can also use subset:
m.a = subset(m, select = colnames(m) == "a")

Is there a way to calculate the following specified matrix by avoiding loops? in R or Matlab

I have an N-by-M matrix X, and I need to calculate an N-by-N matrix Y:
Y[i, j] = sum((X[i,] - X[j,]) ^ 2) 0 <= i,j <= N
For now, I have to use nested loops to do it with O(n2). I would like to know if there's a better way, like using matrix operations.
more generally, sum(....) can be a function, fun(x1,x 2) of which x1, x2 are M-by-1 vectors.
you can use expand.grid to get a data.frame of possible pairs:
X <- matrix(sample(1:5, 50, replace=TRUE), nrow=10)
row.ind <- expand.grid(1:dim(X)[1], 1:dim(X)[2])
Then apply along each pair using a function:
myfun <- function(n) {
sum((X[row.ind[n, 1],] - X[row.ind[n, 2],])^2)
}
Y <- matrix(unlist(lapply(1:nrow(row.ind), myfun)), byrow=TRUE, nrow=nrow(X))
> Y
[,1] [,2] [,3] [,4] [,5]
[1,] 0 28 15 31 41
[2,] 31 28 33 30 33
[3,] 28 0 15 7 19
[4,] 33 30 19 34 11
[5,] 15 15 0 12 22
[6,] 10 19 10 21 20
[7,] 31 7 12 0 4
[8,] 16 17 16 13 2
[9,] 41 19 22 4 0
[10,] 14 11 28 9 2
>
I bet there is a better way but its Friday and I'm tired!
(x[i]-x[j])^2 = x[i]² - 2*x[i]*x[j] + x[j]²
and than is middle part just matrix multiplication -2*X*tran(X) (matrix) and other parts are just vetrors and you have to run this over each element
This has O(n^2.7) or whatever matrix multiplication complexity is
Pseudocode:
vec=sum(X,rows).^2
Y=X * tran(X) * -2
for index [i,j] in Y:
Y[i,j] = Y[i,j] + vec[i]+vec[y]
In MATLAB, for your specific f, you could just do this:
Y = pdist(X).^2;
For a non-"cheating" version, try something like this (MATLAB):
[N, M] = size(X);
f = #(u, v) sum((u-v).^2);
helpf = #(i, j) f(X(i, :), X(j, :))
Y = arrayfun(helpf, meshgrid(1:N, 1:N), meshgrid(1:N, 1:N)');
There are more efficient ways of doing it with the specific function sum(...) but your question said you wanted a general way for a general function f. In general this operation will be O(n^2) times the complexity of each vector pair operation because that's how many operations need to be done. If f is of a special form, some calculations' results can be reused.

Resources