I'm trying to convert some code from MATLAB to R.
I'm having particular problems converting this part of a differential equation:
In MATLAB :
dA.*(A*N - N.*sum(A,2))
where dA is an integer, A is a 10x10 matrix and N is a 10x1 matrix (see example code below)
In R so far I've got this:
dA*(A*N - N*colSums(A))
but for some reason it doesn't seem to be giving the same result. Does anyone have any ideas as to what I've done wrong?
Example of the data I'm using below:
in MATLAB:
dA = 0.1;
N = 120000*ones(1,nN);
seq = [0 1 0 0 0 1 0];
seq2 = repmat(seq,1,20);
seq100 = seq2(1:100)
A = AA-diag(diag(AA));
in R:
dA <- 0.1
N <- c(120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000)
num_zeros_int <- zeros(70, 1)
num_ones_int <- ones(30, 1)
seq <- c(0,1,0,0,0,1,0)
seq2<- rep(seq, times = 20)
seq100 <- seq2[0:100]
int_mat <- matrix(seq100, nests, nests)
Matlab expression:
dA.*(A*N - N.*sum(A,2))
where
dA: real number
A: 10 x 10 matrix
N: 10 X 1 matrix
A*N: matrix multiplication
sum(A,2): sum of rows in A (10x1 matrix)
N.*sum(A,2): element by element multiplication (10 x 1 matrix)
Let's set up the following example in R:
A = matrix(data = 1:100,nrow = 10)
N = matrix(data = 1:10)
dA = 0.1
> A
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 3 13 23 33 43 53 63 73 83 93
[4,] 4 14 24 34 44 54 64 74 84 94
[5,] 5 15 25 35 45 55 65 75 85 95
[6,] 6 16 26 36 46 56 66 76 86 96
[7,] 7 17 27 37 47 57 67 77 87 97
[8,] 8 18 28 38 48 58 68 78 88 98
[9,] 9 19 29 39 49 59 69 79 89 99
[10,] 10 20 30 40 50 60 70 80 90 100
> N
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The first term is:
z1 = A %*% N
And the second term:
srow = rowSums(A)
z2 = srow * N
Which leads to the final result:
result = dA * (z1-z2)
Final equation
result = dA * (A %*% N - rowSums(A)*N)
This should give you the same answer as Matlab's dA.*(A*N - N.*sum(A,2))
Related
I am following the thread 2d matrix to 3d stacked array in r and have a clarification on the aperm function.
1) I get the first part of the solution, but did not understand the c(2,1,3) used in the function. Could you kindly clarify that?
2) Also I am trying a slight variation of the example in that thread.
My case is as follows:
For a similar matrix in example:
set.seed(1)
mat <- matrix(sample(100, 12 * 5, TRUE), ncol = 5)
[,1] [,2] [,3] [,4] [,5]
[1,] 27 69 27 80 74
[2,] 38 39 39 11 70
[3,] 58 77 2 73 48
[4,] 91 50 39 42 87
[5,] 21 72 87 83 44
[6,] 90 100 35 65 25
[7,] 95 39 49 79 8
[8,] 67 78 60 56 10
[9,] 63 94 50 53 32
[10,] 7 22 19 79 52
[11,] 21 66 83 3 67
[12,] 18 13 67 48 41
I am trying to rearrange such that I have a 3 (row) X 5 (col) x 11 (third dim) array.
So, essentially the rows would overlap and show something like:
,,1
27 69 27 80 74
38 39 39 11 70
58 77 2 73 48
,,2
38 39 39 11 70
58 77 2 73 48
91 50 39 42 87
,,3
58 77 2 73 48
91 50 39 42 87
21 72 87 83 44
and so on until we hit ,,11
Would someone have any experience with this?
Thanks!
Just stumbled over this question. Though the answer comes a little late, here are two options for you.
First, you need to extend mat in such a way that it's rows overlap. We can use this vector for row indexing.
#[1] 1 2 3 2 3 4 3 4 5 4 5 6 5 6 7 6 7 8 7 8 9 8 9 10 9 10 11 10 11 12
I used rollapply from the zoo package to create it as follows:
library(zoo)
row_nums <- c(t(rollapply(1:nrow(mat), width = 3, FUN = rep, 1)))
mat <- mat[row_nums, ]
dim(mat)
#[1] 30 5
Now use the matsplitter function that #Mr.Flick provided in this answer (please consider to upvote his answer) to get the desired output:
matsplitter(mat, 3, 5)
#, , 1
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 27 69 27 80 74
#[2,] 38 39 39 11 70
#[3,] 58 77 2 73 48
#
#, , 2
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 38 39 39 11 70
#[2,] 58 77 2 73 48
#[3,] 91 50 39 42 87
#
#, , 3
#
# [,1] [,2] [,3] [,4] [,5]
#[1,] 58 77 2 73 48
#[2,] 91 50 39 42 87
#[3,] 21 72 87 83 44
#
#, , 4
# ...
Note that you will end up with an array of dimension 3 x 5 x 10, not 11.
matsplitter <- function(M, r, c) {
rg <- (row(M) - 1) %/% r + 1
cg <- (col(M) - 1) %/% c + 1
rci <- (rg - 1) * max(cg) + cg
N <- prod(dim(M)) / r / c
cv <- unlist(lapply(1:N, function(x)
M[rci == x]))
dim(cv) <- c(r, c, N)
cv
}
Here is a solution using aperm as in the linked answer (assuming that mat was extended as above and is of dimension 30 x 5).
aperm(`dim<-`(t(mat), list(5, 3, 10)), c(2, 1, 3))
t(mat): transposes mat (new dimension: 5 x 30)
`dim<-`(t(mat), list(5, 3, 10)): changes the dimension of t(mat) from 5 X 30 to 5 x 3 x 10
aperm(..., c(2, 1, 3)) permutes the dimensions of the array `dim<-`(t(mat), list(5, 3, 10)) from 5 x 3 x 10 to 3 x 5 x 10, i.e. the second dimension becomes the first, the first
dimension becomes the second and the third dimension stays the same.
I have two large matrices P and Q around (10k x 50k dim in both, but to test this yourself a random 10x10 matrix for P and Q is sufficient). I have a list of indices, e.g.
i j
1 4
1 625
1 9207
2 827
... ...
etc. This means that I need to find the dot product of column 1 in P and column 4 in Q, then column 1 in P and column 625 in Q and so on. I could easily solve this with a for loop but I know they are not very efficient in R. Anyone got any ideas?
edit: asked for a reproducible example
P <- matrix(c(1,0,1,0,0,1,0,1,0), nrow = 3, ncol = 3)
Q <- matrix(c(0,0,1,0,1,0,1,0,1), nrow = 3, ncol = 3)
i <- c(1,1,2)
j <- c(2,1,3)
gives output (if in dot product form)
1: 0
2: 1
3: 1
P <- matrix(1:50, nrow = 5,ncol = 10)
Q <- matrix(1:50, nrow = 5, ncol = 10)
i <- c(1,2,4,7)
j <- c(5,3,7,2)
P
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 6 11 16 21 26 31 36 41 46
# [2,] 2 7 12 17 22 27 32 37 42 47
# [3,] 3 8 13 18 23 28 33 38 43 48
# [4,] 4 9 14 19 24 29 34 39 44 49
# [5,] 5 10 15 20 25 30 35 40 45 50
Q
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 6 11 16 21 26 31 36 41 46
# [2,] 2 7 12 17 22 27 32 37 42 47
# [3,] 3 8 13 18 23 28 33 38 43 48
# [4,] 4 9 14 19 24 29 34 39 44 49
# [5,] 5 10 15 20 25 30 35 40 45 50
P[,i] * Q[, j]
# [,1] [,2] [,3] [,4]
# [1,] 21 66 496 186
# [2,] 44 84 544 224
# [3,] 69 104 594 264
# [4,] 96 126 646 306
# [5,] 125 150 700 350
Using matrix multiplication, you can do
diag(t(P[, i]) %*% Q[, j])
[1] 0 1 1
Here is second a solution with apply.
apply(cbind(i, j), 1, function(x) t(P[, x[1]]) %*% Q[, x[2]])
[1] 0 1 1
To verify these agree in a second example:
set.seed(1234)
A <- matrix(sample(0:10, 100, replace=TRUE), 10, 10)
B <- matrix(sample(0:10, 100, replace=TRUE), 10, 10)
inds <- matrix(sample(10, 10, replace=TRUE), 5)
matrix multiplication
diag(t(A[, inds[,1]]) %*% B[, inds[,2]])
[1] 215 260 306 237 317
and apply
apply(inds, 1, function(x) t(A[, x[1]]) %*% B[, x[2]])
[1] 215 260 306 237 317
How could I build a function that extracts the diagonal blocks matrices of a larger one? The problem is as follows. The function takes a centred matrix as argument, computes the full error covariance matrix and extracts the blocks on the leading diagonal? I tried the following, but not working.
err_cov <- function(x){
m <- nrow(x)
n <- ncol(x)
#compute the full error covariance matrix as the inner product
#of vec(x) and its transpose. Note that, omega is a mnxmn matrix
vec <- as.vector(x)
omega <- vec%*%t(vec)
sigmas <- list()
for(i in 0:n-1){
#here the blocks have to be m nxn matrices along the
#leading diagonal
for (j in 1:m)
sigmas[[j]] <- omega[(n*i+1):n*(i+1), (n*i+1):n*(i+1)]
}
return(sigmas)
}
So, for instance for
A
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> B<-as.vector(A)
> B
[1] 1 2 3 4 5 6 7 8 9 10 11 12
> C<-B%*%t(B)
> C
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 1 2 3 4 5 6 7 8 9 10 11 12
[2,] 2 4 6 8 10 12 14 16 18 20 22 24
[3,] 3 6 9 12 15 18 21 24 27 30 33 36
[4,] 4 8 12 16 20 24 28 32 36 40 44 48
[5,] 5 10 15 20 25 30 35 40 45 50 55 60
[6,] 6 12 18 24 30 36 42 48 54 60 66 72
[7,] 7 14 21 28 35 42 49 56 63 70 77 84
[8,] 8 16 24 32 40 48 56 64 72 80 88 96
[9,] 9 18 27 36 45 54 63 72 81 90 99 108
[10,] 10 20 30 40 50 60 70 80 90 100 110 120
[11,] 11 22 33 44 55 66 77 88 99 110 121 132
[12,] 12 24 36 48 60 72 84 96 108 120 132 144
The function should return:
> C1
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 6
[3,] 3 6 9
> C2
[,1] [,2] [,3]
[1,] 16 20 24
[2,] 20 25 30
[3,] 24 30 36
> C3
[,1] [,2] [,3]
[1,] 49 56 63
[2,] 56 64 72
[3,] 63 72 81
> C4
[,1] [,2] [,3]
[1,] 100 110 120
[2,] 110 121 132
[3,] 120 132 144
Thanks for answering.
I think a clearer solution is to reset the dimensions and then let R do the index calculations for you:
err_cov <- function(x){
m <- nrow(x)
n <- ncol(x)
#compute the full error covariance matrix as the inner product
#of vec(x) and its transpose
vec <- as.vector(x)
omega <- tcrossprod(vec)
dim(omega) <- c(n,m,n,m)
sigmas <- list()
for (j in 1:m)
sigmas[[j]] <- omega[,j,,j]
return(sigmas)
}
Here is an example:
> x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> tcrossprod(vec)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
[2,] 2 4 6 8 10 12
[3,] 3 6 9 12 15 18
[4,] 4 8 12 16 20 24
[5,] 5 10 15 20 25 30
[6,] 6 12 18 24 30 36
> err_cov(x)
[[1]]
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 6
[3,] 3 6 9
[[2]]
[,1] [,2] [,3]
[1,] 16 20 24
[2,] 20 25 30
[3,] 24 30 36
Let me reclaim my question, how I can sum the numbers by row, and list the sum follow by the last column, forming a new column like the second table (sum = a + b+ c + d + e)?
And I also want to know what if some of the values are N/A, can I still treat them as numbers?
Sample input:
a b c d e
1 90 67 18 39 74
2 100 103 20 45 50
3 80 87 23 44 89
4 95 57 48 79 90
5 74 81 61 95 131
Desired output:
a b c d e sum
1 90 67 18 39 74 288
2 100 103 20 45 50 318
3 80 87 23 44 89 323
4 95 57 48 79 90 369
5 74 81 61 95 131 442
To add a row sum, you can use addmargins
M <- matrix(c(90,67,18,39,74), nrow=1)
addmargins(M, 2) #2 = row margin
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 90 67 18 39 74 288
If you have missing data, you'll need to change the margin function to something that will properly handle the NA values
M<-matrix(c(90,67,18,NA,74), nrow=1)
addmargins(M, 2, FUN=function(...) sum(..., na.rm=T))
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 90 67 18 NA 74 249
Consider using apply(). For example:
set.seed(10) # optional, but this command will replicate data as shown
# create some data
x <-matrix(rnorm(1:25),nrow=5,ncol=5) # 5x5 matrix of random numbers
x
[,1] [,2] [,3] [,4] [,5]
[1,] 0.01874617 0.3897943 1.1017795 0.08934727 -0.5963106
[2,] -0.18425254 -1.2080762 0.7557815 -0.95494386 -2.1852868
[3,] -1.37133055 -0.3636760 -0.2382336 -0.19515038 -0.6748659
[4,] -0.59916772 -1.6266727 0.9874447 0.92552126 -2.1190612
[5,] 0.29454513 -0.2564784 0.7413901 0.48297852 -1.2651980
x.sum <-apply(x,1,sum) # sum the rows. Note: apply(x,2,sum) sums cols
x.sum
[1] 1.003356605 -3.776777904 -2.843256446 -2.431935624 -0.002762636
# attach new column (x.sum) to matrix x
x.sum.1 <-cbind(x,x.sum)
x.sum.1
x.sum
[1,] 0.01874617 0.3897943 1.1017795 0.08934727 -0.5963106 1.003356605
[2,] -0.18425254 -1.2080762 0.7557815 -0.95494386 -2.1852868 -3.776777904
[3,] -1.37133055 -0.3636760 -0.2382336 -0.19515038 -0.6748659 -2.843256446
[4,] -0.59916772 -1.6266727 0.9874447 0.92552126 -2.1190612 -2.431935624
[5,] 0.29454513 -0.2564784 0.7413901 0.48297852 -1.2651980 -0.002762636
Let's say you have the dataframe df, then you could try something like this:
# Assuming the columns a,b,c,d,e are at indices 1:5
df$sum = rowSums(df[ , c(1:5)], na.rm = T)
Or you could aslo try this:
transform(df, sum=rowSums(df), na.rm = T)
I started using R about six months back and i have gained a little bit of experience in R. Recently, I ran into an issue regarding subsets within a matrix and would like assistance on making the solution that I have more efficient.
What I would like to do is the following. Suppose I have a matrix and two vectors as follows:
# matrix
a <- matrix(seq(1,100,by=1),10,10)
# vector (first column of matrix a)
b <- c(2,4,5,6,7,8)
# vector (column numbers of matrix a)
c <- c(5,3,1,4,6,2)
Just to reiterate,
Vector b refers to the first column of matrix a.
Vector c refers to column numbers of matrix a.
I would like to get tmp99 <- a[b,c:8]. However, when I do that I get the following warning message.
Warning message:
In c:8 : numerical expression has 6 elements: only the
first used (index has to be scalar and not vector)
So, I tried working around the problem using loops and list and I get the solution I want. I am assuming that there is a more time efficient solution than this. The solution what I have so far is the following:
a <- matrix(seq(1,100,by=1),10,10)
b <- c(2,4,5,6,7,8)
c <- c(5,3,1,4,6,2)
tmp <- list()
for (i in 1:length(b)) tmp[[i]] <- c(a[b[i],(c[i]:8)])
tmp99 <- t(sapply(tmp, '[', 1:max(sapply(tmp, length))))
tmp99[is.na(tmp99)] <- 0
What I would like to know is if there is a way to avoid using loops to achieve the above because my matrix dimension is 200000 x 200 and since I have to do this a lot (In my problem, b and c are determined as part of another part of the code and so I am not able to use absolute index numbers), I would like to cut down the time taken for the same. Any help will be greatly appreciated. Thank you.
You might try some kind of matrix indexing solution, like this. It's not clear if it will actually be faster or not; in small cases, I think it definitely will be, but in big cases, the overhead from creating the matrixes to index by might take longer than just running through a for loop. To get a better answer, make up a data set that is similar to yours that we could test against.
idx.in <- cbind(rep(b, 8-c+1), unlist(lapply(c, function(x) x:8)))
idx.out <- cbind(rep(seq_along(b), 8-c+1), unlist(lapply(c, function(x) 1:(8-x+1))))
tmp99 <- array(0, dim=apply(idx.out, 2, max))
tmp99[idx.out] <- a[idx.in]
Here's a version with matrix indexing but that does it separately for each row. This might be faster, depending on how many rows and columns are being replaced. What you want to avoid is running out of memory, which the for loop can help with, as it doesn't keep all the details for each step in memory at the same time.
out <- array(0, dim=c(length(b), 8-min(c)+1))
for(idx in seq_along(b)) {
out[cbind(idx, 1:(8-c[idx]+1))] <- a[cbind(b[idx], c[idx]:8)]
}
out
Following is one way to do it using base packages. There might be better solution using data.table but following works :)
a <- matrix(seq(1, 100, by = 1), 10, 10)
b <- c(2, 4, 5, 6, 7, 8)
c <- c(5, 3, 1, 4, 6, 2)
res <- t(sapply(X = mapply(FUN = function(b, c) expand.grid(b, seq(from = c, to = 8)), b, c, SIMPLIFY = FALSE), FUN = function(x) {
c(a[as.matrix(x)], rep(0, 8 - nrow(x)))
}))
res
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 42 52 62 72 0 0 0 0
## [2,] 24 34 44 54 64 74 0 0
## [3,] 5 15 25 35 45 55 65 75
## [4,] 36 46 56 66 76 0 0 0
## [5,] 57 67 77 0 0 0 0 0
## [6,] 18 28 38 48 58 68 78 0
# Let's break it down in multiple steps.
coordinates <- mapply(FUN = function(b, c) expand.grid(b, seq(from = c, to = 8)), b, c, SIMPLIFY = FALSE)
# below sapply subsets c using each element in coordinates and pads result with additional 0s such that total 8 elements are returned.
res <- sapply(X = coordinates, FUN = function(x) {
c(a[as.matrix(x)], rep(0, 8 - nrow(x)))
})
res
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 42 24 5 36 57 18
## [2,] 52 34 15 46 67 28
## [3,] 62 44 25 56 77 38
## [4,] 72 54 35 66 0 48
## [5,] 0 64 45 76 0 58
## [6,] 0 74 55 0 0 68
## [7,] 0 0 65 0 0 78
## [8,] 0 0 75 0 0 0
# you probably need result as traspose
res <- t(res)
res
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 42 52 62 72 0 0 0 0
## [2,] 24 34 44 54 64 74 0 0
## [3,] 5 15 25 35 45 55 65 75
## [4,] 36 46 56 66 76 0 0 0
## [5,] 57 67 77 0 0 0 0 0
## [6,] 18 28 38 48 58 68 78 0
tmp <- lapply(seq_len(length(b)),function(i) {
res <- a[b[i],c[i]:8]
res <- c(res,rep(0,c[i]-1))
res
})
tmp99 <- do.call("rbind",tmp)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 42 52 62 72 0 0 0 0
# [2,] 24 34 44 54 64 74 0 0
# [3,] 5 15 25 35 45 55 65 75
# [4,] 36 46 56 66 76 0 0 0
# [5,] 57 67 77 0 0 0 0 0
# [6,] 18 28 38 48 58 68 78 0