Interact each row of matrix with same row in another matrix - r

In R I have two matrices X and Z and I would like a
matrix W such that the row (i) of W contains row (i) of X interacted with row (i) of Z.
W(i) = X(i1)Z(i1) ... X(iJ)Z(i1) ... X(i1)Z(iK) ... X(iJ)Z(iK)
Here is an example in small scale doing what I want:
set.seed(1)
n <- 3
K <- 2
J <- 3
X <- matrix(rnorm(J*n),ncol=J)
Z <- matrix(rnorm(K*n),ncol=K)
W <- matrix(NA,nrow=n,ncol=K*J)
for (i in 1:n)
{
for (k in 1:K)
{
for (j in 1:J)
{
W[i,j + J*(k-1)] <- X[i,j] * Z[i,k]
}
}
}
Is there a clever way to do that?

I ended up doing
X[,sort(rep(1:J,K))] * Z[,rep(1:K,J)]

For this example, you can do
cbind(X * Z[, 1], X * Z[, 2])
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.1913117 -0.4871802 -0.1488552 0.3891785 -0.9910532 -0.3028107
#[2,] 0.2776285 0.4981436 1.1161854 -0.4067148 -0.7297608 -1.6351676
#[3,] -0.3257642 -0.3198541 0.2244645 -0.9400245 -0.9229703 0.6477142
Or more generally we can use apply for many more columns.
W[] <- apply(Z, 2, function(x) X * x)
which gives the same output as W which we get after running your loop.
W
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.1913117 -0.4871802 -0.1488552 0.3891785 -0.9910532 -0.3028107
#[2,] 0.2776285 0.4981436 1.1161854 -0.4067148 -0.7297608 -1.6351676
#[3,] -0.3257642 -0.3198541 0.2244645 -0.9400245 -0.9229703 0.6477142

Related

How to compute an objective function value from a matrix?

Suppose that we have the following beale function :
custom_fun <- function(x, y) { # Create custom function in R
z <- (1.5 - x + x*y)^2+(2.25 - x + x*y^2)^2+(2.625 - x + x*y^3)^2
return(z)
}
and we have this list of positions in a matrix m:
[1] "list of respective positions"
[,1] [,2] [,3]
x: 2.482116 -0.7845145 -3.370810
y: -1.031615 1.2035550 1.203555
How can i compute custom_fun(x,y) in each column with an elegant way ?
I'm wanting to store those values in the last row of the matrix m.
Thank you for help!
You can use apply:
custom_fun <- function(x, y) { # Create custom function in R
z <- (1.5 - x + x*y)^2+(2.25 - x + x*y^2)^2+(2.625 - x + x*y^3)^2
return(z)
}
my_mat <- matrix(rnorm(10), nrow = 2, dimnames = list(c("x", "y"), NULL))
my_mat
#> [,1] [,2] [,3] [,4] [,5]
#> x 0.5631441 0.9349816 -1.0088734 -1.364570 -1.32633896
#> y 0.4978350 -1.3265677 0.4206566 -2.532265 -0.01913554
rbind(my_mat, `fun(x, y)` = apply(my_mat, 2, function(x) custom_fun(x[1], x[2])))
#> [,1] [,2] [,3] [,4] [,5]
#> x 0.5631441 0.9349816 -1.0088734 -1.364570 -1.32633896
#> y 0.4978350 -1.3265677 0.4206566 -2.532265 -0.01913554
#> fun(x, y) 9.3600306 9.4626106 26.4985306 749.993103 36.53218207
Created on 2020-06-21 by the reprex package (v0.3.0)

for loops in r to create a matrix

enter image description hereI want to create a 10 by 8 matrix from two matrices again with dimensions 10 by 8 by using for loops.
I have matrices a and e and I want to save the results of below code to a matrix. But when I run the code, the matrix chi is an empty matrix except with the last row of the last column. I am kind of newby to R, so any help is appreciated. Thanks.
chi <- matrix(nrow = 10, ncol = 8, byrow = T)
i <- nrow(a)
j <- ncol(a)
k <- nrow(e)
l <- ncol(e)
m <- nrow(chi)
n <- ncol(chi)
for (i in 1:nrow(a)) {
for (j in 1:ncol(a)) {
for (k in 1:nrow(e)) {
for (l in 1:ncol(e))
chi[m, n] <- ((a[i, j] - e[k, l]) ^ 2 / (e[k, l] * (1 - e[k, l])))
}
}
}
Reconsider using any nested for loops as you can simply run matrix algebra since all inputs, a and e, are equal length objects:
chi <- ((a - e) ^ 2 / (e * (1 - e)))
With your nested for loop approach, your attempted matrix cell assignment is overwritten with each inner loop pass and only the very last instance is saved.
To demonstrate, consider the following random matrices (seeded for reproducibility):
set.seed(1162018)
a <- matrix(runif(800), nrow = 10, ncol = 8)
e <- matrix(runif(800), nrow = 10, ncol = 8)
With following output:
chi2 <- ((a - e) ^ 2 / (e * (1 - e)))
chi2
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 1.090516287 5.314965506 0.30221649 4.3078030566 0.08185310 0.33991625 7.475638e-01 7.136321e+01
# [2,] 0.339472596 0.037831564 1.00181544 0.0075194551 0.27228312 20.74823838 2.308509e-04 1.264312e-04
# [3,] 0.001493967 0.009102797 17.76508355 0.0318190760 0.08133848 0.90538852 1.425952e-01 3.600838e-02
# [4,] 25.941857200 2.182678801 0.52170472 0.5485710933 0.57015681 0.09332506 2.631002e-01 4.897862e-01
# [5,] 4.341993499 0.075724451 0.03409925 0.0058830640 0.15290151 0.83227284 2.982630e+02 2.615268e-01
# [6,] 0.327661207 0.058150213 0.17328257 0.3161902785 4.48620227 0.14685330 2.996204e+00 1.888419e+01
# [7,] 0.456397833 1.446942556 0.51597191 0.2051742161 0.20440765 0.58169351 5.345522e+00 1.320896e-03
# [8,] 12.844776005 0.753941152 0.36425134 0.0003481929 0.34011118 2.38649404 1.082046e-01 1.817180e-01
# [9,] 0.042779101 0.119540004 1.41313002 0.1262586599 0.36583013 1.76476721 1.353301e+00 1.670491e-01
# [10,] 4.729182008 5.257386394 0.62181731 0.0000251250 0.32324943 0.08491841 6.627723e+00 2.127289e+00
Notice the very first, second, all the way to last elements of chi2 is consistent to your original formula as seen with using only single values. The all.equal() demonstrates no value difference between scientific notation or not.
((a[1, 1] - e[1, 1]) ^ 2 / (e[1, 1] * (1 - e[1, 1])))
# [1] 1.090516
((a[1, 2] - e[1, 2]) ^ 2 / (e[1, 2] * (1 - e[1, 2])))
# [1] 1.090516
# ...
((a[10, 8] - e[10, 8]) ^ 2 / (e[10, 8] * (1 - e[10, 8])))
# [1] 2.127289
all.equal(2.127289e+00, 2.127289)
# [1] TRUE
Incorrect For Loop Processing
However, adjusting your for loop to use chi[i,j] assignment which does yield values but on closer look does not accurately align to your original formula:
chi <- matrix(nrow = 10, ncol = 8, byrow = T)
i <- nrow(a)
j <- ncol(a)
k <- nrow(e)
l <- ncol(e)
m <- nrow(chi)
n <- ncol(chi)
for (i in 1:nrow(a)) {
for (j in 1:ncol(a)) {
for (k in 1:nrow(e)) {
for (l in 1:ncol(e))
chi[i,j] <- ((a[i, j] - e[k, l]) ^ 2 / (e[k, l] * (1 - e[k, l])))
}
}
}
chi
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 3.409875713 1.91797098 0.983457185 0.72023148 0.96731753 0.047236836 2.20811240 0.6073649
# [2,] 0.011756997 2.96049899 3.614632753 1.30476270 2.49116488 0.074379894 1.01941080 0.3796867
# [3,] 2.061628776 0.03227113 0.691592758 2.58226782 0.17603261 4.377353084 1.07957101 0.9584883
# [4,] 5.477395731 0.07409188 5.287871705 1.86472765 2.02597697 0.078780553 6.20319269 2.6099405
# [5,] 4.342937737 3.57579681 1.016981597 2.83351392 1.11431922 0.083484410 0.08412765 0.5525810
# [6,] 0.008175703 2.63310577 0.005053893 3.69703754 0.05993078 0.004768071 5.92075341 4.2435415
# [7,] 1.051921956 0.31217144 5.624012725 0.90161687 0.43301151 0.156739757 0.72284317 1.2243496
# [8,] 4.941310521 4.85504735 0.021515999 3.66512027 0.08358373 3.603038468 0.38618455 6.1389345
# [9,] 0.559136535 5.08204325 2.999036687 2.72726724 5.99168376 0.319859158 0.59398961 3.6221932
# [10,] 0.001668949 2.97353267 4.703763876 0.04979429 5.31715581 0.053267595 2.09966809 2.1272893
Here, the for loop returns only the very last instance since chi[i,j] is overwritten multiple times during loop. As a result, ALL elements of chi matrix uses the last element of e:
((a[1, 1] - e[10, 8]) ^ 2 / (e[10, 8] * (1 - e[10, 8])))
# [1] 3.409876
((a[1, 2] - e[10, 8]) ^ 2 / (e[10, 8] * (1 - e[10, 8])))
# [1] 1.917971
# ...
((a[10, 8] - e[10, 8]) ^ 2 / (e[10, 8] * (1 - e[10, 8])))
# [1] 2.127289
Conversely, with using chi[k,l] for assignment in loop.
for (i in 1:nrow(a)) {
for (j in 1:ncol(a)) {
for (k in 1:nrow(e)) {
for (l in 1:ncol(e))
chi[k,l] <- ((a[i, j] - e[k, l]) ^ 2 / (e[k, l] * (1 - e[k, l])))
}
}
}
chi
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 5.649285e-01 5.813300e+00 0.035949545 10.14845208 0.002533313 0.405749651 0.711058301 2.592142e+01
# [2,] 7.481556e+00 4.531135e-05 0.455696004 0.09284383 0.192074706 4.178867177 0.105489574 3.541626e-01
# [3,] 4.953702e-04 6.703029e+00 41.109139456 0.08957573 1.511080005 0.254656165 0.004840752 2.805246e-01
# [4,] 1.152237e+01 2.556255e-02 0.018652264 0.65975403 0.515919955 0.280219679 0.124379946 7.777978e-01
# [5,] 2.126765e+00 5.356927e-01 0.251885418 0.06540162 0.008580900 0.003271672 41.259025738 2.963719e-06
# [6,] 1.401345e-01 1.603721e-02 0.334385097 0.05865054 0.622973490 0.608273911 0.888928067 1.046868e+01
# [7,] 1.018507e-01 1.756129e-01 0.005676374 0.72309875 0.011666290 0.314863595 12.420604213 7.778975e-02
# [8,] 6.082752e+00 1.250805e-01 0.287099891 0.17209992 0.050136187 1.339028574 1.059674334 2.627769e-01
# [9,] 8.005223e-02 9.260464e-02 2.823995704 0.04935770 0.020361815 0.258144647 0.275514317 9.392584e-03
# [10,] 4.952038e-01 3.870331e+00 0.089420009 1.05729955 0.002429084 0.349966871 6.702385325 2.127289e+00
As a result, ALL matrix elements uses the last values of a:
((a[10, 8] - e[1, 1]) ^ 2 / (e[1, 1] * (1 - e[1, 1])))
# [1] 0.5649285
all.equal(5.649285e-01, 0.5649285)
# [1] TRUE
((a[10, 8] - e[1, 2]) ^ 2 / (e[1, 2] * (1 - e[1, 2])))
# [1] 5.8133
all.equal(5.813300e+00, 5.8133)
# [1] TRUE
# ...
((a[10, 8] - e[10, 8]) ^ 2 / (e[10, 8] * (1 - e[10, 8])))
# [1] 2.127289

How to generate symmetric random matrix?

I want to generate a random matrix which should be symmetric.
I have tried this:
matrix(sample(0:1, 25, TRUE), 5, 5)
but it is not necessarily symmetric.
How can I do that?
Another quite interesting opportunity is based on the following mathematical fact: if A is some matrix, then A multiplied by its transpose is always symmetric.
> A <- matrix(runif(25), 5, 5)
> A %*% t(A)
[,1] [,2] [,3] [,4] [,5]
[1,] 1.727769 1.0337816 1.2195505 1.4661507 1.1041355
[2,] 1.033782 1.0037048 0.7368944 0.9073632 0.7643080
[3,] 1.219551 0.7368944 1.8383986 1.3309980 0.9867812
[4,] 1.466151 0.9073632 1.3309980 1.3845322 1.0034140
[5,] 1.104135 0.7643080 0.9867812 1.0034140 0.9376534
Try this from the Matrix package
library(Matrix)
x<-Matrix(rnorm(9),3)
x
3 x 3 Matrix of class "dgeMatrix"
[,1] [,2] [,3]
[1,] -0.9873338 0.8965887 -0.6041742
[2,] -0.3729662 -0.5882091 -0.2383262
[3,] 2.1263985 -0.3550972 0.1067264
X<-forceSymmetric(x)
X
3 x 3 Matrix of class "dsyMatrix"
[,1] [,2] [,3]
[1,] -0.9873338 0.8965887 -0.6041742
[2,] 0.8965887 -0.5882091 -0.2383262
[3,] -0.6041742 -0.2383262 0.1067264
If you don't want to use a package:
n=3
x <- matrix(rnorm(n*n), n)
ind <- lower.tri(x)
x[ind] <- t(x)[ind]
x
I like this one:
n <- 3
aux <- matrix(NA, nrow = n, ncol = n)
for(i in c(1:n)){
for(j in c(i:n)){
aux[i,j] <- sample(c(1:n), 1)
aux[j,i] <- aux[i,j]
}
}

Defining a function that calculates the covariance-matrix of a correlation-matrix

I have some problems with the transformation of a matrix and the names of the rows and columns.
My problem is as follows:
As input-matrix I have a (symmetric) correlation matrix like this one:
The correlation-vector is given by the values of the lower triangular matrix:
Now, I want to compute the variance-covariance-matrix of the these correlations, which are approximately normally distributed with the variance-covariance-matrix:
The variances can be approximated by
-> N is the sample size (in this example N = 66)
The covariances can be approximated by
For example the covariance between r_02 and r_13 is given by
Now, I want to define a function in R which gets the correlation matrix as input and returns the variance-covariance matrix. However, I have problems to implement the calculation of the covariances. My idea is to give names to the elements of the correlation_vector as shown above (r_01, r_02...). Then I want to create the empty variance-cocariance matrix, which has the length of the correlation_vector. The rows and the columns should have the same names as the correlation_vector, so I can call them for example by [01][03]. Then I want to implement a for-loop which sets the value of i and j as well as k and l as shown in the formula for the covariance to the columns and rows of the correlations that I need as input for the covariance-formula. These must always be six different values (ij; ik; il; jk; jl; lk). This is my idea, but I don't now how to implement this in R.
This is my code (without the calculation of the covariances):
require(corpcor)
correlation_matrix_input <- matrix(data=c(1.00,0.561,0.393,0.561,0.561,1.00,0.286,0.549,0.393,0.286,1.00,0.286,0.561,0.549,0.286,1.00),ncol=4,byrow=T)
N <- 66 # Sample Size
vector_of_correlations <- sm2vec(correlation_matrix_input, diag=F) # lower triangular matrix of correlation_matrix_input
variance_covariance_matrix <- matrix(nrow = length(vector_of_correlations), ncol = length(vector_of_correlations)) # creates the empty variance-covariance matrix
# function to fill the matrix by calculating the variance and the covariances
variances_covariances <- function(vector_of_correlations_input, sample_size) {
for (i in (seq(along = vector_of_correlations_input))) {
for (j in (seq(along = vector_of_correlations_input))) {
# calculate the variances for the diagonale
if (i == j) {
variance_covariance_matrix[i,j] = ((1-vector_of_correlations_input[i]**2)**2)/sample_size
}
# calculate the covariances
if (i != j) {
variance_covariance_matrix[i,j] = ???
}
}
}
return(variance_covariance_matrix);
}
Does anyone have an idea, how to implement the calculation of the covariances using the formula shown above?
I would be grateful for any kind of help regarding this problem!!!
It's easier if you keep r as a matrix and use this helper function to make things clearer:
covr <- function(r, i, j, k, l, n){
if(i==k && j==l)
return((1-r[i,j]^2)^2/n)
( 0.5 * r[i,j]*r[k,l]*(r[i,k]^2 + r[i,l]^2 + r[j,k]^2 + r[j,l]^2) +
r[i,k]*r[j,l] + r[i,l]*r[j,k] - (r[i,j]*r[i,k]*r[i,l] +
r[j,i]*r[j,k]*r[j,l] + r[k,i]*r[k,j]*r[k,l] + r[l,i]*r[l,j]*r[l,k]) )/n
}
Now define this second function:
vcovr <- function(r, n){
p <- combn(nrow(r), 2)
q <- seq(ncol(p))
outer(q, q, Vectorize(function(x,y) covr(r, p[1,x], p[2,x], p[1,y], p[2,y], n)))
}
And voila:
> vcovr(correlation_matrix_input, 66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.007115262 0.001550264 0.002917481 0.003047666 0.003101602 0.001705781
[2,] 0.001550264 0.010832674 0.001550264 0.006109565 0.001127916 0.006109565
[3,] 0.002917481 0.001550264 0.007115262 0.001705781 0.003101602 0.003047666
[4,] 0.003047666 0.006109565 0.001705781 0.012774221 0.002036422 0.006625868
[5,] 0.003101602 0.001127916 0.003101602 0.002036422 0.007394554 0.002036422
[6,] 0.001705781 0.006109565 0.003047666 0.006625868 0.002036422 0.012774221
EDIT:
For the transformed Z values, as in your comment, you can use this:
covrZ <- function(r, i, j, k, l, n){
if(i==k && j==l)
return(1/(n-3))
covr(r, i, j, k, l, n) / ((1-r[i,j]^2)*(1-r[k,l]^2))
}
And simply replace it in vcovr:
vcovrZ <- function(r, n){
p <- combn(nrow(r), 2)
q <- seq(ncol(p))
outer(q, q, Vectorize(function(x,y) covrZ(r, p[1,x], p[2,x], p[1,y], p[2,y], n)))
}
New result:
> vcovrZ(correlation_matrix_input,66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.015873016 0.002675460 0.006212598 0.004843517 0.006478743 0.002710920
[2,] 0.002675460 0.015873016 0.002675460 0.007869213 0.001909452 0.007869213
[3,] 0.006212598 0.002675460 0.015873016 0.002710920 0.006478743 0.004843517
[4,] 0.004843517 0.007869213 0.002710920 0.015873016 0.003174685 0.007858948
[5,] 0.006478743 0.001909452 0.006478743 0.003174685 0.015873016 0.003174685
[6,] 0.002710920 0.007869213 0.004843517 0.007858948 0.003174685 0.015873016
I wrote an approach using combn and row/column indices to generate the different combinations of p.
variances_covariances <- function(m, n) {
r <- m[lower.tri(m)]
var <- (1-r^2)^2
## generate row/column indices
rowIdx <- rep(1:nrow(m), times=colSums(lower.tri(m)))
colIdx <- rep(1:ncol(m), times=rowSums(lower.tri(m)))
## generate combinations
cov <- combn(length(r), 2, FUN=function(i) {
## current row/column indices
cr <- rowIdx[i] ## i,k
cc <- colIdx[i] ## j,l
## define 6 cases
p.ij <- m[cr[1], cc[1]]
p.ik <- m[cr[1], cr[2]]
p.il <- m[cr[1], cc[2]]
p.jk <- m[cc[1], cr[2]]
p.jl <- m[cc[1], cc[2]]
p.kl <- m[cr[2], cc[2]]
## calculate covariance
co <- 0.5 * p.ij * p.kl * (p.ik^2 + p.il^2 + p.jk^2 + p.jl^2) +
p.ik * p.jl + p.il * p.jk -
(p.ij * p.ik * p.il + p.ij * p.jk * p.jl + p.ik * p.jk * p.kl + p.il * p.jl * p.kl)
return(co)
})
## create output matrix
com <- matrix(NA, ncol=length(r), nrow=length(r))
com[lower.tri(com)] <- cov
com[upper.tri(com)] <- t(com)[upper.tri(com)]
diag(com) <- var
return(com/n)
}
Output:
m <- matrix(data=c(1.000, 0.561, 0.393, 0.561,
0.561, 1.000, 0.286, 0.549,
0.393, 0.286, 1.000, 0.286,
0.561, 0.549, 0.286, 1.00), ncol=4, byrow=T)
variances_covariances(m, 66)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.007115262 0.001550264 0.001550264 0.003101602 0.003101602 0.001705781
#[2,] 0.001550264 0.010832674 0.010832674 0.001127916 0.001127916 0.006109565
#[3,] 0.001550264 0.010832674 0.007115262 0.001127916 0.001127916 0.006109565
#[4,] 0.003101602 0.001127916 0.001127916 0.012774221 0.007394554 0.002036422
#[5,] 0.003101602 0.001127916 0.001127916 0.007394554 0.007394554 0.002036422
#[6,] 0.001705781 0.006109565 0.006109565 0.002036422 0.002036422 0.012774221
I hope, I have done everything right.
salam/hello
variance_covariance_matrix<- diag (variance vector, length (r),length (r))
pcomb <- combn(length(r), 2)
for (k in 1:length(r)){
i<- pcomb[1,k]
j<- pcomb[2,k]
variance_covariance_matrix[i,j]<- variance_covariance_matrix [j,i]<- genCorr[k] * sqrt (sig2g[i]) * sqrt (sig2g[j])
}

check column values and print/delete rows satisfying condition based on percent of columns

I have a matrix of values arranged in different columns per row.
What I want my code to do :
Iterate over a row -> check if value of Column < threshold (e.g. 1)
Within the row, if there are say 80% columns satisfying that condition, Keep the row ; else remove the full row.
Code so far :
myfilt <- function(t,x){
if ((length(which(t[x,] > 1)) / 60) >= 0.8){
return(1)
}else{
return(0)
}
}
y=c()
for(i in 1:length(t[,1])){
y = c(y,myfilt(t,i))
}
But when I print t[v,] all the rows have same value :(
Not sure what I am doing wrong. Also if there is a shorter way to do this, let me know.
P.S. : Here 't' is the name of matrix I am testing
Here's a way to do it :
## Parameters
threshold <- 0.8
perc.to.keep <- 0.5
## Example Matrix
set.seed(1337)
m <- matrix(rnorm(25,1,1),nrow=5,ncol=5)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1.7122837 0.8383025 -0.02718379 2.2157099 2.1291008
# [2,] 0.2462742 2.4602621 -0.04117532 -0.6214087 1.4501467
# [3,] 1.0381899 3.0094584 0.12937698 0.9314247 1.0505864
# [4,] 2.1784211 0.9220618 1.85313022 0.9370171 0.8756698
# [5,] 0.8467962 2.3543421 0.37723981 2.0757077 1.9120115
test <- m < threshold
sel <- apply(test,1,function(v) sum(v)/length(v)) < perc
m[sel,]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1.7122837 0.8383025 -0.02718379 2.2157099 2.1291008
# [2,] 1.0381899 3.0094584 0.12937698 0.9314247 1.0505864
# [3,] 2.1784211 0.9220618 1.85313022 0.9370171 0.8756698
# [4,] 0.8467962 2.3543421 0.37723981 2.0757077 1.9120115

Resources