Filling the upper triangle of a mtrix in r - r

I have 3 vectors as such:
p
[,1]
[1,] 0.002715955
[2,] 0.004460214
[3,] 0.006855524
[4,] 0.007438570
[5,] 0.042657555
q
[,1]
[1,] 0.9972840
[2,] 0.9955398
[3,] 0.9931445
[4,] 0.9925614
[5,] 0.9573424
f
[,1]
[1,] 0.008364157
[2,] 0.013058930
[3,] 0.019131397
[4,] 0.020559785
[5,] 0.090786054
I want to perform the following operation so as to obtain the var-cov matrix (which fills only the upper triangle of the matrix)
v_ij <- p_i*q_j/(n*f_i*f_j); 1<=i<=j<=n
Where n is the number of observations(here 5). so I wrote the following code to perform the code
v_ij <- matrix(nrow=length(p), ncol=length(p))
for(i in 1:nrow(p)){
for(j in 1:nrow(q)){
if(i==j){
diag(v_ij) <- p[i]*q[j]/(f[i]*f[j])
}
}
}
This code was only able to fill the diagonal but I didn't which function could fill the upper part of the matrix for i

Maybe you can use tcrossprod + lower.tri, e.g.,
v <- tcrossprod(p,q)/tcrossprod(f)/5
v[lower.tri(v)] <- 0
such that
> v
[,1] [,2] [,3] [,4] [,5]
[1,] 0.04444444 0.050 0.05333333 0.05555556 0.05714286
[2,] 0.00000000 0.075 0.08000000 0.08333333 0.08571429
[3,] 0.00000000 0.000 0.09600000 0.10000000 0.10285714
[4,] 0.00000000 0.000 0.00000000 0.11111111 0.11428571
[5,] 0.00000000 0.000 0.00000000 0.00000000 0.12244898
DATA
p <- matrix(1:5,nrow = 5,ncol = 1)
q <- matrix(2:6,nrow = 5,ncol = 1)
f <- matrix(3:7,nrow = 5,ncol = 1)

You can use upper.tri() as suggested to produce the upper part if that is all you need. The cov() function produces the full symmetric matrix:
X <- matrix(rnorm(15), ncol=3)
vcov <- cov(X)
If you run
vcov[upper.tri(vcov, diag=TRUE)]
then you only extract the elements but return them as a vector, but
vcov*upper.tri(vcov, diag=TRUE)
will give you what you want.
> X <- matrix(rnorm(15), ncol=3)
> vcov <- cov(X)
> vcov[upper.tri(vcov, diag=TRUE)]
[1] 0.2698385 0.2844101 0.9375529 0.1457635 0.0223076 0.1626811
> vcov*upper.tri(vcov, diag=TRUE)
[,1] [,2] [,3]
[1,] 0.2698385 0.2844101 0.1457635
[2,] 0.0000000 0.9375529 0.0223076
[3,] 0.0000000 0.0000000 0.1626811

Related

Replacing upper triangular matrix elements in row order

I have a square matrix as follows.
> ex_mat
[,1] [,2] [,3] [,4]
[1,] 0.4270634 2.1920890 0.5647472 1.7149861
[2,] 2.0556220 1.1157322 2.6723637 0.3155507
[3,] 1.2252602 0.1063053 0.6396099 0.7903348
[4,] 0.3614062 1.1118661 0.5000143 0.2491543
I've ranked the upper off-diagonal part of the matrix in row order (from largest to smallest) with this.
> rank(-(t(ex_mat)[lower.tri(ex_mat)]))
[1] 2 5 3 1 6 4
I want to replace the upper off-diagonal elements of "ex_mat" with the ranks obtained above. I'm using https://statisticsglobe.com/modify-diagonal-lower-upper-triangular-part-matrix-r as example code. While my ranks are correct, the code seems to be inserting the ranks in column order.
> ex_mat_new <- ex_mat
> ex_mat_new[upper.tri(ex_mat_new)] <- rank(-(t(ex_mat)[lower.tri(ex_mat)]))
> ex_mat_new
[,1] [,2] [,3] [,4]
[1,] 0.4270634 2.0000000 5.0000000 1.0000000
[2,] 2.0556220 1.1157322 3.0000000 6.0000000
[3,] 1.2252602 0.1063053 0.6396099 4.0000000
[4,] 0.3614062 1.1118661 0.5000143 0.2491543
How can I fix this? [1,4] and [2,3] are off. Thank you.
The value insertions are in column order (in matrix, data.frame). We may assign on the lower.tri and then get the transpose
ex_mat_new[lower.tri(ex_mat_new)] <- rank(-(t(ex_mat)[lower.tri(ex_mat)]))
ex_mat_new <- t(ex_mat_new)
ex_mat_new[lower.tri(ex_mat_new)] <- ex_mat[lower.tri(ex_mat)]
-output
> ex_mat_new
[,1] [,2] [,3] [,4]
[1,] 0.4270634 2.0000000 5.0000000 3.0000000
[2,] 2.0556220 1.1157322 1.0000000 6.0000000
[3,] 1.2252602 0.1063053 0.6396099 4.0000000
[4,] 0.3614062 1.1118661 0.5000143 0.2491543
or this can be done in a single line with replace
t(replace(t(ex_mat), lower.tri(ex_mat), rank(-(t(ex_mat)[lower.tri(ex_mat)]))))
-output
[,1] [,2] [,3] [,4]
[1,] 0.4270634 2.0000000 5.0000000 3.0000000
[2,] 2.0556220 1.1157322 1.0000000 6.0000000
[3,] 1.2252602 0.1063053 0.6396099 4.0000000
[4,] 0.3614062 1.1118661 0.5000143 0.2491543
data
ex_mat <- structure(c(0.4270634, 2.055622, 1.2252602, 0.3614062, 2.192089,
1.1157322, 0.1063053, 1.1118661, 0.5647472, 2.6723637, 0.6396099,
0.5000143, 1.7149861, 0.3155507, 0.7903348, 0.2491543), .Dim = c(4L,
4L), .Dimnames = list(NULL, NULL))

I need to create an accumulation index across columns in a matrix

I need to create an accumulation index across columns in my data. I set up the problem as follows
#accumulation function
mat1 <- matrix(nrow=16, ncol =4)
mat1[1,] <- c(1,1,1,1)
mat1[2:16,] <- 1+rnorm(60,0,0.1)
[,1] [,2] [,3] [,4]
[1,] 1.0000000 1.0000000 1.0000000 1.0000000
[2,] 0.9120755 0.9345682 0.8533162 0.8737582
[3,] 0.7838427 0.9691806 0.8216284 0.9863669
[4,] 0.9095204 1.1906031 1.0253083 1.0700338
[5,] 1.0202524 0.9974672 1.1348315 1.1115018
[6,] 0.9456184 1.1250529 1.0348011 0.9323336
[7,] 1.0053195 0.9917475 1.0178855 1.0880626
[8,] 0.9550709 0.9107060 0.8876688 0.9060996
[9,] 1.0728177 1.0559643 0.9161789 0.9711522
[10,] 0.9579642 1.0082560 0.9833227 0.9306639
[11,] 1.0044883 1.1323498 1.0388025 0.8926033
[12,] 0.8777846 0.9940302 0.8314166 0.8479962
[13,] 1.1042297 0.9767410 0.9355374 0.8859680
[14,] 1.1245737 0.8291948 1.0491585 0.9887672
[15,] 0.9687700 0.9915095 0.8962534 1.0220163
[16,] 0.9432597 1.0310273 0.9288159 1.0838243
The desired output takes the product of entries in each column, up to each row number.
therefore:
mat2 <- matrix(nrow=16, ncol=4)
mat2[1,] <- c(1,1,1,1)
mat2[2,] <- mat1[1,]*mat1[2,]
mat2[3,] <-mat1[1,]*mat1[2,]*mat1[3,]
mat2[4,] <-mat1[1,]*mat1[2,]*mat1[3,]*mat1[4,]
and so on and so forth up to row 16. The idea is to accumulate (take the product) of all entries in mat1 up to a particular row number. So row1 of mat2 = row 1 of mat 1. row 2 of mat 2, is equal to row1 mat1 *row2 mat1. row3 of mat2 is equal to row1 of mat1 *row2 of mat1, *row3 of mat1. This process continues up to row 16.
I need to write a function able to do this calculation for matrices in a list all of the same size.
Basically what you need is cumulative product over each column which can be applied using cumprod function in base R
apply(mat1, 2, cumprod)
# [,1] [,2] [,3] [,4]
# [1,] 1.0000 1.0000 1.0000 1.0000
# [2,] 0.8793 0.9890 1.1102 0.9031
# [3,] 0.9037 0.9384 1.0574 0.8031
# [4,] 1.0017 0.8529 0.9824 0.7026
# [5,] 0.7667 0.7815 0.9332 0.6658
# [6,] 0.7996 0.9703 0.7811 0.6327
# [7,] 0.8401 0.9833 0.6899 0.5184
# [8,] 0.7918 0.9351 0.5395 0.4883
# [9,] 0.7485 0.8939 0.4672 0.4341
#[10,] 0.7063 0.9350 0.4534 0.3901
#[11,] 0.6434 0.8701 0.4323 0.3837
#[12,] 0.6127 0.7441 0.4950 0.4053
#[13,] 0.5515 0.7869 0.4421 0.4721
#[14,] 0.5087 0.7063 0.4043 0.4356
#[15,] 0.5120 0.7052 0.3929 0.5056
#[16,] 0.5611 0.6392 0.3538 0.4470
data
set.seed(1234)
mat1 <- matrix(nrow=16, ncol =4)
mat1[1,] <- c(1,1,1,1)
mat1[2:16,] <- 1+rnorm(60,0,0.1)
We can make use of rowCumprods from matrixStats which would be efficient
library(matrixStats)
rowCumprods(mat1)
# [,1] [,2] [,3] [,4]
# [1,] 1.0000000 1.0000000 1.0000000 1.0000000
# [2,] 0.8792934 0.8695961 0.9654515 0.8719461
# [3,] 1.0277429 0.9752243 0.9288433 0.8259908
# [4,] 1.1084441 1.0074432 0.9359711 0.8187889
# [5,] 0.7654302 0.7013506 0.6661948 0.6312977
# [6,] 1.0429125 1.2948629 1.0839177 1.0300632
# [7,] 1.0506056 1.0646930 0.9403774 0.7705423
# [8,] 0.9425260 0.8962776 0.7008855 0.6600887
# [9,] 0.9453368 0.9036902 0.7825060 0.6957347
#[10,] 0.9435548 0.9869196 0.9578751 0.8606545
#[11,] 0.9109962 0.8477986 0.8082998 0.7951804
#[12,] 0.9522807 0.8143710 0.9324137 0.9849138
#[13,] 0.9001614 0.9518986 0.8501747 0.9902680
#[14,] 0.9223746 0.8279552 0.7571348 0.6985816
#[15,] 1.0064459 1.0049223 0.9767219 1.1335746
#[16,] 1.0959494 0.9933742 0.8945990 0.7910216
data
set.seed(1234)
mat1 <- matrix(nrow=16, ncol =4)
mat1[1,] <- c(1,1,1,1)
mat1[2:16,] <- 1+rnorm(60,0,0.1)

R's `chol` differs from MATLAB's `cholcov`. How to do a Cholesky-alike covariance decomposition?

I've been trying to reproduce a cholesky-like covariance decomposition in R - like it is done in Matlab using cholcov(). Example taken from https://uk.mathworks.com/help/stats/cholcov.html.
Result of the original cholcov() function as of their example:
T =
-0.2113 0.7887 -0.5774 0
0.7887 -0.2113 -0.5774 0
1.1547 1.1547 1.1547 1.7321
I am trying to replicate this T in R. I tried:
C1 <- cbind(c(2,1,1,2), c(1,2,1,2), c(1,1,2,2), c(2,2,2,3))
T1 <- chol(C1)
C2 <- t(T1) %*% T1
My result:
[,1] [,2] [,3] [,4]
[1,] 1.414214 0.7071068 0.7071068 1.414214e+00
[2,] 0.000000 1.2247449 0.4082483 8.164966e-01
[3,] 0.000000 0.0000000 1.1547005 5.773503e-01
[4,] 0.000000 0.0000000 0.0000000 1.290478e-08
C2 recovers C1, but T1 is quite different from MATLAB's solution. I then thought maybe it would be a Cholesky composition of the covariance matrix:
T1 <- chol(cov(C1))
but I get
[,1] [,2] [,3] [,4]
[1,] 0.5773503 0.0000000 0.0000000 2.886751e-01
[2,] 0.0000000 0.5773503 0.0000000 2.886751e-01
[3,] 0.0000000 0.0000000 0.5773503 2.886751e-01
[4,] 0.0000000 0.0000000 0.0000000 3.725290e-09
which is not right either.
Could anyone give me a hint how cholcov() in Matlab is calculated so that I could replicate it in R?
You are essentially abusing R function chol in this case. The cholcov function from MATLAB is a composite function.
If the covariance is positive, it does Cholesky factorization, returning a full-rank upper triangular Cholesky factor;
If the covariance is positive-semidefinite, it does Eigen decomposition, returning a rectangular matrix.
On the other hand, chol from R only does Choleksy factorization. The example you give, C1, falls into the second case. So, we should resort to eigen function in R.
E <- eigen(C1, symmetric = TRUE)
#$values
#[1] 7.000000e+00 1.000000e+00 1.000000e+00 2.975357e-17
#
#$vectors
# [,1] [,2] [,3] [,4]
#[1,] -0.4364358 0.000000e+00 8.164966e-01 -0.3779645
#[2,] -0.4364358 -7.071068e-01 -4.082483e-01 -0.3779645
#[3,] -0.4364358 7.071068e-01 -4.082483e-01 -0.3779645
#[4,] -0.6546537 8.967707e-16 -2.410452e-16 0.7559289
V <- E$vectors
D <- sqrt(E$values) ## root eigen values
Since numerical rank is 3, we drop the last eigen value and eigen vector:
V1 <- V[, 1:3]
D1 <- D[1:3]
Thus the factor you want is:
R <- D1 * t(V1) ## diag(D1) %*% t(V1)
# [,1] [,2] [,3] [,4]
#[1,] -1.1547005 -1.1547005 -1.1547005 -1.732051e+00
#[2,] 0.0000000 -0.7071068 0.7071068 8.967707e-16
#[3,] 0.8164966 -0.4082483 -0.4082483 -2.410452e-16
We can verify that:
crossprod(R) ## t(R) %*% R
# [,1] [,2] [,3] [,4]
#[1,] 2 1 1 2
#[2,] 1 2 1 2
#[3,] 1 1 2 2
#[4,] 2 2 2 3
The R factor above is not as same as the one returned by cholcov due to different algorithms used for Eigen factorization. R uses LAPACK routine DSYVER in which some pivoting is done so that eigen values are non-increasing. MATLAB's cholcov is not open-source, so I'm not sure what algorithm it uses. But it is easy to demonstrate that it does not arrange eigen values in non-increasing order.
Consider the factor T returned by cholcov:
T <- structure(c(-0.2113, 0.7887, 1.1547, 0.7887, -0.2113, 1.1547,
-0.5774, -0.5774, 1.1547, 0, 0, 1.7321), .Dim = 3:4)
We can get eigen values by
rowSums(T ^ 2)
# [1] 1.000086 1.000086 7.000167
There are some round-off error because T is not precise, but we can see clearly that eigen values are 1, 1, 7. On the other hand, we have 7, 1, 1 from R (recall D1).

R Correlation significance matrix

I have a large correlation matrix (something like 50*50).
I calculated the matrix using cor(mydata) function.
Now I would like to have equal significance matrix.
Using cor.test() I can have one significance level but is there a easy way to get all 1200?
The function cor_pmat from the ggcorrplot package gives you the p-values of correlations.
library(ggcorrplot)
set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
cor_pmat(xmat)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00000000 0.08034470 0.24441138 0.03293644 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 0.44828479 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 0.20634394 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 0.00000000 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 0.83785303 0.0000000
I think this should do what you want, we use expand.grid in conjunction with the apply function:
Since you didn't provide your data, I created my own set.
set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
matrix(apply(expand.grid(1:ncol(xmat), 1:ncol(xmat)),
1,
function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`),
ncol = ncol(xmat), byrow = T)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00000000 0.08034470 0.24441138 3.293644e-02 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 4.482848e-01 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 2.063439e-01 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 1.063504e-62 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 8.378530e-01 0.0000000
Note that if you didn't want a matrix, and instead were comfortable with a data.frame, we could use combn which would involve much less iteration and be more efficient.
cbind(t(combn(1:ncol(xmat), 2)),
combn(1:ncol(xmat), 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`)
)
[,1] [,2] [,3]
[1,] 1 2 0.08034470
[2,] 1 3 0.24441138
[3,] 1 4 0.03293644
[4,] 1 5 0.32348990
[5,] 2 3 0.08716815
[6,] 2 4 0.44828479
[7,] 2 5 0.48241166
[8,] 3 4 0.20634394
[9,] 3 5 0.95045815
[10,] 4 5 0.83785303
Alternatively, we can perform the same operation, but use the pipe operator %>% to make it a bit more concise:
library(magrittr)
combn(1:ncol(xmat), 2) %>%
apply(., 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`) %>%
cbind(t(combn(1:ncol(xmat), 2)), .)
Here is one solution:
data <- swiss
#cor(data)
n <- ncol(data)
p.value.vec <- apply(combn(1:ncol(data), 2), 2, function(x)cor.test(data[,x[1]], data[,x[2]])$p.value)
p.value.matrix = matrix(0, n, n)
p.value.matrix[upper.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix[lower.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.000000e+00 1.491720e-02 9.450437e-07 1.028523e-03 1.304590e-06 2.588308e-05
[2,] 1.491720e-02 0.000000e+00 3.658617e-07 3.585238e-03 5.204434e-03 4.453814e-01
[3,] 9.450437e-07 9.951515e-08 0.000000e+00 9.951515e-08 6.844724e-01 3.018078e-01
[4,] 3.658617e-07 1.304590e-06 4.811397e-08 0.000000e+00 4.811397e-08 5.065456e-01
[5,] 1.028523e-03 5.204434e-03 2.588308e-05 3.018078e-01 0.000000e+00 2.380297e-01
[6,] 3.585238e-03 6.844724e-01 4.453814e-01 5.065456e-01 2.380297e-01 0.000000e+00

apply function to each value of a matrix

I have a matrix pmatrix
sigma = 0.03
alpha = 0.01
sims = 6
N = 10
pmatrix = matrix(NA, ncol=sims, nrow = N)
for (i in 1:N){
x = rnorm(sims)
pmatrix[i,] <- x
}
And I need to use this matrix in order to get the xt values from the following expression:
xt = 0
for (i in 1:10){
xt[i+1] = xt[i] * exp(-alpha*1) + sqrt(((sigma^2)/2*alpha)*(1-exp(-2*alpha*1)))*pmatrix[i]
}
However the following loop only returns a xt vector. Ideally I would like to obtain a matrix which consists of 10 rows (N- number of years) and 6 columns (sims - the number of simulated scenarios).
I believe it is doable via a second loop or an apply function.
Hope this suits you. Sometimes its easier to use an apply function on column or row indices instead of a data object itself.
sigma = 0.03
alpha = 0.01
sims = 6
N = 10
pmatrix <- matrix(rnorm(N * sims), N)
xt <- matrix(nrow=N, ncol=6)
xt[1,] <- 0
sapply(2:N, FUN = function(x) {
xt[x,] <<- xt[x-1,] * exp(-alpha*1) + sqrt(((sigma^2)/2*alpha)*(1-exp(-2*alpha*1)))*pmatrix[x-1,]
})
> xt
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.0000000000 0.000000e+00 0.000000e+00 0.0000000000 0.0000000000 0.000000e+00
[2,] -0.0006488202 4.730257e-04 4.707051e-04 0.0002174562 0.0001655868 -2.063875e-04
[3,] -0.0007110547 3.792143e-04 3.922429e-04 0.0009465164 -0.0001667539 -1.165253e-05
[4,] -0.0003679911 -3.596447e-05 5.490986e-06 0.0013176437 -0.0006049390 2.276431e-04
[5,] -0.0007176342 1.754809e-05 3.647631e-04 0.0015136978 -0.0010303508 4.773186e-04
[6,] -0.0007918234 -3.065909e-05 -3.703564e-05 0.0015006314 -0.0005650229 7.792698e-05
[7,] -0.0008442265 -2.808698e-04 7.808261e-05 0.0015505998 -0.0005407453 -3.106797e-04
[8,] -0.0010265038 8.540579e-05 2.547632e-04 0.0017364697 -0.0007112818 -3.972706e-04
[9,] -0.0004011710 7.346707e-05 7.471667e-04 0.0014031268 -0.0008266330 -4.296555e-04
[10,] -0.0001490369 -3.189111e-04 1.133248e-03 0.0013038771 -0.0011771068 -2.719285e-04
Just as an extended comment: the issue you were facing came from the fact that you were subsetting xt wrongly. If you want to do something on a matrix rowwise, use the entire row, which you can retrieve using xt[1,] as opposed to xt[1]. See:
sigma = 0.03; alpha = 0.01; sims = 6; N = 10
pmatrix = matrix(rnorm(sims*N), ncol=sims, nrow = N)
xt <- matrix(0, ncol=sims, nrow = N)
xt[1] # just one element
[1] 0
xt[1,] # entire row
[1] 0 0 0 0 0 0
Then it works also with your approach:
for(i in 2:N){
xt[i,] <- xt[i-1,] * exp(-alpha) + sqrt(((sigma^2)/2*alpha)*(1-exp(-2*alpha)))*pmatrix[i-1,]
}
head(xt)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.0000000000 0.000000e+00 0.0000000000 0.000000e+00 0.000000e+00 0.0000000000
[2,] 0.0005102757 9.680876e-05 0.0006264992 2.147689e-05 9.432744e-05 0.0004370840
[3,] 0.0005035357 -4.704802e-04 0.0003954507 -5.083994e-04 4.237845e-04 0.0004007277
[4,] 0.0009345963 -2.699634e-04 0.0003559880 -3.877696e-04 5.337181e-04 0.0005230792
[5,] 0.0016501564 -1.947569e-04 0.0005003010 -6.680216e-05 6.098233e-04 0.0007106022
[6,] 0.0015317653 4.316999e-04 0.0011361772 -2.209149e-04 6.881100e-04 0.0005893373

Resources