Performing element-wise standard deviation in R with two matrices - r

As the title suggests, I am looking for a way to get the standard deviation per element from two separate matrices. However, I am quite the beginner at R and I can't seem to figue out how to do this. Below is an example of what I am trying to accomplish with a small sample of my data (first three rows)
I have two matrices with coordinates (df143 and df143_2, or matrices A and B as you will)
A:
[1,] 21.729504 -55.66055 -37.26477
[2,] 39.445610 -67.67449 -32.19464
[3,] 57.604027 -54.16734 -28.48679
B:
[1,] 21.706865 -55.50722 -37.57840
[2,] 39.553314 -67.68414 -31.95995
[3,] 57.286247 -54.13008 -28.44446
I am looking for an matrix output that shows the standard deviation per element of the two combined matrices.

Or you can do base R:
matrix(mapply(function(x,y) sd(c(x,y)),A, B), ncol=ncol(A))
# [,1] [,2] [,3]
#[1,] 0.01600819 0.10842068 0.22176990
#[2,] 0.07615823 0.00682358 0.16595089
#[3,] 0.22470439 0.02634680 0.02993183

I believe this is what you're looking to do:
library(abind)
a <- c(21.729504, -55.66055, -37.26477, 39.445610, -67.67449, -32.19464, 57.604027, -54.16734, -28.48679)
a <- matrix(a, ncol=3, byrow=TRUE)
b <- c(21.706865, -55.50722, -37.57840, 39.553314, -67.68414, -31.95995, 57.286247, -54.13008, -28.44446)
b <- matrix(b, ncol=3, byrow=TRUE)
m <- abind(a, b, along=3)
apply(m, 1:2, sd)
## [,1] [,2] [,3]
## [1,] 0.01600819 0.10842068 0.22176990
## [2,] 0.07615823 0.00682358 0.16595089
## [3,] 0.22470439 0.02634680 0.02993183

Related

Convert list to matrix using indicator vector

I have the following list of 5x2 matrices:
l <- list(a=matrix(rnorm(10),nrow=5,ncol=2),
b=matrix(rnorm(10),nrow=5,ncol=2),
c=matrix(rnorm(10),nrow=5,ncol=2))
For example, the first element of this list looks like this:
$a
[,1] [,2]
[1,] -0.4988268 1.9881333
[2,] -0.2979064 1.5921169
[3,] -1.3783522 -1.4149601
[4,] 0.2205115 0.2029210
[5,] 1.2721645 0.2861253
I want to take this list and create a new 5x2 matrix using information from a vector v:
v <- c("a","a","b","c","b")
This vector is an indicator vector that has information on how this new matrix should be constructed. That is, take row 1 from list element a, take row 2 from list element a and so on.
One could do it through a for-loop, however, for my application this is not efficient enough and I feel there might be a more elegant solution to it. My approach:
goal <- matrix(nrow=5,ncol=2)
for(i in 1:length(v)){
goal[i,] <- l[[v[i]]][i,]
}
goal
[,1] [,2]
[1,] -0.4988268 1.98813326
[2,] -0.2979064 1.59211686
[3,] 0.7715907 0.16776669
[4,] 0.2690278 0.02542766
[5,] 1.7865093 0.46361239
Thanks!
Assuming all the list matrices have same number of row, we could use mapply and subset the matrices by name (v) and row number.
t(mapply(function(x, y) l[[x]][y, ], v, 1:nrow(l[[1]])))
# [,1] [,2]
#a -1.2070657 0.5060559
#a 0.2774292 -0.5747400
#b -0.7762539 -0.9111954
#c 0.4595894 -0.0151383
#b 0.9594941 2.4158352
data
set.seed(1234)
l <- list(a=matrix(rnorm(10),nrow=5,ncol=2),
b=matrix(rnorm(10),nrow=5,ncol=2),
c=matrix(rnorm(10),nrow=5,ncol=2))

SVD calculation in R

How do I get actual matrix using Singular value decomposition(SVD)
efficiently in R ,
cause A=svd$u %*% svd$d %*% t(svd$v) This is not an efficient way to get matrix A
Try svd(A)$u%*%diag(svd(A)$d)%*%t(svd(A)$v).
set.seed(12345)
A <- matrix(data=runif(n=9, min=1, max=9), nrow=3)
A
[,1] [,2] [,3]
[1,] 6.767231 8.088997 3.600763
[2,] 8.006186 4.651848 5.073795
[3,] 7.087859 2.330974 6.821642
s <- svd(A)
D <- diag(s$d)
s$u %*% D %*% t(s$v)
[,1] [,2] [,3]
[1,] 6.767231 8.088997 3.600763
[2,] 8.006186 4.651848 5.073795
[3,] 7.087859 2.330974 6.821642
Improving upon the answer by #MYaseen208
(s$u) %*% (t(s$v)*s$d)
This has one less matrix multiplication (which is an O(n^3) operation).

How to use some apply function to solve what requires two for-loops in R

I have a matrix, named "mat", and a smaller matrix, named "center".
temp = c(1.8421,5.6586,6.3526,2.904,3.232,4.6076,4.8,3.2909,4.6122,4.9399)
mat = matrix(temp, ncol=2)
[,1] [,2]
[1,] 1.8421 4.6076
[2,] 5.6586 4.8000
[3,] 6.3526 3.2909
[4,] 2.9040 4.6122
[5,] 3.2320 4.9399
center = matrix(c(3, 6, 3, 2), ncol=2)
[,1] [,2]
[1,] 3 3
[2,] 6 2
I need to compute the distance between each row of mat with every row of center. For example, the distance of mat[1,] and center[1,] can be computed as
diff = mat[1,]-center[1,]
t(diff)%*%diff
[,1]
[1,] 3.92511
Similarly, I can find the distance of mat[1,] and center[2,]
diff = mat[1,]-center[2,]
t(diff)%*%diff
[,1]
[1,] 24.08771
Repeat this process for each row of mat, I will end up with
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836
I know how to implement it with for-loops. I was really hoping someone could tell me how to do it with some kind of an apply() function, maybe mapply() I guess.
Thanks
apply(center, 1, function(x) colSums((x - t(mat)) ^ 2))
# [,1] [,2]
# [1,] 3.925110 24.087710
# [2,] 10.308154 7.956554
# [3,] 11.324550 1.790750
# [4,] 2.608405 16.408805
# [5,] 3.817036 16.304836
If you want the apply for expressiveness of code that's one thing but it's still looping, just different syntax. This can be done without any loops, or with a very small one across center instead of mat. I'd just transpose first because it's wise to get into the habit of getting as much as possible out of the apply statement. (The BrodieG answer is pretty much identical in function.) These are working because R will automatically recycle the smaller vector along the matrix and do it much faster than apply or for.
tm <- t(mat)
apply(center, 1, function(m){
colSums((tm - m)^2) })
Use dist and then extract the relevant submatrix:
ix <- 1:nrow(mat)
as.matrix( dist( rbind(mat, center) )^2 )[ix, -ix]
6 7
# 1 3.925110 24.087710
# 2 10.308154 7.956554
# 3 11.324550 1.790750
# 4 2.608405 16.408805
# 5 3.817036 16.304836
REVISION: simplified slightly.
You could use outer as well
d <- function(i, j) sum((mat[i, ] - center[j, ])^2)
outer(1:nrow(mat), 1:nrow(center), Vectorize(d))
This will solve it
t(apply(mat,1,function(row){
d1<-sum((row-center[1,])^2)
d2<-sum((row-center[2,])^2)
return(c(d1,d2))
}))
Result:
[,1] [,2]
[1,] 3.925110 24.087710
[2,] 10.308154 7.956554
[3,] 11.324550 1.790750
[4,] 2.608405 16.408805
[5,] 3.817036 16.304836

Choleski Decomposition in R to get the inverse when pivot = TRUE

I am using the choleski decomposition to compute the inverse of a matrix that is positive semidefinite. However, when my matrix becomes extremely large and has zeros in it I have that my matrix is no longer (numerically from the computers point of view) positive definite. So to get around this problem I use the pivot = TRUE option in the choleski command in R. However, (as you will see below) the two return the same output but with the rows and columns or the matrix rearranged. I am trying to figure out is there a way (or transformation) to make them the same. Here is my code:
X = matrix(rnorm(9),nrow=3)
A = X%*%t(X)
inv1 = function(A){
Q = chol(A)
L = t(Q)
inverse = solve(Q)%*%solve(L)
return(inverse)
}
inv2 = function(A){
Q = chol(A,pivot=TRUE)
L = t(Q)
inverse = solve(Q)%*%solve(L)
return(inverse)
}
Which when run results in:
> inv1(A)
[,1] [,2] [,3]
[1,] 9.956119 -8.187262 -4.320911
[2,] -8.187262 7.469862 3.756087
[3,] -4.320911 3.756087 3.813175
>
> inv2(A)
[,1] [,2] [,3]
[1,] 7.469862 3.756087 -8.187262
[2,] 3.756087 3.813175 -4.320911
[3,] -8.187262 -4.320911 9.956119
Is there a way to get the two answers to match? I want inv2() to return the answer from inv1().
That is explained in ?chol: the column permutation is returned as an attribute.
inv2 <- function(A){
Q <- chol(A,pivot=TRUE)
Q <- Q[, order(attr(Q,"pivot"))]
Qi <- solve(Q)
Qi %*% t(Qi)
}
inv2(A)
solve(A) # Identical
Typically
M = matrix(rnorm(9),3)
M
[,1] [,2] [,3]
[1,] 1.2109251 -0.58668426 -0.4311855
[2,] -0.8574944 0.07003322 -0.6112794
[3,] 0.4660271 -0.47364400 -1.6554356
library(Matrix)
pm1 <- as(as.integer(c(2,3,1)), "pMatrix")
M %*% pm1
[,1] [,2] [,3]
[1,] -0.4311855 1.2109251 -0.58668426
[2,] -0.6112794 -0.8574944 0.07003322
[3,] -1.6554356 0.4660271 -0.47364400

How to solve a system of linear equations with b=0 in R

In R I need to solve a system of linear equations (Ax=b), where b=0. By using solve() it just returns a zero vector for the answer, but I want the non-zero solutions of the system. Is there any way for it?
I think you are looking for the null space of a matrix A. Try :
library(MASS)
Null(t(A))
R > (A <- matrix(c(1,2,3,2,4,7), ncol = 3, byrow = T))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 7
R > Null(t(A))
[,1]
[1,] -8.944272e-01
[2,] 4.472136e-01
[3,] 7.771561e-16
R > (A <- matrix(c(1,2,3,2,4,6), ncol = 3, byrow = T))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 6
R > Null(t(A))
[,1] [,2]
[1,] -0.5345225 -0.8017837
[2,] 0.7745419 -0.3381871
[3,] -0.3381871 0.4927193
Be careful. There are some rounding errors.
Also, denote r as the rank of matrix A, and q as the number of columns of A. If r = q, then zero vector is the only answer. If r > q, then there is no solution. If r < q, we can use the above Null function to get null space of A, but remember they are not unique, in terms of neither magnitude nor directions.
Reference : http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/Null.html

Resources