Singular Values Decomposition (SVD) with R - r

The SVD works well with R:
A = matrix(1:12,3,4)
A
u = svd(A)$u
v = svd(A)$v
sigma = diag(svd(A)$d)
u %*% sigma %*% t(v) # = A as desired
But unlike the usual statement of the SVD theorem, v is not a 4x4 matrix (it should be!):
dim(v) # (4,3)
Why is it so?
According to the theorem,
v should be of format (4,4),
sigma should be of format (3,4).
By the way what would be shortest way to create a diag(svd(A)$d) zero-padded in order to be of format (3,4)?

To get the full U and V matrices, use the nu= and nv= arguments to svd(). To pad a diagonal matrix with zeros, use the nrow= and ncol= arguments to diag():
A <- matrix(1:12,3,4)
D <- svd(A, nu=nrow(A), nv=ncol(A))
u <- D$u
v <- D$v
sigma <- diag(D$d, nrow=nrow(A), ncol=ncol(A))
## Check that that worked:
dim(u)
# [1] 3 3
dim(v)
# [1] 4 4
dim(sigma)
# [1] 3 4
u %*% sigma %*% t(v)
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 5 8 11
# [3,] 3 6 9 12

It's just a different convention, different systems/textbooks will define the SVD one or the other way. The important thing is the unitary property U*U'=I. In either convention the singular vectors will minimize the least-squared distances in projection.
Here's a development of the theory that has the dimension conventions the same as in LINPACK and R: https://www.cs.princeton.edu/courses/archive/spring12/cos598C/svdchapter.pdf

For part two:
diag(c(svd(A)$d,0),nrow=3,ncol=4)

Related

Generate square symmetric matrix in R

I have a matrix in R as follows:
dat <- matrix(c(2,3,5,7,8,4), ncol = 6)
colnames(dat) <- c("A(1,1)", "A(1,2)", "A(1,3)", "A(2,2)", "A(2,3)", "A(3,3)")
How can I create a square symmetric matrix based on an apply function that has the following form:
A(1,1) A(1,2) A(1,3)
A(2,1) A(2,2) A(2,3)
A(3,1) A(3,2) A(3,3)
Note that A(1,2)=A(2,1)
This isn't based on apply but it shows how you can use lower.tri and upper.tri to create a symmetric matrix given a vector of elements. It shows the 3x3 case, but will easily generalize to larger n:
dat <- matrix(rep(0,9),nrow = 3)
dat[lower.tri(dat,diag = TRUE)] <- c(2,3,5,7,8,4)
dat[upper.tri(dat)] <- t(dat)[upper.tri(dat)]
print(dat)
Result:
[,1] [,2] [,3]
[1,] 2 3 5
[2,] 3 7 8
[3,] 5 8 4

How to create random vectors of another vector?

I am performing calculations with constants and vectors (approximate length = 100) for which I need to simulate normal distributions N (with rnorm). For constants (K, with standard deviation = KU) I use rnorm() in the standard way:
K <- 2
KU <- 0.2
set.seed(123)
KN <- rnorm(n = 3, mean = K, sd = KU)
what provides a vector of length 3 (KN):
[1] 1.887905 1.953965 2.311742
Now, I need to do the same thing with a vector (V, standard deviation VU). My first guess is to use:
V <- c(1, 2, 3)
VU <- 0.1 * V
set.seed(123)
VN <- rnorm(3, V, VU)
but only a vector of 3 elements is produced, one for each vector element:
[1] 0.9439524 1.9539645 3.4676125
This is actually the first simulation of the vector, but I need 3 times this vector. One solution is to create 9 numbers, but VN is a vector of 9 elements:
[1] 0.9439524 1.9539645 3.4676125 1.0070508 2.0258575 3.5145195 1.0460916 1.7469878 2.7939441
not 3 vectors of 3 elements. What I want is VN =
[1] 0.9439524 1.0070508 1.0460916
[2] 1.9539645 2.0258575 1.7469878
[3] 3.4676125 3.5145195 2.7939441
so, VN are 3 vectors which I can subsequently use in other calculations, such as KN * VN. The solution that I have found is:
set.seed(123)
VN <- as.data.frame(t(matrix(rnorm(3 * length(V), V, VU), nrow = length(V))))
but in my opinion this is a rather cumbersome expression (which I need to repeat several times in different places with rather long variable names). Is there a simpler way in base R to produce random vectors? I would like to see something like:
VN <- rnorm.vector(3, V, VU)
We can use replicate
set.seed(123)
replicate(3, rnorm(3, V, VU))
# [,1] [,2] [,3]
#[1,] 0.9439524 1.007051 1.046092
#[2,] 1.9539645 2.025858 1.746988
#[3,] 3.4676125 3.514519 2.793944
Or it could be
mapply(rnorm, n = 3, mean = V, sd = VU)
In addition to #akrun's great options, you may also use something slightly simpler than your approach:
matrix(rnorm(n * length(V), V, VU), nrow = n, byrow = TRUE)
# [,1] [,2] [,3]
# [1,] 0.9439524 1.953965 3.467612
# [2,] 1.0070508 2.025858 3.514519
# [3,] 1.0460916 1.746988 2.793944
or also the MASS package with mvrnorm letting to sample from a multivariate normal distribution:
library(MASS)
mvrnorm(n, VU, diag(VU))
# [,1] [,2] [,3]
# [1,] 0.6650715 0.37923044 0.05590089
# [2,] 0.2574341 0.24949882 0.97045721
# [3,] -0.5218990 -0.04857971 0.49707815
where
diag(VU)
# [,1] [,2] [,3]
# [1,] 0.1 0.0 0.0
# [2,] 0.0 0.2 0.0
# [3,] 0.0 0.0 0.3
The latter option is the way to go in case you want the variance-covariance matrix not to be diagonal.

Find rotation matrix of one vector to another using R

I would like to calculate one (any) rotation matrix between two n-dimensional vectors using R.
https://math.stackexchange.com/questions/598750/finding-the-rotation-matrix-in-n-dimensions includes a Matlab algorithm which allows to find one rotation matrix transforming a vector x to another vector y. The code can be transformed to R:
# Function returns a rotation matrix transforming x into y
rotation = function(x,y){
u=x/sqrt(sum(x^2))
v=y-sum(u*y)*u
v=v/sqrt(sum(v^2))
cost=sum(x*y)/sqrt(sum(x^2))/sqrt(sum(y^2))
sint=sqrt(1-cost^2);
diag(length(x)) - u %*% t(u) - v %*% t(v) +
cbind(u,v) %*% matrix(c(cost,-sint,sint,cost), 2) %*% t(cbind(u,v))
}
x=c(2,4,5,3,6)
y=c(6,2,0,1,7)
# Same norm
sqrt(sum(x^2))
sqrt(sum(y^2))
Rx2y = rotation(x,y)
x %*% Rx2y
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 6 2 -8.881784e-16 1 7

Writing a Householder QR factorization function in R code

I am working on a piece of code to find the QR factorization of a matrix in R.
X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785,
0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
), .Dim = c(5L, 3L))
myqr <- function(A) {
n <- nrow(A)
p <- ncol(A)
Q <- diag(n)
Inp <- diag(nrow = n, ncol = p)
for(k in c(1:ncol(A))) {
# extract the kth column of the matrix
col<-A[k:n,k]
# calculation of the norm of the column in order to create the vector
norm1<-sqrt(sum(col^2))
# Define the sign positive if a1 > 0 (-) else a1 < 0(+)
sign <- ifelse(col[1] >= 0, -1, +1)
# Calculate of the vector a_r
a_r <- col - sign * Inp[k:n,k] * norm1
# beta = 2 / ||a-r||^2
beta <- 2 / sum(t(a_r) %*% a_r)
# the next line of code calculates the matrix Q in every step
Q <- Q - beta *Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))
# calculates the matrix R in each step
A[k:n,k:p] <- A[k:n,k:p] - beta * a_r %*% t(a_r) %*% A[k:n,k:p]
}
list(Q=Q,R=A)
}
But, Here I have not calculated in every step the matrix H that represents the householder reflection, also I have not calculated the matrix A in every step.
As H = I - 2 v v', if I multiply by Q I obtain
QH = Q - 2 (Qv) v' // multiplication on the left
HQ = Q - 2 v (Q'v)' // multiplication on the right
Now, this operations should be work in every step. However if I consider the first matrix H and he the second matrix H1.... these matrices will be of smaller that the first one. In order to avoid that I have used the next line of code:
Q <- Q - beta * Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))
but, I am not sure why the code is working well, when I generate the new vector a_r with the first k entries of zeros at every step.
I thought you want exactly the same output as returned by qr.default, which uses compact QR storage. But then I realized that you are storing Q and R factors separately.
Normally, QR factorization only forms R but not Q. In the following, I will describe QR factorization where both are formed. For those who lack basic understanding of QR factorization, please read this first: lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK, where there are neat math formulae arranged in LaTeX. In the following, I will assume that one knows what a Householder reflection is and how it is computed.
QR factorization procedure
First of all, a Householder refection vector is H = I - beta * v v' (where beta is computed as in your code), not H = I - 2 * v v'.
Then, QR factorization A = Q R proceeds as (Hp ... H2 H1) A = R, where Q = H1 H2 ... Hp. To compute Q, we initialize Q = I (identity matrix), then multiply Hk on the right iteratively in the loop. To compute R, we initialize R = A and multiply Hk on the left iteratively in the loop.
Now, at k-th iteration, we have a rank-1 matrix update on Q and A:
Q := Q Hk = Q (I - beta v * v') = Q - (Q v) (beta v)'
A := Hk A = (I - beta v * v') A = A - (beta v) (A' v)'
v = c(rep(0, k-1), a_r), where a_r is the reduced, non-zero part of a full reflection vector.
The code you have is doing such update in a brutal force:
Q <- Q - beta * Q %*% c(rep(0,k-1), a_r) %*% t(c(rep(0,k-1),a_r))
It first pads a_r to get the full reflection vector and performs the rank-1 update on the whole matrix. But actually we can drop off those zeros and write (do some matrix algebra if unclear):
Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
A[k:n,k:p] <- A[k:n,k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
so that only a fraction of Q and A are updated.
Several other comments on your code
You have used t() and "%*%" a lot! But almost all of them can be replaced by crossprod() or tcrossprod(). This eliminates the explicit transpose t() and is more memory efficient;
You initialize another diagonal matrix Inp which is not necessary. To get householder reflection vector a_r, you can replace
sign <- ifelse(col[1] >= 0, -1, +1)
a_r <- col - sign * Inp[k:n,k] * norm1
by
a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
where sign is an R base function.
R code for QR factorization
## QR factorization: A = Q %*% R
## if `complete = FALSE` (default), return thin `Q`, `R` factor
## if `complete = TRUE`, return full `Q`, `R` factor
myqr <- function (A, complete = FALSE) {
n <- nrow(A)
p <- ncol(A)
Q <- diag(n)
for(k in 1:p) {
# extract the kth column of the matrix
col <- A[k:n,k]
# calculation of the norm of the column in order to create the vector r
norm1 <- sqrt(drop(crossprod(col)))
# Calculate of the reflection vector a-r
a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
# beta = 2 / ||a-r||^2
beta <- 2 / drop(crossprod(a_r))
# update matrix Q (trailing matrix only) by Householder reflection
Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
# update matrix A (trailing matrix only) by Householder reflection
A[k:n, k:p] <- A[k:n, k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
}
if (complete) {
A[lower.tri(A)] <- 0
return(list(Q = Q, R = A))
}
else {
R <- A[1:p, ]; R[lower.tri(R)] <- 0
return(list(Q = Q[,1:p], R = R))
}
}
Now let's have a test:
X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785,
0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
), .Dim = c(5L, 3L))
# [,1] [,2] [,3]
#[1,] 0.8147 0.0975 0.1576
#[2,] 0.9058 0.2785 0.9706
#[3,] 0.1270 0.5469 0.9572
#[4,] 0.9134 0.9575 0.4854
#[5,] 0.6324 0.9649 0.8003
First for thin-QR version:
## thin QR factorization
myqr(X)
#$Q
# [,1] [,2] [,3]
#[1,] -0.49266686 -0.4806678 0.17795345
#[2,] -0.54775702 -0.3583492 -0.57774357
#[3,] -0.07679967 0.4754320 -0.63432053
#[4,] -0.55235290 0.3390549 0.48084552
#[5,] -0.38242607 0.5473120 0.03114461
#
#$R
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
Now the full QR version:
## full QR factorization
myqr(X, complete = TRUE)
#$Q
# [,1] [,2] [,3] [,4] [,5]
#[1,] -0.49266686 -0.4806678 0.17795345 -0.6014653 -0.3644308
#[2,] -0.54775702 -0.3583492 -0.57774357 0.3760348 0.3104164
#[3,] -0.07679967 0.4754320 -0.63432053 -0.1497075 -0.5859107
#[4,] -0.55235290 0.3390549 0.48084552 0.5071050 -0.3026221
#[5,] -0.38242607 0.5473120 0.03114461 -0.4661217 0.5796209
#
#$R
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
#[4,] 0.000000 0.0000000 0.0000000
#[5,] 0.000000 0.0000000 0.0000000
Now let's check standard result returned by qr.default:
QR <- qr.default(X)
## thin R factor
qr.R(QR)
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
## thin Q factor
qr.Q(QR)
# [,1] [,2] [,3]
#[1,] -0.49266686 -0.4806678 0.17795345
#[2,] -0.54775702 -0.3583492 -0.57774357
#[3,] -0.07679967 0.4754320 -0.63432053
#[4,] -0.55235290 0.3390549 0.48084552
#[5,] -0.38242607 0.5473120 0.03114461
## full Q factor
qr.Q(QR, complete = TRUE)
# [,1] [,2] [,3] [,4] [,5]
#[1,] -0.49266686 -0.4806678 0.17795345 -0.6014653 -0.3644308
#[2,] -0.54775702 -0.3583492 -0.57774357 0.3760348 0.3104164
#[3,] -0.07679967 0.4754320 -0.63432053 -0.1497075 -0.5859107
#[4,] -0.55235290 0.3390549 0.48084552 0.5071050 -0.3026221
#[5,] -0.38242607 0.5473120 0.03114461 -0.4661217 0.5796209
So our results are correct!

Sum and product over interval in R

I am trying to implement the following simple formulas in R:
Formula 1:
I have no idea how to implement in R the product operator when the limits of the interval are very large (e.g. value of the upper limit = 10,000 instead of 5)
Formula 2
Example input for second formula (in reality, the dimension of the interval S is much much bigger)
S = list(c(1,0,0), c(0,1,0), c(0,0,1))
X = c(1,2,3)
Any help would be appreciated!
For the first, take the log:
i *log(1+x)
For the second formula: (not clear what is the expected output)
ss<-matrix(unlist(S), ncol = 3, byrow = TRUE)
X<-as.matrix(X)
crossprod(crossprod(X,ss),t(X))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 4 6
[3,] 3 6 9
Maybe more compactly:
First formula:
function(n, x) exp(sum(seq_len(n)*log(1+x)))
Second formula:
function(X, S) rowSums(sapply(S, function(y) sum(X*y)*X ))
For the first formula it has been mentioned that it is better to do this on the log scale, if your true values of x are near 0 then the log1p function may be of help.
In general for these types of problems you can use lapply or sapply to compute the pieces that need to be multiplied or summed (or whatever), then use sum or prod to sum, multiply. If you want to collapse/combine the values with an operator that does not have a nice function like sum or prod then use the Reduce function.
S = list c((1,0,0), c(0,1,0), c(0,0,1))
X = c(1,2,3)
lapply( lapply(S, function(x) X %*% x %*% t(X) ) , sum)
[[1]]
[1] 6
[[2]]
[1] 12
[[3]]
[1] 18

Resources