So I have an n x p matrix X, and I want to scale this matrix so that X times its transpose has diagonal elements one only. That is, I want
diag(X %*% t(X)) # n x p matrix X
to equal
rep(1,n)
I know it can be done with the help of the scale function in R, but not sure about the specifics.
You can have each row of X divided by the square root of its sum of squares.
set.seed(123)
X <- matrix(rnorm(15), 3, 5)
divisor <- sqrt(rowSums(X^2))
Y <- X / divisor
Y %*% t(Y)
# [,1] [,2] [,3]
# [1,] 1.0000000 -0.5620878 -0.6140127
# [2,] -0.5620878 1.0000000 0.2485122
# [3,] -0.6140127 0.2485122 1.0000000
I am working on a piece of code to find the QR factorization of a matrix in R.
X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785,
0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
), .Dim = c(5L, 3L))
myqr <- function(A) {
n <- nrow(A)
p <- ncol(A)
Q <- diag(n)
Inp <- diag(nrow = n, ncol = p)
for(k in c(1:ncol(A))) {
# extract the kth column of the matrix
col<-A[k:n,k]
# calculation of the norm of the column in order to create the vector
norm1<-sqrt(sum(col^2))
# Define the sign positive if a1 > 0 (-) else a1 < 0(+)
sign <- ifelse(col[1] >= 0, -1, +1)
# Calculate of the vector a_r
a_r <- col - sign * Inp[k:n,k] * norm1
# beta = 2 / ||a-r||^2
beta <- 2 / sum(t(a_r) %*% a_r)
# the next line of code calculates the matrix Q in every step
Q <- Q - beta *Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))
# calculates the matrix R in each step
A[k:n,k:p] <- A[k:n,k:p] - beta * a_r %*% t(a_r) %*% A[k:n,k:p]
}
list(Q=Q,R=A)
}
But, Here I have not calculated in every step the matrix H that represents the householder reflection, also I have not calculated the matrix A in every step.
As H = I - 2 v v', if I multiply by Q I obtain
QH = Q - 2 (Qv) v' // multiplication on the left
HQ = Q - 2 v (Q'v)' // multiplication on the right
Now, this operations should be work in every step. However if I consider the first matrix H and he the second matrix H1.... these matrices will be of smaller that the first one. In order to avoid that I have used the next line of code:
Q <- Q - beta * Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))
but, I am not sure why the code is working well, when I generate the new vector a_r with the first k entries of zeros at every step.
I thought you want exactly the same output as returned by qr.default, which uses compact QR storage. But then I realized that you are storing Q and R factors separately.
Normally, QR factorization only forms R but not Q. In the following, I will describe QR factorization where both are formed. For those who lack basic understanding of QR factorization, please read this first: lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK, where there are neat math formulae arranged in LaTeX. In the following, I will assume that one knows what a Householder reflection is and how it is computed.
QR factorization procedure
First of all, a Householder refection vector is H = I - beta * v v' (where beta is computed as in your code), not H = I - 2 * v v'.
Then, QR factorization A = Q R proceeds as (Hp ... H2 H1) A = R, where Q = H1 H2 ... Hp. To compute Q, we initialize Q = I (identity matrix), then multiply Hk on the right iteratively in the loop. To compute R, we initialize R = A and multiply Hk on the left iteratively in the loop.
Now, at k-th iteration, we have a rank-1 matrix update on Q and A:
Q := Q Hk = Q (I - beta v * v') = Q - (Q v) (beta v)'
A := Hk A = (I - beta v * v') A = A - (beta v) (A' v)'
v = c(rep(0, k-1), a_r), where a_r is the reduced, non-zero part of a full reflection vector.
The code you have is doing such update in a brutal force:
Q <- Q - beta * Q %*% c(rep(0,k-1), a_r) %*% t(c(rep(0,k-1),a_r))
It first pads a_r to get the full reflection vector and performs the rank-1 update on the whole matrix. But actually we can drop off those zeros and write (do some matrix algebra if unclear):
Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
A[k:n,k:p] <- A[k:n,k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
so that only a fraction of Q and A are updated.
Several other comments on your code
You have used t() and "%*%" a lot! But almost all of them can be replaced by crossprod() or tcrossprod(). This eliminates the explicit transpose t() and is more memory efficient;
You initialize another diagonal matrix Inp which is not necessary. To get householder reflection vector a_r, you can replace
sign <- ifelse(col[1] >= 0, -1, +1)
a_r <- col - sign * Inp[k:n,k] * norm1
by
a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
where sign is an R base function.
R code for QR factorization
## QR factorization: A = Q %*% R
## if `complete = FALSE` (default), return thin `Q`, `R` factor
## if `complete = TRUE`, return full `Q`, `R` factor
myqr <- function (A, complete = FALSE) {
n <- nrow(A)
p <- ncol(A)
Q <- diag(n)
for(k in 1:p) {
# extract the kth column of the matrix
col <- A[k:n,k]
# calculation of the norm of the column in order to create the vector r
norm1 <- sqrt(drop(crossprod(col)))
# Calculate of the reflection vector a-r
a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
# beta = 2 / ||a-r||^2
beta <- 2 / drop(crossprod(a_r))
# update matrix Q (trailing matrix only) by Householder reflection
Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
# update matrix A (trailing matrix only) by Householder reflection
A[k:n, k:p] <- A[k:n, k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
}
if (complete) {
A[lower.tri(A)] <- 0
return(list(Q = Q, R = A))
}
else {
R <- A[1:p, ]; R[lower.tri(R)] <- 0
return(list(Q = Q[,1:p], R = R))
}
}
Now let's have a test:
X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785,
0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
), .Dim = c(5L, 3L))
# [,1] [,2] [,3]
#[1,] 0.8147 0.0975 0.1576
#[2,] 0.9058 0.2785 0.9706
#[3,] 0.1270 0.5469 0.9572
#[4,] 0.9134 0.9575 0.4854
#[5,] 0.6324 0.9649 0.8003
First for thin-QR version:
## thin QR factorization
myqr(X)
#$Q
# [,1] [,2] [,3]
#[1,] -0.49266686 -0.4806678 0.17795345
#[2,] -0.54775702 -0.3583492 -0.57774357
#[3,] -0.07679967 0.4754320 -0.63432053
#[4,] -0.55235290 0.3390549 0.48084552
#[5,] -0.38242607 0.5473120 0.03114461
#
#$R
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
Now the full QR version:
## full QR factorization
myqr(X, complete = TRUE)
#$Q
# [,1] [,2] [,3] [,4] [,5]
#[1,] -0.49266686 -0.4806678 0.17795345 -0.6014653 -0.3644308
#[2,] -0.54775702 -0.3583492 -0.57774357 0.3760348 0.3104164
#[3,] -0.07679967 0.4754320 -0.63432053 -0.1497075 -0.5859107
#[4,] -0.55235290 0.3390549 0.48084552 0.5071050 -0.3026221
#[5,] -0.38242607 0.5473120 0.03114461 -0.4661217 0.5796209
#
#$R
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
#[4,] 0.000000 0.0000000 0.0000000
#[5,] 0.000000 0.0000000 0.0000000
Now let's check standard result returned by qr.default:
QR <- qr.default(X)
## thin R factor
qr.R(QR)
# [,1] [,2] [,3]
#[1,] -1.653653 -1.1404679 -1.2569776
#[2,] 0.000000 0.9660949 0.6341076
#[3,] 0.000000 0.0000000 -0.8815566
## thin Q factor
qr.Q(QR)
# [,1] [,2] [,3]
#[1,] -0.49266686 -0.4806678 0.17795345
#[2,] -0.54775702 -0.3583492 -0.57774357
#[3,] -0.07679967 0.4754320 -0.63432053
#[4,] -0.55235290 0.3390549 0.48084552
#[5,] -0.38242607 0.5473120 0.03114461
## full Q factor
qr.Q(QR, complete = TRUE)
# [,1] [,2] [,3] [,4] [,5]
#[1,] -0.49266686 -0.4806678 0.17795345 -0.6014653 -0.3644308
#[2,] -0.54775702 -0.3583492 -0.57774357 0.3760348 0.3104164
#[3,] -0.07679967 0.4754320 -0.63432053 -0.1497075 -0.5859107
#[4,] -0.55235290 0.3390549 0.48084552 0.5071050 -0.3026221
#[5,] -0.38242607 0.5473120 0.03114461 -0.4661217 0.5796209
So our results are correct!
The SVD works well with R:
A = matrix(1:12,3,4)
A
u = svd(A)$u
v = svd(A)$v
sigma = diag(svd(A)$d)
u %*% sigma %*% t(v) # = A as desired
But unlike the usual statement of the SVD theorem, v is not a 4x4 matrix (it should be!):
dim(v) # (4,3)
Why is it so?
According to the theorem,
v should be of format (4,4),
sigma should be of format (3,4).
By the way what would be shortest way to create a diag(svd(A)$d) zero-padded in order to be of format (3,4)?
To get the full U and V matrices, use the nu= and nv= arguments to svd(). To pad a diagonal matrix with zeros, use the nrow= and ncol= arguments to diag():
A <- matrix(1:12,3,4)
D <- svd(A, nu=nrow(A), nv=ncol(A))
u <- D$u
v <- D$v
sigma <- diag(D$d, nrow=nrow(A), ncol=ncol(A))
## Check that that worked:
dim(u)
# [1] 3 3
dim(v)
# [1] 4 4
dim(sigma)
# [1] 3 4
u %*% sigma %*% t(v)
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 5 8 11
# [3,] 3 6 9 12
It's just a different convention, different systems/textbooks will define the SVD one or the other way. The important thing is the unitary property U*U'=I. In either convention the singular vectors will minimize the least-squared distances in projection.
Here's a development of the theory that has the dimension conventions the same as in LINPACK and R: https://www.cs.princeton.edu/courses/archive/spring12/cos598C/svdchapter.pdf
For part two:
diag(c(svd(A)$d,0),nrow=3,ncol=4)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
i need help understanding "true matrix multiplication : x %*% y".
What is it and when it is useful? I've done some testing, but don't get it. Help.
I've tried this:
1) Created this vectors:
x <- matrix(1:4, 2,2); y <- matrix(rep(10,4), 2,2)
2) Print x:
[,1] [,2]
[1,] 1 3
[2,] 2 4
3) Print y:
[,1] [,2]
[1,] 10 10
[2,] 10 10
4) Print x * y
[,1] [,2]
[1,] 10 30
[2,] 20 40
5) Print x %*% y:
[,1] [,2]
[1,] 40 40
[2,] 60 60
Help.
z <- x*y refers to element-by-element multiplication. That is, z[i,j] == x[i,j] * y[i,j] for each i and j. In this case, x and y must have the same dimensions.
Matrix multiplication is the inner products of the rows of one matrix with the columns of the other. For z <- x %*% y, x must have as many rows as y has columns. In that case, z[i,j] is the inner product of the ith row of x with the jth column of y.
One use is in linear algebra. If x is seen as a linear transformation F and y is seen as a linear transformation G, x %*% y is the composite linear transformation F o G.
I have data matrix (X) of the dimensions 5000x250 plus an extra parameter Y (Dim: 5000x1). The following loop gives me the desired results, but it takes forever to compute.
for (i in 1:ncol(X))
for (j in 1:nrow(X))
{
X[j,i]=Y[j,1]^X[j,i]
}
Is there any way to optimize this? If I didn't require the exponential, I'd use matrix multiplication. Thanks!
Turn your column vector y into a matrix and use elementwise ^.
matrix(y, nrow=nrow(X), ncol=ncol(X)) ^ X
or
rep(y, times=ncol(X)) ^ X
You can use the vectorised ^ if you construct a matrix of y's of the correct size:
x <- matrix(1:9,3)
y <- matrix(1:3,ncol=1)
do.call(cbind,replicate(ncol(x),list(y)))^x
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 4 32 256
[3,] 27 729 19683