Calculating 'hat' matrix in R - r

In calculating the 'hat' matrix in weighted least squares a part of the calculation is
X^T*W*X
However, I am unsure how one would do this in R
See the following example:
x <- matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=T)
xt <- t(x)
w <- as.vector(c(7,8,9))
xt*w%*%x
Which gives the error:
Error in xt * w %*% x : non-conformable arrays
Is there anything basic I have misunderstood?
EDIT
xt%*%w%*%x
gives the error:
Error in xt %*% w %*% x : non-conformable arguments

w needs to be 3x3 so make use diag to construct w as a matrix with those values on the diagonal instead of using a vector
x <- matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=T)
xt <- t(x)
w <- diag(c(7,8,9))
xt %*% w %*% x

I am a little rusty on regressions but I think the hatvalues function is what you are looking for. ?hatvalues provides a useful of other diagnostics.

In your R code, w is a vector. It should be a diagonal matrix:
Replace this line:
w <- as.vector(c(7,8,9))
by this:
w <- as.vector(c(7,8,9))*diag(3)

Related

Why do we add noise to matrix cross product?

I'm looking at code for an R package where the cross product of a matrix is computed to calculate the Lipschitz constant. Theoretically, the constant is calculated as the L2 norm of the cross product of X, so it should be
x <- t(X) %*% (X)
L <- sqrt(sum(x^2))
However, in the code implementation, it is calculated as
x <- matrix(rnorm(n),c(n,1))
x <- x / sqrt(sum(x^2))
x <- t(X) %*% (X %*% x)
L <- sqrt(sum(x^2))
So random noise is being added to X. Why is this being done?

calculate density of multivariate normal distribution manually

I want to calculate the density of a multivariate normal distribution manually. As inputs of my function, I have x which is a n*p matrix of data points, a vector mu with n means and a covariance matrix sigma of dim p*p.
I wrote the following function for this:
`dmnorm <- function(mu, sigma, x){
k <- ncol(sigma)
x <- t(x)
dmn <- exp((-1/2)*t(x-mu)%*%solve(sigma)%*%(x-
mu))/sqrt(((2*pi)^k)*det(sigma))
return(dmn)
}`
My own function gives me a matrix of n*n. However, I should get a vector of length n.
In the end, I want the same results as I get from using the dmvnorm() function from the mvtnorm package. What's wrong with my code?
The expression t(x-mu)%*%solve(sigma)%*%(x-
mu) is p x p, so that's why your result is that size. You want the diagonal of that matrix, which you can get using
diag(t(x-mu)%*%solve(sigma)%*%(x-mu))
so the full function should be
dmnorm <- function(mu, sigma, x){
k <- ncol(sigma)
x <- t(x)
dmn <- exp((-1/2)*diag(t(x-mu)%*%solve(sigma)%*%(x-
mu)))/sqrt(((2*pi)^k)*det(sigma))
dmn
}

How to code quadratic form both naively and efficiently

I'm trying to code a quadratic form Z'(S)^{-1} Z
The code is as below
z <- matrix(rnorm(200 * 100), 200, 100)
S <- cov(z)
quad.naive <- function(z, S) {
Sinv <- solve(S)
rowSums((z %*% Sinv) * z)
}
However, I'm not sure I understand thoroughly the last line of the function
rowSums((z %*% Sinv) * z)
Because naively, we should just type exactly the same as the mathematical formula which is
t(Z) %*% Sinv %*% Z
So, anyone can explain why is the row sums form the same as the naive mathematical form, esp. why after two metrics (z, and Sin) multiplication, it use a element-wise multiply symbol * to times Z, rather than use %*%.
(z %*% Sinv) * z
The following is a bit too long for a comment.
"I'm trying to code a quadratic form Z'(S)^{-1} Z" I don't think the quadratic form is correct.
Assume Z is a m x n matrix. Then:
S = cov(Z) is a n x n matrix
S^-1 is a n x n matrix
t(Z) is a n x m matrix
So Z' S^-1 Z (in R: t(Z) %*% solve(S) %*% Z) would mean multiplying matrices with the following dimensions
(n x m) (n x m) (m x n)
which obviously won't work.
Perhaps you meant Z %*% solve(S) %*% t(Z) which returns a m x m matrix, the diagonal of which is the same as rowSums(Z %*% Sinv * Z).
More fundamentally: Shouldn't the quadratic form be a scalar? Or are you talking about a different quadratic form?
Ok, following our exchange in the comments and the link you gave to the relevant section in the book Advanced Statistical Computing I think I understand what the issue is.
I post this a separate (and real) answer, to avoid confusing future readers who may want to read through the train of thoughts in the comments.
Let's return to the code given in your post (which is copied from section 1.3.3 Multivariate Normal Distribution)
set.seed(2017-07-13)
z <- matrix(rnorm(200 * 100), 200, 100)
S <- cov(z)
quad.naive <- function(z, S) {
Sinv <- solve(S)
rowSums((z %*% Sinv) * z)
}
Considering that the quadratic form is defined as the scalar quantity z' Sigma^-1 z (or in R language t(z) %*% solve(Sigma) %*% z) for a random p × 1 column vector, two questions may arise:
Why is z given as a matrix (instead of a p-dimensional column vector, as stated in the book), and
what is the reason for using rowSums in quad.naive?
First off, keep in mind that the quadratic form is a scalar quantity for a single random multivariate sample. What quad.naive is actually returning is the distribution of the quadratic form in multivariate samples (plural!). z here contains 200 samples from a p = 100-dimensional normal.
Then S is the 100 x 100 covariance matrix, and solve(S) returns the inverse matrix of S. The quantity z %*% Sinv * z (the additional brackets are not necessary due to R's operator precedence) returns the diagonal elements of t(z) %*% solve(Sigma) %*% z for every sample of z as row vectors in a matrix. Taking the rowSums is then the same as taking the trace (i.e. having the quadratic form return a scalar for every sample). Also note that you get the same result with diag(z %*% solve(Sigma) %*% t(z)), but in quad.naive we avoid the double matrix multiplication and additional transposition.
A more fundamental question remains: Why look at the distribution of quadratic forms? It can be shown that the distribution of certain quadratic forms in standard normal variables follows a chi-square distribution (see e.g. Mathai and Provost, Quadratic Forms in Random Variables: Theory and Applications and Normal distribution - Quadratic forms)
Specifically, we can show that the quadratic form (x - μ)' Σ^-1 (x - μ) for a p × 1 column vector is chi-square distributed with p degrees of freedom.
To illustrate this, let's draw 100 samples from a bivariate standard normal, and calculate the quadratic forms for every sample.
set.seed(2020)
nSamples <- 100
z <- matrix(rnorm(nSamples * 2), nSamples, 2)
S <- cov(z)
Sinv <- solve(S)
dquadform <- rowSums(z %*% Sinv * z)
We can visualise the distribution as a histogram and overlay the theoretical chi-square density for 2 degrees of freedom.
library(ggplot2)
bw = 0.2
ggplot(data.frame(x = dquadform), aes(x)) +
geom_histogram(binwidth = bw) +
stat_function(fun = function(x) dchisq(x, df = 2) * nSamples * bw)
Finally, results from a Kolmogorov-Smirnov test comparing the distribution of the quadratic forms with the cumulative chi-square distribution with 2 degrees of freedom lead us to fail to reject the null hypothesis (of the equality of both distributions).
ks.test(dquadform, pchisq, df = 2)
#
# One-sample Kolmogorov-Smirnov test
#
#data: dquadform
#D = 0.063395, p-value = 0.8164
#alternative hypothesis: two-sided

R error QP:not a positive definite matrix?

Im trying to solve the following problem in R, using the quadprog package:
min: vec %*% p + t(p) %*% mat %*% p
st: p >= 0
where
mat <- matrix(c(1162296,0,0,0,0,1,0,0,951.7089,0,1,0,-951.7089,0,0,1),4)
vec <- c(6341934541.1,175800.1,-356401.7,14398073047.1)
I've used
libary(quadprog)
solve.QP(2*mat,-vec, diag(4), integer(4))
but I keep getting the following error:
Error in solve.QP(2*mat, -vec, diag(4), integer(4)) :
matrix D in quadratic function is not positive definite!
However, cleary
> eigen(mat)$values > 0
[1] TRUE TRUE TRUE TRUE
What am I doing wrong? How come this error keeps showing up?
Your matrix mat is not symmetric. The quadprog package is designed to solve quadratic programs, which by definition, require a symmetric matrix in the highest order term. See here, for example.
To solve this problem as written, you will need to use a general constrained optimization algorithm. For example, you can try constrOptim as so:
# system matrices
mat <- matrix(c(1162296,0,0,0,0,1,0,0,951.7089,0,1,0,-951.7089,0,0,1),4)
vec <- c(6341934541.1,175800.1,-356401.7,14398073047.1)
# an initial value
p0 <- c(1,1,1,1)
# the objective function
objective <- function(p) {
vec %*% p + t(p) %*% mat %*% p
}
# solve -- warning! without additional work you won't know if this is a global minimum solution.
solution <- constrOptim(p0, objective, NULL, diag(4), c(0,0,0,0))

Solve indeterminate equation system in R

I have a equation system and I want to solve it using numerical methods. I want to get a close solution given a starting seed. Let me explain.
I have a vector of constants ,X, of values:
X <- (c(1,-2,3,4))
and a vector W of weights:
W <- (c(0.25,0.25,0.25,0.25))
I want that the sum of the components of W will be (sum(W)=1), and the sum of the multiplication of X and W element by element will be a given number N (sum(W*X)=N).
Is there a easy way to do this in R? I have it in Excel, using Solver, but I need to automatize it.
Here is your constant and your target value:
x <- c(1, -2, 3, 4)
n <- 10
You need a function to minimize. The first line contains each of your conditions, and the second line provides a measure of how to combine the errors into a single score. You may want to change the second line. For example, you could make one error term be more heavily weighted than the other using sum(c(1, 5) * errs ^ 2).
fn <- function(w)
{
errs <- c(sum(w) - 1, sum(x * w) - n)
sum(errs ^ 2)
}
The simplest thing is to start with all the weights the same value.
init_w <- rep.int(1 / length(x), length(x))
Use optim to optimize.
optim(init_w, fn)
## $par
## [1] 0.1204827 -1.2438883 1.1023338 1.0212406
##
## $value
## [1] 7.807847e-08
##
## $counts
## function gradient
## 111 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
The par element contains your weights.
There is no unique solution for this problem. If you try other initial values for w you will most likely get different results from optim.
The problem can be formulated as solving an underdetermined system of linear equations.
A <- matrix(c(rep(1,4),x), nrow=2,byrow=TRUE)
b <- matrix(c(1,n), nrow=2)
We seek a solution that satisfies A %*% w = b but which one? Minimum norm solution? Or maybe some other one? There are infinitely many solutions. Solutions can be given using the pseudo-inverse of the matrix A. Use package MASS for this.
library(MASS)
Ag <- ginv(A)
The minimum norm solution is
wmnorm <- Ag %*% b
And check with A %*% wmnorm - b and fn(wmnorm).
See the Wikipedia page System of linear equations
the section Matrix solutions.
The solutions are given by
Az <- diag(nrow=nrow(Ag)) - Ag %*% A
w <- wmnorm + Az %*% z
where z is an arbitrary vector of ncol(Az) elements.
And now generate some solutions and check
xb <- wmnorm
z <- runif(4)
wsol.2 <- xb + Az %*% z
wsol.2
A %*% wsol.2 - b
fn(wsol.2)
z <- runif(4)
wsol.3 <- xb + Az %*% z
wsol.3
A %*% wsol.2 - b
fn(wsol.3)
And you'll see that these two solutions are valid solutions when given as argument to fn. And are quite different from the solution found by optim. You could test this by choosing a different starting point init_w for example by init_w1 <- runif(4)/4.

Resources