Solve indeterminate equation system in R - r

I have a equation system and I want to solve it using numerical methods. I want to get a close solution given a starting seed. Let me explain.
I have a vector of constants ,X, of values:
X <- (c(1,-2,3,4))
and a vector W of weights:
W <- (c(0.25,0.25,0.25,0.25))
I want that the sum of the components of W will be (sum(W)=1), and the sum of the multiplication of X and W element by element will be a given number N (sum(W*X)=N).
Is there a easy way to do this in R? I have it in Excel, using Solver, but I need to automatize it.

Here is your constant and your target value:
x <- c(1, -2, 3, 4)
n <- 10
You need a function to minimize. The first line contains each of your conditions, and the second line provides a measure of how to combine the errors into a single score. You may want to change the second line. For example, you could make one error term be more heavily weighted than the other using sum(c(1, 5) * errs ^ 2).
fn <- function(w)
{
errs <- c(sum(w) - 1, sum(x * w) - n)
sum(errs ^ 2)
}
The simplest thing is to start with all the weights the same value.
init_w <- rep.int(1 / length(x), length(x))
Use optim to optimize.
optim(init_w, fn)
## $par
## [1] 0.1204827 -1.2438883 1.1023338 1.0212406
##
## $value
## [1] 7.807847e-08
##
## $counts
## function gradient
## 111 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
The par element contains your weights.

There is no unique solution for this problem. If you try other initial values for w you will most likely get different results from optim.
The problem can be formulated as solving an underdetermined system of linear equations.
A <- matrix(c(rep(1,4),x), nrow=2,byrow=TRUE)
b <- matrix(c(1,n), nrow=2)
We seek a solution that satisfies A %*% w = b but which one? Minimum norm solution? Or maybe some other one? There are infinitely many solutions. Solutions can be given using the pseudo-inverse of the matrix A. Use package MASS for this.
library(MASS)
Ag <- ginv(A)
The minimum norm solution is
wmnorm <- Ag %*% b
And check with A %*% wmnorm - b and fn(wmnorm).
See the Wikipedia page System of linear equations
the section Matrix solutions.
The solutions are given by
Az <- diag(nrow=nrow(Ag)) - Ag %*% A
w <- wmnorm + Az %*% z
where z is an arbitrary vector of ncol(Az) elements.
And now generate some solutions and check
xb <- wmnorm
z <- runif(4)
wsol.2 <- xb + Az %*% z
wsol.2
A %*% wsol.2 - b
fn(wsol.2)
z <- runif(4)
wsol.3 <- xb + Az %*% z
wsol.3
A %*% wsol.2 - b
fn(wsol.3)
And you'll see that these two solutions are valid solutions when given as argument to fn. And are quite different from the solution found by optim. You could test this by choosing a different starting point init_w for example by init_w1 <- runif(4)/4.

Related

chol2inv(chol(x)) and solve(x)

I assumed that chol2inv(chol(x)) and solve(x) are two different methods that arrive at the same conclusion in all cases. Consider for instance a matrix S
A <- matrix(rnorm(3*3), 3, 3)
S <- t(A) %*% A
where the following two commands will give equivalent results:
solve(S)
chol2inv(chol(S))
Now consider the transpose of the Cholesky decomposition of S:
L <- t(chol(S))
where now the results of the following two commands do not give equivalent results anymore:
solve(L)
chol2inv(chol(L))
This surprised me a bit. Is this expected behavior?
chol expects (without checking) that its first argument x is a symmetric positive definite matrix, and it operates only on the upper triangular part of x. Thus, if L is a lower triangular matrix and D = diag(diag(L)) is its diagonal part, then chol(L) is actually equivalent to chol(D), and chol2inv(chol(L)) is actually equivalent to solve(D).
set.seed(141339L)
n <- 3L
S <- crossprod(matrix(rnorm(n * n), n, n))
L <- t(chol(S))
D <- diag(diag(L))
all.equal(chol(L), chol(D)) # TRUE
all.equal(chol2inv(chol(L)), solve(D)) # TRUE

R Derivatives of an Inverse

I have an expression that contains several parts. However, for simplicity, consider only the following part as MWE:
Let's assume we have the inverse of a matrix Y that I want to differentiate w.r.t. x.
Y is given as I - (x * b * t(b)), where I is the identity matrix, x is a scalar, and b is a vector.
According to The Matrix Cookbook Equ. 59, the partial derivative of an inverse is:
Normally I would use the function D from the package stats to calculate the derivatives. But that is not possible in this case, because e.g. solve to specify Y as inverse and t() is not in the table of derivatives.
What is the best workaround to circumvent this problem? Are there any other recommended packages that can handle such input?
Example that doesn't work:
f0 <- expression(solve(I - (x * b %*% t(b))))
D(f0, "x")
Example that works:
f0 <- expression(x^3)
D(f0, "x")
3 * x^2
I assume that the question is how to get an explicit expression for the derivative of the inverse of Y with respect to x. In the first section we compute it and in the second section we double check it by computing it numerically and show that the two approaches give the same result.
b and the null space of b are both eigenspaces of Y which we can readily verify by noting that Yb = (1-(b'b)x)b and if z belongs to the nullspace of b then Yz = z. This also shows that the corresponding eigenvalues are 1 - x(b'b) with multiplicity 1 and 1 with multiplicity n-1 (since the nullspace of b has that dimension).
As a result of the fact that we can expand such a matrix into the sum of each eigenvalue times the projection onto its eigenspace we can express Y as the following where bb'/b'b is the projection onto the eigenspace spanned by b and the part pre-multiplying it is the eigenvalue. The remaining terms do not involve x because they involve an eigenvalue of 1 independently of x and the nullspace of b is independent of x as well.
Y = (1-x(b'b))(bb')/(b'b) + terms not involving x
The inverse of Y is formed by taking the reciprocals of the eigenvalues so:
Yinv = 1/(1-x(b'b)) * (bb')/(b'b) + terms not involving x
and the derivative of that wrt x is:
(b'b) / (1 - x(b'b))^2 * (bb')/(b'b)
Cancelling the b'b and writing the derivative in terms of R code:
1/(1 - x*sum(b*b))^2*outer(b, b)
Double check
Using specific values for b and x we can verify it against the numeric derivative as follows:
library(numDeriv)
x <- 1
b <- 1:3
# Y inverse as a function of x
Yinv <- function(x) solve(diag(3) - x * outer(b, b))
all.equal(matrix(jacobian(Yinv, x = 1), 3),
1/(1 - x*sum(b*b))^2*outer(b, b))
## [1] TRUE

How to generate samples from MVN model?

I am trying to run some code on R based on this paper here through example 5.1. I want to simulate the following:
My background on R isn't great so I have the following code below, how can I generate a histogram and samples from this?
xseq<-seq(0, 100, 1)
n<-100
Z<- pnorm(xseq,0,1)
U<- pbern(xseq, 0.4, lower.tail = TRUE, log.p = FALSE)
Beta <- (-1)^U*(4*log(n)/(sqrt(n)) + abs(Z))
Some demonstrations of tools that will be of use:
rnorm(1) # generates one standard normal variable
rnorm(10) # generates 10 standard normal variables
rnorm(1, 5, 6) # generates 1 normal variable with mu = 5, sigma = 6
# not needed for this problem, but perhaps worth saying anyway
rbinom(5, 1, 0.4) # generates 5 Bernoulli variables that are 1 w/ prob. 0.4
So, to generate one instance of a beta:
n <- 100 # using the value you gave; I have no idea what n means here
u <- rbinom(1, 1, 0.4) # make one Bernoulli variable
z <- rnorm(1) # make one standard normal variable
beta <- (-1)^u * (4 * log(n) / sqrt(n) + abs(z))
But now, you'd like to do this many times for a Monte Carlo simulation. One way you might do this is by building a function, having beta be its output, and using the replicate() function, like this:
n <- 100 # putting this here because I assume it doesn't change
genbeta <- function(){ # output of this function will be one copy of beta
u <- rbinom(1, 1, 0.4)
z <- rnorm(1)
return((-1)^u * (4 * log(n) / sqrt(n) + abs(z)))
}
# note that we don't need to store beta anywhere directly;
# rather, it is just the return()ed value of the function we defined
betadraws <- replicate(5000, genbeta())
hist(betadraws)
This will have the effect of making 5000 copies of your beta variable and putting them in a histogram.
There are other ways to do this -- for instance, one might just make a big matrix of the random variables and work directly with it -- but I thought this would be the clearest approach for starting out.
EDIT: I realized that I ignored the second equation entirely, which you probably didn't want.
We've now made a vector of beta values, and you can control the length of the vector in the first parameter of the replicate() function above. I'll leave it as 5000 in my continued example below.
To get random samples of the Y vector, you could use something like:
x <- replicate(5000, rnorm(17))
# makes a 17 x 5000 matrix of independent standard normal variables
epsilon <- rnorm(17)
# vector of 17 standard normals
y <- x %*% betadraws + epsilon
# y is now a 17 x 1 matrix (morally equivalent to a vector of length 17)
and if you wanted to get many of these, you could wrap that inside another function and replicate() it.
Alternatively, if you didn't want the Y vector, but just a single Y_i component:
x <- rnorm(5000)
# x is a vector of 5000 iid standard normal variables
epsilon <- rnorm(1)
# epsilon_i is a single standard normal variable
y <- t(x) %*% betadraws + epsilon
# t() is the transpose function; y is now a 1 x 1 matrix

R error QP:not a positive definite matrix?

Im trying to solve the following problem in R, using the quadprog package:
min: vec %*% p + t(p) %*% mat %*% p
st: p >= 0
where
mat <- matrix(c(1162296,0,0,0,0,1,0,0,951.7089,0,1,0,-951.7089,0,0,1),4)
vec <- c(6341934541.1,175800.1,-356401.7,14398073047.1)
I've used
libary(quadprog)
solve.QP(2*mat,-vec, diag(4), integer(4))
but I keep getting the following error:
Error in solve.QP(2*mat, -vec, diag(4), integer(4)) :
matrix D in quadratic function is not positive definite!
However, cleary
> eigen(mat)$values > 0
[1] TRUE TRUE TRUE TRUE
What am I doing wrong? How come this error keeps showing up?
Your matrix mat is not symmetric. The quadprog package is designed to solve quadratic programs, which by definition, require a symmetric matrix in the highest order term. See here, for example.
To solve this problem as written, you will need to use a general constrained optimization algorithm. For example, you can try constrOptim as so:
# system matrices
mat <- matrix(c(1162296,0,0,0,0,1,0,0,951.7089,0,1,0,-951.7089,0,0,1),4)
vec <- c(6341934541.1,175800.1,-356401.7,14398073047.1)
# an initial value
p0 <- c(1,1,1,1)
# the objective function
objective <- function(p) {
vec %*% p + t(p) %*% mat %*% p
}
# solve -- warning! without additional work you won't know if this is a global minimum solution.
solution <- constrOptim(p0, objective, NULL, diag(4), c(0,0,0,0))

R optimize not giving the finite minimum but Inf when the search interval is wider

I have a problem with optimize().
When I limit the search in a small interval around zero, e.g., (-1, 1), the optimize algorithm gives a finite minimum with a finite objective function value.
But when I make the interval wider to (-10, 10), then the minimum is on the boundary of the interval and the objective is Inf, which is really puzzling for me.
How can this happen and how to fix this? Thanks a lot in advance.
The following is my code.
set.seed(123)
n <- 120
c <- rnorm(n,mean=1,sd=.3);
eps <- rnorm(n,mean=0,sd=5)
tet <- 32
r <- eps * c^tet
x <- matrix(c(c,r), ncol=2)
g <- function(tet, x){
matrix((x[,1]^(-tet))*x[,2],ncol=1)
}
theta <- 37
g_t <- g(theta,x)
f.tau <- function(tau){
exp.tau.g <- exp(g_t %*% tau)
g.exp <- NULL; i <- 1:n
g.exp <- matrix(exp.tau.g[i,] * g_t[i,], ncol=1)
sum.g.exp <- apply(g.exp,2,sum)
v <- t(sum.g.exp) %*% sum.g.exp
return(v)
}
band.tau <- 1;
f <- optimize(f.tau, c(-band.tau, band.tau), tol=1e-20)
print("interval=(-1, 1)"); print(f);
band.tau <- 10;
f <- optimize(f.tau, c(-band.tau, band.tau), tol=1e-20)
print("interval=(-10, 10)"); print(f);
The problem is that your function f.tau(x) is not well behaved. You can see that here:
vect.f <- Vectorize(f.tau)
z1 <- seq(-1,1,by=0.01)
z10 <- seq(-10,10,by=0.01)
par(mfrow=c(2,1), mar=c(2,2,1,1))
plot(z1, log(vect.f(z1)), type="l")
plot(z10,log(vect.f(z10)),type="l")
Note that these are plots of log(f.tau). So there are two problems: f.tau(...) has an extremely large slope on either side of the minimum, and f.tau = Inf for x<-0.6 and x>1.0, where Inf means that f.tau(...) is greater than the largest number that can be represented on this system. When you set the range to (-1,1) your starting point is close enough to the minimum that optimize(...) manages to converge. When you set the limits to (-10,10) the starting point is too far away. There are examples in the documentation which show a similar problem with functions that are not nearly as ill-behaved as f.tau.
EDIT (Response to OP's comment)
The main problem is that you are trying to optimize a function which has computational infinities in the interval of interest. Here's a way around that.
band.tau <- 10
z <- seq(-band.tau,band.tau,length=1000)
vect.f <- Vectorize(f.tau)
interval <- range(z[is.finite(vect.f(z))])
f <- optimize(f.tau, interval, tol=1e-20)
f
# $minimum
# [1] 0.001615433
#
# $objective
# [,1]
# [1,] 7.157212e-12
This evaluates f.tau(x) at 1000 equally spaced points on (-band.tau,+band.tau), identifies all the values of x where f.tau is finite, and uses the range as the increment in optimize(...). This works in your case because f.tau(x) does not (appear to...) have asymptotes.

Resources