Two different results with two methods of numerical integration? - r

I calculated the integral of the product of a Gaussian density and some function.
First, I did it with the function int2() (rmutil package).
And then, I did it with Gauss-Hermite points.
The two results I have obtained are different.
Should I consider that the Gauss-Hermite method is the good one and the numerical integration is an approximation ?
I provide below an example :
1. rmutil::int2()
library(rmutil)
Sig <- matrix (c(0.2^2, 0, 0, 0.8^2), ncol=2)
Mu<- c(2, 0)
to.integrate <- function(B0, B1) {
first.int= 1/0.8 * (1.2 * exp(B0 + B1 * 0.5))^(-1/0.8) * gamma(1/0.8)
B=matrix(c(B0, B1), ncol=1)
multi.norm=1 / (2 * pi * det(Sig)^(1/2)) *
exp (- 0.5 * t( B - Mu ) %*% solve(Sig) %*%( B - Mu ) )
return (first.int %*% multi.norm)
}
result_int2 <- int2(to.integrate, a=c(-Inf, -Inf), b=c(Inf, Inf),
eps=1.0e-6, max=16, d=5)
2. Compute multivariate Gaussian quadrature points:
library(statmod)
mgauss.hermite <- function(n, mu, sigma) {
dm <- length(mu)
gh <- gauss.quad(n, 'hermite')
gh <- cbind(gh$nodes, gh$weights)
idx <- as.matrix(expand.grid(rep(list(1:n), dm)))
pts <- matrix(gh[idx, 1], nrow(idx), dm)
wts <- apply(matrix(gh[idx, 2], nrow(idx), dm), 1, prod)
eig <- eigen(sigma)
rot <- eig$vectors %*% diag(sqrt(eig$values))
pts <- t(rot %*% t(pts) + mu)
return(list(points=pts, weights=wts))
}
nod_wei <- mgauss.hermite(10, mu=Mu, sigma=Sig)
gfun <- function(B0, B1) {
first.int <- 1/0.8 *(1.2 * exp(B0 + B1 * 0.5))^(-1/0.8)* gamma(1/0.8)
return(first.int)
}
result_GH <- sum(gfun(nod_wei$points[, 1], nod_wei$points[, 2]) * nod_wei$weights)/pi
result_int2
result_GH

The mistake came from the way the points were calculated in the mgauss.hermite function.
I changed the decomposition of the Sigma matrix for a Cholesky decomposition with a multiplication by square root of 2.
And the results of the two methods became very similar.
Below is the correction of the mgauss.hermite function
mgauss.hermite <- function(n, mu, sigma) {
dm <- length(mu)
gh <- gauss.quad(n, 'hermite')
gh <- cbind(gh$nodes, gh$weights)
idx <- as.matrix(expand.grid(rep(list(1:n),dm)))
pts <- matrix(gh[idx,1],nrow(idx),dm)
wts <- apply(matrix(gh[idx,2],nrow(idx),dm), 1, prod)
rot <- 2.0**0.5*t(chol(sigma))
pts <- t(rot %*% t(pts) + mu)
return(list(points=pts, weights=wts))
}

Related

How to fit it `Error in hist.default(res) : 'x' must be numeric`?

Following this question: How to get the value of `t` so that my function `h(t)=epsilon` for a fixed `epsilon`?
I first sampling 500 eigenvectors v of a random matrix G and then generate 100 different random vectors initial of dimension 500. I normalized them in mats.
#make this example reproducible
set.seed(100001)
n <- 500
#Sample GOE random matrix
A <- matrix(rnorm(n*n, mean=0, sd=1), n, n)
G <- (A + t(A))/sqrt(2*n)
ev <- eigen(G)
l <- ev$values
v <- ev$vectors
#size of multivariate distribution
mean <- rep(0, n)
var <- diag(n)
#simulate bivariate normal distribution
initial <- MASS::mvrnorm(n=1000, mu=mean, Sigma=var) #ten random vectors
#normalized the first possible initial value, the initial data uniformly distributed on the sphere
xmats <- lapply(1:1000, function(i) initial[i, ]/norm(initial[i, ], type="2"))
Then I compute res
h1t <- function(t,x_0) {
h10 <- c(x_0 %*% v[, n])
denom <- vapply(t, function(.t) {
sum((x_0 %*% v)^2 * exp(-4*(l - l[n]) * .t))
}, numeric(1L))
abs(h10) / sqrt(denom)
}
find_t <- function(x, epsilon = 0.01, range = c(-50, 50)) {
uniroot(function(t) h1t(t, x) - epsilon, range,
tol = .Machine$double.eps)$root
}
I want to get res:
res <- lapply(xmats, find_t)
However, it shows error that Error in uniroot(function(t) h1t(t, x) - epsilon, range, tol = .Machine$double.eps) : f() values at end points not of opposite sign
res is a list. I run hist(unlist(res)) and it worked well.

In CVXR, how to use an external c++ function?

I am using CVXR to code a penalized linear regression. My global loss is composed of 4 elements: two differents SSE losses loss_u, loss_b on two different data sets, a ridge penalty and a specific distance D. The code works if I use the 'distance == "MM"'. However, there is an error for 'distance == "MMD"'. I use an external rcpp function from kernal "kernlab::kmmd". The problem is that "Xb %*% beta" is a MulExpression. I dont know if I should convert it into a numeric (but how?) or if it is impossible to use rcpp function.
deb_reg <- function(Xu, Yu, Xb, Yb, beta, lambda = 0, theta = 0.5, alpha = 0, distance = "MM") {
n <- nrow(Xu)
m <- nrow(Xb)
ridge <- lambda * sum(beta^2)
loss_u <- sum((Yu - Xu %*% beta)^2) * ( theta/ n )
loss_b <- sum((Yb - Xb %*% beta)^2) * ( (1-theta)/ m )
if(distance == "MM"){
D <- alpha * ( mean(Yu) - mean(Xb %*% beta) )^2
} else if(distance == "MMD"){
y <- as.numeric(Yu)
# print(beta)
x <- Xb %*% beta
# D <- alpha * EasyMMD::MMD(y, x)
MMD <- kernlab::kmmd(as.matrix(y), as.matrix(x))
D <- alpha * sum(MMD#mmdstats)
} else{
D <- 0
}
obj <- loss_u + loss_b + ridge + D
return(obj)
}
p <- ncol(X_unbiased)
beta <- Variable(p)
obj <- deb_reg(Xu = X_unbiased, Yu = Y_unbiased, Xb = X_biased, Yb = Y_biased, beta,
lambda = 0.1, theta=0.5, alpha = 10, distance = "MMD")
prob <- Problem(Minimize(obj))
result <- solve(prob)

Calculating the ols coefficient beta without function lm

How can I calculate the OLS coefficient in R without the function lm.
Formula: ß = (X'X)^-1*X'y
X <- cbind(runif(1000000), rnorm(1000000), rchisq(1000000,50))
y <- 100 * X[,1] + 200 * X[,2] + rnorm(nrow(X), 0, 10)
Would be really grateful for help since I have no idea how I can do this
It's basic linear algebra for OLS. You might want to have a look at https://en.wikipedia.org/wiki/Linear_regression
set.seed(123)
X <- cbind(runif(1000000), rnorm(1000000), rchisq(1000000,50))
y <- 100 * X[,1] + 200 * X[,2] + rnorm(nrow(X), 0, 10)
# (X'X)^-1*X'y
# basic matrix algebra
solve(t(X) %*% X) %*% (t(X) %*% y)
# crossprod for numeric stability
crossprod(solve(crossprod(X)), crossprod(X,y))
# same in lm()
lm(y~0+X)
If your linear model has an intercept
x <- cbind(1, X)
# (X'X)^-1*X'y
solve(t(x) %*% x) %*% (t(x) %*% y)
crossprod(solve(crossprod(x)), crossprod(x,y))
lm(y~X)
Here is my version, including gradient decent. Kudos also to this post.
x0 <- c(1,1,1,1,1) # Intercept
x1 <- c(1,2,3,4,5)
x2 <- c(8,4,3,1,8)
x <- as.matrix(cbind(x0,x1,x2))
y <- as.matrix(c(3,7,5,11,14))
# (X'X)^-1 X'y
beta1 = solve(t(x)%*%x) %*% t(x)%*%y
# R's regression command
beta2 = summary(lm(y ~ x[, 2:3]))
# Gradient decent
m <- nrow(y)
grad <- function(x, y, theta) {
gradient <- (1/m)* (t(x) %*% ((x %*% t(theta)) - y))
return(t(gradient))
}
# Define gradient descent update algorithm
grad.descent <- function(x, maxit){
theta <- matrix(c(0, 0, 0), nrow=1) # Initialize the parameters
alpha = .05 # set learning rate
for (i in 1:maxit) {
theta <- theta - alpha * grad(x, y, theta)
}
return(theta)
}
# Results without feature scaling
print(grad.descent(x,2000))
beta1
beta2

Iterative optimization of alternative glm family

I'm setting up an alternative response function to the commonly used exponential function in poisson glms, which is called softplus and defined as $\frac{1}{c} \log(1+\exp(c \eta))$, where $\eta$ corresponds to the linear predictor $X\beta$
I already managed optimization by setting parameter $c$ to arbitrary fixed values and only searching for $\hat{\beta}$.
BUT now for the next step I have to optimize this parameter $c$ as well (iteratively changing between updated $\beta$ and current $c$).
I tried to write a log-lik function, score function and then setting up a Newton Raphson optimization (using a while loop)
but I don't know how to seperate the updating of c in an outer step and updating \beta in an inner step..
Are there any suggestions?
# Response function:
sp <- function(eta, c = 1 ) {
return(log(1 + exp(abs(c * eta)))/ c)
}
# Log Likelihood
l.lpois <- function(par, y, X){
beta <- par[1:(length(par)-1)]
c <- par[length(par)]
l <- rep(NA, times = length(y))
for (i in 1:length(l)){
l[i] <- y[i] * log(sp(X[i,]%*%beta, c)) - sp(X[i,]%*%beta, c)
}
l <- sum(l)
return(l)
}
# Score function
score <- function(y, X, par){
beta <- par[1:(length(par)-1)]
c <- par[length(par)]
s <- matrix(rep(NA, times = length(y)*length(par)), ncol = length(y))
for (i in 1:length(y)){
s[,i] <- c(X[i,], 1) * (y[i] * plogis(c * X[i,]%*%beta) / sp(X[i,]%*%beta, c) - plogis(c * X[i,]%*%beta))
}
score <- rep(NA, times = nrow(s))
for (j in 1:length(score)){
score[j] <- sum(s[j,])
}
return(score)
}
# Optimization function
opt <- function(y, X, b.start, eps=0.0001, maxiter = 1e5){
beta <- b.start[1:(length(b.start)-1)]
c <- b.start[length(b.start)]
b.old <- b.start
i <- 0
conv <- FALSE
while(conv == FALSE){
eta <- X%*%b.old[1:(length(b.old)-1)]
s <- score(y, X, b.old)
h <- numDeriv::hessian(l.lpois,b.old,y=y,X=X)
invh <- solve(h)
# update
b.new <- b.old + invh %*% s
i <- i + 1
# Test
if(any(is.nan(b.new))){
b.new <- b.old
warning("convergence failed")
break
}
# convergence reached?
if(sqrt(sum((b.new - b.old)^2))/sqrt(sum(b.old^2)) < eps | i >= maxiter){
conv <- TRUE
}
b.old <- b.new
}
eta <- X%*%b.new[1:(length(b.new)-1)]
# covariance
invh <- solve(numDeriv::hessian(l.lpois,b.new,y=y,X=X))
fitted <- sp(eta, b.new[length(b.new)])
result <- list("coefficients" = c(beta = b.new),
"fitted.values" = fitted,
"covariance" = invh)
}
# Running fails ..
n <- 100
x <- runif(n, 0, 1)
Xdes <- cbind(1, x)
eta <- 1 + 2 * x
y <- rpois(n, sp(eta, c = 1))
opt(y,Xdes,c(0,1,1))
You have 2 bugs:
line 25:
(y[i] * plogis(c * X[i,]%*%beta) / sp(X[i,]%*%beta, c) - plogis(c * X[i,]%*%beta))
this returns matrix so you must convert to numeric:
as.numeric(y[i] * plogis(c * X[i,]%*%beta) / sp(X[i,]%*%beta, c) - plogis(c * X[i,]%*%beta))
line 23:
) is missing:
you have:
s <- matrix(rep(NA, times = length(y)*length(par), ncol = length(y))
while it should be:
s <- matrix(rep(NA, times = length(y)*length(par)), ncol = length(y))

How do I find the maximum likelihood of a specific multivariate normal log likelihood in R?

I'm having trouble optimizing a multivariate normal log-likelihood in R. If anyone has a good solution for that, please let me know. Specifically, I cannot seem to keep the variance-covariance matrix positive-definite and the parameters in a reasonable range.
Let me introduce the problem more completely. I am essentially trying to simultaneously solve these two regression equations using MLE:
$$
y_1 = \beta_1 + \beta_2 x_1 + \beta_3 x_2 \\
y_2 = \beta_4 + \beta_3 x_1 + \beta_5 x_2
$$
The fact that $\beta_3$ is in both equations is not a mistake. I try to solve this using MLE by maximizing the likelihood of the multivariate normal distribution for $Y = (y_1, y_2)^\top$ where the mean is parameterized as above in the regression equations.
I've attached the log-likelihood function as I believe it should be, where I constrain the variance covariance matrix to be positive-definite by recreating it from necessarily positive eigenvalues and a cholesky decomposition.
mvrestricted_ll <- function(par, Y, X) {
# Indices
n <- nrow(X)
nbetas <- (2 + 3 * (ncol(Y) - 1))
# Extract parameters
beta <- par[1:nbetas]
eigvals <- exp(par[(nbetas + 1):(nbetas + ncol(Y))]) # constrain to be positive
chole <- par[(nbetas + ncol(Y) + 1):(nbetas + ncol(Y) + ncol(Y)*(ncol(Y)+1)/2)]
# Build Sigma from positive eigenvalues and cholesky (should be pos def)
L <- diag(ncol(Y))
L[lower.tri(L, diag=T)] <- chole
Sigma <- diag(eigvals) + tcrossprod(L)
# Linear predictor
# Hard coded for 2x2 example for now
mu <- cbind(beta[1] + beta[2]*X[,1] + beta[3]*X[,2],
beta[4] + beta[3]*X[,1] + beta[5]*X[,2])
yminmu <- Y - mu
nlogs <- n * log(det(Sigma))
invSigma <- solve(Sigma)
meat <- yminmu %*% tcrossprod(invSigma, yminmu)
return(- nlogs - sum(diag(meat)))
}
# Create fake data
n <- 1000
p <- 2
set.seed(20160201)
X <- matrix(rnorm(n*p), nrow = n)
set.seed(20160201)
Y <- matrix(rnorm(n*p), nrow = n)
# Initialize parameters
initpars <- c(rep(0, (2 + 3 * (ncol(Y) - 1)) + ncol(Y) + ncol(Y)*(ncol(Y)+1)/2))
# Optimize fails with BFGS
optim(par = initpars, fn = mvrestricted_ll, X=X, Y=Y, method = "BFGS")
# Optim does not converge with Nelder-mead, if you up the maxits it also fails
optim(par = initpars, fn = mvrestricted_ll, X=X, Y=Y)
Any help would be greatly appreciated.
EDIT: I should note that just letting Sigma be a vector in the parameters and then returning a very large value whenever it is not positive definite does not work either.
I have no idea if the code/answer is correct, but
invSigma <- try(solve(Sigma))
if (inherits(invSigma, "try-error")) return(NA)
and running
optim(par = initpars, fn = mvrestricted_ll, X=X, Y=Y,
control = list(maxit = 1e5))
gets me a little farther to a convergence code of 10 (degenerate Nelder-Mead simplex).
$par
[1] 1.361612e+01 4.674349e+01 -3.050170e+01 3.305013e+01 6.731194e+01
[6] -3.117192e+01 -5.408598e+00 -6.326897e-07 -1.987449e+01 -1.795924e+01
$value
[1] -1.529013e+19
$counts
function gradient
1219 NA
$convergence
[1] 10
I suspect that a real solution will involve looking more carefully at the code to see if it's really doing what you think it's doing (sorry); understanding why solve() errors occur might be a good first step. You can work on troubleshooting this by putting a cat(par, "\n") as the first line of the function and running it without the try/NA-return code. That will allow you to isolate an example data set that throws the error — then you can work your way through your code a line at a time (with debug() or by hand) to see what's happening.
You can consider using the following approach :
library(DEoptim)
fn <- function(par, mat_X, mat_Y)
{
X <- mat_X
Y <- mat_Y
n <- nrow(X)
nbetas <- (2 + 3 * (ncol(Y) - 1))
beta <- par[1 : nbetas]
eigvals <- exp(par[(nbetas + 1) : (nbetas + ncol(Y))])
chole <- par[(nbetas + ncol(Y) + 1) : (nbetas + ncol(Y) + ncol(Y) * (ncol(Y) + 1) / 2)]
L <- diag(ncol(Y))
L[lower.tri(L, diag = TRUE)] <- chole
Sigma <- tryCatch(diag(eigvals) + tcrossprod(L), error = function(e) NA)
if(is.null(dim(Sigma)))
{
return(10 ^ 30)
}else
{
mu <- cbind(beta[1] + beta[2] * X[,1] + beta[3] * X[,2],
beta[4] + beta[3] * X[,1] + beta[5] * X[,2])
yminmu <- Y - mu
nlogs <- n * log(det(Sigma))
invSigma <- tryCatch(solve(Sigma), error = function(e) NA)
if(is.null(dim(invSigma)))
{
return(10 ^ 30)
}else
{
meat <- yminmu %*% tcrossprod(invSigma, yminmu)
log_Lik <- - nlogs - sum(diag(meat))
if(is.na(log_Lik) | is.nan(log_Lik) | is.infinite(log_Lik))
{
return(10 ^ 30)
}else
{
return(-log_Lik)
}
}
}
}
n <- 1000
p <- 2
set.seed(20160201)
mat_X <- matrix(rnorm(n * p), nrow = n)
set.seed(2436537)
mat_Y <- matrix(rnorm(n * p), nrow = n)
lower <- rep(-10, 10)
upper <- rep(10, 10)
DEoptim(fn = fn, lower = lower, upper = upper,
control = list(itermax = 10000, parallelType = 1), mat_X = mat_X, mat_Y = mat_Y)

Resources