Numeric Hessian using gradient function R - r

I need to calculate Hessian of my function numerically using my gradient function (programmed by formula, not numeric). Packages like numDeriv or rootSolve calculate hessian using numerical gradient that do not satisfy my needs. The thing I need performed internally (no separate method I can call) in optim package, however the only method that handles my optimization task well implemented in nlopt package and passing its optimal value to optim in order to get hessian is too expensive for my program.
So I need some function that calculates Hessian using not numeric gradient (see for example these formulas https://neos-guide.org/content/difference-approximations). I can't make such a function as I don't understand how to choose a parameter h (increment), that my function is very sensitive to. Could I find such a function in R or retrieve it somehow from optim package? Or may someone at least explain how to choose error minimizing value for h so then I will post this function by myself?

If I understand you correctly, you should use numDeriv::jacobian(), which takes a vector-valued function and computes a matrix (the derivative of each element with respect to each input), and apply it to your analytical gradient function. jacobian() does use numerical approximation (Richardson extrapolation, to be precise), but I don't see any other way that you could get from a black-box gradient function to a Hessian?
You do need to specify (or use the default value) of a numerical 'delta' function (1e-4 by default). On the other hand, the internal code that optim() uses to compute the Hessian also uses finite differences: see here and here ...
Below I define a function, its gradient function, and its Hessian; this code shows that jacobian(grad(x)) is the same as the Hessian (for a particular test case).
library(numDeriv)
x1 <- c(1,1,1)
Test that I didn't screw up the gradient and Hessian functions:
all.equal(grad(f,x1),g(x1)) ## TRUE
all.equal(h(x1),hessian(f,x1)) ## TRUE
Numerical equivalence of my Hessian and the Jacobian of the gradient:
all.equal(h(x1),jacobian(g,x1)) ## TRUE
Test functions:
f <- function(x) {
sin(x[1])*exp(x[2])*x[3]^2
}
g <- function(x) {
c(cos(x[1])*exp(x[2])*x[3]^2,
sin(x[1])*exp(x[2])*x[3]^2,
sin(x[1])*exp(x[2])*2*x[3])
}
h <- function(x) {
m <- matrix(NA,3,3)
m[lower.tri(m,diag=TRUE)] <-
c(-sin(x[1])*exp(x[2])*x[3]^2,
cos(x[1])*exp(x[2])*x[3]^2,
cos(x[1])*exp(x[2])*2*x[3],
# col 2
sin(x[1])*exp(x[2])*x[3]^2,
sin(x[1])*exp(x[3])*2*x[3],
# col 3
sin(x[1])*exp(x[2])*2)
m <- Matrix::forceSymmetric(m,"L")
m <- unname(as.matrix(m))
return(m)
}

Related

optim in R when the function to minimize depends implicitely on the parameters to adjust

I am trying to use the optim function in R to match theoretical data with experimental data.
Basically, I have a function f which computes an output (say a matrix), depending on some parameters defined outside that function. For example, simplifying for clarity:
a=0.2
b=-5
c=9
f=function(dx,Xmax){
Nx=round(Xmax/dx) #Nb of columns
out=matrix(0,2,Nx) #Matrix initialization
for (i in 1:Nx){
out[1,i]=(a^2*dx*i)/b #random example
out[2,i]=(c*(dx*i)^2-b)/a #random example
}
return(out)
}
result=f(0.1,10)
(The actual function f calls lots of functions outside, which use the parameters say a,b,c).
I have some experimental datas for some values of x, say (random again)
expr=data.frame(x=c(0,2,5,7),y=c(1,8,14,23))
I would like to use optim to adjust parameters a,b,c such that the function
WLS=function(theo,expr){
#theo is extracted from the output of f
out=sum((theo-y)/theo)^2) #weighted least squares
}
is minimized, where theo is given by function f for x related to expr (as the experimental data are for some values on x only)
The issue here is, from all the examples of optim I saw, the function to minimize (here WLS) must contain the parameters to vary as an argument. Is my only option is to insert a,b,c into the arguments of f (so we would have f=function(dt,Xmax,a,b,c)), then having something like for WLS
WLS=function(expr,param){
theo=f(dt,Xmax,param[1],param[2],param[3])
#then extract the values of theo for only the x I am interested in (not written here) and plugged
#into theo again
out=sum((theo-y)/theo)^2) #weighted least squares
}
Or are there better ways to deal with that problem?
Thanks in adavance

r optimization function with formula for constrains

I need to optimize a function, say g(x), where x is vector and g is an analytic function.
The problem is that I need to optimize with a constraint (analytic) function c(x) which gives a scalar as an output, i.e the constraint is c(x) > k for some k > 0.
constrOptim only allows to give a constrain for every field separately.
Advices?
Found the right tool - nloptr package. A very robust package where I can define functions for the optimization (g) and constraint (c). Also I can define upper and lower bounds for each of the variables separately, and use different king of optimization methods.

Numerical Hessian for n-Dimensional arguments in R

The example is based on an example in Shumway and Stoffer's: "Time Series Analysis and it's Applications with R Examples". In the original example phi, cq, and cr were scalar so the authors could use fdHess without any issues (see the hashed out version of the code).
para=list(phi, cq, cr)
Linn=function(para){# to evaluate likelhood at estimates
#kf=Kfilter0(num,y,1,mu0,Sigma0,para[1],para[2],para[3])
kf=Kfilter0(num,y,A=h,mu0,Sigma0,para[[1]],para[[2]],para[[3]])
return(kf$like)}
emhess=fdHess(para, function(para) Linn(para))
SE=sqrt(diag(solve(emhess$Hessian)))
I would like to generalize the code so that it can be applied to multivariate time series models. So in the code shown phi, cq, and cr are n*n arrays.
Is there a package that can calculate the Hessian for a scalar valued function with matrix arguments?
The closest match I can find is this (I also looked at nlme and numDeriv):
calculating the Gradient and the Hessian in R
In this case all the arguments are passed as a vector so the function being called has to be modified so that it can take the list of arguments and reconstruct the required matrices.
Is there a method that would allow me to calculate the Hessian for a scalar valued function with matrix arguments without changing the function being called? Seems this would be such a common problem that there would be an off the shelf answer but I haven't been able to find one.
Baz

R: Pass a function of many parameters as one of one parameter to another function by setting its other parameters

General problem:
I have a function that takes another function as an argument, of form:
F <- function(x, FUN){FUN(x)}
I could easily pass a simple function to it:
f1 <- function(x){plot(x, 1/x)}
F(-5:5, f1)
would display a plot of 1/x.
Supposing I had another, more complex function:
f2 <- function(x, a){plot(x, 1/x^a)}
f2 has 2 arguments, so can't be passed directly to F. But I might want to retain the flexibility in a so that, without defining lots of different functions, I can quickly plot 1/x^a for whatever value of a I fancy. I've tried, for a = 2:
F(-5:5, f2(, 2))
F(-5:5, f2(, a=2))
F(-5:5, f2(x, 2))
F(-5:5, f2(a=2))
But none of these work. Does anyone have a solution? (I could set a default for a in f2, but then I could not run it with a different value of a).
Specific context:
I have a function that will find the inverse Laplace Transform of a function, taking a function as its argument which is expected to have one argument (the Laplace variable, p). I can invert a function like f1 above. But I am trying to invert a function for contaminant transport in groundwater. This process depends upon lots of other parameters such as the water velocity and the distance being travelled. So I would like to be able to pass a multi-parameter function for Laplace inversion in such a way that all parameters apart from the Laplace parameter p is fixed. Ultimately I would like to do this process many times with different values for velocity etc., so I need a fluid way to change the "fixed" parameters being used.
Thanks in advance for any help,
Christopher
Just define a generator of function:
genFunc = function(a)
{
function(x) plot(x, 1/x^a)
}
F(-5:5, genFunc(2))
Or use Curry from functional package to fix parameters you want and spice your meals:
library(functional)
F(-5:5, Curry(f2, a=2))

Likelihood maximization in R

In R, I wrote a log-likelihood function containing two recursive calculation. The log-likelihood function works properly (it gives answer for known values of parameters), but when I try to maximize it using optim(), it takes too much time. How can I optimize the code? Thanks in advance for ideas.
This is the log-likelihood function for a markov regime switching model with a dependence structure using copula functions.
Named g in the for loop:
Named p in the for loop:
Named f in the codes:
Some data:
u <- cbind(rt(100,10),rt(100,13))
f function:
f=function(u,p,e1,e2){
s=diag(2);s[1,2]=p
ff=dcopula.gauss(cbind(pt(u[,1],e1),pt(u[,2],e2)),Sigma=s)*dt(u[,1],e1)*dt(u[,2],e2)
return(ff)
}
log-likelihood function:
loglik=function(x){
p11<-x[1];p12<-x[2];p21<-x[3];p22<-x[4];p31<-x[5];p32<-x[6];r<-x[7];a1<-x[8];a2<-x[9];s<-x[10];b1<-x[11];b2<-x[12];t<-x[13];c1<-x[14];c2<-x[15]
p1=c(numeric(nrow(u)));p2=c(numeric(nrow(u)));p3=c(numeric(nrow(u)))
g=c(numeric(nrow(u)))
p1_0=.3
p2_0=.3
g[1]<-(p1_0*f(u,r,a1,a2)[1])+(p2_0*f(u,s,b1,b2)[1])+((1-(p1_0+p2_0))*f(u,t,c1,c2)[1])
p1[1]<-((p1_0*p11*f(u,r,a1,a2)[1])+(p2_0*p21*f(u,r,a1,a2)[1])+((1-(p1_0+p2_0))*p31*f(u,r,a1,a2)[1]))/g[1]
p2[1]<-((p1_0*p12*f(u,s,b1,b2)[1])+(p2_0*p22*f(u,s,b1,b2)[1])+((1-(p1_0+p2_0))*p32*f(u,s,b1,b2)[1]))/g[1]
p3[1]<-((p1_0*(1-(p11+p12))*f(u,t,c1,c2)[1])+(p2_0*(1-(p21+p22))*f(u,t,c1,c2)[1])+((1-(p1_0+p2_0))*(1-(p31+p32))*f(u,t,c1,c2)[1]))/g[1]
for(i in 2:nrow(u)){
g[i]<-(p1[i-1]*p11*f(u,r,a1,a2)[i])+(p1[i-1]*p12*f(u,s,b1,b2)[i])+(p1[i-1]*(1-(p11+p12))*f(u,t,c1,c2)[i])+
(p2[i-1]*p21*f(u,r,a1,a2)[i])+(p2[i-1]*p22*f(u,s,b1,b2)[i])+(p2[i-1]*(1-(p21+p22))*f(u,t,c1,c2)[i])+
(p3[i-1]*p31*f(u,r,a1,a2)[i])+(p3[i-1]*p32*f(u,s,b1,b2)[i])+(p3[i-1]*(1-(p31+p32))*f(u,t,c1,c2)[i])
p1[i]<-((p1[i-1]*p11*f(u,r,a1,a2)[i])+(p1[i-1]*p12*f(u,s,b1,b2)[i])+(p1[i-1]*(1-(p11+p12))*f(u,t,c1,c2)[i]))/g[i]
p2[i]<-((p2[i-1]*p21*f(u,r,a1,a2)[i])+(p2[i-1]*p22*f(u,s,b1,b2)[i])+(p2[i-1]*(1-(p21+p22))*f(u,t,c1,c2)[i]))/g[i]
p3[i]<-((p3[i-1]*p31*f(u,r,a1,a2)[i])+(p3[i-1]*p32*f(u,s,b1,b2)[i])+(p3[i-1]*(1-(p31+p32))*f(u,t,c1,c2)[i]))/g[i]
}
return(-sum(log(g)))
}
Optimization:
library(QRM)
library(copula)
start=list(0,1,0,0,0,0,1,9,7,-1,10,13,1,6,4)
##
optim(start,loglik,lower=c(rep(0,6),-1,1,1,-1,1,1,-1,1,1),
upper=c(rep(1,6),1,Inf,Inf,1,Inf,Inf,1,Inf,Inf),
method="L-BFGS-B") -> fit
This looks like a question for Stack-Overflow.
Something that springs to my mind is:
Define a vector containing the values f(.,.,.,.) in order to avoid doing k*nrow(u) evaluations of the same function and simply call those entries of interest.
It seems like the loop could be replaced by matrix and/or vector products. However, without further information it is unclear what the code is doing and it would take eons to extract this information from the code.

Resources