R Optimization: Pass value from function to gradient with each iteration - r

I have a function that I am optimizing using the optimx function in R (I'm also open to using optim, since I'm not sure it will make a difference for what I'm trying to do). I have a gradient that I am passing to optimx for (hopefully) faster convergence compared to not using a gradient. Both the function and the gradient use many of the same quantities that are computed from each new parameter set. One of these quantities in particular is very computationally costly, and it's redundant to have to compute this quantity twice for each iteration - once for the function, and again for the gradient. I'm trying to find a way to compute this quantity once, then pass it to the function and the gradient.
So here is what I am doing. So far this works, but it is inefficient:
optfunc<-function(paramvec){
quant1<-costlyfunction(paramvec)
#costlyfunction is a separate function that takes a while to run
loglikelihood<-sum(quant1)**2
#not really squared, but the log likelihood uses quant1 in its calculation
return(loglikelihood)
}
optgr<-function(paramvec){
quant1<-costlyfunction(paramvec)
mygrad<-sum(quant1) #again not the real formula, just for illustration
return(mygrad)
}
optimx(par=paramvec,fn=optfunc,gr=optgr,method="BFGS")
I am trying to find a way to calculate quant1 only once with each iteration of optimx. It seems the first step would be to combine fn and gr into a single function. I thought the answer to this question may help me, and so I recoded the optimization as:
optfngr<-function(){
quant1<-costlyfunction(paramvec)
optfunc<-function(paramvec){
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS",optfngr() )))
Here, I receive the error: "Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, : Cannot evaluate function at initial parameters." Of course, there are obvious problems with my code here. So, I'm thinking answering any or all of the following questions may shed some light:
I passed paramvec as the only arguments to optfunc and optgr so that optimx knows that paramvec is what needs to be iterated over. However, I don't know how to pass quant1 to optfunc and optgr. Is it true that if I try to pass quant1, then optimx will not properly identify the parameter vector?
I wrapped optfunc and optgr into one function, so that the quantity quant1 will exist in the same function space as both functions. Perhaps I can avoid this if I can find a way to return quant1 from optfunc, and then pass it to optgr. Is this possible? I'm thinking it's not, since the documentation for optimx is pretty clear that the function needs to return a scalar.
I'm aware that I might be able to use the dots arguments to optimx as extra parameter arguments, but I understand that these are for fixed parameters, and not arguments that will change with each iteration. Unless there is also a way to manipulate this?
Thanks in advance!

Your approach is close to what you want, but not quite right. You want to call costlyfunction(paramvec) from within optfn(paramvec) or optgr(paramvec), but only when paramvec has changed. Then you want to save its value in the enclosing frame, as well as the value of paramvec that was used to do it. That is, something like this:
optfngr<-function(){
quant1 <- NULL
prevparam <- NULL
updatecostly <- function(paramvec) {
if (!identical(paramvec, prevparam)) {
quant1 <<- costlyfunction(paramvec)
prevparam <<- paramvec
}
}
optfunc<-function(paramvec){
updatecostly(paramvec)
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
updatecostly(paramvec)
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS"),optfngr() ))
I used <<- to make assignments to the enclosing frame, and fixed up your do.call second argument.
Doing this is called "memoization" (or "memoisation" in some locales; see http://en.wikipedia.org/wiki/Memoization), and there's a package called memoise that does it. It keeps track of lots of (or all of?) the previous results of calls to costlyfunction, so would be especially good if paramvec only takes on a small number of values. But I think it won't be so good in your situation because you'll likely only make a small number of repeated calls to costlyfunction and then never use the same paramvec again.

Related

Function doesn't change value (R)

I have written a function that takes two arguments, a number between 0:16 and a vector which contains four parameter values.
The output of the function does change if I change the parameters in the vector, but it does not change if I change the number between 0:16.
I can add, that the function I'm having troubles with, includes another function (called 'pi') which takes the same arguments.
I have checked that the 'pi' function does actually change values if I change the value from 0:16 (and it does also change if I change the values of the parameters).
Firstly, here is my code;
pterm_ny <- function(x, theta){
(1-sum(theta[1:2]))*(theta[4]^(x))*exp((-1)*theta[4])/pi(x, theta)
}
pi <- function(x, theta){
theta[1]*1*(x==0)+theta[2]*(theta[3]^(x))*exp((-1)*(theta[3]))+(1-
sum(theta[1:2]))*(theta[4]^(x))*exp((-1)*(theta[4]))
}
Which returns 0.75 for pterm_ny(i,c(0.2,0.2,2,2)), were i = 1,...,16 and 0.2634 for i = 0, which tells me that the indicator function part in 'pi' does work.
With respect to raising a number to a certain power, I have been told that one should wrap the wished number in a 'I', as an example it would be like;
x^I(2)
I have tried to do that in my code, but that didn't help either.
I can't remember the argument for doing it, but I expect that it's to ensure that the number in parentheses is interpreted as an integer.
My end goal is to get 17 different values of the 'pterm' and to accomplish that, I was thinking of using the sapply function like this;
sapply(c(0:16),pterm_ny,theta = c(0.2,0.2,2,2))
I really hope that someone can point out what I'm missing here.
In advance, thank you!
You have a theta[4]^x term both in your main expression and in your pi() function; these are cancelling out, leaving the result invariant to changes in x ...
Also:
you might want to avoid using pi as your function name, as it's also a built-in variable (3.14159...) - this can sometimes cause confusion
the advice about using the "as is" function I() to protect powers is only relevant within formulas, e.g. as used in lm() (linear regression). (It would be used as I(x^2), not x^I(2)

Error while optimizing in R

I am creating two functions f(i) and f(j) and I want to find the value of i and j simultaneously such that the difference f(i) -f(j) is minimized.
However on running the code below I am getting an error.
I have two functions with parameter i and j as below
bu1<- function(j){
sum(linkinc_lev1$gdp*(1/(1+ (linkinc_lev1$use_gro*(1+j/100))))
}
bu1<- function(j){
sum(linkinc_lev2$gdp*(1/(1+ (linkinc_lev2$use_gro*(1+i/100))))
}
Now I need to find the value of i and j simultaneously such that difference of above functions minimized.
I was trying like
f1<- function(j,i) abs(bu1(j)-td1(i))
ans_lev1<-optimize(f1, lower=-100, upper=100),
but getting error Error in td1(i) : argument "i" is missing, with no default
Is there any way in R to minimize functions based on two parameters?
Yes, there is, almost all optimizers work on parameter vectors. You should modify your function to something like
f1<- function(param) abs(bu1(param[2])-td1(param[1]))
i.e. the function takes a single argument "param", and inside the function you fetch the values of interest out of it.
A note: if you use abs() you end up with a non-differentiable objective function. You have to select an optimizer that can handle it (SANN and Nelder-Mead can, for instance). I would rather do
f1 <- function(param) (bu1(param[2])-td1(param[1]))^2
still the same solution but now differentiable, and you can feed it to most optimizers.

R optim same function for fn and gr

I would like to use optim() to optimize a cost function (fn argument), and I will be providing a gradient (gr argument). I can write separate functions for fn and gr. However, they have a lot of code in common and I don't want the optimizer to waste time repeating those calculations. So is it possible to provide one function that computes both the cost and the gradient? If so, what would be the calling syntax to optim()?
As an example, suppose the function I want to minimize is
cost <- function(x) {
x*exp(x)
}
Obviously, this is not the function I'm trying to minimize. That's too complicated to list here, but the example serves to illustrate the issue. Now, the gradient would be
grad <- function(x) {
(x+1)*exp(x)
}
So as you can see, the two functions, if called separately, would repeat some of the work (in this case, the exponential function). However, since optim() takes two separate arguments (fn and gr), it appears there is no way to avoid this inefficiency, unless there is a way to define a function like
costAndGrad <- function(x) {
ex <- exp(x)
list(cost=x*ex, grad=(x+1)*ex)
}
and then pass that function to optim(), which would need to know how to extract the cost and gradient.
Hope that explains the problem. Like I said my function is much more complicated, but the idea is the same: there is considerable code that goes into both calculations (cost and gradient), which I don't want to repeat unnecessarily.
By the way, I am an R novice, so there might be something simple that I'm missing!
Thanks very much
The nlm function does optimization and it expects the gradient information to be returned as an attribute to the value returned as the original function value. That is similar to what you show above. See the examples in the help for nlm.

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

Passing a function argument through R-function nlm

I am probably being unreasonable asking for help debugging a program, but I have spent a day and a half on this pretty simple bit of code and have run out of ideas. I am trying to optimize a function called "log.pr.data" with respect to its first argument.
Because the function optimize requires you to set bounds on the argument I decided to use nlm which only requires a starting point. I have checked with simple examples that nlm is indeed able to pass functions as arguments. My problem is that I am unable to pass a function as an argument in this particular case.
So here is the objective function (with two print diagnostics). I want to maximize it with respect to the argument lambda.s. (As a matter of interest, I am not maximizing a likelihood here. I am trying to optimize an importance sampler.)
log.pr.data<-function(lambda.s,n1,n0,lambda.star,psi0,tobs,g=T.chan){
print("Function log.pr.data")
print(g)
psi.s<-boundary(lambda.s,g,psi0,tobs,n1,n0)
-my.dbinom(n0*lambda.s,n0,lambda.star,log=TRUE)
}
I have no problems with the command:
nlm(log.pr.data,p=0.6,n1=n1,n0=n0,lambda.star=lambda.star,psi0=psi0,tobs=tobs)
It works fine. But I want to be able to change the function g=T.chan. So I redefined the function leaving g unspecified in log.pr.data. In other words, I just removed the "=T.chan" in the argument list. I checked that the function works OK. For instance with the command
log.pr.data(l,n1,n0,lambda.star,psi0,tobs,T.chan)
for a range of values of "l" and it works fine and gives the same values as the previous function where g=T.chan is specified in the argument list. So the function T.chan is being passed properly it appears.
I then try to optimize
nlm(log.pr.data,p=0.6,n1=n1,n0=n0,lambda.star=lambda.star,psi0=psi0,tobs=tobs,g=T.chan)
and I get the error
Error in nlm(function(x) f(x, ...), p,
hessian, typsize, fscale, msg, :
invalid NA value in parameter
It is also interesting that there does not seem to be a single call to log.pr.data because "Function log.pr.data" is not printed. In earlier attempts to troubleshoot this problem, I realized that I was using the symbol "f" for the function being passed and that this might cause problems because nlm called its objective function "f". So I changed it to "g" throughout.
First, I agree with Eduardo Leoni's comment that we need a reproducible example, so that we have “real” code to work with.
My blind guess would be that, because in R you can abbreviate parameters, g is not correctly resolved between “your” g and an abbreviated gradtol from the nlm function.
On the other hand, if I try your code fragments, nlm proceeds with calling log.pr.data and fails only at the second print statement because T.chan isn't known.
So sadly, without a working (i.e. failing reproducibly) example, it's difficult to find out what's wrong.

Resources