R optim same function for fn and gr - r

I would like to use optim() to optimize a cost function (fn argument), and I will be providing a gradient (gr argument). I can write separate functions for fn and gr. However, they have a lot of code in common and I don't want the optimizer to waste time repeating those calculations. So is it possible to provide one function that computes both the cost and the gradient? If so, what would be the calling syntax to optim()?
As an example, suppose the function I want to minimize is
cost <- function(x) {
x*exp(x)
}
Obviously, this is not the function I'm trying to minimize. That's too complicated to list here, but the example serves to illustrate the issue. Now, the gradient would be
grad <- function(x) {
(x+1)*exp(x)
}
So as you can see, the two functions, if called separately, would repeat some of the work (in this case, the exponential function). However, since optim() takes two separate arguments (fn and gr), it appears there is no way to avoid this inefficiency, unless there is a way to define a function like
costAndGrad <- function(x) {
ex <- exp(x)
list(cost=x*ex, grad=(x+1)*ex)
}
and then pass that function to optim(), which would need to know how to extract the cost and gradient.
Hope that explains the problem. Like I said my function is much more complicated, but the idea is the same: there is considerable code that goes into both calculations (cost and gradient), which I don't want to repeat unnecessarily.
By the way, I am an R novice, so there might be something simple that I'm missing!
Thanks very much

The nlm function does optimization and it expects the gradient information to be returned as an attribute to the value returned as the original function value. That is similar to what you show above. See the examples in the help for nlm.

Related

Which trigonometric function is most appropriate?

I'm trying to figure out which trigonometric function is most appropriate for image below.
What I have tried:
function(x,theta){
theta*acos(theta+1)*x
}
However, I'm not sure if this takes it to the 'power of' like in the image:
This is just an oridinary cosine being raised to a power. It is written that way by convention to save parentheses and to make it clear that it is the result of applying the cosine which is raised to the power (rather than raising the argument to a power prior to taking its cosine). Use:
f <- function(x,theta){
theta*cos(x)^(theta + 1)
}

Taking advantage of Julia's integration abilities

One of the main reasons I wanted to use Julia for my project is because of its speed, especially for calculating integrals.
I would like to integrate a 1-d function f(x) over some interval [a,b]. In general Julia's quadgk function would be a fast and accurate solution. However, I do not have the function f(x), but only its values f(xi) for a discrete set of points xi in [a,b], stored in an array. The xi's are regularly spaced, and I can get the spacing to be however small I like.
Naively, I could simply define a function f which interpolates using the values f(xi) and feed this to quadgk, (and make the spacing as small as possible), however then I won't know what my error is, which is a shame because QuadGK tells you the error in its estimation.
Another solution is to write a function myself to integrate the array (with trapezoid rule for example), but that would defeat the purpose of using Julia...
What is the easiest way to accurately integrate a function only given discrete values using Julia?
Since you only have values, not the function itself, trapezoid will be your best bet probably. The package Trapz provides this (https://github.com/francescoalemanno/Trapz.jl). However, I think it is worth seeing how easy writing a pretty good implementation yourself would be.
function trap(A)
return sum(A) - (A[begin] + A[end])/2
end
This takes 2.9ms for an array of 10 million floats. If they're Int, then 2.9ms. If they were complex numbers, it would still work (and take 8.9 ms)
A method like this is a good example to show how simple it can be to write pretty fast code in Julia that is still fully generic

R Optimization: Pass value from function to gradient with each iteration

I have a function that I am optimizing using the optimx function in R (I'm also open to using optim, since I'm not sure it will make a difference for what I'm trying to do). I have a gradient that I am passing to optimx for (hopefully) faster convergence compared to not using a gradient. Both the function and the gradient use many of the same quantities that are computed from each new parameter set. One of these quantities in particular is very computationally costly, and it's redundant to have to compute this quantity twice for each iteration - once for the function, and again for the gradient. I'm trying to find a way to compute this quantity once, then pass it to the function and the gradient.
So here is what I am doing. So far this works, but it is inefficient:
optfunc<-function(paramvec){
quant1<-costlyfunction(paramvec)
#costlyfunction is a separate function that takes a while to run
loglikelihood<-sum(quant1)**2
#not really squared, but the log likelihood uses quant1 in its calculation
return(loglikelihood)
}
optgr<-function(paramvec){
quant1<-costlyfunction(paramvec)
mygrad<-sum(quant1) #again not the real formula, just for illustration
return(mygrad)
}
optimx(par=paramvec,fn=optfunc,gr=optgr,method="BFGS")
I am trying to find a way to calculate quant1 only once with each iteration of optimx. It seems the first step would be to combine fn and gr into a single function. I thought the answer to this question may help me, and so I recoded the optimization as:
optfngr<-function(){
quant1<-costlyfunction(paramvec)
optfunc<-function(paramvec){
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS",optfngr() )))
Here, I receive the error: "Error in optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower, : Cannot evaluate function at initial parameters." Of course, there are obvious problems with my code here. So, I'm thinking answering any or all of the following questions may shed some light:
I passed paramvec as the only arguments to optfunc and optgr so that optimx knows that paramvec is what needs to be iterated over. However, I don't know how to pass quant1 to optfunc and optgr. Is it true that if I try to pass quant1, then optimx will not properly identify the parameter vector?
I wrapped optfunc and optgr into one function, so that the quantity quant1 will exist in the same function space as both functions. Perhaps I can avoid this if I can find a way to return quant1 from optfunc, and then pass it to optgr. Is this possible? I'm thinking it's not, since the documentation for optimx is pretty clear that the function needs to return a scalar.
I'm aware that I might be able to use the dots arguments to optimx as extra parameter arguments, but I understand that these are for fixed parameters, and not arguments that will change with each iteration. Unless there is also a way to manipulate this?
Thanks in advance!
Your approach is close to what you want, but not quite right. You want to call costlyfunction(paramvec) from within optfn(paramvec) or optgr(paramvec), but only when paramvec has changed. Then you want to save its value in the enclosing frame, as well as the value of paramvec that was used to do it. That is, something like this:
optfngr<-function(){
quant1 <- NULL
prevparam <- NULL
updatecostly <- function(paramvec) {
if (!identical(paramvec, prevparam)) {
quant1 <<- costlyfunction(paramvec)
prevparam <<- paramvec
}
}
optfunc<-function(paramvec){
updatecostly(paramvec)
loglikelihood<-sum(quant1)**2
return(loglikelihood)
}
optgr<-function(paramvec){
updatecostly(paramvec)
mygrad<-sum(quant1)
return(mygrad)
}
return(list(fn = optfunc, gr = optgr))
}
do.call(optimx, c(list(par=paramvec,method="BFGS"),optfngr() ))
I used <<- to make assignments to the enclosing frame, and fixed up your do.call second argument.
Doing this is called "memoization" (or "memoisation" in some locales; see http://en.wikipedia.org/wiki/Memoization), and there's a package called memoise that does it. It keeps track of lots of (or all of?) the previous results of calls to costlyfunction, so would be especially good if paramvec only takes on a small number of values. But I think it won't be so good in your situation because you'll likely only make a small number of repeated calls to costlyfunction and then never use the same paramvec again.

R: Creating a function with an arbitrarily long expression

Although my original question is more general, in order to keep things more comprehensive, I'm formulating below just its partial case, - I expect that a solution/ answer for it will serve as an answer for the more general question.
Question:
to integrate a function f(x)=(...(((x^x)^x)^x)...^x)^x (... powered x n times) on the interval (0,1) ?
Thanks a lot for any ideas!
P.S.: please, do not try to solve the problem mathematically or to simplify an expression (e.g., to approximate the result with a Taylor expansion, whatever), since it's not the main topic (however, I've tried to choose such an example, which should not have any simple transformations)
P.S.2: Original question (which does not require an answer here, since it's expected that an answer for posted question is valid for original one):
if it's possible in R to create a function with an arbitrarily long expression (avoiding "manual" defining). For example, it's easy to set up manually a given function for n=5:
f<-function(x) {
((((x^x)^x)^x)^x)^x
}
But what if n=1'000, or 1'000'000 ?
It seems that simple looping is not appropriate here...
Copied from Rhelp: You should look at:
# ?funprog Should have worked but didn't. Try instead ...
?Reduce
There are several examples of repeated applications of a functional argument. Also composition of list of functions.
One instance:
Funcall <- function(f, ...) f(...) # sort of like `do.call`
Iterate <- function(f, n = 1)
function(x) Reduce(Funcall, rep.int(list(f), n), x, right = TRUE)
Iterate(function(x) x^1.1, 30)(1.01)
#[1] 1.189612

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

Resources