Stopping criteria for optim/SANN in R not working - r

Issue 1
I have an objective function, gFun(modelOutput,l,u), which returns 0 if the simulated output is in interval [l,u], otherwise it returns a positive(!) number.
OFfun <- function(params) {
out <- simulate(params)
OF <- gFun(out,0,5)
return(OF)
}
The objective function is called from the optim function with some tolerance settings.
fitval=optim(par=parms,fn=OFfun,method="SANN",control = list(abstol = 1e-2))
summary(fitval)
My issue is that the optimization doesn't stop if the OFfun == 0.
I have tried with the condition below:
if (OF == 0){
opt <- options(show.error.messages=FALSE)
on.exit(options(opt))
stop()
}
it works but it doesn't return the OF back to optim and therefore I don't get the fitval info with estimated parameters.
Issue 2
Another issue is that the solver sometimes crashes and aborts the entire optimisation. I would like to harvest many solution sets for different initial guesses - so I need to handle failed simulations. probably related to issue 1.
Any advice would be very appreciated.

Related

R code runs when required parameter is not specified

I am assisting a colleague with adding functionality to one of his R packages.
I have implemented nonparametric bootstrapping using a for loop construct in R.
# perform resampling
# resample `subsample_size` values with or without replacement replicate_size times
for (i in 1:replicate_size) {
if (replacement == TRUE) { # bootstrapping
z <- sample(x, size = subsample_size, replace = TRUE)
zz <- sample(x, size = subsample_size, replace = TRUE)
} else { # subsampling
z <- sample(x, size = subsample_size, replace = FALSE)
zz <- sample(x, size = subsample_size, replace = FALSE)
}
# calculate statistic
boot_samples[i] <- min(zz) - max(z)
}
The above loop is nested within another for loop, which itself is nested within a function (details not shown). The code I'm dealing with is messy, and there are most certainly more efficient ways of coding things up, but I've had to leave it be since my colleague is only familiar with very basic and rudimentary coding constructs.
Upon running said function, I specified all required arguments (replicate_size, replacement) except subsample_size. subsample_size is needed to carry out the resampling. This mistake on my part was revealing because, for some strange reason, the code still runs without throwing an error regarding missing a value for subsample_size.
Question: Does anyone have any idea on why this happens?
I'd include more code, but it is very verbose and unwieldy (his code, not mine). Running the for loop outside the function does indeed raise the error regarding the missing value as expected.

Basic questions about scilab

I am taking a numeric calculus class and we are not required to know any scilab programming except the very basic, which is taught through a booklet, since the class is mostly theoretical. I was reading the booklet and found this scilab code meant to find a root of a function through bissection method.
The problem is, I can't find a way to make it work. I tried to call it with bissecao(x,-1,1,0.1,40) however it didn't work.
The error I got was:
at line 3 of function bissecao ( E:\Downloads\bisseccao3.sce line 3 )
Invalid index.
As I highly doubt that the code itself isn't working, and I tried to search for anything I could spot that seemed wrong, to no avail, I guess I am probably calling it wrong, somehow.
The code is the following:
function p = bissecao(f, a, b, TOL, N)
i = 1
fa = f(a)
while (i <= N)
//iteraction of the bissection
p = a + (b-a)/2
fp = f(p)
//stop condition
if ((fp == 0) | ((b-a)/2 < TOL)) then
return p
end
//bissects the interval
i = i+1
if (fa * fp > 0) then
a = p
fa = fp
else
b = p
end
end
error ('Max number iter. exceded!')
endfunction
Where f is a function(I guess), a and b are the limits of the interval in which we will be iterating, TOL is the tolerance at which the program terminates close to a zero, and N is the maximum number of iteractions.
Any help on how to make this run is greatly appreciated.
Error in bissecao
The only error your bissecao function have is the call to return :
In a function return stops the execution of the function,
[x1,..,xn]=return(a1,..,an) stops the execution of the function and
put the local variables ai in calling environment under names xi.
So you should either call it without any argument (input our output) and the function will exit and return p.
Or you could call y1 = return(p) and the function will exit and p will be stored in y1.
It is better to use the non-arguments form return in functions to avoid changing values of variables in the parent/calling script/functions (possible side-effect).
The argument form is more useful when interactively debugging with pause:
In pause mode, it allows to return to lower level.
[x1,..,xn]=return(a1,..,an) returns to lower level and put the local
variables ai in calling environment under names xi.
Error in calling bissecao
The problem may come by your call: bissecao(x,-1,1,0.1,40) because you didn't defined x. Just fixing this by creating a function solves the problem:
function y=x(t)
y=t+0.3
enfunction
x0=bissecao(x,-1,1,0.1,40) // changed 'return p' to 'return'
disp(x0) // gives -0.3 as expected

Simulated Annealing in R: GenSA running time

I am using simulated annealing, as implemented in R's package GenSa (function GenSA), to search for values of input variables that result in "good values" (compared to some baseline) of a highly dimensional function. I noticed that setting maximum number of calls of the objective function has no effect on the running time. Am I doing something wrong or is this a bug?
Here is a modification of the example given in GenSA help file.
library(GenSA)
Rastrigin <- local({
index <- 0
function(x){
index <<- index + 1
if(index%%1000 == 0){
cat(index, " ")
}
sum(x^2 - 10*cos(2*pi*x)) + 10*length(x)
}
})
set.seed(1234)
dimension <- 1000
lower <- rep(-5.12, dimension)
upper <- rep(5.12, dimension)
out <- GenSA(lower = lower, upper = upper, fn = Rastrigin, control = list(max.call = 10^4))
Even though the max.call is specified to be 10,000, GenSA calls the objective function more than 46,000 times (note that the objective is called within a local environment in order to track the number of calls). The same problem rises when trying to specify the maximum running time via max.time.
This is an answer by the package maintainer :
max.call and max.time are soft limits that do not include local
searches that are performed before reaching these limits. The
algorithm does not stop the local search strategy loop before its end
and this may exceed the limitation that you have set but will stop
after that last search. We have designed the algorithm that way to
make sure that the algorithm isn't stopped in the middle of searching
valley. Such an option to stop anywhere will be implemented in the
next release of the package.

Behavior of optim() function in R

I'm doing maximum likelihood estimation using the R optim function.
The command I used is
optim(3, func, lower=1.0001, method="L-BFGS-B")$par
The function func has infinite value if the parameter is 1.
Thus I set the lower value to be 1.0001.
But sometime an error occurs.
Error in optim(3, func, lower = 1.0001, method = "L-BFGS-B", sx = sx, :
L-BFGS-B needs finite values of 'fn'
What happened next is hard to understand.
If I run the same command again, then it gives the result 1.0001 which is lower limit.
It seems that the optim function 'learns' that 1 is not the proper answer.
How can the optim function can give the answer 1.0001 at my first run?
P.S.
I just found that this problem occurs only in stand-alone R-console. If I run the same code in R Studio, it does not occur. Very strange.
The method "L-BFGS-B" requires all computed values of the function to be finite.
It seems, for some reason, that optim is evaluating your function at the value of 1.0, giving you an inf, then throwing an error.
If you want a quick hack, try defining a new function that gives a very high value(or low if you're trying to maximize) for inputs of 1.
func2 <- function(x){
if (x == 1){
return -9999
}
else{
return func(x)
}
}
optim(3, func2, lower=1.0001, method="L-BFGS-B")$par
(Posted as answer rather than comment for now; will delete later if appropriate.)
For what it's worth, I can't get this example (with a singularity at 1) to fail, even using the default control parameters (e.g. ndeps=1e-3):
func <- function(x) 1/(x-1)*x^2
library(numDeriv)
grad(func,x=2) ## critical point at x=2
optim(par=1+1e-4,fn=func,method="L-BFGS-B",lower=1+1e-4)
Try a wide range of starting values:
svec <- 1+10^(seq(-4,2,by=0.5))
sapply(svec,optim,fn=func,method="L-BFGS-B",lower=1+1e-4)
These all work.

R script question - is.na telling me the condition has length > 1

In my r script, I do perform an nls to get a fit value:
fit <- nls(...)
and then after that, I test if the nls succeeded by doing this:
if(is.na(fit)) {
print("succeeded")
}
but I get warnings:
the condition has length > 1 and only the first element will be used
am I doing this wrong? if so, what should I do? if not, how do I remove the warning? thanks!
nls induces an error if the fitting failed. So, is.null after try(nls(...)) is the correct way.
here is a piece of code I used when using nls fit for uncertain data:
fit <- NULL
while (TRUE) {
start <- list(...) # try somewhat randomized initial parameter
try(fit <- nls(..., start = start)) # performe nls
if (!is.null(fit)) break;
}

Resources