optim function in R: L-BFGS-B needs finite values of 'fn' - r

I am trying to use the optim function in R for a MLE of three variables, but i keep getting the error: Error in optim(fn = logL_geotest5_test, par = c(0.2, 1.5, 0.3), I = I, :
L-BFGS-B needs finite values of 'fn'
I am trying to understand the reasons behind this error and it seems to be related to the maximal value of loglikelihood beeing beyond .Machine$double.xmax.
This Code is part of the geometricVaR Backtesting Method provided by Pelletier &Wei and i can provide you with the loglikelihood. However, optimization worked ( and occasionally didnt) before, so i assume that this is not the problem. If you wish, i can provide you with the formular for the LL, but it is a long code ( and i wanted to keep this post as short as possible).
I am thankful for any suggestions and ideas.
V is a vector of 250 values.
N<-100
iTest<-mat.or.vec(250,N)
iTest<-replicate(n=N,rbinom(n= 250, size=1, prob = 0.01))
LL_H0<-mat.or.vec(1,N)
for(i in 1:N){
I<-iTest[,i]
logL_gtest<-function(Omega,I,VaR){
a=Omega[1]; b=Omega[2]; z=Omega[3]
logL(I,a,b,z,VaR)
}
lower_boundary<- c(1e-8,0,0); upper_boundary<-rep(1,2,2)
LL_H0help <- optim(fn=logL_gtest, par=c(0.2, 1.5,0.3), I=I,VaR=VaR, lower=lower_boundary, upper=upper_boundary, method= "L-BFGS-B")
LL_H0[,i] <- LL_H0help$value
}
Edit1:
Thank you for your advises so far. I am still looking for the right place to insert the Browser function. Meanwhile I'll give you the LL-function:
logL<-function(I,a,b,z,VaR){
m <- sum(I)
v<-which(I == 1)
v<-c(0,v,250)
d<-c(diff(v))
if(a<0 | a>=1 | b<0 | b>1 | z<0 | (m-1)<3){logL<-NA
}else{
s<-rep(0,length(d))
f<-rep(0,length(d))
for(i in 1:length(d)){
lambda<-mat.or.vec(length(d),1)
lambda<- function(a, b, z, d, VaR){
lambda <- a*ds^(b-1)*(exp(-z*(VaR1)))
return(lambda)
}
VaR1<-VaR[(v[i]+1):v[i+1]]
ds<-seq(1:d[i])
lhelp<-lambda(a, b, z, ds, VaR1)
lhelp<-na.omit(lhelp)
lf<-c(1-lhelp[1:(length(lhelp)-1)], lhelp[length(lhelp)])
f[i]<-prod(lf)
ls<-c(1-lhelp[1:(length(lhelp)-1)])
s[i]<-prod(ls)
}
part1 <- ifelse(d[1]>0,log(s[1]), log(f[1]) )
part2 <- sum(log(f[2:(length(d)-1)]))
part3 <- ifelse(d[length(d)]<250,log(s[length(d)]), log(f[length(d)]))
logL <- part1 + part2 + part3
return(-logL)
}
}
Edit2: Forgot to mention that V is a vector of Value-At-Risk computations, therefore beeing small values of around -0.02.
Edit3:Thank you for your suggestions so far. I replaced any V by the VaR and c by z. VaR is a vector of computed Value-at-risks of length 250. All values are roughly around -0.018 to -0.024.

I doubt anybody can guess what the issue is but I can tell you how to debug it yourself:
Use something like:
browser(expr=yourVariable==Inf)
in your likelihood code so that you can explore the variables values and understand why it comes up. Check the help of this function, very helpful as usual in R. Feel free to edit the answer if there is some typo, I cannot check it in R right now.

Related

Integration of a function with while loop in R

I want to integrate a function involving while loop in R. I have pasted here an MWE. Could anyone please guide about how to get rid of warning messages when integrating such a function?
Thank You
myfun <- function(X, a, b, kmin, kmax){
term <- 0
k <- 1
while(k < kmax | term < 10000){
term <- term + a * b * X^k
k <- k+1
}
fx <- exp(X) * term
return(fx)
}
a <- 5
b <- 4
kmax <- 20
integrate(myfun, lower = 0, upper = 10, a = a, b = b, kmax = kmax)
Produces a warning, accessed via warnings():
In while (k < kmax | term < 10000) { ... :
the condition has length > 1 and only the first element will be used
From the integrate() documentation:
f must accept a vector of inputs and produce a vector of function evaluations at those points.
This is the crux of the problem here, which you can see by running myfun(c(1, 2), a, b, kmin, kmax) and reproducing a similar warning. What's happening is that integrate() wants to pass a vector of inputs to myfun in X; this means that inside your while loop, term will become a vector as well. This creates a problem when the while loop kicks back to the evaluation stage, because now the condition k < kmax | term < 10000 has a vector structure as well (since term does), which while doesn't like.
This warning is very good in this case, because it strongly suggests that integrate() isn't doing what you want it to do. Your goal here isn't to get rid of the warning messages; the function as written simply won't work with integrate() due to the while loop structure.
Your choices for how to proceed are to either (1) rewrite the function in a way that doesn't use a while loop, or (2) just hard-code some numeric integration yourself, perhaps with a for loop. The best way to use R is to vectorize everything and to avoid things like while and for when at all possible.
Finally, I'll note that there seems to be some problem with the underlying function, since myfun(0.5, a, b, kmin, kmax) does not converge (note the problem with the mathematics when the supplied X term is less than 1), so you won't be able to integrate it on the interval [0, 10] no matter what you do.

Issue with the dimension of matrix being optimised in R

I am attempting to calculate some weights in order to perform an indirect treatment comparison using R. I have altered some code slightly, in order to reflect that I am only centring the mean. However, this code will not run.
I believe this is due to the a1 matrix having two columns instead of one, but I really can't work out how to change this. I have tried adding a column of zeros and ones to the matrix, but I'm not sure if this will give me a correct result.
Of course, this may not be the issue at all, but I fail to see what else could be causing this. I have included the code and any advice would be appreciated.
# Objective function
objfn <- function(a1, X){
sum(exp(X %*% a1))
}
# Gradient function
gradfn <- function(a1, X){
colSums(sweep(X, 1, exp(X %*% a1), "*"))
}
X.EM.0 = data$A-age.mean
# Estimate weights
print(opt1 <- optim(par = c(0,0), fn = objfn, gr = gradfn, X = X.EM.0, method = "BFGS"))
a1 <- opt1$par
Such a simple solution, I'm slightly embarrassed to have posted this.
par=c(0,0) should be altered to match the columns of data. Here it should have been changed to one.

Two variable function maximization - R code

So I'm trying to maximize the likelihood function for a gamma-poisson and I've programmed it into R as the following:
lik<- function(x,t,a,b){
for(i in 1:n){
like[i] =
log(gamma(a + x[i]))-log(gamma(a))
-log(gamma(1+x[i] + x[i]*log(t[i]/b)-(a+x[i])*log(1+t[i]/b)
}
return(sum(like))
}
where x and t are the data, and I have n data rows.
I need a and b to be solved for simultaneously. Does a built in function exist in R? Or do I need to hard code an algorithm to solve the system of equations? [I'd rather not] I know optimize() solves for 1 variable and so does fminbnd(). I'm trying to copy the behavior of FindMaximum() in mathematica. In a perfect world I'd like the code to work something like this:
optimize(f=lik, a>0, b>0, x=x, t=t, maximum=TRUE, iteration=5000)
$maximum
a 150
b 6
Thanks.
optim's first argument can be a vector of parameters. So you could try something like this:
lik <- function(p=c(1,1), x, t){
# In the body of the function replace a by p[1] and b by p[2]
}
optim(c(1,1), lik, method = c("L-BFGS-B"), x=x, t=t, control=list(fnscale=-1))
So the solution that ended up working out is:
attempt2d <- optim(
par = c(sumx/sumt, 1), fn = lik, data = data11,
method = "L-BFGS-B", control = list(fnscale = -1, trace=TRUE),
lower=0.1, upper = 170
)
However my parameters run out to 170, essentially meaning that my gamma parameters are Inf. Because gamma() hits infinity relatively quickly. And in mathematica the solutions are a=169 and b=16505, and R gets nowhere near that maxing out at 170. The known solutions are beyond 170 in some cases any solution for this anomaly?

Find minimums with R (1 Variable X, n times a fixed parameter U)

I'm trying to minimize a function f(X,U) = (X*log(X)-1/(1-U))^2
where U=(U_1,...,U_n) ~ U(0,1), that means I have n amount of fixed U's and want to find the min of:
(x_1*ln(x_1)-1/(1-u_1))^2
(x_2*ln(x_2)-1/(1-u_2))^2
......
(x_n*ln(x_n)-1/(1-u_n))^2
For that, I wanted to use the optim function.
I have defined:
n <- 10^3
U <- sort(runif(n,min=0,max=1))
X <- c()
Xsolution<- c()
f <- function(X,U){
return(-(X*log(X)-(1/(1-U)))^2)
} #-, because min(f) = max(-f)
now I have no idea how to do this with optim()? I always get the following error for the following code:
for(i in 1:n){
Xsolution[i] <- optim(f(X,U[i])
}
Error in log(X) : non-numeric argument to mathematical function
Sidenote: I would welcome a method without a for-loop, since for great n, it will take too long. Maybe you can help me get it work with sapply? Or an alternative way?
Alternatively, I thought I got it working with optimize(...,maximize=FALSE,..):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
for (i in 1:n){
xmin[i] <- optimize(f, c(0, 10000), tol = 0.0001, a = U[i])
}
This doesn't work either properly...
Also, the problem may be that it will take tooooo long. I want to do it with n=10^6. But I'm quite sure there has to be a way doing it without a for-loop? I think the for-loop is the problem that makes this take ages. Please help me, I've been sitting on this problem for ages and it's quite frustrating.
Since X * log(X) = 1 / (1 - U[i]) can be solved numerically for any U[i], there is a solution for each distinct U[i] so any of the (X*ln(X)-1/(1-U[i]))^2 can be driven to zero and therefore there is a solution for each distinct U[i]. If typically the U[i] are all distinct that means there are length(U) solutions. The solutions are given by (can omit the unique if the U[i] are all distinct):
f <- function (X, a) ((X*log(X)-(1/(1-a)))^2)
unique(sapply(U, function(a) optimize(f, c(0, 1000000), a = a)$minimum))

Why is the error count divided by 2 in the nn.test function in deepnet R package?

I do not understand why there is '/2'(divided by 2) in the calculation of error_count in the nn.test function in the deepnet R package, as shown below.
function (nn, x, y, t = 0.5)
{
y_p <- nn.predict(nn, x)
m <- nrow(x)
y_p[y_p >= t] <- 1
y_p[y_p < t] <- 0
error_count <- sum(abs(y_p - y))/2
error_count/m
}
To me, "error_count <- sum(abs(y_p - y))" is correct. Can anybody explain why they put the '/2' there?
Thank you for your time!
Just from looking at it, my best guess would be that error_count <- sum(abs(y_p - y)) would not only count to the left of the prediction, but also the offset to the right of the prediction. Since you are calculating error twice, you divide by two.
did you ever get to the bottom of this one? As I see this you are right, it should not be divided by 2 unless there is some error rate definition here that I am not aware of and does not correspond to the actual error count. If I run test on X, Y (0/1 classification) I get an "error rate" of 20% but if I simply use predict on X with the same network to get the predictions and then compare them to Y, the "error count" is actually 40%.

Resources