(If anyone has a suggestion for a better title, please let me know.)
I am trying to write a backward induction optimization problem. (That might not be important, but if it helps, great.)
I have a function that is a function of two variables, x and y.
I have a matrix for which I know only the terminal column, and each column needs to be solves backwards using the last column and optimization over x and y.
For example
m.state=matrix(1:16,16,1)
m.valuemat=matrix(0,16,5)
# five is number of periods
#16 is num of states (rows)
##Suppose i want to make optim avoid chosing a configuration that lands us in states 1-5 at the end
m.valuemat[1:5,5]=-Inf
f.foo0=function(x,y){
util=2*x^2-y^1.5
return(util)
}
foo=function(x,y,a){
footomorrow=function(x,y,a){
at1=-x+2*y+a
atround=abs(m.state-at1)
round2=m.state[which(min(atround)==atround)]
at1=round2
Vtp1=m.valuemat[which(m.state==at1),(5+1)]
return(Vtp1)
}
valuetoday=f.foo0(x,y)+.9*footomorrow(x,y,a)
return(valuetoday)
}
# I know the final column should be all 0's
for(i in 1:4){
print(i)
i=5-i
for(j in 1:16){
tempfunction=function(x){
foo(x[1],x[2],m.state[j])
}
result=optim(c(.001,1), tempfunction, gr = NULL, method = "L-BFGS-B",
lower = c(0.001,0.001), upper = c(5,1),
control = list(fnscale=-1,
maxit=50000), hessian = FALSE)
m.valuemat[j,i]=result$value
print( m.valuemat)
}
}
The error you get is: Error in optim(c(0.001, 1), f.Vt.ext, gr = NULL, method = "L-BFGS-B", :
L-BFGS-B needs finite values of 'fn'.
Is there a way to make optim smarter about this? Or a condition I can put or something? This is obviously a simplified version of my real code.
Related
Roughly speaking, I seek to solve a maximation problem in R, where the objective function has the following structure: log[f(theta)*g(theta)]. Thus I want to solve
Max log[f(theta)*g(theta)]
The problem comes from the fact that g(theta) is obtained from another (minimization) problem in theta with restrictions. g(theta) is defined as:
g(theta) = argmin {Min h(x,theta)*g}.
Since h(x,theta) depends on theta, the optimal g that minimizes h(x,theta)*g must be a function of theta.
My approach, so far, has been to first define the function constrOptim(.) such that I can tell R that I want to minimize h(x,theta)*g subject to some restrictions, and then I incorporate that optimal g(.) into the second function (in theta). With this, I use the function optim(.) to try to max log[f(theta)*g(theta)].
Here is my code:
First, the function problem_min, which is a function of theta, gives the optimal g(theta), which is obtained as a solution to the minimization problem with constraints (sol_min).
problem_min <- function(theta_est){
reg_theta_int <- reg_matrix%*%theta_est
for(i in 1:(n*T)){
for (j in 1:length(alpha)){
L_1[i,j] = exp(alpha[j]+reg_theta_int[i])/(1+exp(alpha[j]+reg_theta_int[i]))
}
}
y_tilde <- full_data$y - L_1[,1]
L_1_tilde <- L_1[,-1] - L_1[,1]
eval_funct_g <- function(g){
return((sum((y_tilde-L_1%*%g)^2))*0.5*(1/n)*(1/T))
}
sol_min <- constrOptim(theta = g_int, f = eval_funct_g, grad = NULL, ui = R, ci = r, mu = 1e-04,
method = "Nelder-Mead",
outer.iterations = 100, outer.eps = 1e-05,
hessian = FALSE)
g_theta = c(1-sum(sol_min$par), sol_min$par)
return(g_theta)
}
Once I have the optimal g(theta), which is a vector of numbers, I plug in log[f(theta)*g(theta)] to maximize the whole expression using optim(.) :
funct_f_g <- function(theta_est){
full_data$reg_theta_est <- reg_matrix%*%theta_est
for(i in 1:n){
for (j in 1:length(alpha)){
for(t in 1:T){
product[t] = exp(full_data$y[full_data$t==t & full_data$id==i]*(alpha[j]+full_data$reg_theta_est[full_data$t==t & full_data$id==i]))/(1+exp(alpha[j]+full_data$reg_theta_est[full_data$t==t & full_data$id==i]))
}
L_ml[i,j] = prod(product)
}}
return(sum(log(L_ml%*%problem_min(theta_est))))
}
sol_ml <- optim(par = theta_int, fn = funct_f_g, method=c("Nelder-Mead"),
lower=-Inf, upper=Inf,
control=list(fnscale=-1),
hessian = FALSE)
theta_opt <- sol_ml$par
}
sol_ml intends to solve Max log[f(theta)*g(theta)] while incorporating the fact that g(theta) should be previously chosen optimally.
A variable_int means that it gives an initial value.
When I run the previous code R tells me that the objective function in optim(.) cannot be evaluated. Nevertheless, when I evaluate funct_f_g at some give theta_est it runs perfectly. Thus I think that there is something wrong with the optim(.) function regarding how I am trying to tell R that the problem has the previous structure.
If you have a different approach to approach my problem, or an explanation about what I am not doing correctly, it will be great!
I know that I am not giving a description of all the matrices and operations that are involved in the previous problem. I skip this for simplicity, hoping that the general structure of the problems can be understood.
I was commanded with the following question. Write a function, named pdice, to simulate a weighted die that has probability of landing on 1:6 of (p1,p2,p3,p4,p5,p6) respectively. You can simulate this by using the parameter prob = p in the sample function, where p is a vector of non-negative numbers of length 6 (at least one needs to be >0). It will use p/sum(p) as the probabilities. This function has arguments p, n. Check ?sample to find out what conditions you need to check for p. Your function should generate an error message if these conditions are not met. Below is my code thus far, which runs, though giving back a warning about a coercing error of double to logical.
pdice <- function(n, p){
weightofDies <- c(1/40, rep(4/40,4), 23/40)
roll <- sample(1:6, size = n, replace = TRUE, prob = weightofDies)
if(n>0 && all(p=1)) {
return(roll)
}
else {
print("Error, Conditions Not Met")
}
}
I'm confused when the question says use parameter prob = p, then defines p as a vector of non-negative numbers of length 6. How can a probability be defined as a vector? Thus when it came to the conditions my brief understanding made sure the number of rolls (n) was greater than zero. And with p I just went on ahead to make sure the probabilities of this "Vector" added up to one. However not sure if my process thus far is correct. I created my biased probabilities via weightofDies.
I am calibrating a model and for that I have to estimate a parameter for each input combination I give to the objective function. I have a bit more than 10k input combinations and I want to minimize the parameter for each combination. All other variables in the model are known. I achieved to estimate 1 minimal value for the whole set but that doesn't help me, and when I tried my approach for each combination I get the error: Error in mP[, logik] <- mPv[, logik, drop = FALSE] : NAs are not allowed in subscripted assignments.
My objective function looks like this
x_vol <- vector(mode = "double", length = 10776)
objective_function_vol <- function(x_vol){
S <- calibration_set$index_level
K <- calibration_set$strike
tau <- calibration_set$tau
r <- calibration_set$riskfree_rate
q <- calibration_set$q
model_prices_vol <- vector(mode = "double", length = 10776)
for (i in 1:10776){
model_prices_vol[i] <- hestonCallcf(S = S[i], K = K[i], t = tau[i],
r = r[i], q = 0,
v0 = x_vol[i],
vbar = 0.1064688, rho = -0.9914710,
a = 1.6240300, vvol = 0.98839192)
print(i)
}
diff_sq <- (market_price - model_prices_vol)^2
wdiff <- diff_sq/market_price
error <- sum(wdiff)/10776
return(error)
}
I am using NMOF::DEopt for the optimization. Is it maybe possible to write a second loop which stores the optimal values of x_vol because I think using the subscript i for the known inputted values as well as the unknown is somehow wrong.
The error means that some objective-function calls resulted in NA.
If you only wish to minimize a single parameter (i.e. a scalar), Differential Evolution is probably not the method you want. A grid search along one dimension, possibly with refinements, would likely work better.
Simulating an SIR model in R. I have a data set I am trying to plot accurately with the model. I am right now using the particle filter function, then would like to use the corresponding logLik method on the result. When I do this, I get "[1] -Inf" as a result. I can't find in the documentation why this is and how I can avoid it. Are my parameters for the model not accurate enough? Is there something else wrong?
My function looks like this:
SIRsim %>%
pfilter(Np=5000) -> pf
logLik(pf)
From an online course lesson entitled Likelihood for POMPS https://kingaa.github.io/sbied/pfilter/ , this is the R script for the lesson. However, the code works here... I'm not sure how to reproduce my specific problem with it and unfortunately cannot share the dataset or code I am using because it is for academic research.
library(tidyverse)
library(pomp)
options(stringsAsFactors=FALSE)
stopifnot(packageVersion("pomp")>="3.0")
set.seed(1350254336)
library(tidyverse)
library(pomp)
sir_step <- Csnippet("
double dN_SI = rbinom(S,1-exp(-Beta*I/N*dt));
double dN_IR = rbinom(I,1-exp(-mu_IR*dt));
S -= dN_SI;
I += dN_SI - dN_IR;
R += dN_IR;
H += dN_IR;
")
sir_init <- Csnippet("
S = nearbyint(eta*N);
I = 1;
R = nearbyint((1-eta)*N);
H = 0;
")
dmeas <- Csnippet("
lik = dbinom(reports,H,rho,give_log);
")
rmeas <- Csnippet("
reports = rbinom(H,rho);
")
read_csv("https://kingaa.github.io/sbied/pfilter/Measles_Consett_1948.csv")
%>%
select(week,reports=cases) %>%
filter(week<=42) %>%
pomp(
times="week",t0=0,
rprocess=euler(sir_step,delta.t=1/7),
rinit=sir_init,
rmeasure=rmeas,
dmeasure=dmeas,
accumvars="H",
statenames=c("S","I","R","H"),
paramnames=c("Beta","mu_IR","eta","rho","N"),
params=c(Beta=15,mu_IR=0.5,rho=0.5,eta=0.06,N=38000)
) -> measSIR
measSIR %>%
pfilter(Np=5000) -> pf
logLik(pf)
library(doParallel)
library(doRNG)
registerDoParallel()
registerDoRNG(652643293)
foreach (i=1:10, .combine=c) %dopar% {
measSIR %>% pfilter(Np=5000)
} -> pf
logLik(pf) -> ll
logmeanexp(ll,se=TRUE)
If I set Beta=100 in the code above I can get a negative-infinite log-likelihood.
Replacing the measurement-error snippet with this:
dmeas <- Csnippet("
double ll = dbinom(reports,H,rho,give_log);
lik = (!isfinite(ll) ? -1000 : ll );
")
appears to 'solve' the problem, although you should be a little bit careful; papering over numerical cracks like this is sometimes OK, but could conceivably come back to bite you in some way later on. If you just need to avoid non-finite values long enough to get into a reasonable parameter range this might be OK ...
Some guesses as to why this is happening:
you are somehow getting an "impossible" situation like a positive number of reported cases when the underlying true number of infections is zero.
Sometimes non-finite log-likelihoods occur when a very small positive probability underflows to zero. The equivalent here is likely that the probability of infection 1-exp(-Beta*I/N*dt) goes to 1.0; then any observed outcome where less than 100% of the population is infected is impossible.
You can try to diagnose the situation by seeing what the filtered trajectory actually looks like and comparing it with the data, or by adding debugging statements to the code. If there's a way to run just the deterministic simulation with your parameter values that might tell you pretty quickly what's going wrong.
An easier/more direct way to debug would be to replace the Csnippet you're using for dmeas with an R function: this will be slower but easier to work with (especially if you're not familiar with C coding). If you uncomment the browser() statement below, the code will drop into debug mode when you encounter the bad situation ...
dmeas <- function(reports,H,rho,log, ...) {
lik <- dbinom(reports,size=H,prob=rho,log=log)
if (!is.finite(lik)) {
lik <- -1000
## browser()
}
return(lik)
}
For example:
(t = 3, reports = 2, S = 2280, I = 0, R = 35721, H = 0, Beta = 100,
mu_IR = 0.5, rho = 0.5, eta = 0.06, N = 38000, log = TRUE)
Browse[1]> debug at /tmp/SO65554258.R!ZlSILG#7: return(lik)
Browse[2]> reports
[1] 2
Browse[2]> H
[1] 0
Browse[2]> rho
[1] 0.5
This shows that the problem is indeed that you have a positive number of reported cases when there have been zero infections ... R is trying to compute the binomial probability of observing reports cases out when there are H infections that are potentially reportable, each reported with a probability rho. When the number of trials N in a binomial probability Binom(N,p) is zero, the only possible outcome is zero 'successes' (reported cases), with probability 1. All other outcomes have probability 0 (and log-probability -Inf).
The following code was presented by our teacher in a practical lesson. I have questions about the last line of the code in the optimization step using the function optim() but I add the other lines to clarify the work.
population <- read.csv("data.csv", header=FALSE, sep=';', dec=',')
and the data looks like:
1 8.29
2 5.37
3 10.61
4 5.92
5 14.99
6 9.74
7 15.47
.
.
.
We sample 100 elements from the population as
Sampled_Data <- sample(population$V1, 100)
then we write a function to calculate the likelihood "of the log-normal distribution" for the sampled elements as
MyFunc <- function(Myparameters,data){
Firstparameter <- Myparameters[1]
Secondparameter <- Myparameters[2]
n <- length(data)
Mydistribution <- -n/2*log(2*pi*(Secondparameter^2)) - sum(log(data)) - (1/(2*Secondparameter^2))*sum((log(data)-Firstparameter)^2)
return(Mydistribution)
}
Finally, we use the function optim() to estimate the two parameters of the distribution using the likelihood maximum function
optimisation <- optim(c(1,1),MyFunc,data=Sampled_Data)
in the function optim(), I don't understand why he added the vector c(1,1) while, from the documentation, we should fill it by the initial values for the parameters? does he assume that the initial values are 1 & 1 ? if so, based on what we assume the initial values.
Also, why he added data=Sampled_Data whereas there is no similar thing in the documentation? From the documentation, after adding the function, we should add other things like the gradient, method and bounds! but not the data we have!
Finally, if I want to specify the lower and upper bounds, it is not clear for me which values to use in my case with the log normal distribution.
I was lost where to post the question, here or in cross validated but I saw similar questions here. Anyway, if it is not the suitable place I will delete the question.
Unnamed arguments to functions in R are assigned in the order that they appear in the definition.
If we look at help with help(optim):
Usage
optim(par, fn, gr = NULL, ...,
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN",
"Brent"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE)
We will see that the first argument by default is par. Therefore, you are correct that your instructor has set par = c(1,1) and fn = MyFun. You will see also in help(optim) that you can set upper = and lower = arguments as well.
If you look back to the definition of MyFun, you will see that the first argument is Myparameters. Therefore, when optim is called, c(1,1) is passed to MyFun as its first argument. Thus, on the initial step, both Firstparameter and Secondparameter will be set to 1.
Finally, if you look carefully at help(optim):
Usage
optim(par, fn, gr = NULL, ...,
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN",
"Brent"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE)
Arguments
par Initial values for the parameters to be optimized over.
fn A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
gr A function to return the gradient for the "BFGS", "CG" and "L-BFGS-B" methods. If it is NULL, a finite-difference approximation will be used.
... Further arguments to be passed to fn and gr.
You will see that ... can be further arguments to be passed to fn and thus to MyFun. In this case, data=Sampled_Data.