constrained optimization in R - r

I am trying to use http://rss.acs.unt.edu/Rdoc/library/stats/html/constrOptim.html in R to do optimization in R with some given linear constraints but not able to figure out how to set up the problem.
For example, I need to maximize $f(x,y) = log(x) + \frac{x^2}{y^2}$ subject to constraints $g_1(x,y) = x+y < 1$, $g_2(x,y) = x > 0$ and $g_3(x,y) = y > 0$. How do I do this in R? This is just a hypothetical example. Do not worry about its structure, instead I am interested to know how to set this up in R.
thanks!

Setting up the function was trivial:
fr <- function(x) { x1 <- x[1]
x2 <- x[2]
-(log(x1) + x1^2/x2^2) # need negative since constrOptim is a minimization routine
}
Setting up the constraint matrix was problematic due to a lack of much documentation, and I resorted to experimentation. The help page says "The feasible region is defined by ui %*% theta - ci >= 0". So I tested and this seemed to "work":
> rbind(c(-1,-1),c(1,0), c(0,1) ) %*% c(0.99,0.001) -c(-1,0, 0)
[,1]
[1,] 0.009
[2,] 0.990
[3,] 0.001
So I put in a row for each constraint/boundary:
constrOptim(c(0.99,0.001), fr, NULL, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0)) # the thresholds
For this problem there is a potential difficulty in that for all values of x the function goes to Inf as y -> 0. I do get a max around x=.95 and y=0 even when I push the starting values out to the "corner", but I'm somewhat suspicious that this is not the true maximum which I would have guessed was in the "corner".
EDIT:
Pursuing this I reasoned that the gradient might provide additional "direction" and added a gradient function:
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-(1/x[1] + 2 * x[1]/x[2]^2),
2 * x[1]^2 /x[2]^3 )
}
This did "steer" the optimization a bit closer to the c(.999..., 0) corner, instead of moving away from it, as it did for some starting values. I remain somewhat disappointed that the process seems to "head for the cliff" when the starting values are close to the center of the feasible region:
constrOptim(c(0.99,0.001), fr, grr, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0) )
$par
[1] 9.900007e-01 -3.542673e-16
$value
[1] -7.80924e+30
$counts
function gradient
2001 37
$convergence
[1] 11
$message
[1] "Objective function increased at outer iteration 2"
$outer.iterations
[1] 2
$barrier.value
[1] NaN
Note: Hans Werner Borchers posted a better example on R-Help that succeeded in getting the corner values by setting the constraint slightly away from the edge:
> constrOptim(c(0.25,0.25), fr, NULL,
ui=rbind( c(-1,-1), c(1,0), c(0,1) ),
ci=c(-1, 0.0001, 0.0001))
$par
[1] 0.9999 0.0001

Related

Minimising area under the ROC curve to optimise the parameters of a polynomial predictor with optim

My predictor (x) has U-shaped distribution in relation to the binary outcome (y), with positive outcomes at both low and high values of x, leading to a biconcave roc curve with a poor area under the curve (auc).
To maximise its ability to discriminate the outcome, I am trying to optimise the parameters of a second grade polynomial of x, by using optim and 1 - auc as the cost function to minimise.
x = c(13,7,7,7,1,100,3,4,4,2,2,7,14,8,3,14,5,12,8,
13,9,4,9,4,8,3,13,9,4,4,5,9,10,10,7,6,12,7,2,
6,6,4,3,2,3,10,5,2,5,8,3,5,4,2,7,5,7,6,79,9)
y = c(0,0,1,0,0,1,0,0,1,1,0,1,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0)
theta = c(0, 0, 0)
min_auc <- function(theta, x, y) {
(1 - roc(y, (theta[1] + theta[2]*x + theta[3]*x^2))$auc)
}
optim(theta, min_auc, x = x, y = y)
The results are as follow:
$par
[1] 0.0 0.1 0.0
$value
[1] 0.4380054
$counts
function gradient
8 NA
$convergence
[1] 0
$message
NULL
However, from a manual definition of the parameters, I know that min_auc can be further minimised.
theta = c(0, -40, 1)
(1 - roc(y, (theta[1] + theta[2]*x + theta[3]*(x^2)))$auc)
[1] 0.2762803
Could anyone explain to me what I am doing wrong, please? Is it possibly due to a non-convex cost function?
One possibility is it's a collinearity problem. Scaling the inputs helps:
min_auc <- function(theta, x, y) {
(1 - roc(y, (theta[1] + theta[2]*scale(x) + theta[3]*scale(x)^2))$auc)
}
optim(theta, min_auc, x = x, y = y)
# $par
# [1] -0.02469136 -0.03117284 0.11049383
#
# $value
# [1] 0.2762803
#
# $counts
# function gradient
# 30 NA
#
# $convergence
# [1] 0
#
# $message
# NULL
#
Another potential problem is that the surface over which you're optimizing has some flat spots. Let's say, for example, that we fix the intercept in this equation to -2. This is about what you get if you do qlogis(mean(y)). Then, you're only optimizing over 2 parameters so the surface is easier to see. Here's what it looks like with the two remaining theta terms on the two horizontal axes and the 1-AUC value on the y-axis.
min_auc <- function(theta, x, y) {
(1 - roc(y, (-2 + theta[1]*scale(x) + theta[2]*scale(x)^2))$auc)
}
s <- seq(-.25, .25, length=50)
o <- outer(s, s, Vectorize(function(z,w)min_auc(c(z,w),x,y)))
library(plotly)
plot_ly(x = ~s, y = ~s, z = ~o) %>% add_surface()
As you may have noticed above, there is no unique solution to the problem. There are lots of solutions that seem to get to the minimum value.

Minimizing an objective function and identify vector values

I have to minimize the following formula in R:
min Z = sum(x_i) * [C * (sum(x_i * B_i)/sum(x_i))]
B_i and C are given constant values and x is a vector that needs to be identified.
x also has constraints which is dependent on the i.
For example:
10 < x1 < 1500
12 < x2 < 1502
...
I have tried using optim; however I am having trouble on how to approaching this problem in R. Can anybody give me any ideas to solve this? Thanks.
EDIT: I have tried the following toy function (a minimal reproducible example):
f <- function(x){(x[1] + x[2]) * 12 * (50 - (x[1]*10 + x[2]*15)/(x[1]+x[2]))}
p = array( c( 1, 0), dim=c(2,1) )
nlm(f,p)
I haven't added the constraints yet as I am not sure how to do so at the moment. But overall, I am guessing there could be a better way to do this. The x vector has more than approximately 50 values.
optim with L-BFGS-B supports bound constraints:
optim(c(1, 0), f, lower = c(10, 12), upper = c(1500, 1502), method = "L-BFGS-B")
giving:
$par
[1] 10 12
$value
[1] 9840
$counts
function gradient
1 1
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

R optimize multiple parameters

I am using R optim() function to estimate set of parameters which optimize user defined function shown below. But optim() out put is:
Error in optim(pstart, llAgedepfn, method = "L-BFGS-B", upper = up, lower = lo) :
L-BFGS-B needs finite values of 'fn'
Please help. The complete script is shown below:
dataM<-cbind(c(1.91,0.29,0.08,0.02,0.01,0.28,0.45,0.36,0.42,0.17,0.16,0.06,0.17,0.17,0.12),
c(0.27,4.54,0.59,0.05,0.04,0.13,0.48,0.68,0.66,0.18,0.11,0.06,0.08,0.08,0.08),
c(0.07,0.57,4.48,0.48,0.02,0.05,0.09,0.43,0.78,0.52,0.17,0.10,0.05,0.05,0.14),
c(0.02,0.04,0.44,4.34,0.36,0.09,0.07,0.11,0.41,0.77,0.43,0.10,0.03,0.04,0.14),
c(0.01,0.04,0.01,0.36,2.20,0.46,0.19,0.15,0.19,0.34,0.62,0.30,0.09,0.03,0.22),
c(0.22,0.11,0.05,0.09,0.45,0.91,0.61,0.43,0.37,0.26,0.41,0.63,0.29,0.16,0.15),
c(0.31,0.35,0.07,0.05,0.16,0.54,0.81,0.59,0.48,0.36,0.33,0.43,0.47,0.26,0.20),
c(0.22,0.45,0.29,0.08,0.11,0.34,0.53,0.85,0.71,0.39,0.27,0.26,0.26,0.28,0.38),
c(0.22,0.36,0.44,0.26,0.12,0.24,0.36,0.59,0.91,0.61,0.35,0.28,0.20,0.22,0.29),
c(0.09,0.10,0.30,0.49,0.22,0.17,0.28,0.33,0.62,0.80,0.52,0.29,0.20,0.11,0.46),
c(0.10,0.07,0.12,0.32,0.48,0.32,0.30,0.27,0.42,0.61,0.78,0.47,0.33,0.23,0.49),
c(0.04,0.04,0.06,0.08,0.24,0.53,0.41,0.28,0.36,0.36,0.50,0.67,0.51,0.19,0.47),
c(0.10,0.05,0.04,0.02,0.07,0.23,0.43,0.26,0.23,0.23,0.33,0.48,0.75,0.51,0.49),
c(0.05,0.04,0.03,0.05,0.02,0.10,0.19,0.22,0.21,0.10,0.18,0.14,0.40,0.79,0.82),
c(0.03,0.02,0.03,0.03,0.06,0.04,0.06,0.12,0.11,0.18,0.16,0.14,0.16,0.34,1.26)
)
NormCM <- dataM/eigen(CMWkday)$values[1] #Normalizing the contact mtrix - divide by the largest eigen value
w <- c(495,528,548,603,617,634,720,801,957,937,798,755,795,1016,2469)
g2 <- c(770,622,726,559,410,547,564,472,399,397,340,308,337,91,84)
h2 <- c(269,426,556,430,271,284,303,207,194,181,126,106,74,24,23)
z2 <- h2/g2
g1 <- c(774,527,665,508,459,539,543,492,402,412,365,342,213,146,152)
h1 <- c(56,31,84,173,103,85,123,70,71,80,55,25,18,12,26)
z1 <- h1/g1
#### Normal loglikelihood #########
llnormfn <- function(q) {
tol <- 1e-9
final.size.start <- 0.8
zeta <- rep(final.size.start, nrow(NormCM))
last.zeta <- rep(0, nrow(NormCM))
first.run <- T
current.diff <- tol+1
loglik <- 0
while (current.diff > tol) {
zeta <- 1-exp(-(q*(zeta%*%NormCM)))
current.diff <- sum(abs(last.zeta-zeta))
last.zeta <-zeta
}
mu <- c(zeta)
zigma <- z1*(1-z1)/g1 + (z1+mu)*(1-(z1+mu))/g2
logliknorm <- -sum((((z2-z1)-mu)**2)/2*zigma + 0.5*log(2*pi*zigma))
return(logliknorm)
}
pstart <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
up <- c(5,5,5,5,5,5,5,5,5,5,5,5,5,5,5)
lo <- c(0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1)
estm <- optim(pstart, llnormfn, method = "L-BFGS-B", upper = up, lower = lo )
Your llnormfn doesn't return a finite value for all values of its parameters within the range. For example at the upper limit:
> llnormfn(up)
[1] NaN
Warning message:
In log(2 * pi * zigma) : NaNs produced
Because zigma must be less than zero here.
If you restrict the range a bit you can eventually find a spot where it does work...
> llnormfn(up-2)
[1] NaN
Warning message:
In log(2 * pi * zigma) : NaNs produced
> llnormfn(up-3)
[1] 42.96818
Let's check it works at the lower range:
> llnormfn(lo)
[1] 41.92578
that looks fine. So either you've set that upper limit outside the computationally valid range of your function, or you've got a bug in your llnormfn function, or both, or something else.
If you do run the optimisation with a reduced upper bound you do get convergence:
> estm <- optim(pstart, llnormfn, method = "L-BFGS-B", upper = up-3, lower = lo )
> estm
$par
[1] 1.9042672 1.0891264 0.9916916 0.6208685 1.2413983 1.4822433 1.1243878
[8] 1.5224263 1.3686933 1.4876350 1.6231518 2.0000000 2.0000000 2.0000000
[15] 2.0000000
$value
[1] 38.32182
$counts
function gradient
23 23
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
Although you might notice some of those parameters are at the upper value (2.0) which is an alarm bell.
Check your function behaves sensibly for its input values - try fixing all-but-one and plotting how llnormfn behaves while varying one. I just had a quick look and the function does not look smooth at all, with lots of discontinuities, so I doubt BFGS is a good method for optimising.
e.g varying the fifth parameter between 0.1 and 2:
> s = seq(0.1,2,len=300)
> ss = sapply(1:length(s),function(i){ll=lo;ll[5]=s[i];llnormfn(ll)})
> plot(s,ss)
gives:

Constraint Optimization with one parameter included in the constraint of the other

I want to calculate the following
So I want to find Theta and Sigma that maximizes the function.
The constraints are:
> Theta>-Sigma
> -1<Sigma<1
So one of my problem is that I dont know how to deal with the fact that one parameter is included in the constraint of the other Parameter, that I want to optimize over.
I tried with optim(), constrOptim and dfoptim!
Using optim():
k=8
i=1:(k-1)
x=c(5,0.2)
n=24
nj=c(3,4,8,1,1,4,2,1)
EPPF <- function(x,n,nj) {
y=(x[1]+1):(x[1]+1+(n-1)-1)
z=-(prod(x[1]+i*x[2])/(prod(y))*prod(sapply(nj, hfun)))
return(z)}
hfun <- function(p){
h=(1-x[2]):((1-x[2])+p-1)
hfun=prod(h)
return(hfun)
}
> optim(c(6,0.3), fn=EPPF,method = "L-BFGS-B", n=n,nj=nj, lower = c(-x[1],-1), upper = c(Inf,1))
$par
[1] 6.0 0.3
$value
[1] -1.258458e-15
$counts
function gradient
2 2
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
I also tried using a constructor function:
make.EPPF <- function(n,nj,fixed=c(FALSE,FALSE)){
params <-fixed
function(p) {
hfun <- function(y){
h=(1-sigma):((1-sigma)+y-1)
hfun=prod(h)
return(hfun)
}
params[!fixed] <- p
theta <- params[1]
sigma <- params[2]
y=(theta+1):(theta+1+(n-1)-1)
z=(prod(theta+i*sigma)/(prod(y))*prod(sapply(nj, hfun)))
z
}
}
EPPF <-make.EPPF (n,nj)
> optim(c(theta=6, sigma=0.5), fn=EPPF,method = "L-BFGS-B",lower = c(-sigma,-1), upper = c(Inf,1))
Error in optim(c(theta = 6, sigma = 0.5), fn = EPPF, method = "L-BFGS-B", :
object 'sigma' not found
Using constrOptim():
> A <- matrix(c(1,1,0,1,0,-1),3,2,byrow=T)
> b <- c(0,-1,-1)
>
> constrOptim(c(3,0.3),EPPF,NULL,A,b, control=list(fnscale=-1))
$par
[1] 3.0 0.3
$value
[1] 9.712117e-16
$counts
[1] 0
$convergence
[1] 0
$message
NULL
$outer.iterations
[1] 1
$barrier.value
[1] 7.313452e-05
Using Package dfoptim:
> library(dfoptim)
> nmkb(x=c(6,0.3), EPPF, lower=c(-x[2],-1), upper=c(Inf, 1 ))
Error in par < lower :
comparison (3) is possible only for atomic and list types
Either there is for some reasons no convergence or some other Errors.
I am relativ new to programming and R and would really appreciate if someone could help me. Thanks!
These are 3 linear inequality constraints:
sigma + theta > 0
sigma + 1 > 0
-sigma + 1 > 0
You can do this in maxLik. But note that maxLik maximizes the function, hence remove the '-' in front of z. Here is the code that works for me (using Rscript):
k=8
i=1:(k-1)
x=c(5,0.2)
n=24
nj=c(3,4,8,1,1,4,2,1)
EPPF <- function(x,n,nj) {
theta <- x[1]
sigma <- x[2]
y=(x[1]+1):(x[1]+1+(n-1)-1)
z <- prod(x[1]+i*x[2])/(prod(y))*prod(sapply(nj, hfun))
z <- log(z)
return(z)
}
hfun <- function(p){
h=(1-x[2]):((1-x[2])+p-1)
hfun=prod(h)
return(hfun)
}
library(maxLik)
constraints <- list(ineqA=matrix(c(1,0,0,1,1,-1),3,2), ineqB=c(0,1,1))
m <- maxBFGS(EPPF, start=c(6,0.3), constraints=constraints, n=n, nj=nj)
print(summary(m))
I also took logarithm of the result as this leads to more "human" numbers. Otherwise you have to re-tune the stopping conditions. The answer seems to be -1, 1.

Use of constraints in R

Simple one here, i once knew this but it has been lost over the years.
Simple equation easy to code in R:
f(x,y) = 2x^2 + 4y^2 + 6x - 8y + 15
And i have constraints of x > 1 and y > -1.
I cant for the life of me remember how to write the constraints properly in R and the book i have is no use
Cheers for any help
Looking for the minimum and maximum
Define your function that takes a single vector of arguments:
myfun <- function(xy) {
x <- xy[1]
y <- xy[2]
2*x^2 + 4*y^2 + 6*x - 8*y + 15
}
Supply starting values to optim and specify your lower bounds for x and y:
starting_values <- c(0, 0)
optim(starting_values, myfun, lower=c(1, -1), method='L-BFGS-B')
optim output:
$par
[1] 1 1
$value
[1] 19
$counts
function gradient
2 2
$convergence
[1] 0
$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

Resources