Linear optimization with R - r

I'm trying to maximize the function by x[1], x[2]:
(a - 1) * x[1] * c + (a - 1) * x[2] * b * d
Where a, b, c, d are known positive constants and
0 < x[1] < 1, 0 < x[2] < 1,
x[1] + x[2] = 1
Using NlcOptim, I did:
solver_1 <- function(a, b, c, d){
obj = function(x){
return((a - 1) * x[1] * c + (a - 1) * x[2] * b * d )
}
con = function(x){
f = NULL
f = rbind(f, x[1] + x[2] - 1)
return(list(ceq = f, c = NULL))
}
x0 = c(1, 0)
solnl(x0, objfun = obj, confun = con)
}
solver_1(1.2, 5.2, 0.8, 0.1)
But it gives me:
Error in if (norm(H, "I") == 0) { : missing value where TRUE/FALSE needed
Does anyone know what I'm doing wrong?

Using an optimization package is massive overkill.
Either by following up on the comment of #Roland and doing some elementary calculus, or using the corner point theorem of linear programming, your objective function is optimized when x[1] = 0 and x[2] = 1 or vice versa. Thus you only have to evaluated two constant expressions in a,b,c,d and pick the larger of the two:
max(c*(a-1), b*d*(a-1))
For any a other than 1, which is larger will be determined by whether or not c > a*b

Related

MathOptInterface.OTHER_ERROR when trying to use ISRES of NLopt through JuMP

I am trying to minimize a nonlinear function with nonlinear inequality constraints with NLopt and JuMP.
In my test code below, I am minimizing a function with a known global minima.
Local optimizers such as LD_MMA fails to find this global minima, so I am trying to use global optimizers of NLopt that allow nonlinear inequality constraintes.
However, when I check my termination status, it says “termination_status(model) = MathOptInterface.OTHER_ERROR”. I am not sure which part of my code to check for this error.
What could be the cause?
I am using JuMP since in the future I plan to use other solvers such as KNITRO as well, but should I rather use the NLopt syntax?
Below is my code:
# THIS IS A CODE TO SOLVE FOR THE TOYMODEL
# THE EQUILIBRIUM IS CHARACTERIZED BY A NONLINEAR SYSTEM OF ODEs OF INCREASING FUCTIONS B(x) and S(y)
# THE GOAL IS TO APPROXIMATE B(x) and S(y) WITH POLYNOMIALS
# FIND THE POLYNOMIAL COEFFICIENTS THAT MINIMIZE THE LEAST SQUARES OF THE EQUILIBRIUM EQUATIONS
# load packages
using Roots, NLopt, JuMP
# model primitives and other parameters
k = .5 # equal split
d = 1 # degree of polynomial
nparam = 2*d+2 # number of parameters to estimate
m = 10 # number of grids
m -= 1
vGrid = range(0,1,m) # discretize values
c1 = 0 # lower bound for B'() and S'()
c2 = 2 # lower and upper bounds for offers
c3 = 1 # lower and upper bounds for the parameters to be estimated
# objective function to be minimized
function obj(α::T...) where {T<:Real}
# split parameters
αb = α[1:d+1] # coefficients for B(x)
αs = α[d+2:end] # coefficients for S(y)
# define B(x), B'(x), S(y), and S'(y)
B(v) = sum([αb[i] * v .^ (i-1) for i in 1:d+1])
B1(v) = sum([αb[i] * (i-1) * v ^ (i-2) for i in 2:d+1])
S(v) = sum([αs[i] * v .^ (i-1) for i in 1:d+1])
S1(v) = sum([αs[i] * (i-1) * v ^ (i-2) for i in 2:d+1])
# the equilibrium is characterized by the following first order conditions
#FOCb(y) = B(k * y * S1(y) + S(y)) - S(y)
#FOCs(x) = S(- (1-k) * (1-x) * B1(x) + B(x)) - B(x)
function FOCb(y)
sy = S(y)
binv = find_zero(q -> B(q) - sy, (-c2, c2))
return k * y * S1(y) + sy - binv
end
function FOCs(x)
bx = B(x)
sinv = find_zero(q -> S(q) - bx, (-c2, c2))
return (1-k) * (1-x) * B1(x) - B(x) + sinv
end
# evaluate the FOCs at each grid point and return the sum of squares
Eb = [FOCb(y) for y in vGrid]
Es = [FOCs(x) for x in vGrid]
E = [Eb; Es]
return E' * E
end
# this is the actual global minimum
αa = [1/12, 2/3, 1/4, 2/3]
obj(αa...)
# do optimization
model = Model(NLopt.Optimizer)
set_optimizer_attribute(model, "algorithm", :GN_ISRES)
#variable(model, -c3 <= α[1:nparam] <= c3)
#NLconstraint(model, [j = 1:m], sum(α[i] * (i-1) * vGrid[j] ^ (i-2) for i in 2:d+1) >= c1) # B should be increasing
#NLconstraint(model, [j = 1:m], sum(α[d+1+i] * (i-1) * vGrid[j] ^ (i-2) for i in 2:d+1) >= c1) # S should be increasing
register(model, :obj, nparam, obj, autodiff=true)
#NLobjective(model, Min, obj(α...))
println("")
println("Initial values:")
for i in 1:nparam
set_start_value(α[i], αa[i]+rand()*.1)
println(start_value(α[i]))
end
JuMP.optimize!(model)
println("")
#show termination_status(model)
#show objective_value(model)
println("")
println("Solution:")
sol = [value(α[i]) for i in 1:nparam]
My output:
Initial values:
0.11233072522513032
0.7631843020124309
0.3331559403539963
0.7161240026812674
termination_status(model) = MathOptInterface.OTHER_ERROR
objective_value(model) = 0.19116585196576466
Solution:
4-element Vector{Float64}:
0.11233072522513032
0.7631843020124309
0.3331559403539963
0.7161240026812674
I answered on the Julia forum: https://discourse.julialang.org/t/mathoptinterface-other-error-when-trying-to-use-isres-of-nlopt-through-jump/87420/2.
Posting my answer for posterity:
You have multiple issues:
range(0,1,m) should be range(0,1; length = m) (how did this work otherwise?) This is true for Julia 1.6. The range(start, stop, length) method was added for Julia v1.8
Sometimes your objective function errors because the root doesn't exist. If I run with Ipopt, I get
ERROR: ArgumentError: The interval [a,b] is not a bracketing interval.
You need f(a) and f(b) to have different signs (f(a) * f(b) < 0).
Consider a different bracket or try fzero(f, c) with an initial guess c.
Here's what I would do:
using JuMP
import Ipopt
import Roots
function main()
k, d, c1, c2, c3, m = 0.5, 1, 0, 2, 1, 10
nparam = 2 * d + 2
m -= 1
vGrid = range(0, 1; length = m)
function obj(α::T...) where {T<:Real}
αb, αs = α[1:d+1], α[d+2:end]
B(v) = sum(αb[i] * v^(i-1) for i in 1:d+1)
B1(v) = sum(αb[i] * (i-1) * v^(i-2) for i in 2:d+1)
S(v) = sum(αs[i] * v^(i-1) for i in 1:d+1)
S1(v) = sum(αs[i] * (i-1) * v^(i-2) for i in 2:d+1)
function FOCb(y)
sy = S(y)
binv = Roots.fzero(q -> B(q) - sy, zero(T))
return k * y * S1(y) + sy - binv
end
function FOCs(x)
bx = B(x)
sinv = Roots.fzero(q -> S(q) - bx, zero(T))
return (1-k) * (1-x) * B1(x) - B(x) + sinv
end
return sum(FOCb(x)^2 + FOCs(x)^2 for x in vGrid)
end
αa = [1/12, 2/3, 1/4, 2/3]
model = Model(Ipopt.Optimizer)
#variable(model, -c3 <= α[i=1:nparam] <= c3, start = αa[i]+ 0.1 * rand())
#constraints(model, begin
[j = 1:m], sum(α[i] * (i-1) * vGrid[j]^(i-2) for i in 2:d+1) >= c1
[j = 1:m], sum(α[d+1+i] * (i-1) * vGrid[j]^(i-2) for i in 2:d+1) >= c1
end)
register(model, :obj, nparam, obj; autodiff = true)
#NLobjective(model, Min, obj(α...))
optimize!(model)
print(solution_summary(model))
return value.(α)
end
main()

NaN values while computing the probability in binomial distribution

I would like to compute integrate the following function
riskFunction <- function(theta, n, r, s)
{
risk <- 0
for (j in 1:n)
{
risk <- risk + abs(theta - r * j - s) * dbinom(j, n, theta)
}
return(risk)
}
using the trapeizodal method on the interval [0, 1]. That's my code
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:m-1)
{
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}
After calling trapezoidalMethod
trapeizodalMethod(a=0, b=1, m=100, n=100, r=0.01, s=0)
more than 50 errors occurs: In dbinom(j, 100, theta) : NaN produced.
I have no idea what might have gone wrong. I would appreciate any hints or tips.
That warning arises when dbinom(x, size, prob, log = FALSE) has prob outside [0, 1]. In your case, theta = -0.01 occurs because the loop is not running as you expected.
The binary operator : has higher precedence than binary -. So for example 1:5-1 is evaluated as (1:5) - 1, not 1:(5 - 1). You want
trapeizodalMethod <- function(a, b, m, n, r, s)
{
intValue <- 0
h <- (b - a)/m
for (i in 0:(m-1)) {
# ^^^^^
intValue <- intValue + 0.5 * (riskFunction(a + i * h, n=n, r=r, s=s) + riskFunction(a + (i + 1) * h, n=n, r=r, s=s)) * h
}
return(intValue)
}

Create a new probability distribution (relying on the previous r.v.) in R

I want to write this pdf and generate random numbers using it.
Suppose that X is a random variable that takes the values 0 or 1 only, the pdf is as follows:
P(X_t) = a^[(1-X_(t-1))*X_t] * (1-a)^[(1-X_(t-1))*(1-X_t)] * b^[X_(t-1)*(1-X_t)] * (1-b)^[X_(t-1)*X_t]
X_t: the current r.v., X_(t-1): the previous r.v., where t=1,2,...,T and the initial value at t=0 is given. Finally, a and b are two known probabilities.
Not sure if I completely understood, but you could possibly do as follows.
If we say 'y' is the previous realisation, and 'x' is the current one, then we have:
P(x=0|y=0) = 1-a
P(x=1|y=0) = a
P(x=0|y=1) = b
P(x=1|y=1) = 1-b
So then we can generate uniform variates U in [0,1] and if y = 0 then set x = 0 if U <= 1-a, else x = 1; and if y = 1, then set x = 0 if U <= b, else set x = 1
The following function may do the trick, where x0 is the initial value of x:
rhany <- function(n, a, b, x0 = 0) {
sim <- c(x0, runif(n-1))
for (i in 2:n){
sim[i] = (sim[i-1] == 0) * ((sim[i] <= 1-a) * 0 + (sim[i] > a) * 1) +
(sim[i-1] == 1) * ((sim[i] <= b) * 0 + (sim[i] > 1-b) * 1)
}
sim
}
So then if you run the function:
rhany(10, 0.1, 0.7)
[1] 0 1 0 1 1 1 0 1 1 0
Admittedly, the for-loop slows down the function; it takes about 9 seconds on my machine to generate 1e7 variates. You could reimplement using the Rcpp package:
library(Rcpp)
cppFunction('NumericVector rhanya(double a, double b, NumericVector zs) {
int n = zs.size();
NumericVector sim = zs;
for (int i = 1; i < n; i++) {
sim[i] = (sim[i-1] == 0) * ((sim[i] <= 1-a) * 0 + (sim[i] > a) * 1) + (sim[i-1] == 1) * ((sim[i] <= b) * 0 + (sim[i] > 1-b) * 1);
}
return(sim);
}')
rhany1 <- function(n, a, b, x0 = 0) {
sim <- c(x0, runif(n-1))
rhanya(a, b, sim)
}
This function rhany1 takes less than 0.5 sec to generate 1e7 variates.
You can test that the two functions rhany and rhany1 give the same results when the same seed is set:
set.seed(123); bc <- rhany(1e7, 0.1, 0.7)
set.seed(123); ac <- rhany1(1e7, 0.1, 0.7)
all.equal(ac, bc)
[1] TRUE

Mixture modeling - troublee with infinite values from exp() and log()

I'm writing a function for Gaussian mixture models with spherical covariance structures--ie $\Sigma_k = \sigma_k^2 I$. This particular function is similar to the mclust package with identifier VII.
http://en.wikipedia.org/wiki/Mixture_model
Anyways, the problem I'm having is running into infinite values for the weight matrix. Definition: Let W be an n x m matrix where n = 1, ..., n (number of obs) and m = 1, ..., m (number of mixtues). Each element of W (ie w_ij) can essentially be defined as a specific form of:
w_im = \frac{a / b * exp(c)}{\sum_i=1^m [a_i / b_i * exp(c_i)]}
Computing this numerically is giving me infinite values. So I'm trying to use the log-identity log(x+y) = log(x) + log(1 + y/x). But the issue is that it's not as simple as log(x+y) but rather log(\sum_i=1^m [a_i / b_i * exp(c_i)]).
Here's some code define:
n_im = a / b * exp(c) ;
d_.m = \sum_i=1^m [a_i / b_i * exp(c_i)] ; and
c_mat[i,j] as the value of the exponent for the [i,j]th term.
n_mat[, i] <- log(a[i]) - log(b[i]) - c[,i] # numerator of w_im
internal_vec1[i] <- (a[i] * b[1])/ (a[1] * b[i]) # an internal for the step below
c_mat2 <- cbind(rep(1, n), c_mat[,1] - c_mat[,-1]) # since e^a / e^b = e^(a-b)
for (i in 1:n) {
d_vec[i] <- n_mat[i,1] + log(sum(internal_vec1 * exp(c_mat2[i,)))
} ## still getting infinite values
I'm trying to define the problem as briefly as possible. the entire function is obviously much larger than this. But, since the problem I'm running into is specifically dealing with infinite (and 1/infinity) values, I'm hoping this snippet is sufficient. Anyone with a coding trick here?
Here is the solution!! (I've spent way too damn long on this)
**The first function log_plus() solves the simple problem where you want log(\sum_{i=1)^n x_i)
**The second function log_plus2() solves the more complicated problem described above where you want log(\sum_{i=1}^n [a_i / b_i * exp(c_i)])
log_plus <- function(xvec) {
m <- length(xvec)
x <- log(xvec[1])
for (j in 2:m) {
sum_j <- sum(xvec[1:j-1])
x <- x + log(1 + xvec[j]/sum_j)
}
return(x)
}
log_plus2 <- function(a, b, c) {
# assumes intended input of form sum(a/b * e^c)
if ((length(a) != length(b)) || (length(a) != length(c))) {
stop("Input equal length vectors")
}
if (!(all(c > 0) || all(c < 0))) {
stop("All values of c must be either > 0 or < 0.")
}
m <- length(a)
# initilialize log sum
x <- log(a[1]) - log(b[1]) + c[1]
# aggregate / loop log sum
for (j in 2:m) {
# build denominator
b2 <- b[1:j-1]
for (i in 1:j-1) {
d1 <- 0
c2 <- c[1:i]
if (all(c2 > 0)) {
c_min <- min(c2[1:j-1])
c2 <- c2 - c_min
} else if (all(c2 < 0)) {
c_min <- max(c2[1:j-1])
c2 <- c2 - c_min
}
d1 <- d1 + a[i] * prod(b2[-i]) * exp(c2[i])
}
den <- b[j] * (d1)
num <- a[j] * prod(b[1:j-1]) * exp(c[j] - c_min)
x <- x + log(1 + num / den)
}
return(x)
}

constrained optimization of a complicated function

I need to optimize the following to find the maximum value for r1:
ad = 0.95*M_D + 0.28*G_D + 0.43*S_D + 2.25*Q_D
as = 0.017*M_A + 0.0064*G_A + 0.0076*S_A + 0.034*Q_A
ccb = 0.0093*M_CC+ 0.0028*G_CC + 0.0042*S_CC + 0.0186*Q_CC
ccd = 0.0223*M_CD + 0.0056*G_CD + 0.0078*S_CD + 0.0446*Q_CD
apb = 1.28*M_P + 2.56*Q_P
r1=(1+ccb*(1+ccd))*ad*as*100/(130-apb)
subject to the following constraints:
0 <= M_CD <= M_CC <= M_A <= M_D <= M_P <= 9
0 <= G_CD <= G_CC <= G_A <= G_D <= 9
0 <= S_CD <= S_CC <= S_A <= S_D <= 9
0 <= Q_CD <= Q_CC <= Q_A <= Q_D <= Q_P <= 3
The approach I've tried before doesn't work very well and I'm hoping to find a better solution.
Once the problem is stated correctly you maybe able to firstly map the parameters to lower and
upper bounds of [0,1]. You can then implement the inequalities inside your function and optimise using an algorithm which accepts basic lower and upper bound constraints. nlminb could be used but the vignette suggests the algorithm used may not be the best.
UPDATE:
With OP revised function
dumFun <- function(p){
p[1] -> M_CD; p[2] -> M_CC; p[3] -> M_A; p[4] -> M_D; p[5] -> M_P;
M_P <- 9*M_P; M_D <- M_P*M_D; M_A <- M_D*M_A; M_CC <- M_A*M_CC; M_CD <- M_CC*M_CD;
p[6] -> G_CD; p[7] -> G_CC; p[8] -> G_A; p[9] -> G_D;
G_D <- 9*G_D; G_A <- G_D*G_A; G_CC <- G_A*G_CC; G_CD <- G_CC*G_CD;
p[10] -> S_CD; p[11] -> S_CC; p[12] -> S_A; p[13] -> S_D;
S_D <- 9*S_D; S_A <- S_D*S_A; S_CC <- S_A*S_CC; S_CD <- S_CC*S_CD;
p[14] -> Q_CD; p[15] -> Q_CC; p[16] -> Q_A; p[17] -> Q_D; p[18] -> Q_P;
Q_P <- 3*Q_P; Q_D <- Q_P*Q_D; Q_A <- Q_D*Q_A; Q_CC <- Q_A*Q_CC; Q_CD <- Q_CC*Q_CD;
ad = 0.95*M_D + 0.28*G_D + 0.43*S_D + 2.25*Q_D
as = 0.017*M_A + 0.0064*G_A + 0.0076*S_A + 0.034*Q_A
ccb = 0.0093*M_CC+ 0.0028*G_CC + 0.0042*S_CC + 0.0186*Q_CC
ccd = 0.0223*M_CD + 0.0056*G_CD + 0.0078*S_CD + 0.0446*Q_CD
apb = 1.28*M_P + 2.56*Q_P
r1=(1+ccb*(1+ccd))*ad*as*100/(130-apb)
-r1
}
require(minqa)
p <- rep(.1, 18)
bobyqa(p, dumFun, lower = rep(0, length(p)), upper = rep(1, length(p)))
parameter estimates: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
objective: -9.65605526502482
number of function evaluations: 97
>
I finally solved my problem not with vectorization but with C. My program containing 14nested loops executes 100 to 1000 times faster with C than with R ! Which is sad because I didn't learn anything new from that and it prooves that R can be pretty useless on some problems but what can we do.

Resources