I'm trying to optimize a function using two variables in R. My concern is that these 2 variables have only specific possible values. I found solution with lower/upper limits using noptr but I'm not able to "force" the value taken by both variables. An example will be easier to understand using constrOptim function:
g <- function(x,y) 100*x+150*y
gb <- function(x) g(x[1], x[2])
A <- matrix(c(1,0,0,1,100,150),3,2,byrow=T)
b <- c(0,0,350)
constrOptim(theta=c(2,2), f=gb, grad=NULL, ui=A, ci=b)
Thus, I want x & y to take the values 0, 1 or 2. In my example, the constraints are further written as x>=0,y>=0 and 100x+150y>=350.
My goal is to minimize 100*x+150*y respecting 100x+150y>=350 where x and y are taking values in c(0,1,2) only!
Depending on what features of the example apply to your actual problem you may be able to use brute force (if problem is not too large), integer linear programming (if objective and constraints are linear) or integer convex programming (if objective and constraints are convex). All of these hold for the example in the question.
# brute force
list(grid = expand.grid(x = 0:2, y = 0:2)) |>
with(cbind(grid, g = apply(grid, 1, gb))) |>
subset(g >= 350) |>
subset(g == min(g))
## x y g
## 6 2 1 350
# integer linear programming
library(lpSolve)
res <- lp("min", c(100, 150), A, c("<=", "<=", ">="), c(2, 2, 350), all.int = TRUE)
res
## Success: the objective function is 350
res$solution
## [1] 2 1
# integer convex programming
library(CVXR)
X <- Variable(2, integer = TRUE)
v <- c(100, 150)
objective <- Minimize(sum(v * X))
constraints <- list(X >= 0, X <= 2, sum(v * X) >= 350)
prob <- Problem(objective, constraints)
CVXR_result <- solve(prob)
CVXR_result$status
## [1] "optimal"
CVXR_result$getValue(X)
## [,1]
## [1,] 2.0000228
## [2,] 0.9999743
Since your objective function and constraint are both linear, your problem is a standard Mixed Integer Linear Programming (MIP) problem. There is a collection of solvers to solve those problems. Here is a formulation using the ompr package as the model manager and the glpk solver:
g <- c(100, 150)
rhs <- 350
model <- MIPModel() %>%
add_variable(x[i], i = 1:2, type = "integer", lb = 0, ub = 2) %>%
set_objective(sum_over(g[i] * x[i], i = 1:2), "min") %>%
add_constraint(sum_over(g[i] * x[i], i = 1:2) >= rhs)
result <- solve_model(model, with_ROI(solver = "glpk"))
result
Status: success
Objective value: 350
solution <- get_solution(result, x[i])
solution
variable i value
1 x 1 2
2 x 2 1
ompr uses simple algebraic notation and is easy to learn:
https://dirkschumacher.github.io/ompr/index.html
Related
I would like to compute the Area Under the Curve defined by a set of experimental values. I created a function to calculate an aproximation of the AUC using the Simpson's rule as I saw in this post. However, the function only works when it receives a vector of odd length. How can I modify the code to add the area of the last trapezoid when the input vector has an even length.
AUC <- function(x, h=1){
# AUC function computes the Area Under the Curve of a time serie using
# the Simpson's Rule (numerical method).
# https://link.springer.com/chapter/10.1007/978-1-4612-4974-0_26
# Arguments
# x: (vector) time serie values
# h: (int) temporal resolution of the time serie. default h=1
n = length(x)-1
xValues = seq(from=1, to=n, by=2)
sum <- list()
for(i in 1:length(xValues)){
n_sub <- xValues[[i]]-1
n <- xValues[[i]]
n_add <- xValues[[i]]+1
v1 <- x[[n_sub+1]]
v2 <- x[[n+1]]
v3 <- x[[n_add+1]]
s <- (h/3)*(v1+4*v2+v3)
sum <- append(sum, s)
}
sum <- unlist(sum)
auc <- sum(sum)
return(auc)
}
Here a data example:
smoothed = c(0.3,0.317,0.379,0.452,0.519,0.573,0.61,0.629,0.628,0.613,0.587,0.556,0.521,
0.485,0.448,0.411,0.363,0.317,0.273,0.227,0.185,0.148,0.12,0.103,0.093,0.086,
0.082,0.079,0.076,0.071,0.066,0.059,0.053,0.051,0.052,0.057,0.067,0.081,0.103,
0.129,0.165,0.209,0.252,0.292,0.328,0.363,0.398,0.431,0.459,0.479,0.491,0.494,
0.488,0.475,0.457,0.43,0.397,0.357,0.316,0.285,0.254,0.227,0.206,0.189,0.181,
0.171,0.157,0.151,0.162,0.192,0.239)
One recommended way to handle an even number of points and still achieve precision is to combine Simpson's 1/3 rule with Simpson's 3/8 rule, which can handle an even number of points. Such approaches can be found in (at least one or perhaps more) engineering textbooks on numerical methods.
However, as a practical matter, you can write a code chunk to check the data length and add a single trapezoid at the end, as was suggested in the last comment of the post to which you linked. I wouldn't assume that it is necessarily as precise as combining Simpson's 1/3 and 3/8 rules, but it is probably reasonable for many applications.
I would double-check my code edits below, but this is the basic idea.
AUC <- function(x, h=1){
# AUC function computes the Area Under the Curve of a time serie using
# the Simpson's Rule (numerical method).
# https://link.springer.com/chapter/10.1007/978-1-4612-4974-0_26
# Arguments
# x: (vector) time serie values
# h: (int) temporal resolution of the time serie. default h=1
#jh edit: check for even data length
#and chop off last data point if even
nn = length(x)
if(length(x) %% 2 == 0){
xlast = x[length(x)]
x = x[-length(x)]
}
n = length(x)-1
xValues = seq(from=1, to=n, by=2)
sum <- list()
for(i in 1:length(xValues)){
n_sub <- xValues[[i]]-1
n <- xValues[[i]]
n_add <- xValues[[i]]+1
v1 <- x[[n_sub+1]]
v2 <- x[[n+1]]
v3 <- x[[n_add+1]]
s <- (h/3)*(v1+4*v2+v3)
sum <- append(sum, s)
}
sum <- unlist(sum)
auc <- sum(sum)
##jh edit: add trapezoid for last two data points to result
if(nn %% 2 == 0){
auc <- auc + (x[length(x)] + xlast)/2 * h
}
return(auc)
}
sm = smoothed[-length(smoothed)]
length(sm)
[1] 70
#even data as an example
AUC(sm)
[1] 20.17633
#original odd data
AUC(smoothed)
[1] 20.389
There may be a good reason for you to prefer using Simpson's rule, but if you're just looking for a quick and efficient estimate of AUC, the trapezoid rule is far easier to implement, and does not require an even number of breaks:
AUC <- function(x, h = 1) sum((x[-1] + x[-length(x)]) / 2 * h)
AUC(smoothed)
#> [1] 20.3945
Here, I show example code that uses the Simpson's 1/3 and 3/8 rules in tandem for the numerical integration of data. As always, the usual caveats about the possibility of coding errors or compatibility issues apply.
The output at the end compares the numerical estimates of this algorithm with the trapezoidal rule using R's "integrate" function.
#Algorithm adapted from:
#Numerical Methods for Engineers, Seventh Edition,
#By Chapra and Canale, page 623
#Modified to accept data instead of functional values
#Modified by: Jeffrey Harkness, M.S.
##Begin Simpson's rule function code
simp13 <- function(dat, h = 1){
ans = 2*h*(dat[1] + 4*dat[2] + dat[3])/6
return(ans)}
simp13m <- function(dat, h = 1){
summ <- dat[1]
n <- length(dat)
nseq <- seq(2,(n-2),2)
for(i in nseq){
summ <- summ + 4*dat[i] + 2*dat[i+1]}
summ <- summ + 4*dat[n-1] + dat[n]
result <- (h*summ)/3
return(result)}
simp38 <- function(dat, h = 1){
ans <- 3*h*(dat[1] + 3*sum(dat[2:3]) + dat[4])/8
return(ans)}
simpson = function(dat, h = 1){
hin = h
len = length(dat)
comp <- len %% 2
##number of segments
if(len == 2){
ans = sum(dat)/2*h} ##n = 2 is the trapezoidal rule
if(len == 3){
ans = simp13(dat, h = hin)}
if(len == 4){
ans = simp38(dat,h = hin)}
if(len == 6){
ans <- simp38(dat[1:4],h = hin) + simp13(dat[4:len],h = hin)}
if(len > 6 & comp == 0){
ans = simp38(dat[1:4],h = hin) + simp13m(dat[4:len],h = hin)}
if(len >= 5 & comp == 1){
ans = simp13m(dat,h = hin)}
return(ans)}
##End Simpson's rule function code
This next section of code shows the performance comparison. This code can easily be altered for different test functions and cases.
The precision difference tends to change with the sample size and test function used; this example is not intended to imply that the difference is always this pronounced.
#other algorithm for comparison purposes, from Allan Cameron above
oa <- function(x, h = 1) sum((x[-1] + x[-length(x)]) / 2 * h)
#Testing and algorithm comparison code
simans = NULL; oaans = NULL; simerr = NULL; oaerr = NULL; mp = NULL
for( j in 1:10){
n = j
#f = function(x) cos(x) + 2 ##Test functions
f = function(x) 0.2 + 25*x - 200*x^2 + 675*x^3 - 900*x^4 + 400*x^5
a = 0;b = 10
h = (b-a)/n
datain = seq(a,b,by = h)
preans = integrate(f,a,b)$value #precise numerical estimate of test function
simans[j] = simpson(f(datain), h = h)
oaans[j] = oa(f(datain), h = h)
(simerr[j] = abs(simans[j] - preans)/preans * 100)
(oaerr[j] = abs(oaans[j] - preans)/preans * 100)
mp[j] = simerr[j] < oaerr[j]
}
(outframe = data.frame("simpsons percent diff" = simerr,"trapezoidal percent diff" = oaerr, "more precise?" = mp, check.names = F))
simpsons percent diff trapezoidal percent diff more precise?
1 214.73489738 214.734897 FALSE
2 15.07958148 64.993410 TRUE
3 6.70203621 29.816799 TRUE
4 0.94247384 16.955208 TRUE
5 0.54830021 10.905620 TRUE
6 0.18616767 7.593825 TRUE
7 0.12051767 5.588209 TRUE
8 0.05890462 4.282980 TRUE
9 0.04087107 3.386525 TRUE
10 0.02412733 2.744500 TRUE
I have a large matrix having this structure :
A B C
A' 9 2 0
B' 2 8 0
C' 0 1 7
The diagonal terms represent the interaction of an individual (A) with his/her brother/sister (A'). Off diagonal elements represent the interaction of an individual with individuals not part of the family.
From a large set of individuals (say a few hundreds), I would like to find subsets (say 10) of individuals having minimal interactions with individuals not part of the family.
I was thinking of using a genetic algorithm (to optimize a parameter that I could calculate from the matrix) but could not find any algorithm that deals with subsets.
Is there a package in R (preferable) doing this ?
Thanks
I'll outline binary genetic algorithm approach using GA package interface that you can use as baseline to create your own implementation. Because you have constraint that you want subset of specific length I will create mutation and crossover operators that will not ruin that specification (otherwise those operators will mostly create infeasible individuals).
mutation
mutation <- function(obj, parent){
vec <- as.vector(obj#population[parent,])
ind1 <- which(vec == 1)
ind2 <- setdiff(seq_along(vec), ind1)
ind1 <- sample(ind1, 1)
ind2 <- sample(ind2, 1)
vec[c(sample(ind1, 1), sample(ind2, 1))] <- c(0, 1)
vec
}
Mutation picks one position with 1 and one position with 0 and change their values to 0 and 1 respectively.
crossover
sizeCrossover <- function(size){
function(obj, parents){
vec1 <- obj#population[parents[1],] == 1
vec2 <- obj#population[parents[2],] == 1
c1 <- vec1 & vec2
c1[sample(which(!c1), size - sum(c1))] <- 1
c2 <- vec1 | vec2
c2[sample(which(c2), sum(c2) - size)] <- 0
list(children = rbind(c1, c2), fitness = c(NA, NA))
}
}
Crossover is variation of arithmetic crossover. In case of & it needs to additionally change some 0 to 1 and in case of | it needs to additionally change some 1 to 0.
initial population
initialPopulation <- function(popSize, N, size){
indices <- do.call(
rbind,
mapply(
function(x, y) cbind(x, y),
seq_len(popSize),
replicate(popSize, sample(seq_len(N), size), simplify = FALSE),
SIMPLIFY = FALSE
)
)
mm <- matrix(0, nrow = popSize, ncol = N)
mm[indices] <- 1
mm
}
If you create completely random initial population most (if not every) individuals will be infeasible. You need to create feasible initial population.
fitness
fitness <- function(vec, m, size){
indices <- which(vec == 1)
-sum(m[indices,indices])
}
GA::ga performas maximization hence the minus sign.
random data and parameters
N <- 200 # matrix dimensions
size <- 10 # subset length
popSize <- 100 # population size for genetic algorithm
m <- matrix(sample(0:10, N^2, TRUE), nrow = N)
diag(m) <- 0
ga
obj <- GA::ga(
type = "binary",
nBits = N,
run = 500,
maxiter = 10000,
popSize = popSize,
fitness = fitness,
m = m,
size = 10,
suggestions = initialPopulation(popSize, N, size),
mutation = mutation,
crossover = sizeCrossover(size)
)
subset <- which(obj#solution[1,] == 1)
note
I'm using sample as modified in Advanced R:
sample <- function(x, size = NULL, replace = FALSE, prob = NULL) {
size <- size %||% length(x)
x[sample.int(length(x), size, replace = replace, prob = prob)]
}
The problem in the question is not fully specified so assume that the problem is as follows. Let v be the vector of column sums of m-diag(diag(m)) and let k be an input specifying the required number of individuals in the subset. Then we wish to find the column names corresponding to the k smallest values in v or in terms of the language of integer linear programming we want to find the 0/1 vector x which minimizes v'x such that sum(x) = k.
We use the inputs and v defined in the Note at the end.
1) sort To minimize this we simply take the column names corresponding to the k smallest values of v.
names(head(sort(v), k))
## [1] "C" "A"
2) knapsack This can also be expressed as a knapsack problem. knapsack maximizes so we use max(v)-v to get the effect of minimization.
library(adagio)
res.knap <- knapsack(rep(1, length(v)), max(v) - v, k)
names(v)[res.knap$indices]
## [1] "A" "C"
3) linear programming We can also use integer linear programming with the following solution.
library(lpSolve)
res.lp <- lp("min", v, t(rep(1, length(v))), "=", k, all.bin = TRUE)
res.lp
## Success: the objective function is 2
names(v)[res.lp$solution == 1]
## [1] "A" "C"
4) genalg We use the rbga.bin genetic algorithm.
library(genalg)
set.seed(13)
f <- function(x) if (sum(x) == k) sum(x * v) else Inf
res.ga <- rbga.bin(size = length(v), evalFunc = f, popSize = 200,
mutation = .01, iters = 400)
cat(summary(res.ga))
## ...snip...
## GA Results
## Best Solution : 1 0 1
Note
The inputs are assumed to be:
k <- 2
m <- structure(c(9L, 2L, 0L, 2L, 8L, 1L, 0L, 0L, 7L), dim = c(3L,
3L), dimnames = list(c("A'", "B'", "C'"), c("A", "B", "C")))
v <- colSums(m - diag(diag(m)))
I have to choose 10 elements of a vector to maximizes a function. Since the vector is pretty long there are to many possibilities (~1000 choose 10) to compute them all. So I started to look into the GA package to use a genetic algorithm.
I came up with this MWE:
values <- 1:1000
# Fitness function which I want to maximise
f <- function(x){
# Choose values
y <- values[x]
# From the first 10 sum up the odd values.
y <- ifelse(y %% 2 != 0, y, 0)
y <- y[1:10]
return(sum(y))
}
# Maximum value of f for this example
y <- ifelse(values %% 2 != 0, values, 0)
sum(sort(y, decreasing = TRUE)[1:10])
# [1] 9900
# Genetic algorithm
GA <- ga(type = "permutation", fitness = f, lower = rep(1, 10), upper = rep(1000, 10), maxiter = 100)
summary(GA)
The results are a bit underwhelming. From summary(GA), I get the feeling that the algorithm always permutates all 1000 values (the solution goes from x1 to x1000) which leads to an inefficient optimization. How can I tell the algorithm that it should only should use 10 values (so the solution is x1 .. x10)?
You should read https://www.jstatsoft.org/article/view/v053i04. You don't have permutation problem but selection one hence you should use binary type of genetic algorithm. Because you want to select exclusively 10 (10 ones and 990 zeroes) you should probably write your own genetic operators because that is constraint that will hardly ever be satisfied by default operators (with inclusion of -Inf in fitness function if you have more than 10 zeroes). One approach:
Population (k tells how much ones you want):
myInit <- function(k){
function(GA){
m <- matrix(0, ncol = GA#nBits, nrow = GA#popSize)
for(i in seq_len(GA#popSize))
m[i, sample(GA#nBits, k)] <- 1
m
}
}
Crossover
myCrossover <- function(GA, parents){
parents <- GA#population[parents,] %>%
apply(1, function(x) which(x == 1)) %>%
t()
parents_diff <- list("vector", 2)
parents_diff[[1]] <- setdiff(parents[2,], parents[1,])
parents_diff[[2]] <- setdiff(parents[1,], parents[2,])
children_ind <- list("vector", 2)
for(i in 1:2){
k <- length(parents_diff[[i]])
change_k <- sample(k, sample(ceiling(k/2), 1))
children_ind[[i]] <- if(length(change_k) > 0){
c(parents[i, -change_k], parents_diff[[i]][change_k])
} else {
parents[i,]
}
}
children <- matrix(0, nrow = 2, ncol = GA#nBits)
for(i in 1:2)
children[i, children_ind[[i]]] <- 1
list(children = children, fitness = c(NA, NA))
}
Mutation
myMutation <- function(GA, parent){
ind <- which(GA#population[parent,] == 1)
n_change <- sample(3, 1)
ind[sample(length(ind), n_change)] <- sample(setdiff(seq_len(GA#nBits), ind), n_change)
parent <- integer(GA#nBits)
parent[ind] <- 1
parent
}
Fitness (your function adapted for binary GA):
f <- function(x, values){
ind <- which(x == 1)
y <- values[ind]
y <- ifelse(y %% 2 != 0, y, 0)
y <- y[1:10]
return(sum(y))
}
GA:
GA <- ga(
type = "binary",
fitness = f,
values = values,
nBits = length(values),
population = myInit(10),
crossover = myCrossover,
mutation = myMutation,
run = 300,
pmutation = 0.3,
maxiter = 10000,
popSize = 100
)
Chosen values
values[which(GA#solution[1,] == 1)]
I wonder if anyone is able to suggest some packages to solve a non-linear optimisation problem which can provide integer variables for an optimum solution? The problem is to minimise a function with an equality constraint subject to some lower and upper boundary constraints.
I've used the 'nloptr' package in R for a non-linear optimisation problem which worked nicely but would now like to extend the method to have some of the variables as integers. From my use and understanding of nloptr so far, it can only return continuous, and not integer variables for an optimum solution.
I believe this sort of problem needs to be solved using mixed-integer non-linear programming.
One example of the problem in a form for nloptr:
min f(x) (x-y)^2/y + (p-q)^2/q
so that (x-y)^2/y + (p-q)^2/q = 10.2
where
x and p are positive integers not equal to 0
and
y and q may or may not be positive integers not equal to 0
The nloptr code for this in R would look like this
library('nloptr')
x1 <- c(50,25,20,15)
fn <- function(x) {
(((x[1] - x[2])^2)/x[2]) + (((x[3] - x[4])^2)/x[4])
}
heq <- function(x) {
fn(x)-10.2
}
lower_limit <- c(0,0,0,0)
upper_limit <- c(67.314, 78, 76.11, 86)
slsqp(x1, fn, lower = lower_limit, upper = upper_limit, hin = NULL, heq = heq, control = list(xtol_rel = 1e-8, check_derivatives = FALSE))
This would output the following:
$par
[1] 46.74823 29.72770 18.93794 16.22137
$value
[1] 10.2
$iter
[1] 6
$convergence
[1] 4
$message
[1] "NLOPT_XTOL_REACHED: Optimization stopped because xtol_rel or xtol_abs (above) was reached."
This is the sort of I result I am looking for but as stated above, I need x and p as integers.
I've had a look at https://cran.r-project.org/web/views/Optimization.html which has a really good list of packages for mixed-integer non-linear programming but just wondered if anyone had experience with any of them and what they think might be most appropriate to solve the problem as stated above.
There was a similar question about this posted around 7 years ago on here but it ended up with a link to the cran page so thought it would be worth asking again.
Thanks very much for your input.
Cheers,
Andrew
Here's an example of how it doesn't work with CVXR, without a simpler objective function. I suspect the problem is not convex, even with the constraints, so an alternative option is required.
#base example from https://cvxr.rbind.io/cvxr_examples/cvxr_gentle-intro/
#install.packages("CVXR")
library(CVXR)
#modified for Stackoverflow integer MIQP ####
#Solves, but terms not normalised by y and q respectively
# Variables minimized over
x <- Variable(1, integer=TRUE)
y <- Variable(1)
p <- Variable(1, integer=TRUE)
q <- Variable(1)
# Problem definition (terms not normalised by y and q respectively)
objective <- Minimize((x - y)^2 + (p - q)^2)
constraints <- list(x >= 0, y >= 0, p >= 0, q >= 0,
x <= 67.314, y <= 78, p <= 76.11, q <= 86)
prob2.1 <- Problem(objective, constraints)
# Problem solution
solution2.1 <- solve(prob2.1)
solution2.1$status
solution2.1$value
solution2.1$getValue(x)
solution2.1$getValue(y)
solution2.1$getValue(p)
solution2.1$getValue(q)
#modified for Stackoverflow integer NLP (not integer) ####
#Does not solve, not convex?
# Variables minimized over
x <- Variable(1)
y <- Variable(1)
p <- Variable(1)
q <- Variable(1)
# Problem definition
objective <- Minimize((x - y)^2/y + (p - q)^2/q)
constraints <- list(x >= 0, y >= 0, p >= 0, q >= 0,
x <= 67.314, y <= 78, p <= 76.11, q <= 86)
prob2.1 <- Problem(objective, constraints)
# Problem solution
solution2.1 <- solve(prob2.1, gp = TRUE)
solution2.1 <- solve(prob2.1, gp = FALSE)
# solution2.1$status
# solution2.1$value
# solution2.1$getValue(x)
# solution2.1$getValue(y)
# solution2.1$getValue(p)
# solution2.1$getValue(q)
#modified for Stackoverflow integer MINLP ####
#Does not solve
# Variables minimized over
x <- Variable(1, integer=TRUE)
y <- Variable(1)
p <- Variable(1, integer=TRUE)
q <- Variable(1)
# Problem definition
objective <- Minimize((x - y)^2/y + (p - q)^2/q)
constraints <- list(x >= 0, y >= 0, p >= 0, q >= 0,
x <= 67.314, y <= 78, p <= 76.11, q <= 86)
prob2.1 <- Problem(objective, constraints)
# Problem solution
solution2.1 <- solve(prob2.1, gp = TRUE)
solution2.1 <- solve(prob2.1, gp = FALSE)
# solution2.1$status
# solution2.1$value
# solution2.1$getValue(x)
# solution2.1$getValue(y)
# solution2.1$getValue(p)
# solution2.1$getValue(q)
#modified for Stackoverflow integer NLP (not integer) ####
#objective multiplied by y*q, Does not solve, not convex?
# Variables minimized over
x <- Variable(1)
y <- Variable(1)
p <- Variable(1)
q <- Variable(1)
# Problem definition
objective <- Minimize((x - y)^2*q + (p - q)^2*y)
constraints <- list(x >= 0, y >= 0, p >= 0, q >= 0,
x <= 67.314, y <= 78, p <= 76.11, q <= 86)
prob2.1 <- Problem(objective, constraints)
# Problem solution
solution2.1 <- solve(prob2.1, gp = TRUE)
solution2.1 <- solve(prob2.1, gp = FALSE)
# solution2.1$status
# solution2.1$value
# solution2.1$getValue(x)
# solution2.1$getValue(y)
# solution2.1$getValue(p)
# solution2.1$getValue(q)
ROI is an option for MINLP problems. I believe it has access to some solvers that are appropriate. It allows access to neos (described in another answer to your question).
If you are interested in seeing what an ROI optimisation looks like, here is an LP (linear programming example:
#ROI example https://epub.wu.ac.at/5858/1/ROI_StatReport.pdf
#install.packages("ROI")
library(ROI)
ROI_available_solvers()
ROI_registered_solvers() #has one solver loaded by default
## Get and load "lpsolve" solver
#install.packages("ROI.plugin.lpsolve", repos=c("https://r-forge.r-project.org/src/contrib",
# "http://cran.at.r-project.org"),dependencies=TRUE)
library(ROI.plugin.lpsolve)
ROI_registered_solvers() #Now it is available to use
## Describe model
A <- rbind(c(5, 7, 2), c(3, 2, -9), c(1, 3, 1))
dir <- c("<=", "<=", "<=")
rhs <- c(61, 35, 31)
lp <- OP(objective = L_objective(c(3, 7, -12)),
constraints = L_constraint(A, dir = dir, rhs = rhs),
bounds = V_bound(li = 3, ui = 3, lb = -10, ub = 10, nobj = 3),
maximum = TRUE)
## When you have a model, you can find out which solvers can solve it
ROI_available_solvers(lp)[, c("Package", "Repository")]
## Solve model
lp_sol <- ROI_solve(lp, solver = "lpsolve")
I've tried the following code using your example to try and replicate the nloptr example in the original question:
#base example from https://cvxr.rbind.io/cvxr_examples/cvxr_gentle-intro/
#install.packages("CVXR")
library(CVXR)
#modified for Stackoverflow integer MINLP (MIQP) ####
#Solves
# Variables minimized over
x <- Variable(1, integer=TRUE)
y <- Variable(1)
p <- Variable(1, integer=TRUE)
q <- Variable(1)
z <- Variable(1)
# Problem definition (terms not normalised by y and q respectively)
objective <- Minimize((x - y)^2 + (p - q)^2 -z)
constraints <- list(x <= 67.314, y <= 78, p <= 76.11, q <= 86, z == 10.2)
prob2.1 <- Problem(objective, constraints)
# Problem solution
solution2.1 <- solve(prob2.1)
solution2.1$status
solution2.1$value
solution2.1$getValue(x)
solution2.1$getValue(y)
solution2.1$getValue(p)
solution2.1$getValue(q)
solution2.1$getValue(z)
However, I get this as the a value of -10.19989 when it should be 0.
> solution2.1$status
[1] "optimal"
> solution2.1$value
[1] -10.19989
> solution2.1$getValue(x)
[1] -1060371
> solution2.1$getValue(y)
[1] -1060371
> solution2.1$getValue(p)
[1] -1517
> solution2.1$getValue(q)
[1] -1517.002
> solution2.1$getValue(z)
[1] 10.2
I can't work out what I need to do for the above to get it working like the nloptr example but ensuring x and p are integers values!
Cheers, Andrew
Because this problem is of a type that is difficult to solve, any general algorithm is not guaranteed to be great for this exact problem (see no free lunch theorem). Indeed, many algorithms are not even likely to converge on the global optimum for a difficult problem. Interestingly, a random search of the problem space at least will converge eventually, because eventually it searches the whole space!
tl/dr Try enumeration of the problem space. For example, if your four variables are integers between 0 and 80, there are only ~80^4=~40million combinations which you could loop through. An intermediate option might be (if only two variables are integers) to solve the problem by optimisation methods for the two remaining variables given a value for the two integers (maybe it is now a convex problem?) and loop through for the integer values.
rneos is a package that lets you access neos, a free solving service with numerous algorithms, including some suitable for MINLP problems (e.g. BONMIN and Couenne, see list here). Unfortunately the problem needs to be formatted as a GAMS or AMPL model. For you, this might mean learning some basic GAMS, and in that scenario, maybe you could just use the GAMS software see here? The free version may be sufficient for your purposes. It can be run as a command line, so you could call it from R if you needed to.
If you are interested in seeing what a neos optimisation looks like, here is an LP (linear programming example:
#rneos example
#from p11 of https://www.pfaffikus.de/talks/rif/files/rif2011.pdf
#install.packages("rneos")
library(rneos)
#library(devtools)
#install_github("duncantl/XMLRPC")
library(XMLRPC)
## NEOS: ping
Nping()
## NEOS: listCategories
NlistCategories()
## NEOS: listSolversInCategory
NlistSolversInCategory(category = "lp")
## NEOS: getSolverTemplate
template <- NgetSolverTemplate(category = "lp", solvername = "MOSEK", inputMethod = "GAMS")
template
#gams file below sourced from https://github.com/cran/rneos/blob/master/inst/ExGAMS/TwoStageStochastic.gms
modc <- paste(paste(readLines("TwoStageStochastic.gms"), collapse = "\n"), "\n")
cat(modc)
argslist <- list(model = modc, options = "", wantlog = "", comments = "")
xmls <- CreateXmlString(neosxml = template, cdatalist = argslist)
## NEOS: printQueue
NprintQueue()
## NEOS: submitJob
(test <- NsubmitJob(xmlstring = xmls, user = "rneos", interface = "", id = 0))
## NEOS: getJobStatus
NgetJobStatus(obj = test)
## NEOS: getFinalResults
NgetFinalResults(obj = test)
R is not the tool for that problem. It requires advanced functionalities for non-linear programming. I turn my attention to Julia language. I used the JuMP package and the Juniper and Ipopt algorithm. It worked well for my MINLP problem.
I am taking baby steps to use metaheuristics for solving constrained optimization problems. I am trying to solve basic Markowitz Mean-Variance optimization model (given below) using NMOFpackage in R.
Min
lambda * [sum{i=1 to N}sum{j = 1 to N}w_i*w_i*Sigma_ij] - (1-lambda) * [sum{i=1 to N}(w_i*mu_i)]
subject to
sum{i=1 to N}{w_i} = 1
0 <= w_i <= 1; i = 1,...,N
where, lambda takes values between 0 and 1, N is number of assets.
Following is my code (Based on Book: Numerical Methods and Optimization in Finance):
library(NMOF)
na <- dim(fundData)[2L]
ns <- dim(fundData)[1L]
Sigma <- cov(fundData)
winf <- 0.0
wsup <- 1.0
m <- colMeans(fundData)
resample <- function(x,...) x[sample.int(length(x),...)]
data <- list(R = t(fundData),
m = m,
na = dim(fundData)[2L],
ns = dim(fundData)[1L],
Sigma = Sigma,
eps = 0.5/100,
winf = winf,
wsup = wsup,
nFP = 100)
w0 <- runif(data$na); w0 <- w0/sum(w0)
OF <- function(w,data){
wmu <- crossprod(w,m)
res <- crossprod(w, data$Sigma)
res <- tcrossprod(w,res)
result <- res - wmu
}
neighbour <- function(w, data){
toSell <- w > data$winf
toBuy <- w < data$wsup
i <- resample(which(toSell), size = 1L)
j <- resample(which(toBuy), size = 1L)
eps <- runif(1) * data$eps
eps <- min(w[i] - data$winf, data$wsup - w[j], eps)
w[i] <- w[i] - eps
w[j] <- w[j] + eps
w
}
algo <- list(x0 = w0, neighbour = neighbour, nS = 5000L)
system.time(sol1 <- LSopt(OF, algo, data))
I am not sure how to include lambda in the objective function (OF). The above code does not include lambda in OF. I tried using for loop but it resulted in following error:
OF <- function(w,data){
lambdaSeq <- seq(.001,0.999, length = data$nFP)
for(lambda in lambdaSeq){
wmu <- crossprod(w,m)
res <- crossprod(w, data$Sigma)
res <- tcrossprod(w,res)
result <- lambda*res - (1-lambda)*wmu
}
}
Error:
Local Search.
Initial solution:
| | 0%
Error in if (xnF <= xcF) { : argument is of length zero
Timing stopped at: 0.01 0 0.03
It would be nice if someone could help me in this regard.
P.S: I am also aware that this can be solved using quadratic programming. This is just an initiation to include other constraints.
If I understand correctly, you want to replicate the mean--variance efficient frontier by Local Search? Then you need to run a Local Search for every value of lambda that you want to include in the frontier.
The following example should help you get going. I start by attaching the package and setting up the list data.
require("NMOF")
data <- list(m = colMeans(fundData), ## expected returns
Sigma = cov(fundData), ## expected var of returns
na = dim(fundData)[2L], ## number of assets
eps = 0.2/100, ## stepsize for LS
winf = 0, ## minimum weight
wsup = 1, ## maximum weight
lambda = 1)
Next I compute a benchmark for the minimum-variance case (i.e. lambda equals one).
## benchmark: the QP solution
## ==> this will only work with a recent version of NMOF,
## which you can get by saying:
## install.packages('NMOF', type = 'source',
## repos = c('http://enricoschumann.net/R',
## getOption('repos')))
##
require("quadprog")
sol <- NMOF:::minvar(data$Sigma, 0, 1)
Objective function and neighbourhood function. I have slightly simplified both functions (for clarity; using crossprod in the objective function would probably be more efficient).
OF <- function(w, data){
data$lambda * (w %*% data$Sigma %*% w) -
(1 - data$lambda) * sum(w * data$m)
}
neighbour <- function(w, data){
toSell <- which(w > data$winf)
toBuy <- which(w < data$wsup)
i <- toSell[sample.int(length(toSell), size = 1L)]
j <- toBuy[sample.int(length(toBuy), size = 1L)]
eps <- runif(1) * data$eps
eps <- min(w[i] - data$winf, data$wsup - w[j], eps)
w[i] <- w[i] - eps
w[j] <- w[j] + eps
w
}
Now we can run Local Search. Since it is a fairly large dataset (200 assets),
you will need a relatively large number of steps to reproduce the QP solution.
w0 <- runif(data$na) ## a random initial solution
w0 <- w0/sum(w0)
algo <- list(x0 = w0, neighbour = neighbour, nS = 50000L)
sol1 <- LSopt(OF, algo, data)
You can compare the weights you get from Local Search with the QP solution.
par(mfrow = c(3,1), mar = c(2,4,1,1), las = 1)
barplot(sol, main = "QP solution")
barplot(sol1$xbest, main = "LS solution")
barplot(sol - sol1$xbest,
ylim = c(-0.001,0.001)) ## +/-0.1%
Finally, if you want to compute the whole frontier, you need to rerun this code for different levels of data$lambda.