Solving an equation in R with given constraints - r

I want to maximize the following function subject to the given constraints.
-p1log(p1) - p3log(p3) - p5log(p5)
subject to p1 + p3 + p5 = 1
and p1 + 3p3 + 5p5 = 3.5
p1 , p3 and p5 all lie between 0 and 1 [They are probabilities].
My question is how do I solve this in R? From what I saw, constrOptim() is one of the functions commonly used to solve these type of problems. However I could not figure it out.
Any help is appreciated.

Package Rsolnp uses Lagrange multipliers to solve non-linear problems with equality constraints. Here is how it would be setup. eps is meant not to have the logarithms produce NaN values.
library(Rsolnp)
f <-function(X) {
x <- X[1]
y <- X[2]
z <- X[3]
res <- -x*log(x) - y*log(y) - z*log(z)
-res
}
eq_f <- function(X){
x <- X[1]
y <- X[2]
z <- X[3]
c(
x + y + z,
x + 3*y + 5*z
)
}
eps <- .Machine$double.eps*10^2
X0 <- c(0.1, 0.1, 0.1)
sol <- solnp(
pars = X0,
fun = f,
eqfun = eq_f,
eqB = c(1, 3.5),
LB = rep(eps, 3),
UB = rep(1, 3)
)
#
#Iter: 1 fn: -1.0512 Pars: 0.21624 0.31752 0.46624
#Iter: 2 fn: -1.0512 Pars: 0.21624 0.31752 0.46624
#solnp--> Completed in 2 iterations
sol$convergence
#[1] 0
sol$pars
#[1] 0.2162396 0.3175208 0.4662396
sol$values
#[1] 0.000000 -1.051173 -1.051173
The last value of sol$values is the function value at the optimal parameters.
We can check that the constraints are met.
sum(sol$pars)
#[1] 1
sum(sol$pars*c(1, 3, 5))
#[1] 3.5

Related

How to perform bivariate optimization

Is it possible to solve the following problem in R?
In particular, I want to find the values of a1 and a2 minimizing the loss below:
> n <- 1000
> x <- rnorm(n, 1, 1)
> e <- rnorm(n, 0, 1)
> d <- a1+a2*x+e < 0
> loss <- (mean(d) - 0.5) + (mean((a1 + a2*x + e)[d=0]) - 2)
That is, I want to find the values of a1 and a2 that make mean(d) and mean((a1+a2*x+e)[d=0]) as close as possible to 0.5 and 2, respectively.
(the chosen values 0.5 and 2 are just temporal values)
Using optim with a function f that computes the defined loss. p is a vector of parameters, i.e. p[1] is your a1, and p[2] your a2. Use reasonable starting values when calling optim with your function.
f <- \(p) {
d <- p[1] + p[2]*x + e < 0
(mean(d) - 0.5) + (mean((p[1] + p[2]*x + e)[d]) - 2)
}
res <- optim(c(0, 0), f)
res$par
# [1] 4.393432e+53 1.010012e+55 ## a1 and a2
Note that d is already boolean.
In case you get different results with different starting values, your distribution might be multi-modal.
Data:
n <- 1e3; set.seed(42); x <- rnorm(n, 1, 1); e <- rnorm(n, 0, 1)

How to set NonConvex = 2 in Gurobi in R?

I get this error when I run the MWE code below. Does anyone know how to resolve this? thanks!
Error: Error 10020: Q matrix is not positive semi-definite (PSD). Set NonConvex parameter to 2 to solve model.
MWE:
library(gurobi)
library(Matrix)
model <- list()
#optimization problem:
# max x + y
# s.t.
# -x + y <= 0
# x^2 - y^2 <= 10
# 0 <= x < = 20
# 0 <= y <= 20
model$obj <- c(1,1)
model$A <- matrix(c(-1,1), nrow=1, byrow=T) # for LHS of linear constraint: -x + y <= 0
model$rhs <- c(0) # for RHS of linear constraint: -x + y <= 0
model$ub[1] = 20 # x < = 20
model$ub[2] = 20 # y < = 20
model$sense <- c('<')
# non-convex quadratic constraint: x^2 - y^2 <= 10
qc1 <- list()
qc1$Qc <- spMatrix(2, 2, c(1, 2), c(1, 2), c(1.0, -1.0))
qc1$rhs <- 10
model$quadcon <- list(qc1)
#the QC constraint is a non-convex quadratic constraint, so set NonConvex = 2
model$params <- list(NonConvex=2)
gurobi_write(model,'quadtest.lp', env)
result <- gurobi(model) # THIS IS WHERE I GET THE ERROR ABOVE
print(result$objval)
print(result$x)
NM...i see that I can fix this by not putting the params as part of the model list, and instead running it as an input to the gurobi(,) call as follows:
params <- list(NonConvex=2)
result <- gurobi(model, params)

Finding the largest argument of a function subject to a bound

As the title suggests, I would like to solve the following problem. Let f denote a a certain function and let f0 denote a given constant. Is there an economical way of finding max{x:f(x) <= f0}?
Here is what an example would look like:
f = function(x) (x-2)^2
f0 = 0.4
and in that case the correct answer would be about 2.5. Thank you in advance.
One possibility to optimize with constraints would be to define a version of your function f which returns Inf if the constraint is not met:
f <- function(x) (x-2)^2
f0 <- 0.4
f_optim <- function(x, a = f0) ifelse(f(x) <= a, f(x), Inf)
optimize(f_optim, c(-10, 10), a = f0, maximum = T, tol = .Machine$double.eps)
$maximum
[1] 2.632456
$objective
[1] 0.4
f0 = 0.04
f = function(x) (x - 2)^2
g = function(x, f0) {
delta = f0 - f(x)
abs(delta)
}
optimize(g, c(0, 10), f0 = f0, maximum=F, tol= .Machine$double.eps)

Maximum Product of Spacings using R

I would like to estimate the parameter for exponential distribution using Maximum Product of Spacings (MPS). I will have to minimize:
-(1/(n + 1))*(sum of log D[i] from i = 1 to n + 1),
where D[i] = F(x[i]) - F(x[i - 1])
And the following is my R code:
n<- 10
mydata<- rexp(n, rate=2)
x<- sort(mydata)
fnn<- function(lambda,x){
for (i in 2:n){
c<- 1-exp(-lambda*x[i])
d<- 1-exp(-lambda*x[i-1])
}
s<- (1/(n-1))*sum(log(c-d))
return(-s)
}
optim(0.8, fnn, x=x)
Can someone please verify if I am doing the right thing here?
The output I obtained is far from the true value of lambda = 2.
$`par`
[1] 0.92375
$value
[1] 0.1847188
$counts
function gradient
18 NA
$convergence
[1] 0
$message
NULL
what modifications should I include?
The problem with your code is that it is rewriting c and d each time through the for loop. It also had a bug in the computation of the multiplicative constant 1/(n + 1).
Here is a corrected version. The key is to reserve memory before the loop with numeric(n - 1).
I also include a simpler version, taking advantage of R's built-in pexp.
fnn <- function(lambda, x){
n <- length(x)
c <- numeric(n - 1)
d <- numeric(n - 1)
for (i in 2:n){
c[i - 1] <- 1 - exp(-lambda*x[i])
d[i - 1] <- 1 - exp(-lambda*x[i-1])
}
s <- (1/(n + 1))*sum(log(c - d))
return(-s)
}
fnn2 <- function(lambda, x){
n <- length(x)
D <- log(pexp(x[-1], rate = lambda,) - pexp(x[-n], rate = lambda))
s <- sum(D)/(n + 1)
-s
}
set.seed(1234)
n <- 10
mydata <- rexp(n, rate = 2)
x <- sort(mydata)
opt <- optim(0.8, fnn, x = x)
opt2 <- optim(0.8, fnn2, x = x)
opt$par
#[1] 2.9225
opt2$par
#[1] 2.9225
identical(opt$par, opt2$par)
#[1] TRUE

Function for polynomials of arbitrary order (symbolic method preferred)

I've found polynomial coefficients from my data:
R <- c(0.256,0.512,0.768,1.024,1.28,1.437,1.594,1.72,1.846,1.972,2.098,2.4029)
Ic <- c(1.78,1.71,1.57,1.44,1.25,1.02,0.87,0.68,0.54,0.38,0.26,0.17)
NN <- 3
ft <- lm(Ic ~ poly(R, NN, raw = TRUE))
pc <- coef(ft)
So I can create a polynomial function:
f1 <- function(x) pc[1] + pc[2] * x + pc[3] * x ^ 2 + pc[4] * x ^ 3
And for example, take a derivative:
g1 <- Deriv(f1)
How to create a universal function so that it doesn't have to be rewritten for every new polynomial degree NN?
My original answer may not be what you really want, as it was numerical rather symbolic. Here is the symbolic solution.
## use `"x"` as variable name
## taking polynomial coefficient vector `pc`
## can return a string, or an expression by further parsing (mandatory for `D`)
f <- function (pc, expr = TRUE) {
stringexpr <- paste("x", seq_along(pc) - 1, sep = " ^ ")
stringexpr <- paste(stringexpr, pc, sep = " * ")
stringexpr <- paste(stringexpr, collapse = " + ")
if (expr) return(parse(text = stringexpr))
else return(stringexpr)
}
## an example cubic polynomial with coefficients 0.1, 0.2, 0.3, 0.4
cubic <- f(pc = 1:4 / 10, TRUE)
## using R base's `D` (requiring expression)
dcubic <- D(cubic, name = "x")
# 0.2 + 2 * x * 0.3 + 3 * x^2 * 0.4
## using `Deriv::Deriv`
library(Deriv)
dcubic <- Deriv(cubic, x = "x", nderiv = 1L)
# expression(0.2 + x * (0.6 + 1.2 * x))
Deriv(f(1:4 / 10, FALSE), x = "x", nderiv = 1L) ## use string, get string
# [1] "0.2 + x * (0.6 + 1.2 * x)"
Of course, Deriv makes higher order derivatives easier to get. We can simply set nderiv. For D however, we have to use recursion (see examples of ?D).
Deriv(cubic, x = "x", nderiv = 2L)
# expression(0.6 + 2.4 * x)
Deriv(cubic, x = "x", nderiv = 3L)
# expression(2.4)
Deriv(cubic, x = "x", nderiv = 4L)
# expression(0)
If we use expression, we will be able to evaluate the result later. For example,
eval(cubic, envir = list(x = 1:4)) ## cubic polynomial
# [1] 1.0 4.9 14.2 31.3
eval(dcubic, envir = list(x = 1:4)) ## its first derivative
# [1] 2.0 6.2 12.8 21.8
The above implies that we can wrap up an expression for a function. Using a function has several advantages, one being that we are able to plot it using curve or plot.function.
fun <- function(x, expr) eval.parent(expr, n = 0L)
Note, the success of fun requires expr to be an expression in terms of symbol x. If expr was defined in terms of y for example, we need to define fun with function (y, expr). Now let's use curve to plot cubic and dcubic, on a range 0 < x < 5:
curve(fun(x, cubic), from = 0, to = 5) ## colour "black"
curve(fun(x, dcubic), add = TRUE, col = 2) ## colour "red"
The most convenient way, is of course to define a single function FUN rather than doing f + fun combination. In this way, we also don't need to worry about the consistency on the variable name used by f and fun.
FUN <- function (x, pc, nderiv = 0L) {
## check missing arguments
if (missing(x) || missing(pc)) stop ("arguments missing with no default!")
## expression of polynomial
stringexpr <- paste("x", seq_along(pc) - 1, sep = " ^ ")
stringexpr <- paste(stringexpr, pc, sep = " * ")
stringexpr <- paste(stringexpr, collapse = " + ")
expr <- parse(text = stringexpr)
## taking derivatives
dexpr <- Deriv::Deriv(expr, x = "x", nderiv = nderiv)
## evaluation
val <- eval.parent(dexpr, n = 0L)
## note, if we take to many derivatives so that `dexpr` becomes constant
## `val` is free of `x` so it will only be of length 1
## we need to repeat this constant to match `length(x)`
if (length(val) == 1L) val <- rep.int(val, length(x))
## now we return
val
}
Suppose we want to evaluate a cubic polynomial with coefficients pc <- c(0.1, 0.2, 0.3, 0.4) and its derivatives on x <- seq(0, 1, 0.2), we can simply do:
FUN(x, pc)
# [1] 0.1000 0.1552 0.2536 0.4144 0.6568 1.0000
FUN(x, pc, nderiv = 1L)
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
FUN(x, pc, nderiv = 2L)
# [1] 0.60 1.08 1.56 2.04 2.52 3.00
FUN(x, pc, nderiv = 3L)
# [1] 2.4 2.4 2.4 2.4 2.4 2.4
FUN(x, pc, nderiv = 4L)
# [1] 0 0 0 0 0 0
Now plotting is also easy:
curve(FUN(x, pc), from = 0, to = 5)
curve(FUN(x, pc, 1), from = 0, to = 5, add = TRUE, col = 2)
curve(FUN(x, pc, 2), from = 0, to = 5, add = TRUE, col = 3)
curve(FUN(x, pc, 3), from = 0, to = 5, add = TRUE, col = 4)
Since my final solution with symbolic derivatives eventually goes too long, I use a separate session for numerical calculations. We can do this as for polynomials, derivatives are explicitly known so we can code them. Note, there will be no use of R expression here; everything is done directly by using functions.
So we first generate polynomial basis from degree 0 to degree p - n, then multiply coefficient and factorial multiplier. It is more convenient to use outer than poly here.
## use `outer`
g <- function (x, pc, nderiv = 0L) {
## check missing aruments
if (missing(x) || missing(pc)) stop ("arguments missing with no default!")
## polynomial order p
p <- length(pc) - 1L
## number of derivatives
n <- nderiv
## earlier return?
if (n > p) return(rep.int(0, length(x)))
## polynomial basis from degree 0 to degree `(p - n)`
X <- outer(x, 0:(p - n), FUN = "^")
## initial coefficients
## the additional `+ 1L` is because R vector starts from index 1 not 0
beta <- pc[n:p + 1L]
## factorial multiplier
beta <- beta * factorial(n:p) / factorial(0:(p - n))
## matrix vector multiplication
drop(X %*% beta)
}
We still use the example x and pc defined in the symbolic solution:
x <- seq(0, 1, by = 0.2)
pc <- 1:4 / 10
g(x, pc, 0)
# [1] 0.1000 0.1552 0.2536 0.4144 0.6568 1.0000
g(x, pc, 1)
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
g(x, pc, 2)
# [1] 0.60 1.08 1.56 2.04 2.52 3.00
g(x, pc, 3)
# [1] 2.4 2.4 2.4 2.4 2.4 2.4
g(x, pc, 4)
# [1] 0 0 0 0 0 0
The result is consistent with what we have with FUN in the the symbolic solution.
Similarly, we can plot g using curve:
curve(g(x, pc), from = 0, to = 5)
curve(g(x, pc, 1), from = 0, to = 5, col = 2, add = TRUE)
curve(g(x, pc, 2), from = 0, to = 5, col = 3, add = TRUE)
curve(g(x, pc, 3), from = 0, to = 5, col = 4, add = TRUE)
Now after quite much effort in demonstrating how we can work out this question ourselves, consider using R package polynom. As a small package, it aims at implementing construction, derivatives, integration, arithmetic and roots-finding of univariate polynomials. This package is written completely with R language, without any compiled code.
## install.packages("polynom")
library(polynom)
We still consider the cubic polynomial example used before.
pc <- 1:4 / 10
## step 1: making a "polynomial" object as preparation
pcpoly <- polynomial(pc)
#0.1 + 0.2*x + 0.3*x^2 + 0.4*x^3
## step 2: compute derivative
expr <- deriv(pcpoly)
## step 3: convert to function
g1 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 1.2 + x * w
# w <- 0.6 + x * w
# w <- 0.2 + x * w
# w
#}
#<environment: 0x9f4867c>
Note, by step-by-step construction, the resulting function has all parameters inside. It only requires a single argument for x value. In contrast, functions in the other two answers will take coefficients and derivative order as mandatory arguments, too. We can call this function
g1(seq(0, 1, 0.2))
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
To produce the same graph we see in other two answers, we get other derivatives as well:
g0 <- as.function(pcpoly) ## original polynomial
## second derivative
expr <- deriv(expr)
g2 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 2.4 + x * w
# w <- 0.6 + x * w
# w
#}
#<environment: 0x9f07c68>
## third derivative
expr <- deriv(expr)
g3 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 2.4 + x * w
# w
#}
#<environment: 0x9efd740>
Perhaps you have already noticed that I did not specify nderiv, but recursively take 1 derivative at a time. This may be a disadvantage of this package. It does not facilitate higher order derivatives.
Now we can make a plot
## As mentioned, `g0` to `g3` are parameter-free
curve(g0(x), from = 0, to = 5)
curve(g1(x), add = TRUE, col = 2)
curve(g2(x), add = TRUE, col = 3)
curve(g3(x), add = TRUE, col = 4)

Resources