I am working with the output from a model in which there are parameter estimates that may not follow a-priori expectations. I would like to write a function that forces these utility estimates back in line with those expectations. To do this, the function should minimize the sum of the squared deviance between the starting values and the new estimates. Since we have a-priori expections, the optimization should be subject to the following constraints:
B0 < B1
B1 < B2
...
Bj < Bj+1
For example, the raw parameter estimates below are flipflopped for B2 and B3. The columns Delta and Delta^2 show the deviance between the original parameter estimate and the new coefficient. I am trying to minimize the column Delta^2. I've coded this up in Excel and shown how Excel's Solver would optimize this problem providing the set of constraints:
Beta BetaRaw Delta Delta^2 BetaNew
B0 1.2 0 0 1.2
B1 1.3 0 0 1.3
B2 1.6 -0.2 0.04 1.4
B3 1.4 0 0 1.4
B4 2.2 0 0 2.2
After reading through ?optim and ?constrOptim, I'm not able to grok how to set this up in R. I'm sure I'm just being a bit dense, but could use some pointers in the right direction!
3/24/2012 - Added bounty since I'm not smart enough to translate the first answer.
Here's some R code that should be on the right path. Assuming that the betas start with:
betas <- c(1.2,1.3,1.6,1.4,2.2)
I want to minimize the following function such that b0 <= b1 <= b2 <= b3 <= b4
f <- function(x) {
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
x5 <- x[5]
loss <- (x1 - betas[1]) ^ 2 +
(x2 - betas[2]) ^ 2 +
(x3 - betas[3]) ^ 2 +
(x4 - betas[4]) ^ 2 +
(x5 - betas[5]) ^ 2
return(loss)
}
To show that the function works, the loss should be zero if we pass the original betas in:
> f(betas)
[1] 0
And relatively large with some random inputs:
> set.seed(42)
> f(rnorm(5))
[1] 8.849329
And minimized at the values I was able to calculate in Excel:
> f(c(1.2,1.3,1.4,1.4,2.2))
[1] 0.04
1.
Since the objective is quadratic and the constraints linear,
you can use solve.QP.
It finds the b that minimizes
(1/2) * t(b) %*% Dmat %*% b - t(dvec) %*% b
under the constraints
t(Amat) %*% b >= bvec.
Here, we want b that minimizes
sum( (b-betas)^2 ) = sum(b^2) - 2 * sum(b*betas) + sum(beta^2)
= t(b) %*% t(b) - 2 * t(b) %*% betas + sum(beta^2).
Since the last term, sum(beta^2), is constant, we can drop it,
and we can set
Dmat = diag(n)
dvec = betas.
The constraints are
b[1] <= b[2]
b[2] <= b[3]
...
b[n-1] <= b[n]
i.e.,
-b[1] + b[2] >= 0
- b[2] + b[3] >= 0
...
- b[n-1] + b[n] >= 0
so that t(Amat) is
[ -1 1 ]
[ -1 1 ]
[ -1 1 ]
[ ... ]
[ -1 1 ]
and bvec is zero.
This leads to the following code.
# Sample data
betas <- c(1.2, 1.3, 1.6, 1.4, 2.2)
# Optimization
n <- length(betas)
Dmat <- diag(n)
dvec <- betas
Amat <- matrix(0,nr=n,nc=n-1)
Amat[cbind(1:(n-1), 1:(n-1))] <- -1
Amat[cbind(2:n, 1:(n-1))] <- 1
t(Amat) # Check that it looks as it should
bvec <- rep(0,n-1)
library(quadprog)
r <- solve.QP(Dmat, dvec, Amat, bvec)
# Check the result, graphically
plot(betas)
points(r$solution, pch=16)
2.
You can use constrOptim in the same way (the objective function can be arbitrary, but the constraints have to be linear).
3.
More generally, you can use optim if you reparametrize the problem
into a non-constrained optimization problem,
for instance
b[1] = exp(x[1])
b[2] = b[1] + exp(x[2])
...
b[n] = b[n-1] + exp(x[n-1]).
There are a few examples
here
or there.
Alright, this is starting to take form, but still has some bugs. Based on the conversation in chat with #Joran, it seems I can include a conditional that will set the loss function to an arbitrarily large value if the values are not in order. This seems to work IF the discrepancy occurs between the first two coefficients, but not thereafter. I'm having a hard time parsing out why that would be the case.
Function to minimize:
f <- function(x, x0) {
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
x4 <- x[4]
x5 <- x[5]
loss <- (x1 - x0[1]) ^ 2 +
(x2 - x0[2]) ^ 2 +
(x3 - x0[3]) ^ 2 +
(x4 - x0[4]) ^ 2 +
(x5 - x0[5]) ^ 2
#Make sure the coefficients are in order
if any(diff(c(x1,x2,x3,x4,x5)) > 0) loss = 10000000
return(loss)
}
Working example (sort of, it seems the loss would be minimized if b0 = 1.24?):
> betas <- c(1.22, 1.24, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.282 1.240 1.180 1.120 1.100
Non-working example (note that the third element is still larger than the second:
> betas <- c(1.20, 1.15, 1.18, 1.12, 1.10)
> optim(betas, f, x0 = betas)$par
[1] 1.20 1.15 1.18 1.12 1.10
Related
I have an equation which goes like this,
2* (1-x-a-b)^2 * x * *theta* + 2 * (1-a-b-x) * x^2 * *theta* - 2 * b * x^2 + 2 * a * (1-a-b-x)^2 = 0
I want to create a function in R, that selects a and b with restriction (a + b < 1 - a + b) from an uniform distribution. After selecting, I want it to find the solutions for x (both negative and positive).
I want to repeat this process t amount of time in a for loop where I will give the theta value as an input.
After that I want it to create a 3D density plot where solutions are shown with respect to values of a,b on two axes and x on one axis.
So far I have tried to use polynom package and solve function. But I am having hard time with R when it comes to mathematics.
You need to rewrite the polynomial in standard form a0 + a1*x + a2*x^2 + a3*x^3, then you can use the base function polyroot() to find the roots. For example,
a0 <- 2 * a * (1 - a - b)^2
a1 <- 2 * (1 - a - b)^2 * theta - 4 * a * (1 - a - b)
a2 <- -4 * (1 - a - b) * theta + 2 * (1 - a - b) * theta - 2 * b + 2 * a
a3 <- 0
So this is a quadratic equation, not a cubic as it appears at first glance.
Then use
polyroot(c(a0, a1, a2))
to find the roots. Select the real roots, and put them together into a matrix roots with columns a, b, root, then use rgl::plot3d(roots) to display them.
I think you have a typo in your restriction, so I'll ignore it, and this is the plot I get for theta == 1:
theta <- 1
a <- runif(1000)
b <- runif(1000)
a0 <- 2*a*(1-a-b)^2
a1 <- 2*(1-a-b)^2*theta -4*a*(1-a-b)
a2 <- -4*(1-a-b)*theta + 2*(1-a-b)*theta-2*b+2*a
result <- matrix(numeric(), ncol = 3, dimnames = list(NULL, c("a", "b", "root")))
for (i in seq_along(a)) {
root <- polyroot(c(a0[i], a1[i], a2[i]))
if (max(Im(root)) < 1.e8)
result <- rbind(result, cbind(a[i], b[i], Re(root)))
}
library(rgl)
plot3d(result)
Created on 2022-06-14 by the reprex package (v2.0.1)
Most of the roots are really small, but for some of them a2 is nearly zero, and then they can be very large.
You can create a table with a column for each variable and filter the rows not satisfying your equation:
library(tidyverse)
set.seed(1337)
n <- 1000
tibble(
a = runif(n),
b = runif(n)
) |>
filter(a + b < 1 - a + b) |>
expand_grid(
theta = seq(0, 1, by = 1),
x = seq(0, 1, by = 1)
) |>
filter(
2 * (1 - x - a - b)^2 * x * theta + 2 * (1 - a - b - x) * x^2 * theta - 2 *
b * x^2 + 2 * a * (1 - a - b - x)^2 == 0
)
#> # A tibble: 0 × 4
#> # … with 4 variables: a <dbl>, b <dbl>, theta <dbl>, x <dbl>
Created on 2022-06-13 by the reprex package (v2.0.0)
Unfortunately, there is no point in the sampled space satisfying your equation. This is probably due to ==0 instead of <e where e is a very small error. One needs to allow small errors in numerical sampling solutions.
Why just not solve the roots of the equation analytically?
I hope this message finds you well.
I am trying to solve an optimization problem formulated as a Mixed Integer Program with the lpSolveAPI R-package. However, there are indicator functions in the objective function and in some constraints. To be more specific, consider the following optimization problem:
min{ 2.8 * x1 + 3.2 * x2 + 3.5 * x3 +
17.5 * delta(x1) + 2.3 * delta(x2) + 5.5 * delta(x3) }
subject to:
0.4 * x1 + 8.7 * x2 + 4.5 * x3 <=
387 - 3 * delta(x1) - 1 * delta(x2) - 3 * delta(x3)
x1 <= 93 * delta(x1)
x2 <= 94 * delta(x2),
x3 <= 100 * delta(x3), and
x1, x2, and x3 are non-negative integers.
In this problem, for all i in {1, 2, 3}, delta(xi) = 1 if xi > 0, whereas delta(xi) = 0 otherwise.
The R-code I have so far is:
install.packages("lpSolveAPI")
library(lpSolveAPI)
a <- c(3, 1, 3)
b <- c(0.4, 8.7, 4.5)
q <- 387
M <- c(93, 94, 100)
A <- c(17.5, 2.3, 5.5)
h <- c(2.8, 3.2, 3.5)
Fn <- function(u1, u2, u3, u4){
lprec <- make.lp(0, 3)
lp.control(lprec, "min")
set.objfn(lprec, u1)
add.constraint(lprec, u2, "<=", u3)
set.bounds(lprec, lower = rep(0, 3), upper = u4)
set.type(lprec, columns = 1:3, type = "integer")
solve(lprec)
return(list(Soln = get.variables(lprec), MinObj = get.objective(lprec)))
}
TheTest <- Fn(u1 = h, u2 = b, u3 = q, u4 = M)
Please, I was wondering if someone could tell me how to put delta functions into this R-code to solve the aforementioned optimization problem.
Rodrigo.
A constraint like x1 <= 93 * delta(x1) looks very strange to me. I think this is just x1 <= 93. For a MIP solver replace the function delta(x) by a binary variable d. Then add the constraint d <= x <= M*d where M is an upper bound on x. To be explicit, for your model we have:
min 2.8*x1 + 3.2*x2 + 3.5*x3 + 17.5*d1 + 2.3*d2 + 5.5*d3
0.4*x1 + 8.7*x2 + 4.5*x3 <= 387 - 3*d1 - d2 - 3*d3
d1 <= x1 <= 93*d1
d2 <= x2 <= 94*d2
d3 <= x3 <= 100*d3
x1 integer in [0,93]
x2 integer in [0,94]
x3 integer in [0,100]
d1,d2,d3 binary
This is now trivial to solve with any MIP solver. Note that a double inequality like d1 <= x1 <= 93*d1 can be written as two inequalities: d1<=x1 and x1<=93*d1.
I am trying to execute the following code to solve a system of nonlinear equations using R:
library(pracma)
t <- read.csv("values-try.csv", header=F, sep=",")
x0 <- as.matrix( c(0, 1, 0, 1, 0, 1))
Gr <- 9.807
F <- function (x) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3]; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]
as.matrix( c( (t[1,1] - x1)^2/x2^2 + (t[1,2] - x3)^2/x4^2 + (t[1,3] - x5)^2/x6^2 - Gr^2,
(t[2,1] - x1)^2/x2^2 + (t[2,2] - x3)^2/x4^2 + (t[2,3] - x5)^2/x6^2 - Gr^2,
(t[3,1] - x1)^2/x2^2 + (t[3,2] - x3)^2/x4^2 + (t[3,3] - x5)^2/x6^2 - Gr^2,
(t[4,1] - x1)^2/x2^2 + (t[4,2] - x3)^2/x4^2 + (t[4,3] - x5)^2/x6^2 - Gr^2,
(t[5,1] - x1)^2/x2^2 + (t[5,2] - x3)^2/x4^2 + (t[5,3] - x5)^2/x6^2 - Gr^2,
(t[6,1] - x1)^2/x2^2 + (t[6,2] - x3)^2/x4^2 + (t[6,3] - x5)^2/x6^2 - Gr^2), ncol = 1)
}
fsolve(F, x0)
I keep getting the following error:
Error in if (norm(s, "F") < tol || norm(as.matrix(ynew), "F") < tol) break :
missing value where TRUE/FALSE needed
Calls: fsolve -> broyden
Execution halted
Any hints or resolving the error?
The values-try.csv looks like this:
0.1191419256974832, -0.2806359683994824, -9.755712465258934
0.3194200198415491, 0.05681698915395282, -9.711375649078391
0.05320046522270569, 0.21071993729858585, -9.711942750423542
0.056291795600583824, 0.20746318577998762, -9.697096562782926
-0.18870002789891743, -0.03873042128470452, -9.70831243701548
0.13239301222057243, -9.790554976542873, -0.9744148062871234
Finding the common zeroes of a set of polynomials is always tricky business. I somehow doubt the polynomials in your example do have such an exact common zero. Anyway, implementations like the one in fsolve will have problems with too small gradients or step lengths.
A better idea might be to apply a least-squares solver, i.e., minimize the sum of squares of the components of F. Function pracma::lsqnonlin will do this, squaring and summing the components of F automatically.
library(pracma)
x0 <- as.matrix( c(0, 1, 0, 1, 0, 1))
sol = lsqnonlin(F, x0, options=list(tolx=1e-12, tolg=1e-12))
sol$x
## [1] 0.1061871 32.9875053 -0.5361180
## [4] 59.1224428 68975.6833271 7034.3066917
F(sol$x)
## [,1]
## [1,] 1.838934e-07
## [2,] 9.420962e-08
## [3,] 2.146091e-05
## [4,] -2.161610e-05
## [5,] -1.225254e-07
## [6,] -3.836504e-10
Please note that you will find other minima with different starting points. You didn’t say whether you want to restrict the problem domain; I am quite certain there are no ‘near’ zeros in [-10, 10]^6.
I am trying to optimize layout of a set of boxes w.r.t. their hanger locations s.t. the boxes are most aligned with their hangers and do not crowd out each other. Using quadprog.
Givens:
1. box hanger x-locations (P). =710 850 990 1130
2. box-sizes (W). =690 550 690 130
3. usable x-spread tuple (S). =-150 2090
4. number of boxes (K). =4
5. minimum interbox spread (G). =50
6. box x-locations (X). =objective
We can see that the total required x-spread is sum(W) + 3G = 2060 + 150 = 2210 whereas the available x-spread is S[2] - S1 = 2240. So, a solution should exist.
Min:
sumof (P[i] – X[i])^2
s.t.:
(1) X[i+i] – X[i] >= G + ½ ( W[i+1] + W[i] ); i = 1..(K-1), i.e. the boxes do not crowd out each other
-X[i] + X[i+1] >= -( -G – ½ (W[i+1] + W[i]) )
(2) X1 >= S[left] + ½ W1, and (3) X[K] <= S[right] – ½ W[K], i.e. the boxes are within the given x-spread
X[1] >= - ( S[left] + ½ W[1] )
-X[K] >= - ( S[right] – ½ W[K] )
for a total of 5 constraints - 3 for the inter-box spread, and 2 for extremities.
in R:
> Dmat = matrix(0,4,4)
> diag(Dmat) = 1
> dvec = P, the hanger locations
[1] 710 850 990 1130
> bvec
[1] -670 -670 -460 -195 2025
> t(Amat)
[,1] [,2] [,3] [,4]
[1,] -1 1 0 0
[2,] 0 -1 1 0
[3,] 0 0 -1 1
[4,] 1 0 0 0
[5,] 0 0 0 -1
> solve.QP(Dmat, dvec, Amat, bvec)
Error in solve.QP(Dmat, dvec, Amat, bvec) :
constraints are inconsistent, no solution!
Quite obviously I have missed or mis-specified the problem (Package 'quadprog')! I am using quadprog as I found a JavaScript port of it.
Thanks a lot.
I'm not sure that this solves your physical problem but the code below seems to solve the optimization problem as you stated it. I've generalized it to a
variable number of boxes and included a plot to check the solution.
library(quadprog)
p <- c(710, 850, 990, 1130) # hanger positions
w <- c(690, 550, 690, 130) # box widths
g <- 50 # min box separation
s <- c(-150, 2390) # min and max postions of box edges
k <- length(w) # number of boxes
Dmat <- 2*diag(nrow=k)
dvec <- p
# separation constraints
Amat <- -diag(nrow=k,ncol=(k-1))
Amat[lower.tri(Amat)] <- unlist(lapply((k-1):1, function(n) c(1,numeric(n-1))))
bvec <- sapply(1:(k-1), function(n) g + (w[n+1]+w[n])/2)
# x-spread constraints
Amat <- cbind(Amat, c(1,numeric(k-1)), c(numeric(k-1),-1))
bvec <- c(bvec, s[1] + w[1]/2, -(s[2] - w[k]/2))
sol <- solve.QP(Dmat, dvec, Amat, bvec)
plot(x=s, y=c(0,0), type="l", ylim=c(-2.5,0))
points(x=p, y=numeric(k), pch=19)
segments(x0=sol$solution, y0=-1, x1=p, y1=0)
rect(xleft=sol$solution-w/2, xright=sol$solution+w/2, ytop=-1.0, ybottom=-2, density=8)
The problem lies with the setup of Amat, bvec or both. solve.QP tries to find a solution, b, of the quadratic programming problem subject to the constraint that
t(Amat)*b >= bvec
Expanding out this constraint in your example, we want to find a vector b := c(b[1], b[2], b[3], b[4]) that satisfies the conditions:
-b[1] + b[2] >= -670,
-b[2] + b[3] >= -670,
-b[3] + b[4] >= -460,
b[1] >= -195
and -b[4] >= 2025 (i.e., b[4] <= -2025).
However, by adding the first four inequalities together, we have b[4] >= -670-670-460-195 = -1995. In other words, b[4] must be greater than -1995 and less than -2025. This is a contradiction and therefore solve.QP fails to find a solution.
Trying this example with the constraint -b[4] >= -2025, by setting bvec = c(-670, -670, -460, -195, -2025) yields a solution. Without going too much into your formulation above, perhaps this was intended (or another one of these values should have been positive)?
I have been trying to solve a constrained optimization problem in R using constrOptim() (my first time) but am struggling to set up the constraints for my problem.
The problem is pretty straight forward and i can set up the function ok but am a bit at a loss about passing the constraints in.
e.g. problem i've defined is (am going to start with N fixed at 1000 say so i just want to solve for X ultimately i'd like to choose both N and X that max profit):
so i can set up the function as:
fun <- function(x, N, a, c, s) { ## a profit function
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
a1 <- a[1]
a2 <- a[2]
a3 <- a[3]
c1 <- c[1]
c2 <- c[2]
c3 <- c[3]
s1 <- s[1]
s2 <- s[2]
s3 <- s[3]
((N*x1*a1*s1)-(N*x1*c1))+((N*x2*a2*s2)-(N*x2*c2))+((N*x3*a3*s3)-(N*x3*c3))
}
The constraints i need to implement are that:
x1>=0.03
x1<=0.7
x2>=0.03
x2<=0.7
x3>=0.03
x2<=0.7
x1+x2+x3=1
The X here represents buckets into which i need to optimally allocate N, so x1=pecent of N to place in bucket 1 etc. with each bucket having at least 3% but no more than 70%.
Any help much appreciated...
e.g. here is an example i used to test the function does what i want:
fun <- function(x, N, a, c, s) { ## profit function
x1 <- x[1]
x2 <- x[2]
x3 <- x[3]
a1 <- a[1]
a2 <- a[2]
a3 <- a[3]
c1 <- c[1]
c2 <- c[2]
c3 <- c[3]
s1 <- s[1]
s2 <- s[2]
s3 <- s[3]
((N*x1*a1*s1)-(N*x1*c1))+((N*x2*a2*s2)-(N*x2*c2))+((N*x3*a3*s3)-(N*x3*c3))
};
x <-matrix(c(0.5,0.25,0.25));
a <-matrix(c(0.2,0.15,0.1));
s <-matrix(c(100,75,50));
c <-matrix(c(10,8,7));
N <- 1000;
fun(x,N,a,c,s);
You can use The lpSolveAPI package.
## problem constants
a <- c(0.2, 0.15, 0.1)
s <- c(100, 75, 50)
c <- c(10, 8, 7)
N <- 1000
## Problem formulation
# x1 >= 0.03
# x1 <= 0.7
# x2 >= 0.03
# x2 <= 0.7
# x3 >= 0.03
# x1 +x2 + x3 = 1
#N*(c1- a1*s1)* x1 + (a2*s2 - c2)* x2 + (a3*s3- c3)* x3
library(lpSolveAPI)
my.lp <- make.lp(6, 3)
The best way to build a model in lp solve is columnwise;
#constraints by columns
set.column(my.lp, 1, c(1, 1, 0, 0, 1, 1))
set.column(my.lp, 2, c(0, 0, 1, 1, 0, 1))
set.column(my.lp, 3, c(0, 0, 0, 0, 1, 1))
#the objective function ,since we need to max I set negtive max(f) = -min(f)
set.objfn (my.lp, -N*c(c[1]- a[1]*s[1], a[2]*s[2] - c[2],a[3]*s[3]- c[3]))
set.rhs(my.lp, c(rep(c(0.03,0.7),2),0.03,1))
#constraint types
set.constr.type(my.lp, c(rep(c(">=","<="), 2),">=","="))
take a look at my model
my.lp
Model name:
Model name:
C1 C2 C3
Minimize 10000 -3250 2000
R1 1 0 0 >= 0.03
R2 1 0 0 <= 0.7
R3 0 1 0 >= 0.03
R4 0 1 0 <= 0.7
R5 1 0 1 >= 0.03
R6 1 1 1 = 1
Kind Std Std Std
Type Real Real Real
Upper Inf Inf Inf
Lower 0 0 0
solve(my.lp)
[1] 0 ## sucess :)
get.objective(my.lp)
[1] -1435
get.constraints(my.lp)
[1] 0.70 0.70 0.03 0.03 0.97 1.00
## the decisions variables
get.variables(my.lp)
[1] 0.03 0.70 0.27
Hi Just in case of use to anyone i also found an answer as below:
First of all, your objective function can be written a lot more concisely using vector operations:
> my_obj_coeffs <- function(N,a,c,s) N*(a*s-c)
> fun <- function(x,N,a,c,s) sum(my_obj_coeffs(N,a,c,s) * x)
You're trying to solve a linear program, so you can use solve it using the simplex algorithm. There's a lightweight implementation of it in the 'boot' package.
> library(boot)
> solution <- function(obj) simplex(obj, diag(3), rep(0.7,3), diag(3), rep(0.03,3), rep(1,3), 1, maxi=TRUE)
Then for the example parameters you used, you can call that solution function:
> a <- c(0.2,0.15,0.1)
> s <- c(100,75,50)
> c <- c(10,8,7)
> N <- 1000
> solution(my_obj_coeffs(N,a,c,s))
Linear Programming Results
Call : simplex(a = obj(N, a, s, c), A1 = diag(3), b1 = rep(0.7, 3),
A2 = diag(3), b2 = rep(0.03, 3), A3 = matrix(1, 1, 3), b3 = 1,
maxi = TRUE)
Maximization Problem with Objective Function Coefficients
[,1]
[1,] 10000
[2,] 3250
[3,] -2000
attr(,"names")
[1] "x1" "x2" "x3"
Optimal solution has the following values
x1 x2 x3
0.70 0.27 0.03
The optimal value of the objective function is 7817.5.