constrained optimization R: another example - r

I am attempting to perform constrained optimization in R. I have looked at these posts and a couple of others:
constrained optimization in R
function constrained optimization in R
The first post above is quite helpful, but I am still not obtaining the correct answer to my problem.
My function is:
Fd <- 224 * d1 + 84 * d2 + d1 * d2 - 2 * d1^2 - d2^2
and my constraint is: 3 * d1 + d2 = 280
First I find the correct answer using an unconstrained exhaustive search followed by a constrained exhaustive search:
my.data <- expand.grid(x1 = seq(0, 200, 1), x2 = seq(0, 200, 1))
head(my.data)
dim(my.data)
d1 <- my.data[,1]
d2 <- my.data[,2]
Fd <- 224 * d1 + 84 * d2 + d1 * d2 - 2 * d1^2 - d2^2
new.data <- data.frame(Fd = Fd, d1 = d1, d2 = d2)
head(new.data)
# identify values of d1 and d2 that maximize Fd without the constraint
new.data[new.data$Fd == max(new.data$Fd),]
# **This is the correct answer**
# Fd d1 d2
# 6157 11872 76 80
# Impose constraint
new.data <- new.data[(3 * new.data$d1 + new.data$d2) == 280, ]
# identify values of d1 and d2 that maximize Fd with the constraint
new.data[new.data$Fd == max(new.data$Fd),]
# **This is the correct answer**
# Fd d1 d2
# 14743 11774 69 73
Now find unconstrained maxima using optim. This works.
Fd <- function(betas) {
b1 = betas[1]
b2 = betas[2]
(224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}
# unconstrained
optim(c(60, 100), Fd, control=list(fnscale=-1), method = "BFGS", hessian = TRUE)
# $par
# [1] 75.99999 79.99995
Now find constrained maxima using constrOptim. This does not work.
b1.lower.bound <- c(0, 280)
b1.upper.bound <- c(93.33333, 0)
b2.lower.bound <- c(93.33333, 0)
b2.upper.bound <- c(0, 280)
theta = c(60,100) # starting values
ui = rbind(c(280,0), c(0,93.33333)) # range of allowable values
theta %*% ui # obtain ci as -1 * theta %*% ui
# [,1] [,2]
# [1,] 16800 9333.333
constrOptim(c(60,100), Fd, NULL, ui = rbind(c(280,0), c(0,93.33333)), ci = c(-16800, -9333.333), control=list(fnscale=-1))
# $par
# [1] 75.99951 80.00798
I have tried playing around with ui and ci, but it seems like no matter what values I use for them I always get the same answer as with unconstrained optim.
Thank you for any advice.

constrOptim() uses linear inequality constraints and defines the feasible region by ui %*% param - ci >= 0. If the constraint is 3 * d1 + d2 <= 280, ui is c(-3, -1) and ci is -280.
constrOptim(); inequality constraint is: 3 * d1 + d2 <= 280
Fd <- function(betas) {
b1 = betas[1]
b2 = betas[2]
(224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}
theta = c(59.999,100) # because of needing " ui %*% inital_par - ci > 0 "
ui = c(-3, -1)
ci = -280 # those ui & ci mean " -3*par[1] + -1*par[2] + 280 >= 0 "
constrOptim(theta, Fd, NULL, ui = ui, ci = ci, control=list(fnscale=-1))
# $par
# [1] 69.00002 72.99993
[Edited]
If you want not inequality but equality constraints, it would be better to use Rsolnp or alabama package. They can use inequality and/or equality constraints (see Constrained Optimization library for equality and inequality constraints).
solnp(); auglag(); equality constraint is: 3 * d1 + d2 = 280
library(Rsolnp); library(alabama);
Fd2 <- function(betas) { # -1 * Fd
b1 = betas[1]
b2 = betas[2]
-1 * (224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}
eqFd <- function(betas) { # the equality constraint
b1 = betas[1]
b2 = betas[2]
(3 * b1 + b2 -280)
}
solnp(pars = c(60, 100), fun = Fd2, eqfun = eqFd, eqB = 0)
auglag(par = c(60, 100), fn = Fd2, heq = eqFd)

Here I have implemented G. Grothendieck's suggestion and it seems to return the correct answer. Although, ideally I would like to learn how to obtain the correct answer using constrained optimization. I used the Brent method here because there is only one variable. Note that I had to provide upper and lower limits in the optim statement.
# Find maxima using optim and substitution. First remove b2
#
# 3 * b1 + b2 = 280
#
# b2 = (280 - 3 * b1)
Fd <- function(betas) {
b1 = betas[1]
(224 * b1 + 84 * (280 - 3 * b1) + b1 * (280 - 3 * b1) - 2 * b1^2 - (280 - 3 * b1)^2)
}
optim(c(60), Fd, method = "Brent", lower = 0, upper = 93.33333, control=list(fnscale=-1))
# $par
# [1] 69
# Now remove b1
#
# 3 * b1 + b2 = 280
#
# b1 = ((280 - b2) / 3)
Fd <- function(betas) {
b2 = betas[1]
(224 * ((280 - b2) / 3) + 84 * b2 + ((280 - b2) / 3) * b2 - 2 * ((280 - b2) / 3)^2 - b2^2)
}
optim(c(100), Fd, method = "Brent", lower = 0, upper = 280, control=list(fnscale=-1))
# $par
# [1] 73

Related

Creating a 3D Plot of a Polynomial Function with Uniform Distributed Values

I have an equation which goes like this,
2* (1-x-a-b)^2 * x * *theta* + 2 * (1-a-b-x) * x^2 * *theta* - 2 * b * x^2 + 2 * a * (1-a-b-x)^2 = 0
I want to create a function in R, that selects a and b with restriction (a + b < 1 - a + b) from an uniform distribution. After selecting, I want it to find the solutions for x (both negative and positive).
I want to repeat this process t amount of time in a for loop where I will give the theta value as an input.
After that I want it to create a 3D density plot where solutions are shown with respect to values of a,b on two axes and x on one axis.
So far I have tried to use polynom package and solve function. But I am having hard time with R when it comes to mathematics.
You need to rewrite the polynomial in standard form a0 + a1*x + a2*x^2 + a3*x^3, then you can use the base function polyroot() to find the roots. For example,
a0 <- 2 * a * (1 - a - b)^2
a1 <- 2 * (1 - a - b)^2 * theta - 4 * a * (1 - a - b)
a2 <- -4 * (1 - a - b) * theta + 2 * (1 - a - b) * theta - 2 * b + 2 * a
a3 <- 0
So this is a quadratic equation, not a cubic as it appears at first glance.
Then use
polyroot(c(a0, a1, a2))
to find the roots. Select the real roots, and put them together into a matrix roots with columns a, b, root, then use rgl::plot3d(roots) to display them.
I think you have a typo in your restriction, so I'll ignore it, and this is the plot I get for theta == 1:
theta <- 1
a <- runif(1000)
b <- runif(1000)
a0 <- 2*a*(1-a-b)^2
a1 <- 2*(1-a-b)^2*theta -4*a*(1-a-b)
a2 <- -4*(1-a-b)*theta + 2*(1-a-b)*theta-2*b+2*a
result <- matrix(numeric(), ncol = 3, dimnames = list(NULL, c("a", "b", "root")))
for (i in seq_along(a)) {
root <- polyroot(c(a0[i], a1[i], a2[i]))
if (max(Im(root)) < 1.e8)
result <- rbind(result, cbind(a[i], b[i], Re(root)))
}
library(rgl)
plot3d(result)
Created on 2022-06-14 by the reprex package (v2.0.1)
Most of the roots are really small, but for some of them a2 is nearly zero, and then they can be very large.
You can create a table with a column for each variable and filter the rows not satisfying your equation:
library(tidyverse)
set.seed(1337)
n <- 1000
tibble(
a = runif(n),
b = runif(n)
) |>
filter(a + b < 1 - a + b) |>
expand_grid(
theta = seq(0, 1, by = 1),
x = seq(0, 1, by = 1)
) |>
filter(
2 * (1 - x - a - b)^2 * x * theta + 2 * (1 - a - b - x) * x^2 * theta - 2 *
b * x^2 + 2 * a * (1 - a - b - x)^2 == 0
)
#> # A tibble: 0 × 4
#> # … with 4 variables: a <dbl>, b <dbl>, theta <dbl>, x <dbl>
Created on 2022-06-13 by the reprex package (v2.0.0)
Unfortunately, there is no point in the sampled space satisfying your equation. This is probably due to ==0 instead of <e where e is a very small error. One needs to allow small errors in numerical sampling solutions.
Why just not solve the roots of the equation analytically?

R: Optimization problem with equal coefficients

I would like to optimize a simple function such as:
max = a1 * x1 + a2 * x2 + a3 * x3
where the x's are known in advance and a1 + a2 + a3 = limit.
Furthermore, I need to add a constraint where a1 = a2 = a3. Would some know how this can be implemented using lpSolveAPI? This is what I already have:
library(lpSolveAPI)
limit = 50
a <- c(1.5, 1.6, 2.5)
my.lp <- make.lp(0,3)
set.objfn(my.lp, a)
add.constraint(my.lp, 1:3, "=", limit)
lp.control(my.lp,sense='max')
my.lp
solve(my.lp)
Currently, I cannot seem to find a way to add the constraint a1 = a2 = a3 (or C1 = C2 = C3).
You cannot add constraint with two equals. You can split then into
a1 - a2 = 0
a1 - a3 = 0
That can be done using
add.constraint(my.lp, c(1, -1, 0), "=", 0)
add.constraint(my.lp, c(1, 0, -1), "=", 0)
But this optimization doesn't make sense to me as you have only one feasible solution so there is nothing to optimize.
Edit:
I've just noticed that your constraint isn't correct. What you have done is basically
x1 + 2 * x2 + 3 * x3 = limit
You need to change it to
add.constraint(my.lp, rep(1, 3), "=", limit)

How to perform nonlinear least squares with shared parameters in R?

I would like to perform nonlinear least squares regression in R where I simultaneously minimize the squared residuals of three models (see below). Now, the three models share some of the parameters, in my example, parameters b and d.
Is there a way of doing this with either nls(), or, either packages minpack.lm or nlsr?
So, ideally, I would like to generate the objective function (the sum of least squares of all models together) and regress all parameters at once: a1, a2, a3, b, c1, c2, c3 and d.
(I am trying to avoid running three independent regressions and then perform some averaging on b and d.)
my_model <- function(x, a, b, c, d) {
a * b ^ (x - c) + d
}
# x values
x <- seq(0, 10, 0.2)
# Shared parameters
b <- 2
d <- 10
a1 <- 1
c1 <- 1
y1 <- my_model(x,
a = a1,
b = b,
c = c1,
d = d) + rnorm(length(x))
a2 <- 2
c2 <- 5
y2 <- my_model(x,
a = a2,
b = b,
c = c2,
d = d) + rnorm(length(x))
a3 <- -2
c3 <- 3
y3 <- my_model(x,
a = a3,
b = b,
c = c3,
d = d) + rnorm(length(x))
plot(
y1 ~ x,
xlim = range(x),
ylim = d + c(-50, 50),
type = 'b',
col = 'red',
ylab = 'y'
)
lines(y2 ~ x, type = 'b', col = 'green')
lines(y3 ~ x, type = 'b', col = 'blue')
Below we run nls (using a slightly modified model) and nlxb (from nlsr) but nlxb stops before convergence. Desite these problems both of these nevertheless do give results which visually fit the data well. These problems suggest that there are problems with the model itself so in the Other section, guided by the nlxb output, we show how to fix the model giving a submodel of the original model which fits the data easily with both nls and nlxb and also gives a good fit. At the end in the Notes section we provide the data in reproducible form.
nls
Assuming the setup shown reproducibly in the Note at the end, reformulate the problem for the nls plinear algorithm by defining a right hand side matrix whose columns multiply each of the linear parameters, a1, a2, a3 and d, respectively. plinear does not require starting values for those simplifying the setup. It will report them as .lin1, .lin2, .lin3 and .lin4 respectively.
To get starting values we used a simpler model with no grouping and a grid search over b from 1 to 10 and c also from 1 to 10 using nls2 in the package of the same name. We also found that nls still produced errors but by using abs in the formula, as shown, it ran to completion.
The problems with the model suggest that there is a fundamental problem with it and in the Other section we discuss how to fix it up.
xx <- c(x, x, x)
yy <- c(y1, y2, y3)
# startingi values using nls2
library(nls2)
fo0 <- yy ~ cbind(b ^ abs(xx - c), 1)
st0 <- data.frame(b = c(1, 10), c = c(1, 10))
fm0 <- nls2(fo0, start = st0, alg = "plinear-brute")
# run nls using starting values from above
g <- rep(1:3, each = length(x))
fo <- yy ~ cbind((g==1) * b ^ abs(xx - c[g]),
(g==2) * b ^ abs(xx - c[g]),
(g==3) * b ^ abs(xx - c[g]),
1)
st <- with(as.list(coef(fm0)), list(b = b, c = c(c, c, c)))
fm <- nls(fo, start = st, alg = "plinear")
plot(yy ~ xx, col = g)
for(i in unique(g)) lines(predict(fm) ~ xx, col = i, subset = g == i)
fm
giving:
Nonlinear regression model
model: yy ~ cbind((g == 1) * b^abs(xx - c[g]), (g == 2) * b^abs(xx - c[g]), (g == 3) * b^abs(xx - c[g]), 1)
data: parent.frame()
b c1 c2 c3 .lin1 .lin2 .lin3 .lin4
1.997 0.424 1.622 1.074 0.680 0.196 -0.532 9.922
residual sum-of-squares: 133
Number of iterations to convergence: 5
Achieved convergence tolerance: 5.47e-06
(continued after plot)
nlsr
With nlsr it would be done like this. No grid search for starting values was needed and adding abs was not needed either. The b and d values seem similar to the nls solution but the other coefficients differ. Visually both solutions seem to fit the data.
On the other hand from the JSingval column we see that the jacobian is rank deficient which caused it to stop and not produce SE values and the convergence is in doubt (although it may be sufficient given that visually the plot, not shown, seems like a good fit). We discuss how to fix this up in the Other section.
g1 <- g == 1; g2 <- g == 2; g3 <- g == 3
fo2 <- yy ~ g1 * (a1 * b ^ (xx - c1) + d) +
g2 * (a2 * b ^ (xx - c2) + d) +
g3 * (a3 * b ^ (xx - c3) + d)
st2 <- list(a1 = 1, a2 = 1, a3 = 1, b = 1, c1 = 1, c2 = 1, c3 = 1, d = 1)
fm2 <- nlxb(fo2, start = st2)
fm2
giving:
vn: [1] "yy" "g1" "a1" "b" "xx" "c1" "d" "g2" "a2" "c2" "g3" "a3" "c3"
no weights
nlsr object: x
residual sumsquares = 133.45 on 153 observations
after 16 Jacobian and 22 function evaluations
name coeff SE tstat pval gradient JSingval
a1 3.19575 NA NA NA 9.68e-10 4097
a2 0.64157 NA NA NA 8.914e-11 662.5
a3 -1.03096 NA NA NA -1.002e-09 234.9
b 1.99713 NA NA NA -2.28e-08 72.57
c1 2.66146 NA NA NA -2.14e-09 10.25
c2 3.33564 NA NA NA -3.955e-11 1.585e-13
c3 2.0297 NA NA NA -7.144e-10 1.292e-13
d 9.92363 NA NA NA -2.603e-12 3.271e-14
We can calculate SE's using nls2 as a second stage but this still does not address the problem with the whole lthing that the singular values suggest.
summary(nls2(fo2, start = coef(fm2), algorithm = "brute-force"))
giving:
Formula: yy ~ g1 * (a1 * b^(xx - c1) + d) + g2 * (a2 * b^(xx - c2) + d) +
g3 * (a3 * b^(xx - c3) + d)
Parameters:
Estimate Std. Error t value Pr(>|t|)
a1 3.20e+00 5.38e+05 0.0 1
a2 6.42e-01 3.55e+05 0.0 1
a3 -1.03e+00 3.16e+05 0.0 1
b 2.00e+00 2.49e-03 803.4 <2e-16 ***
c1 2.66e+00 9.42e-02 28.2 <2e-16 ***
c2 3.34e+00 2.43e+05 0.0 1
c3 2.03e+00 8.00e+05 0.0 1
d 9.92e+00 4.42e+05 0.0 1
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.959 on 145 degrees of freedom
Number of iterations to convergence: 8
Achieved convergence tolerance: NA
Other
When nls has trouble fitting a model it often suggests that there is something wrong with the model itself. Playing around with it a bit, guided by the JSingval column in nlsr output above which suggests that c parameters or d might be the problem, we find that if we fix all c parameter values to 0 then the model is easy to fit given sufficiently good starting values and it still gives a low residual sum of squares.
library(nls2)
fo3 <- yy ~ cbind((g==1) * b ^ xx, (g==2) * b ^ xx, (g==3) * b ^ xx, 1)
st3 <- coef(fm0)["b"]
fm3 <- nls(fo3, start = st3, alg = "plinear")
giving:
Nonlinear regression model
model: yy ~ cbind((g == 1) * b^xx, (g == 2) * b^xx, (g == 3) * b^xx, 1)
data: parent.frame()
b .lin1 .lin2 .lin3 .lin4
1.9971 0.5071 0.0639 -0.2532 9.9236
residual sum-of-squares: 133
Number of iterations to convergence: 4
Achieved convergence tolerance: 1.67e-09
which the following anova indicates is comparable to fm from above despite having 3 fewer parameters:
anova(fm3, fm)
giving:
Analysis of Variance Table
Model 1: yy ~ cbind((g == 1) * b^xx, (g == 2) * b^xx, (g == 3) * b^xx, 1)
Model 2: yy ~ cbind((g == 1) * b^abs(xx - c[g]), (g == 2) * b^abs(xx - c[g]), (g == 3) * b^abs(xx - c[g]), 1)
Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
1 148 134
2 145 133 3 0.385 0.14 0.94
We can redo fm3 using nlxb like this:
fo4 <- yy ~ g1 * (a1 * b ^ xx + d) +
g2 * (a2 * b ^ xx + d) +
g3 * (a3 * b ^ xx + d)
st4 <- list(a1 = 1, a2 = 1, a3 = 1, b = 1, d = 1)
fm4 <- nlxb(fo4, start = st4)
fm4
giving:
nlsr object: x
residual sumsquares = 133.45 on 153 observations
after 24 Jacobian and 33 function evaluations
name coeff SE tstat pval gradient JSingval
a1 0.507053 0.005515 91.94 1.83e-132 8.274e-08 5880
a2 0.0638554 0.0008735 73.11 4.774e-118 1.26e-08 2053
a3 -0.253225 0.002737 -92.54 7.154e-133 -4.181e-08 2053
b 1.99713 0.002294 870.6 2.073e-276 -2.55e-07 147.5
d 9.92363 0.09256 107.2 3.367e-142 -1.219e-11 10.26
Note
The assumed input below is the same as in the question except we additionally
set the seed to make it reproducible.
set.seed(123)
my_model <- function(x, a, b, c, d) a * b ^ (x - c) + d
x <- seq(0, 10, 0.2)
b <- 2; d <- 10 # shared
a1 <- 1; c1 <- 1
y1 <- my_model(x, a = a1, b = b, c = c1, d = d) + rnorm(length(x))
a2 <- 2; c2 <- 5
y2 <- my_model(x, a = a2, b = b, c = c2, d = d) + rnorm(length(x))
a3 <- -2; c3 <- 3
y3 <- my_model(x, a = a3, b = b, c = c3, d = d) + rnorm(length(x))
I'm not sure this is really the best way, but you could minimize the sum of the squared residuals using optim().
#start values
params <- c(a1=1, a2=1, a3=1, b=1, c1=1, c2=1, c3=1,d=1)
# minimize total sum of squares of residuals
fun <- function(p) {
sum(
(y1-my_model(x, p["a1"], p["b"], p["c1"], p["d"]))^2 +
(y2-my_model(x, p["a2"], p["b"], p["c2"], p["d"]))^2 +
(y3-my_model(x, p["a3"], p["b"], p["c3"], p["d"]))^2
)
}
out <- optim(params, fun, method="BFGS")
out$par
# a1 a2 a3 b c1 c2 c3
# 0.8807542 1.0241804 -2.8805848 1.9974615 0.7998103 4.0030597 3.5184600
# d
# 9.8764917
And we can add the plots on top of the image
curve(my_model(x, out$par["a1"], out$par["b"], out$par["c1"], out$par["d"]), col="red", add=T)
curve(my_model(x, out$par["a2"], out$par["b"], out$par["c2"], out$par["d"]), col="green", add=T)
curve(my_model(x, out$par["a3"], out$par["b"], out$par["c3"], out$par["d"]), col="blue", add=T)

Mixed Integer Programming in R - Indicator functions

I hope this message finds you well.
I am trying to solve an optimization problem formulated as a Mixed Integer Program with the lpSolveAPI R-package. However, there are indicator functions in the objective function and in some constraints. To be more specific, consider the following optimization problem:
min{ 2.8 * x1 + 3.2 * x2 + 3.5 * x3 +
17.5 * delta(x1) + 2.3 * delta(x2) + 5.5 * delta(x3) }
subject to:
0.4 * x1 + 8.7 * x2 + 4.5 * x3 <=
387 - 3 * delta(x1) - 1 * delta(x2) - 3 * delta(x3)
x1 <= 93 * delta(x1)
x2 <= 94 * delta(x2),
x3 <= 100 * delta(x3), and
x1, x2, and x3 are non-negative integers.
In this problem, for all i in {1, 2, 3}, delta(xi) = 1 if xi > 0, whereas delta(xi) = 0 otherwise.
The R-code I have so far is:
install.packages("lpSolveAPI")
library(lpSolveAPI)
a <- c(3, 1, 3)
b <- c(0.4, 8.7, 4.5)
q <- 387
M <- c(93, 94, 100)
A <- c(17.5, 2.3, 5.5)
h <- c(2.8, 3.2, 3.5)
Fn <- function(u1, u2, u3, u4){
lprec <- make.lp(0, 3)
lp.control(lprec, "min")
set.objfn(lprec, u1)
add.constraint(lprec, u2, "<=", u3)
set.bounds(lprec, lower = rep(0, 3), upper = u4)
set.type(lprec, columns = 1:3, type = "integer")
solve(lprec)
return(list(Soln = get.variables(lprec), MinObj = get.objective(lprec)))
}
TheTest <- Fn(u1 = h, u2 = b, u3 = q, u4 = M)
Please, I was wondering if someone could tell me how to put delta functions into this R-code to solve the aforementioned optimization problem.
Rodrigo.
A constraint like x1 <= 93 * delta(x1) looks very strange to me. I think this is just x1 <= 93. For a MIP solver replace the function delta(x) by a binary variable d. Then add the constraint d <= x <= M*d where M is an upper bound on x. To be explicit, for your model we have:
min 2.8*x1 + 3.2*x2 + 3.5*x3 + 17.5*d1 + 2.3*d2 + 5.5*d3
0.4*x1 + 8.7*x2 + 4.5*x3 <= 387 - 3*d1 - d2 - 3*d3
d1 <= x1 <= 93*d1
d2 <= x2 <= 94*d2
d3 <= x3 <= 100*d3
x1 integer in [0,93]
x2 integer in [0,94]
x3 integer in [0,100]
d1,d2,d3 binary
This is now trivial to solve with any MIP solver. Note that a double inequality like d1 <= x1 <= 93*d1 can be written as two inequalities: d1<=x1 and x1<=93*d1.

Hessian is -1 * Hessian when finding function maxima with optim

I would like to find the maxima of the function:
Fd <- 224 * d1 + 84 * d2 + d1 * d2 - 2 * d1^2 - d2^2
I can do that using an 'exhaustive' search using the following code:
my.data <- expand.grid(x1 = seq(0, 200, 1), x2 = seq(0, 200, 1))
d1 <- my.data[,1]
d2 <- my.data[,2]
Fd <- 224 * d1 + 84 * d2 + d1 * d2 - 2 * d1^2 - d2^2
new.data <- data.frame(Fd = Fd, d1 = d1, d2 = d2)
# identify values of d1 and d2 that maximize Fd
new.data[new.data$Fd == max(new.data$Fd),]
# Fd d1 d2
# 16157 11872 76 80
The function has a maximum value of 11872 when d1 = 76 and d2 = 80.
I can also locate the maxima using optim with the code below:
Fd <- function(betas) {
b1 = betas[1]
b2 = betas[2]
-1 * (224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}
optim(c(1,1), Fd, hessian = TRUE)
I multiply the function above by -1 to obtain the maxima and noticed the maximum value returned by this code is the negative of the true maximum value. Also, the Hessian returned by this code is the true Hessian * -1:
true.hessian <- matrix(c(-4, 1, 1, -2), nrow = 2, byrow = TRUE)
true.hessian
estimated.hessian <- -1 * true.hessian
estimated.hessian
I never realized this until now and did not see it mentioned on the optim page. Should I be concerned? If so, in what circumstances?
Is there an option in the optim statement to return the true Hessian of the original function rather than -1 * the original function when finding the maxima? Or should I just be aware that the Hessian is multiplied by -1 when I search for the maxima (rather than search for the minima) and correct the Hessian myself?
This will return the correct Hessian when maximizing the function:
Fd <- function(betas) {
b1 = betas[1]
b2 = betas[2]
(224 * b1 + 84 * b2 + b1 * b2 - 2 * b1^2 - b2^2)
}
optim(c(1,1), Fd, control=list(fnscale=-1), hessian = TRUE)
# $par
# [1] 76.00046 79.99999
#
# $value
# [1] 11872
#
# $counts
# function gradient
# 129 NA
#
# $convergence
# [1] 0
#
# $message
# NULL
#
# $hessian
# [,1] [,2]
# [1,] -3.999998 0.999999
# [2,] 0.999999 -2.000000

Resources