Is it possible to solve the following problem in R?
In particular, I want to find the values of a1 and a2 minimizing the loss below:
> n <- 1000
> x <- rnorm(n, 1, 1)
> e <- rnorm(n, 0, 1)
> d <- a1+a2*x+e < 0
> loss <- (mean(d) - 0.5) + (mean((a1 + a2*x + e)[d=0]) - 2)
That is, I want to find the values of a1 and a2 that make mean(d) and mean((a1+a2*x+e)[d=0]) as close as possible to 0.5 and 2, respectively.
(the chosen values 0.5 and 2 are just temporal values)
Using optim with a function f that computes the defined loss. p is a vector of parameters, i.e. p[1] is your a1, and p[2] your a2. Use reasonable starting values when calling optim with your function.
f <- \(p) {
d <- p[1] + p[2]*x + e < 0
(mean(d) - 0.5) + (mean((p[1] + p[2]*x + e)[d]) - 2)
}
res <- optim(c(0, 0), f)
res$par
# [1] 4.393432e+53 1.010012e+55 ## a1 and a2
Note that d is already boolean.
In case you get different results with different starting values, your distribution might be multi-modal.
Data:
n <- 1e3; set.seed(42); x <- rnorm(n, 1, 1); e <- rnorm(n, 0, 1)
I get this error when I run the MWE code below. Does anyone know how to resolve this? thanks!
Error: Error 10020: Q matrix is not positive semi-definite (PSD). Set NonConvex parameter to 2 to solve model.
MWE:
library(gurobi)
library(Matrix)
model <- list()
#optimization problem:
# max x + y
# s.t.
# -x + y <= 0
# x^2 - y^2 <= 10
# 0 <= x < = 20
# 0 <= y <= 20
model$obj <- c(1,1)
model$A <- matrix(c(-1,1), nrow=1, byrow=T) # for LHS of linear constraint: -x + y <= 0
model$rhs <- c(0) # for RHS of linear constraint: -x + y <= 0
model$ub[1] = 20 # x < = 20
model$ub[2] = 20 # y < = 20
model$sense <- c('<')
# non-convex quadratic constraint: x^2 - y^2 <= 10
qc1 <- list()
qc1$Qc <- spMatrix(2, 2, c(1, 2), c(1, 2), c(1.0, -1.0))
qc1$rhs <- 10
model$quadcon <- list(qc1)
#the QC constraint is a non-convex quadratic constraint, so set NonConvex = 2
model$params <- list(NonConvex=2)
gurobi_write(model,'quadtest.lp', env)
result <- gurobi(model) # THIS IS WHERE I GET THE ERROR ABOVE
print(result$objval)
print(result$x)
NM...i see that I can fix this by not putting the params as part of the model list, and instead running it as an input to the gurobi(,) call as follows:
params <- list(NonConvex=2)
result <- gurobi(model, params)
As the title suggests, I would like to solve the following problem. Let f denote a a certain function and let f0 denote a given constant. Is there an economical way of finding max{x:f(x) <= f0}?
Here is what an example would look like:
f = function(x) (x-2)^2
f0 = 0.4
and in that case the correct answer would be about 2.5. Thank you in advance.
One possibility to optimize with constraints would be to define a version of your function f which returns Inf if the constraint is not met:
f <- function(x) (x-2)^2
f0 <- 0.4
f_optim <- function(x, a = f0) ifelse(f(x) <= a, f(x), Inf)
optimize(f_optim, c(-10, 10), a = f0, maximum = T, tol = .Machine$double.eps)
$maximum
[1] 2.632456
$objective
[1] 0.4
f0 = 0.04
f = function(x) (x - 2)^2
g = function(x, f0) {
delta = f0 - f(x)
abs(delta)
}
optimize(g, c(0, 10), f0 = f0, maximum=F, tol= .Machine$double.eps)
I would like to estimate the parameter for exponential distribution using Maximum Product of Spacings (MPS). I will have to minimize:
-(1/(n + 1))*(sum of log D[i] from i = 1 to n + 1),
where D[i] = F(x[i]) - F(x[i - 1])
And the following is my R code:
n<- 10
mydata<- rexp(n, rate=2)
x<- sort(mydata)
fnn<- function(lambda,x){
for (i in 2:n){
c<- 1-exp(-lambda*x[i])
d<- 1-exp(-lambda*x[i-1])
}
s<- (1/(n-1))*sum(log(c-d))
return(-s)
}
optim(0.8, fnn, x=x)
Can someone please verify if I am doing the right thing here?
The output I obtained is far from the true value of lambda = 2.
$`par`
[1] 0.92375
$value
[1] 0.1847188
$counts
function gradient
18 NA
$convergence
[1] 0
$message
NULL
what modifications should I include?
The problem with your code is that it is rewriting c and d each time through the for loop. It also had a bug in the computation of the multiplicative constant 1/(n + 1).
Here is a corrected version. The key is to reserve memory before the loop with numeric(n - 1).
I also include a simpler version, taking advantage of R's built-in pexp.
fnn <- function(lambda, x){
n <- length(x)
c <- numeric(n - 1)
d <- numeric(n - 1)
for (i in 2:n){
c[i - 1] <- 1 - exp(-lambda*x[i])
d[i - 1] <- 1 - exp(-lambda*x[i-1])
}
s <- (1/(n + 1))*sum(log(c - d))
return(-s)
}
fnn2 <- function(lambda, x){
n <- length(x)
D <- log(pexp(x[-1], rate = lambda,) - pexp(x[-n], rate = lambda))
s <- sum(D)/(n + 1)
-s
}
set.seed(1234)
n <- 10
mydata <- rexp(n, rate = 2)
x <- sort(mydata)
opt <- optim(0.8, fnn, x = x)
opt2 <- optim(0.8, fnn2, x = x)
opt$par
#[1] 2.9225
opt2$par
#[1] 2.9225
identical(opt$par, opt2$par)
#[1] TRUE
I've found polynomial coefficients from my data:
R <- c(0.256,0.512,0.768,1.024,1.28,1.437,1.594,1.72,1.846,1.972,2.098,2.4029)
Ic <- c(1.78,1.71,1.57,1.44,1.25,1.02,0.87,0.68,0.54,0.38,0.26,0.17)
NN <- 3
ft <- lm(Ic ~ poly(R, NN, raw = TRUE))
pc <- coef(ft)
So I can create a polynomial function:
f1 <- function(x) pc[1] + pc[2] * x + pc[3] * x ^ 2 + pc[4] * x ^ 3
And for example, take a derivative:
g1 <- Deriv(f1)
How to create a universal function so that it doesn't have to be rewritten for every new polynomial degree NN?
My original answer may not be what you really want, as it was numerical rather symbolic. Here is the symbolic solution.
## use `"x"` as variable name
## taking polynomial coefficient vector `pc`
## can return a string, or an expression by further parsing (mandatory for `D`)
f <- function (pc, expr = TRUE) {
stringexpr <- paste("x", seq_along(pc) - 1, sep = " ^ ")
stringexpr <- paste(stringexpr, pc, sep = " * ")
stringexpr <- paste(stringexpr, collapse = " + ")
if (expr) return(parse(text = stringexpr))
else return(stringexpr)
}
## an example cubic polynomial with coefficients 0.1, 0.2, 0.3, 0.4
cubic <- f(pc = 1:4 / 10, TRUE)
## using R base's `D` (requiring expression)
dcubic <- D(cubic, name = "x")
# 0.2 + 2 * x * 0.3 + 3 * x^2 * 0.4
## using `Deriv::Deriv`
library(Deriv)
dcubic <- Deriv(cubic, x = "x", nderiv = 1L)
# expression(0.2 + x * (0.6 + 1.2 * x))
Deriv(f(1:4 / 10, FALSE), x = "x", nderiv = 1L) ## use string, get string
# [1] "0.2 + x * (0.6 + 1.2 * x)"
Of course, Deriv makes higher order derivatives easier to get. We can simply set nderiv. For D however, we have to use recursion (see examples of ?D).
Deriv(cubic, x = "x", nderiv = 2L)
# expression(0.6 + 2.4 * x)
Deriv(cubic, x = "x", nderiv = 3L)
# expression(2.4)
Deriv(cubic, x = "x", nderiv = 4L)
# expression(0)
If we use expression, we will be able to evaluate the result later. For example,
eval(cubic, envir = list(x = 1:4)) ## cubic polynomial
# [1] 1.0 4.9 14.2 31.3
eval(dcubic, envir = list(x = 1:4)) ## its first derivative
# [1] 2.0 6.2 12.8 21.8
The above implies that we can wrap up an expression for a function. Using a function has several advantages, one being that we are able to plot it using curve or plot.function.
fun <- function(x, expr) eval.parent(expr, n = 0L)
Note, the success of fun requires expr to be an expression in terms of symbol x. If expr was defined in terms of y for example, we need to define fun with function (y, expr). Now let's use curve to plot cubic and dcubic, on a range 0 < x < 5:
curve(fun(x, cubic), from = 0, to = 5) ## colour "black"
curve(fun(x, dcubic), add = TRUE, col = 2) ## colour "red"
The most convenient way, is of course to define a single function FUN rather than doing f + fun combination. In this way, we also don't need to worry about the consistency on the variable name used by f and fun.
FUN <- function (x, pc, nderiv = 0L) {
## check missing arguments
if (missing(x) || missing(pc)) stop ("arguments missing with no default!")
## expression of polynomial
stringexpr <- paste("x", seq_along(pc) - 1, sep = " ^ ")
stringexpr <- paste(stringexpr, pc, sep = " * ")
stringexpr <- paste(stringexpr, collapse = " + ")
expr <- parse(text = stringexpr)
## taking derivatives
dexpr <- Deriv::Deriv(expr, x = "x", nderiv = nderiv)
## evaluation
val <- eval.parent(dexpr, n = 0L)
## note, if we take to many derivatives so that `dexpr` becomes constant
## `val` is free of `x` so it will only be of length 1
## we need to repeat this constant to match `length(x)`
if (length(val) == 1L) val <- rep.int(val, length(x))
## now we return
val
}
Suppose we want to evaluate a cubic polynomial with coefficients pc <- c(0.1, 0.2, 0.3, 0.4) and its derivatives on x <- seq(0, 1, 0.2), we can simply do:
FUN(x, pc)
# [1] 0.1000 0.1552 0.2536 0.4144 0.6568 1.0000
FUN(x, pc, nderiv = 1L)
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
FUN(x, pc, nderiv = 2L)
# [1] 0.60 1.08 1.56 2.04 2.52 3.00
FUN(x, pc, nderiv = 3L)
# [1] 2.4 2.4 2.4 2.4 2.4 2.4
FUN(x, pc, nderiv = 4L)
# [1] 0 0 0 0 0 0
Now plotting is also easy:
curve(FUN(x, pc), from = 0, to = 5)
curve(FUN(x, pc, 1), from = 0, to = 5, add = TRUE, col = 2)
curve(FUN(x, pc, 2), from = 0, to = 5, add = TRUE, col = 3)
curve(FUN(x, pc, 3), from = 0, to = 5, add = TRUE, col = 4)
Since my final solution with symbolic derivatives eventually goes too long, I use a separate session for numerical calculations. We can do this as for polynomials, derivatives are explicitly known so we can code them. Note, there will be no use of R expression here; everything is done directly by using functions.
So we first generate polynomial basis from degree 0 to degree p - n, then multiply coefficient and factorial multiplier. It is more convenient to use outer than poly here.
## use `outer`
g <- function (x, pc, nderiv = 0L) {
## check missing aruments
if (missing(x) || missing(pc)) stop ("arguments missing with no default!")
## polynomial order p
p <- length(pc) - 1L
## number of derivatives
n <- nderiv
## earlier return?
if (n > p) return(rep.int(0, length(x)))
## polynomial basis from degree 0 to degree `(p - n)`
X <- outer(x, 0:(p - n), FUN = "^")
## initial coefficients
## the additional `+ 1L` is because R vector starts from index 1 not 0
beta <- pc[n:p + 1L]
## factorial multiplier
beta <- beta * factorial(n:p) / factorial(0:(p - n))
## matrix vector multiplication
drop(X %*% beta)
}
We still use the example x and pc defined in the symbolic solution:
x <- seq(0, 1, by = 0.2)
pc <- 1:4 / 10
g(x, pc, 0)
# [1] 0.1000 0.1552 0.2536 0.4144 0.6568 1.0000
g(x, pc, 1)
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
g(x, pc, 2)
# [1] 0.60 1.08 1.56 2.04 2.52 3.00
g(x, pc, 3)
# [1] 2.4 2.4 2.4 2.4 2.4 2.4
g(x, pc, 4)
# [1] 0 0 0 0 0 0
The result is consistent with what we have with FUN in the the symbolic solution.
Similarly, we can plot g using curve:
curve(g(x, pc), from = 0, to = 5)
curve(g(x, pc, 1), from = 0, to = 5, col = 2, add = TRUE)
curve(g(x, pc, 2), from = 0, to = 5, col = 3, add = TRUE)
curve(g(x, pc, 3), from = 0, to = 5, col = 4, add = TRUE)
Now after quite much effort in demonstrating how we can work out this question ourselves, consider using R package polynom. As a small package, it aims at implementing construction, derivatives, integration, arithmetic and roots-finding of univariate polynomials. This package is written completely with R language, without any compiled code.
## install.packages("polynom")
library(polynom)
We still consider the cubic polynomial example used before.
pc <- 1:4 / 10
## step 1: making a "polynomial" object as preparation
pcpoly <- polynomial(pc)
#0.1 + 0.2*x + 0.3*x^2 + 0.4*x^3
## step 2: compute derivative
expr <- deriv(pcpoly)
## step 3: convert to function
g1 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 1.2 + x * w
# w <- 0.6 + x * w
# w <- 0.2 + x * w
# w
#}
#<environment: 0x9f4867c>
Note, by step-by-step construction, the resulting function has all parameters inside. It only requires a single argument for x value. In contrast, functions in the other two answers will take coefficients and derivative order as mandatory arguments, too. We can call this function
g1(seq(0, 1, 0.2))
# [1] 0.200 0.368 0.632 0.992 1.448 2.000
To produce the same graph we see in other two answers, we get other derivatives as well:
g0 <- as.function(pcpoly) ## original polynomial
## second derivative
expr <- deriv(expr)
g2 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 2.4 + x * w
# w <- 0.6 + x * w
# w
#}
#<environment: 0x9f07c68>
## third derivative
expr <- deriv(expr)
g3 <- as.function(expr)
#function (x)
#{
# w <- 0
# w <- 2.4 + x * w
# w
#}
#<environment: 0x9efd740>
Perhaps you have already noticed that I did not specify nderiv, but recursively take 1 derivative at a time. This may be a disadvantage of this package. It does not facilitate higher order derivatives.
Now we can make a plot
## As mentioned, `g0` to `g3` are parameter-free
curve(g0(x), from = 0, to = 5)
curve(g1(x), add = TRUE, col = 2)
curve(g2(x), add = TRUE, col = 3)
curve(g3(x), add = TRUE, col = 4)