I am using the R programming language. Using the "optim" library and the "BFGS" optimization algorithm, I am interested in optimizing the following function (also called the "Rosenbrock Function"):
If you define this function and the derivative of this function, it is pretty straightforward to optimize with the "optim" library and the BFGS algorithm (note: the BFGS algorithm requires knowledge of the function's derivative):
fr <- function(x) { ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
200 * (x2 - x1 * x1))
}
res <- optim(c(-1.2,1), fr, grr, method = "BFGS")
> res
$par
[1] 1 1
$value
[1] 9.594956e-18
$counts
function gradient
110 43
$convergence
[1] 0
$message
NULL
Suppose you are working with a high dimensional complicated function - the derivative of this function will be difficult to manually evaluate and then write a function for this derivative (i.e. an additional place where you can make a mistake). Are there any "automatic" ways in R, such that if you write a mathematical function - R can automatically "infer" the derivative of this function?
For instance, in a new R session - would there have been some way to run the BFGS algorithm without explicitly defining the derivative?
fr <- function(x) { ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
#pseudo code
res <- optim(c(-1.2,1), fr, ??? , method = "BFGS")
Does anyone know if something like this is possible? Can R automatically infer the derivative?
I thought of an approach where you could use a preexisting "numerical differentiation" function in R to approximate the derivative at each iteration, and then feed this approximation into the BFGS algorithm, but that sounds very complicated and unnecessary.
It would have been nice if R could somehow automatically infer the derivative of a function.
References:
https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/optim
This is already built into the optim() function. If you don't specify the derivative it will be calculated numerically e.g.
fr <- function(x) { ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
optim(c(-1.2,1), fr, method = "BFGS")
# $par
# [1] 0.9998044 0.9996084
#
# $value
# [1] 3.827383e-08
#
# $counts
# function gradient
# 118 38
#
# $convergence
# [1] 0
#
# $message
# NULL
Note that (at least for this case) the solution is very close to that found using the analytical derivatives.
Related
#generating 100 uniformly distributed numbers
u1 <- runif(100,0,1)
u2 <- runif(100,0,1)
x1 <- function(x, y) {
return(sqrt(-2 * log(x) * cos(2 * pi * y)))
}
x2 <- function(x, y) {
return(sqrt(-2 * log(x) * sin(2 * pi * y)))
}
#applying x1
x1_vals <- mapply(x1, u1, u2)
#applying x2
x2_vals <- mapply(x2, u1, u2)
Hi I want to write a box muller function, this is part of my attempt above, I'm trying to avoid using loops as much as possible(for loops especially)
However, I keep getting NA values for my x1/x2 functions. I can't figure out whats wrong.
Here is the Box-Muller formula corrected.
x1 <- function(x, y) sqrt(-2 * log(x)) * cos(2 * pi * y)
x2 <- function(x, y) sqrt(-2 * log(x)) * sin(2 * pi * y)
u1 <- runif(1000,0,1)
u2 <- runif(1000,0,1)
# applying x1
x1_vals <- x1(u1, u2)
all(is.finite(x1_vals))
#> [1] TRUE
# applying x2
x2_vals <- x2(u1, u2)
all(is.finite(x2_vals))
#> [1] TRUE
old_par <- par(mfrow = c(1, 2))
hist(x1_vals, freq = FALSE)
curve(dnorm, from = -4, to = 4, add = TRUE)
hist(x2_vals, freq = FALSE)
curve(dnorm, from = -4, to = 4, add = TRUE)
par(old_par)
Created on 2022-09-28 with reprex v2.0.2
Those functions are already vectorized. Don't need to wrap mapply around them. Just give the two vectors to x1 and x2
x1(u1,u2)
[1] NaN NaN NaN NaN 0.3088330
[6] NaN 0.7866889 NaN NaN 1.7102801
[11] 2.1886770 NaN 1.5473627 1.0644378 0.8499208
snipped remaining 100 values
Vectorization is a feature of the R language. If the expressions in the function body can all take vectors and return vectors of equal length then no loop wrapper is needed.
You are getting NA's because the domain of the arguments to sin and cos are causing both positive and negative values to be given to sqrt.
I need to evaluate an integral in the following form:
\int_a^b f(x) \int_0^x g(t)(x-t)dtdx
Can you please suggest a way? I assume that this integral can't be done in the standard approach suggested in the following answer:
Standard approach
Update: Functions are added in the following image. f(x) basically represents a pdf of a uniform distribution but the g(t) is a bit more complicated. a and b can be any positive real numbers.
The domain of integration is a simplex (triangle) with vertices (a,a), (a,b) and (b,b). Use the SimplicialCubature package:
library(SimplicialCubature)
alpha <- 3
beta <- 4
g <- function(t){
((beta/t)^(1/2) + (beta/t)^(3/2)) * exp(-(t/beta + beta/t - 2)/(2*alpha^2)) /
(2*alpha*beta*sqrt(2*pi))
}
a <- 1
b <- 2
h <- function(tx){
t <- tx[1]
x <- tx[2]
g(t) * (x-t)
}
S <- cbind(c(a, a), c(a ,b), c(b, b))
adaptIntegrateSimplex(h, S)
# $integral
# [1] 0.01962547
#
# $estAbsError
# [1] 3.523222e-08
Another way, less efficient and less reliable, is:
InnerFunc <- function(t, x) { g(t) * (x - t) }
InnerIntegral <- Vectorize(function(x) { integrate(InnerFunc, a, x, x = x)$value})
integrate(InnerIntegral, a, b)
# 0.01962547 with absolute error < 2.2e-16
I'm trying to solve a simple system of non-linear equations described in this post.
The system is two equations with two unknowns p and q and a free parameter lambda:
When lambda = 1 the system looks like this:
There is a unique solution and it's in the vicinity of p = 0.3, q = 0.1.
I'm trying to solve it with nleqslv. My objective function is:
library(nleqslv)
fn = function(x, lambda = 1){
# p = x[1]
# q = x[2]
pstar = exp(lambda * (1*x[2])) / (exp(lambda * (1*x[2])) + exp(lambda * (1 - x[2])))
qstar = exp(lambda * (1 - x[1])) / (exp(lambda * ((1 - x[1]))) + exp(lambda * (9*x[1])))
return(c(pstar,qstar))
}
but the results don't match what the plot:
> xstart = c(0.1, 0.3)
> nleqslv(xstart, fn)$x
[1] 1.994155 -8.921285
My first question is: am I using nleqslv correctly? I thought so after looking at other examples. But now I'm not sure.
My second question: is this a good problem nleqslv? Or am I barking up the wrong tree?
Your function does not reflect properly what you want.
You can see this by evaluating fn(c(0.3,0.1)) as follows.
fn(c(0.3,0.1))
[1] 0.3100255 0.1192029
So the output is very close to the input. You wanted (almost) zero as output.
So you want to solve the system for p and q.
What you need to do is to make your function return the difference between the input p and the expression for pstar and the difference between the input q and the expression for qstar.
So rewrite your function as follows
fn <- function(x, lambda = 1){
p <- x[1]
q <- x[2]
pstar <- exp(lambda * (1*x[2])) / (exp(lambda * (1*x[2])) + exp(lambda * (1 - x[2])))
qstar <- exp(lambda * (1 - x[1])) / (exp(lambda * ((1 - x[1]))) + exp(lambda * (9*x[1])))
return(c(pstar-p,qstar-q))
}
and then call nleqslv as follows (PLEASE always show all the code you are using. You left out the library(nleqslv)).
library(nleqslv)
xstart <- c(0.1, 0.3)
nleqslv(xstart, fn)
This will display the full output of the function. Always a good idea to check for succes. Always check $termcd for succes.
$x
[1] 0.3127804 0.1064237
$fvec
[1] 5.070055e-11 6.547240e-09
$termcd
[1] 1
$message
[1] "Function criterion near zero"
$scalex
[1] 1 1
$nfcnt
[1] 7
$njcnt
[1] 1
$iter
[1] 7
The result for $x is more what you expect.
Finally please use <- for assignment. If you don't there will come the day that you will be bitten by R and its magic.
This is nothing wrong in using nleqslv for this problem. You only made a small mistake.
I am solving a nonlinear equation using the package nleqslv, but I keep getting the error: Length of fn result <> length of x!
I can't spot the place where vector length can be a problem. Anyone knows what mistakes did I made in my code?
library(nleqslv)
d_plus <- function(x) (log(55.75/x[1])+(0.026 + x[2]^2 / 2) * 0.25) / (x[2]*0.5) + 0 * x[3]
d_minus <- function(x) (log(55.75/x[1])+(0.026 - x[2]^2 / 2) * 0.25) / (x[2]*0.5) + 0 * x[3]
F_C0 <- function(x) 55.75 * pnorm(d_plus(x)) - x[1] * exp(-0.026 * 0.25) * pnorm(d_minus(x)) + 0 * x[3]
eqn <- function(x) F_C0(x) - x[3]
nleqslv( c(40, 1, 17.35), eqn)
I am trying to solve for x[2] and I have the input of x[1] and x[3]. It looks like I should get the numeric solution of x[2].
Your example is not reproducible since you have not shown all your code: library(nleqslv) is missing. Please show all your code.
As the first comment on your question stated you are providing a vector to eqn but functions d_plus, d_minus and thus F_C0 return a scalar.
That implies that the length of the function result is not the same as the length of the input.
From your explanation you want to solve for x[2]. So the function presented to nleqslv must take a scalar as input and return a scalar.
This can be achieved as follows:
library(nleqslv)
d_plus <- function(x) (log(55.75/x[1])+(0.026 + x[2]^2 / 2) * 0.25) / (x[2]*0.5) + 0 * x[3]
d_minus <- function(x) (log(55.75/x[1])+(0.026 - x[2]^2 / 2) * 0.25) / (x[2]*0.5) + 0 * x[3]
F_C0 <- function(x) 55.75 * pnorm(d_plus(x)) - x[1] * exp(-0.026 * 0.25) * pnorm(d_minus(x)) + 0 * x[3]
eqn <- function(xpar) { x <- c(40,xpar,17.35);F_C0(x) - x[3] }
Insert the scalar argument of eqn, which is your x[2], into a vector x where the first and third entries are what you provided as starting values in your code .
Then running this
xstart <- 1
nleqslv( xstart, eqn)
results in this
$x
[1] 0.6815036
$fvec
[1] 6.18563e-11
$termcd
[1] 1
$message
[1] "Function criterion near zero"
$scalex
[1] 1
$nfcnt
[1] 5
$njcnt
[1] 1
$iter
[1] 5
Read the documentation of nleqslv to see what these items mean.
As you can see nleqslv found a solution to your problem.
Suppose I have the following system of equations:
a * b = 5
sqrt(a * b^2) = 10
How can I solve these equations for a and b in R ?
I guess this problem can be stated as an optimisation problem, with the following function... ?
fn <- function(a, b) {
rate <- a * b
shape <- sqrt(a * b^2)
return(c(rate, shape) )
}
In a comment the poster specifically asks about using solve and optim so we show how to solve this (1) by hand, (2) using solve, (3) using optim and (4) a fixed point iteration.
1) by hand First note that if we write a = 5/b based on the first equation and substitute that into the second equation we get sqrt(5/b * b^2) = sqrt(5 * b) = 10 so b = 20 and a = 0.25.
2) solve Regarding the use of solve these equations can be transformed into linear form by taking the log of both sides giving:
log(a) + log(b) = log(5)
0.5 * (loga + 2 * log(b)) = log(10)
which can be expressed as:
m <- matrix(c(1, .5, 1, 1), 2)
exp(solve(m, log(c(5, 10))))
## [1] 0.25 20.00
3) optim Using optim we can write this where fn is from the question. fn2 is formed by subtracting off the RHS of the equations and using crossprod to form the sum of squares.
fn2 <- function(x) crossprod( fn(x[1], x[2]) - c(5, 10))
optim(c(1, 1), fn2)
giving:
$par
[1] 0.2500805 19.9958117
$value
[1] 5.51508e-07
$counts
function gradient
97 NA
$convergence
[1] 0
$message
NULL
4) fixed point For this one rewrite the equations in a fixed point form, i.e. in the form c(a, b) = f(c(a, b)) and then iterate. In general, there will be several ways to do this and not all of them will converge but in this case this seems to work. We use starting values of 1 for both a and b and divide both side of the first equation by b to get the first equation in fixed point form and we divide both sides of the second equation by sqrt(a) to get the second equation in fixed point form:
a <- b <- 1 # starting values
for(i in 1:100) {
a = 5 / b
b = 10 / sqrt(a)
}
data.frame(a, b)
## a b
## 1 0.25 20
Use this library.
library("nleqslv")
You need to define the multivariate function you want to solve for.
fn <- function(x) {
rate <- x[1] * x[2] - 5
shape <- sqrt(x[1] * x[2]^2) - 10
return(c(rate, shape))
}
Then you're good to go.
nleqslv(c(1,5), fn)
Always look at the detailed results. Numerical calculations can be tricky. In this case I got this:
Warning message:
In sqrt(x[1] * x[2]^2) : NaNs produced
That just means the procedure searched a region that included x[1] < 0 and then presumably noped the heck back to the right hand side of the plane.