Error in uniroot while calculating XIRR in R - r

Reproducible example:
v <- c(-400000.0,-200000.0, 660636.7)
d <- c("2021-10-27","2022-12-23","2023-01-04")
d1 <- as.Date(d, format="%Y-%m-%d")
tvm::xirr(v, d1) # gives the error below
Error in uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau, :
f.lower = f(lower) is NA
Excel XIRR returns 0.125 which seems correct.
The uniroot documentation says "Either interval or both lower and upper must be specified", and I'm not sure if tvm::xirr does so. I guess it does because it works well for many other sets of data.
Anyway, I could get it to work correctly in this case by providing a lower and upper (now that I know the answer via Excel) with some trial and error as below. But I'm not sure if my bounds will always hold.
> tvm::xirr(v, d1, f.lower = -0.2, f.upper=0.5)
[1] 10
> tvm::xirr(v, d1, f.lower = -0.2, f.upper=5)
[1] -1
> tvm::xirr(v, d1, lower = -0.99, upper=0.99)
[1] 0.1244512
Is this a bug or limitation of tvm::xirr or am I missing something?

Let us go down the rabbit hole. Firstly, let us read the source code for tvm::xirr:
xirr = function (cf, d, tau = NULL, comp_freq = 1, interval = c(-1, 10), ...)
{
uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau,
comp_freq = comp_freq, extendInt = "yes", ...)$root
}
Xirr calls uniroot to identify at what cf the function xnpv is equal to zero in the interval c(-1, 10). Default parameter values are tau = NULL and comp_freq = 1. Secondly, let us see the source code for xnpv:
xnpv = function (i, cf, d, tau = NULL, comp_freq = 1)
{
if (is.null(tau))
tau <- as.integer(d - d[1])/365
delta <- if (comp_freq == 0) {
1/(1 + i * tau)
}
else if (comp_freq == Inf) {
exp(-tau * i)
}
else {
1/((1 + i/comp_freq)^(tau * comp_freq))
}
sum(cf * delta)
}
We can visualize xnpv and its root as follows:
library(tvm)
v = c(-400000.0,-200000.0, 660636.7)
d = c("2021-10-27","2022-12-23","2023-01-04")
d1 = as.Date(d, format="%Y-%m-%d")
x = seq(-0.8, 10, 0.01)
y = sapply(x, function(x) xnpv(i = x, cf = v, d = d1, tau = as.integer(d1 - d1[1])/365))
plot(x, y, type = 'l', ylab = "xnpv", xlab = "cf"); abline(h = 0, lty = 2); abline(v = 0.1244512, lty = 2)
As you can see, for comp_freq = 1, the factor 1/(1 + i/comp_freq) (in the definition of delta) has a vertical asymptote at i = -1 for exponents different than 0 (0^0 = 1 in R). Moreover, for i < -1, this expression is undefined in R (negative number raised to decimal powers equals NaN in R).
To solve this issue, assuming comp_freq different than 0 or +Inf, you can call xirr as follows:
offset = 0.001; comp_freq = 1
tvm::xirr(v, d1, lower = -comp_freq+offset, upper = 10, comp_freq = comp_freq, tol = 1e-7) # I also changed the numerical tolerance for increased accuracy.
This assumes that cf <= 10. Finally, given that comp_freq = 1 is the default value, xirr always fails under default settings (thus: this function has not been thoroughly tested by its developer(s)).

I'm the package creator.
In this case, the uniroot algo tries to move past the i=-1 point, and fails. You can easily guide it with the lower bound as the OP has done. I could have set up a default lower bound >= 0 to deal with this, but due to the existence of negative interest rates, I decided to not to do it. A possible solution would be to set a lower bound > -1 in the case that the compounding frequency is not 0 (simple interest) or inf (continuous compounding) and that the function call doesn't include explicit bounds.
Thanks for the report.

Related

Solving (determining) a function at a point in R

In my below R code, I was wondering how I could find out what is rh1 when y == 0.5?
Note that y uses atanh(rh1), which can be converted back to rh1 using tanh().
rh1 <- seq(-1, 0.1, by = 0.001)
y <- pnorm(-0.13, atanh(rh1), 0.2)
plot(rh1, y, type = "l")
Analytical solution
For a normal distribution X ~ N(mu, 0.2). We want to find mu, such that Pr (X < -0.13) = y.
Recall your previous question and my answer over there: Determine a normal distribution given its quantile information. Here we have something simpler, as there is only one unknown parameter and one piece of quantile information.
Again, we start by standardization:
Pr {X < -0.13} = y
=> Pr { [(X - mu) / 0.2] < [(-0.13 - mu) / 0.2] } = y
=> Pr { Z < [(-0.13 - mu) / 0.2] } = y # Z ~ N(0,1)
=> (-0.13 - mu) / 0.2 = qnorm (y)
=> mu = -0.13 - 0.2 * qnorm (y)
Now, let atanh(rh1) = mu => rh1 = tanh(mu), so in short, the analytical solution is:
tanh( -0.13 - 0.2 * qnorm (y) )
Numerical solution
It is a root finding problem. We first build the following function f, and we aim to find its root, i.e., the rh1 so that f(rh1) = 0.
f <- function (rh1, y) pnorm(-0.13, atanh(rh1), 0.2) - y
The simplest root finding method is bisection method, implemented by uniroot in R. I recommend you reading Uniroot solution in R for how we should work with it in general.
curve(f(x, 0.5), from = -1, to = 0.1); abline (h = 0, lty = 2)
We see there is a root between (-0.2, 0), so:
uniroot(f, c(-0.2, 0), y = 0.5)$root
# [1] -0.129243
Your function is monotonic so you can just create the inverse function.
rh1 <- seq(-1,.1,by=.001)
y <- pnorm(-.13,atanh(rh1),.2)
InverseFun = approxfun(y, rh1)
InverseFun(0.5)
[1] -0.1292726

Maximum likelihood estimation with uniform distribution in R leads to absurd result

I want to use the mle function to get estimates of a and b in a Unif(a,b) distribution. But I get absurd estimates nowhere close to 1 and 3.
library(stats4)
set.seed(20161208)
N <- 100
c <- runif(N, 1, 3)
LL <- function(min, max) {
R <- runif(100, min, max)
suppressWarnings((-sum(log(R))))
}
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS",
lower = c(-Inf, 0), upper = c(Inf, Inf))
I got:
Call:
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS")
Coefficients:
min max
150.8114 503.6586
Any ideas of what's going on? Thank you in advance!
I would first point out where your code is wrong.
You need dunif not runif. You may define:
LL <- function (a, b) -sum(dunif(x, a, b, log.p = TRUE))
In my code below I did not use dunif, as the density is just 1 / (b - a) so I wrote it directly.
You are generating samples inside objective function. For U[a,b] this is OK as its density is free of x. But for other distributions the objective function changes at each iteration.
With box constraints, you need method = "L-BFGS-B", not the ordinary "BFGS". And you are not using the right constraints.
Now in more depth...
For a length-n sample vector x from U[a, b], the likelihood is (b - a) ^ (-n), and negative-log-likelihood is n * log(b - a). Obviously the MLE are a = min(x) and b = max(x).
Numerical optimization is completely unnecessary, and is in fact impossible without constraints. Look at the gradient vector:
( n / (a - b), n / (b - a) )
The partial derivative w.r.t. a / b is always negative / positive and can't be 0.
Numerical approach becomes feasible when we impose box constraints: -Inf < a <= min(x) and max(x) <= b < Inf. We know for sure that iteration terminates at the boundary.
My code below uses both optim and mle. Note mle will fail, when it inverts Hessian matrix, as it is singular:
-(b - a) ^ 2 (b - a) ^ 2
(b - a) ^ 2 -(b - a) ^ 2
Code:
## 100 samples
set.seed(20161208); x <- runif(100, 1, 3)
# range(x)
# [1] 1.026776 2.984544
## using `optim`
nll <- function (par) log(par[2] - par[1]) ## objective function
gr_nll <- function (par) c(-1, 1) / diff(par) ## gradient function
optim(par = c(0,4), fn = nll, gr = gr_nll, method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf), hessian = TRUE)
#$par
#[1] 1.026776 2.984544 ## <- reaches boundary!
#
# ...
#
#$hessian ## <- indeed singular!!
# [,1] [,2]
#[1,] -0.2609022 0.2609022
#[2,] 0.2609022 -0.2609022
## using `stats4::mle`
library(stats4)
nll. <- function (a, b) log(b - a)
mle(minuslogl = nll., start = list(a = 0, b = 4), method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf))
#Error in solve.default(oout$hessian) :
# Lapack routine dgesv: system is exactly singular: U[2,2] = 0

optim in r :non finite finite difference error

I have a simple likelihood function (from a normal dist with mean=0) that I want to maximize. optim keeps giving me this error:
Error in optim(par = phi, fn = loglike, estimates = estimates, NULL, hessian = TRUE, : non-finite finite-difference value [1]
Here is my data and likelihood function:
y = [ -0.01472 0.03942 0.03592 0.02776 -0.00090 ]
C = a varcov matrix:
1.66e-03 -0.000120 -6.780e-06 0.000102 -4.000e-05
-1.20e-04 0.001387 7.900e-05 -0.000140 -8.000e-05
-6.78e-06 0.000079 1.416e-03 -0.000070 8.761e-06
1.02e-04 -0.000140 -7.000e-05 0.001339 -6.000e-05
-4.00e-05 -0.000080 8.761e-06 -0.000060 1.291e-03
my log likelihood function is:
lglkl = -.5*(log(det(v)) + (t(y)%%vi%%y))` where v = phi*I + C and vi=inverse(v) and I= 5*5 Identity matrix.
I am trying to get the mle estimate for "phi". I thought this would be a simple optimization problem but am struggling. Would really appreciate any help. Thanks in advance. My code is below:
loglike <- function(phi,y) {
v = phi*I + C
vi = solve(v)
loglike = -.5*(log(det(v)) + (t(y)%*%vi%*%y))
return(-loglike)
}
phi = 0
parm <- optim(par=phi,fn=loglike,y=y,NULL,hessian = TRUE, method="L-BFGS-B",lower=0,upper=1000)
The error you ran into is because ϕ becomes negative beyond a certain number of iterations (which indicates that the constraints are not being applied correctly by the algorithm). Also, the solution does not converge to a single value but jumps between a few small values before reaching a situation where the updated covariance matrix is no-longer positive definite. At that stage you get det(v) < 0 and log[det(v)] is undefined. The optim algorithm bails out at that stage.
To see what's happening, play with the maxit and ndeps parameters in the code below.
require("matrixcalc")
#-------------------------------------------------
# Log-likelihood function
#-------------------------------------------------
loglike <- function(phi, y) {
# Shift the covariance matrix
print(paste("phi = ", phi))
#v = phi*I + (1 - phi)*C
v = phi*I + C
stopifnot(is.positive.definite(v))
# Invert shifted matrix
vi = solve(v)
# Compute log likelihood
loglike = -.5*(log(det(v)) + (t(y) %*% vi %*% y))
print(paste("L = ", loglike))
return(-loglike)
}
#-------------------------------------------------
# Data
#-------------------------------------------------
y = c(-0.01472, 0.03942, 0.03592, 0.02776, -9e-04)
C = structure(c(0.00166, -0.00012, -6.78e-06, 0.000102, -4e-05, -0.00012,
0.001387, 7.9e-05, -0.00014, -8e-05, -6.78e-06, 7.9e-05,
0.001416, -7e-05, 8.761e-06, 0.000102, -0.00014, -7e-05,
0.001339, -6e-05, -4e-05, -8e-05, 8.761e-06, -6e-05, 0.001291),
.Dim = c(5L, 5L ))
#--------
# Initial parameter
#--------
I = diag(5)
phi = 50
#--------
# Minimize
#--------
parm <- optim(par = phi, fn = loglike, y = y, NULL, hessian = TRUE,
method = "L-BFGS-B", lower = 0.0001, upper = 1000,
control = list(trace = 3,
maxit = 1000,
ndeps = 1e-4) )

Integrating beta-function-like function over compact support [0,alpha], alpha < 1

I would like to integrate a following function named betalog
g <- function(x,a,b){
if (a < 0 | b < 0) stop()
temp <- (a-1)*log(x) + (b-1)*log(1-x)
return( exp(temp) )
}
betalog<- function(x,a,b)
{
temp <- g(x=x,a=a,b=b)* log(x/(1-x))
return( temp )
}
The function g is integrand of the beta function. In theory, betalog should be integrable over any [0,alpha] interval if 0 < alpha < 1, and a > 0, b >0.
However, I cannot numerically integrate betalog with very small a:
a <- 0.00001
b <- 1
alpha <- 0.5
integrate(betalog,a=a,b=b,lower=0,upper=alpha,subdivisions=1000000L)
Error in integrate(betalog, a = a, b = b, lower = 0, upper = alpha, subdivisions =
1000000L) :
non-finite function value
In fact, I cannot even compute the incomplete beta function using R integrate function when a is very small:
integrate(g,a=a,b=b,lower=0,upper=alpha,subdivisions=1000000L)
Error in integrate(g, a = a, b = b, lower = 0, upper = alpha, subdivisions = 1000000L) :
roundoff error is detected in the extrapolation table
Can anyone gives me tip to integrate such incomplete beta-like function in R?
> betalog(0, a, b)
[1] -Inf
Your function is singular at the lower bound. Recall that to compute an improper integral you must replace the singular bounds with dummy variables and take the limit from the correct side towards that bound. In particular,
> integrate(betalog,a=a,b=b,lower=0.000001,upper=alpha,subdivisions=10000000L)
-94.60292 with absolute error < 0.00014
> integrate(betalog,a=a,b=b,lower=.Machine$double.xmin * 1000,upper=alpha,subdivisions=10000
-244894.7 with absolute error < 10
> integrate(betalog,a=a,b=b,lower=.Machine$double.xmin,upper=alpha,subdivisions=10000000L)
Error in integrate(betalog, a = a, b = b, lower = .Machine$double.xmin, :
non-finite function value
I suspect that your integral diverges, but this might be tricky since even state-of-the-art symbolic algebra systems can't prove that:
http://www.wolframalpha.com/input/?i=Integral%28x%5E%280.00001+-1%29+ln%28x%2F%281-x%29%29%2C+x%2C0%2C+0.5%29
Whatever the case, R is not the correct tool for this problem.

R Optimization Returning Incorrect Values

My objective function:
helper.post<-function(monthly.mean.return,
start.capital, #initial nest egg
target.legacy,
monthly.inflation.post,
monthly.withdrawals,
n.obs){
req = matrix(start.capital, n.obs+1, 1) #matrix for storing target weight
for (a in 1:n.obs) {
#cat("a: ",a,"\n")
req[a + 1, ] = req[a, ] * (1 + monthly.mean.return - monthly.inflation.post) - monthly.withdrawals[a,]
}
ending.value=req[nrow(req),]
#ending.value
value=target.legacy - ending.value
return(abs(value))
}
With the following Optimization structure, changing the n.obs between the two values give the same output:
ie if n.obs = 288 or n.obs = 336, it gives the same values.
optimize(f=helper.post,
start.capital = 1000000,
target.legacy = 1000000,
monthly.inflation.post=0.002083333,
monthly.withdrawals = matrix(rep(10000,n.obs)),
n.obs = n.obs,
lower = 0,
upper = 1,
tol = 0.00000000000000000000000000000000001)$minimum
The value is correct seems to be a estimation as oppose to the correct value. Any idea what I may be doing incorrectly? Would a different optimization tool work better for such precise optimization efforts? I tried uni-root, but it doesn't sem to work as the end points are not opposite signs..
uniroot( helper.post,
c(0, 1),
start.capital = start.capital,
target.legacy = target.legacy,
monthly.inflation.post=monthly.inflation.post,
monthly.withdrawals = monthly.withdrawals,
n.obs = n.obs)$root
Let's start with a slight rewrite of your code. I replaced one-column matrices with vectors. I also added an option for returning the error itself or its absolute value. You'll want to use the absolute value when trying to minimize the error with optim while you'll want the value itself when trying to find its root with uniroot.
helper.post <- function(monthly.mean.return,
start.capital,
target.legacy,
monthly.inflation.post,
monthly.withdrawals,
n.obs,
return.abs = TRUE) {
req <- numeric(n.obs + 1)
req[1] <- start.capital
for (month in 1:n.obs) {
req[month + 1] <- req[month] *
(1 + monthly.mean.return - monthly.inflation.post) -
monthly.withdrawals[month]
}
ending.value <- req[n.obs + 1]
error <- target.legacy - ending.value
return(ifelse(return.abs, abs(error), error))
}
Now let's optimize it:
n.obs <- 288
optimize(f = helper.post,
start.capital = 1000000,
target.legacy = 1000000,
monthly.inflation.post = 0.002083333,
monthly.withdrawals = matrix(rep(10000,n.obs)),
n.obs = n.obs,
lower = 0,
upper = 1,
tol = 1e-20)$minimum
# [1] 0.01208333
And let's check the solution with uni.root:
uniroot(helper.post,
c(0, 1),
start.capital = 1000000,
target.legacy = 1000000,
monthly.inflation.post = 0.002083333,
monthly.withdrawals = matrix(rep(10000,n.obs)),
n.obs = n.obs,
return.abs = FALSE,
tol = 1e-20)$root
# [1] 0.01208333
They match. There is nothing wrong with one or the other tool...
If you run again with a different value for n.obs, you will get the exact same result. Why? Because you have picked constant withdrawals and equal values for the start and target capitals: the output you get is the rate needed to maintain that balance constant from one month to the next, regardless of the total number of months.
In fact, this is the kind of thing you would want to put in a unit test. Because it is an expected and easily interpretable result. Another one that comes to mind is if you made the withdrawals equal to zero everywhere. Then you would expect your answer to match the inflation rate. Give it a try and see that it does indeed.

Resources