How do I get R to calculate a negative IRR scenario? - r

I have been doing stochastic cash flow modeling. In some of these scenarios, IRR is negative (cash flows out exceed cash flows in over time). R seems to hate this. I get a uniroot error. I have used the FinCal package irr function, and I even tried to write my own uniroot IRR formula. It's important that any formula solves for both positive and negative IRR scenarios.
Any suggestions or ideas? Is there an R package that handles this, or a simple uniroot formula?
Thank you!

i ended up writing my own code (based upon irr from FinCal) where errors are ignored. I also changed the range to incorporate negative numbers when uniroot is looking for a solution. FYI - errors happen because a value is out of range or because there could be two solutions. Use irr4 to solve.
irr3<-function (cf)
{ n <- length(cf)
subcf <- cf[2:n]
uniroot(function(r) -1 * pv.uneven(r, subcf) + cf[1], lower = -.2, upper = .2, tol = .000001)$root}
irr4<-function (x) {
out<-tryCatch(irr3(x),error= function(e) NULL)
return(out)}

Related

optim (method=Brent) and optimize not giving correct minimum for binomial distribution (N > 1000)

I'm using optim() (and optimize()) to try and find the the quantiles of a binomial distribution, however for N ~ 2000 (N = 2135), the functions do not give the correct value.
optim(21, function(x) abs(1 - pbinom(x, 2135, 21/2135) - 0.1),
method = "Brent", lower = 1, upper = 2135)
optimize(function(x) abs(1 - pbinom(x, 2135, 21/2135) - 0.1), c(1,2135))
P.S: I also try to set the min argument equal to the probability, but I still get incorrect answers.
The problem is that optimize() assumes that small changes in the parameter will give reliable information about whether the minimum has been attained (and which direction to go if not). (I initially said that the function needed to be differentiable, which might not be true: see the Wikipedia article on Brent's method.) In other words, most of the easily available optimization algorithms can fail on an objective function that is piecewise constant, as this one is ...
IMO the accepted answer to this nearly identical question is simply wrong. (It states that "the gradient at your starting point is almost 0", whereas in fact it's exactly zero; using optimize() doesn't help, as you found out, and picking a different starting point is more or less a matter of luck ...)
I made up a smaller example to illustrate: find the 0.6 quantile of the binomial distribution with N=10, prob=0.2. R can do this directly, very easily: qbinom(0.6, size=10, prob=0.2) ! But assuming that you want to solve some other problem of a similar form, and this is just an example, or that the constraints are given by a homework problem, or ...
Slightly simplified objective function (using the squared difference rather than the absolute value):
fx <- function(x) (pbinom(x, size=10, prob=0.2) - 0.6)^2
What does this look like?
curve(fx, from = 0, to =10, n=501)
So the correct answer is any value between 2 and 3. In this particular case optimize(fx, interval=c(1,10)) happens to work OK (returns 2.313, you could use floor() to convert it to 2), but it will fail if I use a wider interval (optimize(fx, interval=c(1,100)) returns 99.99996), or if I did a problem with a larger size. Let me try
fx2 <- function(x) pbinom(x, size=1000, prob=0.2) - 0.6
qbinom(0.6, size=1000, prob=0.2) ## answer: 203
optimize(fx2, interval=c(1,1000)) ## 999.9999
The problem is that if the initial step of the optimization method jumps less than one unit, the algorithm will conclude that it has found the minimum.
One potential solution is to look for a root rather than a minimum:
fx3 <- function(x) pbinom(x, size=1000, prob=0.2) - 0.6
uniroot(fx3, interval=c(1,1000)) ## 203
I don't know a good way to solve this as an optimization problem. A stochastic global optimizer would work, but would in general be very inefficient. See here for one particular problem involving nonlinear discrete optimization in R. You can also look at the optimization task view, although I didn't find it useful ...

MLE using nlminb in R - understand/debug certain errors

This is my first question here, so I will try to make it as well written as possible. Please be overbearing should I make a silly mistake.
Briefly, I am trying to do a maximum likelihood estimation where I need to estimate 5 parameters. The general form of the problem I want to solve is as follows: A weighted average of three copulas, each with one parameter to be estimated, where the weights are nonnegative and sum to 1 and also need to be estimated.
There are packages in R for doing MLE on single copulas or on a weighted average of copulas with fixed weights. However, to the best of my knowledge, no packages exist to directly solve the problem I outlined above. Therefore I am trying to code the problem myself. There is one particular type of error I am having trouble tracing to its source. Below I have tried to give a minimal reproducible example where only one parameter needs to be estimated.
library(copula)
set.seed(150)
x <- rCopula(100, claytonCopula(250))
# Copula density
clayton_density <- function(x, theta){
dCopula(x, claytonCopula(theta))
}
# Negative log-likelihood function
nll.clayton <- function(theta){
theta_trans <- -1 + exp(theta) # admissible theta values for Clayton copula
nll <- -sum(log(clayton_density(x, theta_trans)))
return(nll)
}
# Initial guess for optimization
guess <- function(x){
init <- rep(NA, 1)
tau.n <- cor(x[,1], x[,2], method = "kendall")
# Guess using method of moments
itau <- iTau(claytonCopula(), tau = tau.n)
# In case itau is negative, we need a conditional statement
# Use log because it is (almost) inverse of theta transformation above
if (itau <= 0) {
init[1] <- log(0.1) # Ensures positive initial guess
}
else {
init[1] <- log(itau)
}
}
estimate <- nlminb(guess(x), nll.clayton)
(parameter <- -1 + exp(estimate$par)) # Retrieve estimated parameter
fitCopula(claytonCopula(), x) # Compare with fitCopula function
This works great when simulating data with small values of the copula parameter, and gives almost exactly the same answer as fitCopula() every time.
For large values of the copula parameter, such as 250, the following error shows up when I run the line with nlminb():"Error in .local(u, copula, log, ...) : parameter is NA
Called from: .local(u, copula, log, ...)
Error during wrapup: unimplemented type (29) in 'eval'"
When I run fitCopula(), the optimization is finished, but this message pops up: "Warning message:
In dlogcdtheta(copula, u) :
dlogcdtheta() returned NaN in column(s) 1 for this explicit copula; falling back to numeric derivative for those columns"
I have been able to find out using debug() that somewhere in the optimization process of nlminb, the parameter of interest is assigned the value NaN, which then yields this error when dCopula() is called. However, I do not know at which iteration it happens, and what nlminb() is doing when it happens. I suspect that perhaps at some iteration, the objective function is evaluated at Inf/-Inf, but I do not know what nlminb() does next. Also, something similar seems to happen with fitCopula(), but the optimization is still carried out to the end, only with the abovementioned warning.
I would really appreciate any help in understanding what is going on, how I might debug it myself and/or how I can deal with the problem. As might be evident from the question, I do not have a strong background in coding. Thank you so much in advance to anyone that takes the time to consider this problem.
Update:
When I run dCopula(x, claytonCopula(-1+exp(guess(x)))) or equivalently clayton_density(x, -1+exp(guess(x))), it becomes apparent that the density evaluates to 0 at several datapoints. Unfortunately, creating pseudobservations by using x <- pobs(x) does not solve the problem, which can be see by repeating dCopula(x, claytonCopula(-1+exp(guess(x)))). The result is that when applying the logarithm function, we get several -Inf evaluations, which of course implies that the whole negative log-likelihood function evaluates to Inf, as can be seen by running nll.clayton(guess(x)). Hence, in addition to the above queries, any tips on handling log(0) when doing MLE numerically is welcome and appreciated.
Second update
Editing the second line in nll.clayton as follows seems to work okay:
nll <- -sum(log(clayton_density(x, theta_trans) + 1e-8))
However, I do not know if this is a "good" way to circumvent the problem, in the sense that it does not introduce potential for large errors (though it would surprise me if it did).

Simplify the division of Normals cumulatives functions

I'm struggling on how I can simplify the quotient of two normal probability functions in R. Actually, I'm calculating a conditional skew-Normal density, them I have the division between this two function:
pnorm(alpha0+t(alpha2)%*%chol2inv(chol(omega2))%*%t(y2-xi2.1))/pnorm(tau2.1)
where alpha0+t(alpha2)%*%chol2inv(chol(omega2))%*%t(y2-xi2.1) and tau2.1 result in real numbers. For example, sometimes I have pnorm(-50)/pnorm(-40), e.g. an inconsistency 0/0. But these values are not zero, R is just approximating. I tried to use the erf function, but I got the same problem (0/0).
Any hint on how can I overcome this issue?
pnorm has a log parameter, which makes it return log(p). Change your equation to exp(log(p1) - log(p2)):
exp(pnorm(-50, log = TRUE) - pnorm(-40, log = TRUE))
#[1] 2.95577e-196

How can I resolve an exponential function for x in R?

I want to analyse a logarithmic growth curve in more detail. Especially I would like to kow the time point when the slope becomes >0 (which is the starting point of growth after a lag phase).
Therefore I fitted a logarithmic function to my growth data with the grofit package of R. I got values for the three parameters (lambda, mu, maximal assymptote).
Now I thought, I could use the first derivative of the logarithmic growth function to put mu=0 (the slope of any time point during growth) and this way solve the equation for the time (x). I'm not sure if this is possible, since the mu=0 will be correct for a longer timespan at the beginning of the curve (and no unique timepoint). But maybe I could approximate to that point by putting mu=0.01. This should be more specific.
Anyway I used the Deriv package to find the first derivative of my logarithmic function:
Deriv(a/(1+exp(((4*b)/a)*(c-x)+2)), "x")
where a=assymptote, b=maximal slope, c=lambda.
As a result I got:
{.e2 <- exp(2 + 4 * (b * (c - x)/a))
4 * (.e2 * b/(.e2 + 1)^2)}
Or in normal writing:
f'(x)=(4*exp(2+((4b(c-x))/a))*b)/((exp(2+((4b(c-x))/a))+1)^2)
Now I would like to solve this function for x with f'(x)=0.01. Can anyone tell me, how best to do it?
Also, do you have comments on my way of thinking or the R functions I used?
Thank you.
Anne
Using a root solving function is more appropriate than using an optimization function.
I'll give an example with two packages.
It would also be a good idea to plot the function for a range of values.
Like this:
curve(fn,-.1,.1)
You can see that using the base R function uniroot will present problems since it needs function values at the endpoints of the interval to be of opposite sign.
Using package nleqslv like this
library(nleqslv)
nleqslv(1,fn)
gives
$x
[1] 0.003388598
$fvec
[1] 8.293101e-10
$termcd
[1] 1
$message
[1] "Function criterion near zero"
<more info> ......
Using function fsolve from package pracma
library(pracma)
fsolve(fn,1)
gives
$x
[1] 0.003388585
$fval
[1] 3.136539e-10
The solutions given by both packages are very close to each other.
Might not be the best approach but you can use the optim function to find the solution. Check the code below, I am basically trying to find the value of x which minimizes abs(f(x) - 0.01)
There starting seed value for x may be important, the optim function might not converge for some seeds.
fn <- function(x){
a <- 1
b<- 1
c <- 1
return( abs((4*exp(2+((4*b*(c-x))/a))*b)/ ((exp(2+((4*b*(c-x))/a))+1)^2) - 0.01) )
}
x <- optim(10,fn)
x$par
Thank you very much for your efforts. Unfortunately, none of the above solutions worked for me :-(
I figured the problem out the old fashioned way (pencil + paper + mathematics book).
Have a good day
Anne

numerical integration of a tricky function

The prob package numerically evaluates characteristic functions for base R distributions. For almost all distributions there are existing formulas. For a few cases, though, no closed-form solution is known. Case in point: the Weibull distribution (but see below).
For the Weibull characteristic function I essentially compute two integrals and put them together:
fr <- function(x) cos(t * x) * dweibull(x, shape, scale)
fi <- function(x) sin(t * x) * dweibull(x, shape, scale)
Rp <- integrate(fr, lower = 0, upper = Inf)$value
Ip <- integrate(fi, lower = 0, upper = Inf)$value
Rp + (0+1i) * Ip
Yes, it's clumsy, but it works surprisingly well! ...ahem, most of the time. A user reported recently that the following breaks:
cfweibull(56, shape = 0.5, scale = 1)
Error in integrate(fr, lower = 0, upper = Inf) :
the integral is probably divergent
Now, we know that the integral isn't divergent, so it must be a numerical problem. With some fiddling I could get the following to work:
fr <- function(x) cos(56 * x) * dweibull(x, 0.5, 1)
integrate(fr, lower = 0.00001, upper = Inf, subdivisions=1e7)$value
[1] 0.08024055
That's OK, but it isn't quite right, plus it takes a fair bit of fiddling which doesn't scale well. I've been investigating this for a better solution. I found a recently published "closed-form" for the characteristic function with scale > 1 (see here), but it involves Wright's generalized confluent hypergeometric function which isn't implemented in R (yet). I looked into the archives for integrate alternatives, and there's a ton of stuff out there which doesn't seem very well organized.
As part of that searching it occurred to me to translate the region of integration to a finite interval via the inverse tangent, and voila! Check it out:
cfweibull3 <- function (t, shape, scale = 1){
if (shape <= 0 || scale <= 0)
stop("shape and scale must be positive")
fr <- function(x) cos(t * tan(x)) * dweibull(tan(x), shape, scale)/(cos(x))^2
fi <- function(x) sin(t * tan(x)) * dweibull(tan(x), shape, scale)/(cos(x))^2
Rp <- integrate(fr, lower = 0, upper = pi/2, stop.on.error = FALSE)$value
Ip <- integrate(fi, lower = 0, upper = pi/2, stop.on.error = FALSE)$value
Rp + (0+1i) * Ip
}
> cfweibull3(56, shape=0.5, scale = 1)
[1] 0.08297194+0.07528834i
Questions:
Can you do better than this?
Is there something about numerical integration routines that people who are expert about such things could shed some light on what's happening here? I have a sneaking suspicion that for large t the cosine fluctuates rapidly which causes problems...?
Are there existing R routines/packages which are better suited for this type of problem, and could somebody point me to a well-placed position (on the mountain) to start the climb?
Comments:
Yes, it is bad practice to use t as a function argument.
I calculated the exact answer for shape > 1 using the published result with Maple, and the brute-force-integrate-by-the-definition-with-R kicked Maple's ass. That is, I get the same answer (up to numerical precision) in a small fraction of a second and an even smaller fraction of the price.
Edit:
I was going to write down the exact integrals I'm looking for but it seems this particular site doesn't support MathJAX so I'll give links instead. I'm looking to numerically evaluate the characteristic function of the Weibull distribution for reasonable inputs t (whatever that means). The value is a complex number but we can split it into its real and imaginary parts and that's what I was calling Rp and Ip above.
One final comment: Wikipedia has a formula listed (an infinite series) for the Weibull c.f. and that formula matches the one proved in the paper I referenced above, however, that series has only been proved to hold for shape > 1. The case 0 < shape < 1 is still an open problem; see the paper for details.
You may be interested to look at this paper, which discuss different integration methods for highly oscillating integrals -- that's what you are essentially trying to compute:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.6944
Also, another possible advice, is that instead of infinite limit you may want to specify a smaller one, because if you specify the precision that you want, then based on the cdf of the weibull you can easily estimate how much of the tail you can truncate. And if you have a fixed limit, then you can specify exactly (or almost) the number of subdivisions (e.g. in order to have a few(4-8) points per period).
I had the same problem than Jay - not with the Weibull distribution but with the integrate function. I found my answer to Jay's question 3 in a comment to this question:
Divergent Integral in R is solvable in Wolfram
The R package pracma contains several functions for solving integrals numerically. In the package, one finds some R functions for integrating certain mathematical functions. And there is a more general function integral. That helped in my case. Example code is given below.
To questions 2: The first answer to the linked question (above) states that not the complete error message of the C source file is printed out by R (The function may just converge too slowly). Therefore, I would agree with Jay that the fast fluctuation of the cosine may be a problem. In my case and in the example below it was the problem.
Example Code
# load Practical Numerical Math Functions package
library(pracma)
# define function
testfun <- function(r) cos(r*10^6)*exp(-r)
# Integrate it numerically with the basic 'integrate'.
out1 = integarte(testfun, 0, 100)
# "Error in integrate(testfun, 0, 100) : the integral is probably divergent"
# Integrate it numerically with 'integral' from 'pracma' package
# using 'Gauss-Kronrod' method and 10^-8 as relative tolerance. I
# did not try the other ones.
out2 = integral(testfun, 0, 100, method = 'Kronrod', reltol = 1e-8)
Two remarks
The integral function does not break as the integrate function does but it may take quite a long time to run. I do not know (and I did not try) whether the user can limit the number of iterations (?).
Even if the integral function finalises without errors I am not sure how correct the result is. Numerically integrating a function which is fast fluctuating around zero seems to be quite tricky since one does not know where exactly values on the fluctuating function are calculated (twice as much positive than negative values; positive values close to local maxima and negative values far off). I am not on expert on numeric integration but I just got to know some basic fixed-step integration methods in my numerics lectures. So maybe the adaptive methods used in integral deal with this problem in some way.
I'm attempting to answer questions 1 & 3. That being said I am not contributing any original code. I did a google search and hopefully this is helpful. Good luck!
Source:http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf (p.6)
#Script
library(ggplot2)
## sampling from a Weibull distribution with parameters shape=2.1 and scale=1.1
x.wei<-rweibull(n=200,shape=2.1,scale=1.1)
#Weibull population with known paramters shape=2 e scale=1
x.teo<-rweibull(n=200,shape=2, scale=1) ## theorical quantiles from a
#Figure
qqplot(x.teo,x.wei,main="QQ-plot distr. Weibull") ## QQ-plot
abline(0,1) ## a 45-degree reference line is plotted
Is this of any use?
http://www.sciencedirect.com/science/article/pii/S0378383907000452
Muraleedharana et al (2007) Modified Weibull distribution for maximum and significant wave height simulation and prediction, Coastal Engineering, Volume 54, Issue 8, August 2007, Pages 630–638
From the abstract: "The characteristic function of the Weibull distribution is derived."

Resources