I want to build a function that takes E[x] and Var[X] and give me the mean and standard error of a univariate lognormal variable.
E[x] = exp(mu + theta)
Var[x] = exp(2*mu + theta)*(exp(theta) - 1)
The function would take E[x] and Var[x] as input and as output would give me theta and mu
There are several packages that provide ways and means to solve a system of nonlinear equations. One of these is nleqslv.
You nee to provide a function that function that returns the differences between the actual value of the equations and the desired value.
Load package nleqslv and define the following function
library(nleqslv)
f <- function(x,Ex,Varx) {
y<- numeric(length(x))
mu <- x[1]
theta <- x[2]
y[1] <- exp(mu+theta) - Ex
y[2] <- exp(2*mu+theta)*(exp(theta)-1) - Varx
y
}
The vector x in the function contains the values of mu and theta.
An example with Ex=2 and Varx=3 and some random starting values
xstart <- c(1,1)
nleqslv(xstart,f,Ex=2,Varx=3)
gives the following
$x
[1] -0.6931472 1.3862944
$fvec
[1] -8.095125e-11 -8.111645e-11
$termcd
[1] 1
$message
[1] "Function criterion near zero"
$scalex
[1] 1 1
$nfcnt
[1] 31
$njcnt
[1] 2
$iter
[1] 22
See the manual of nleqslv for the meaning of the different elements of the return value of nleqslv.
If you want to investigate the effect of the different solving methods try this
testnslv(xstart,f,Ex=2,Varx=3)
Related
I need to demonstrate that the probability density function is the derivative of the CDF. Any distribution will do, but I have been trying with the normal. I have got as far as:
set.seed(53)
b <- rnorm(500)
db <- density(b)
plot(db)
Then I can calculate the cumulative probabilities using pnorm(b), but then I don't know how to differentiate, because D() requires an expression rather than pnorm(). Could anyone help, please?
Here's the console scrape from where I demonstrated the near equality (to 5 or 7 decimal places) of the integral of dnorm to pnorm from -Inf to selected values of "x": The Fundamental Theorem of Calculus says that if the integral of a function f(x) is g(x) then f(x) is the derivative of g(x). (Or words to that effect.)
> sapply(c(0,Inf), function(x) integrate(dnorm, lower=-Inf, upper=x))
[,1] [,2]
value 0.5 1
abs.error 4.680562e-05 9.361124e-05
subdivisions 3 3
message "OK" "OK"
call expression expression
> sapply(c(0,Inf), function(x) integrate(dnorm, lower=-Inf, upper=x)$value)
[1] 0.5 1.0
> sapply(seq(-3,3, by=0.5), function(x) integrate(dnorm, lower=-Inf, upper=x)$value)
[1] 0.001349899 0.006209665 0.022750132 0.066807201 0.158655254 0.308537539
[7] 0.500000000 0.691462461 0.841344751 0.933192799 0.977249868 0.993790335
[13] 0.998650102
> pnorm(seq(-3,3, by=0.5)
+ )
[1] 0.001349898 0.006209665 0.022750132 0.066807201 0.158655254 0.308537539
[7] 0.500000000 0.691462461 0.841344746 0.933192799 0.977249868 0.993790335
[13] 0.998650102
I wasn't sure that the D() was "smart" enough to to the symbolic differentiation, but I shouldn't have been so skeptical. This bit of console interaction was done following the examples on the ?deriv help page:
> D(quote(pnorm(x)), "x")
dnorm(x)
Also ... here's something you can get with deriv:
> norm.expr <- expression(pnorm(x))
> deriv(norm.expr, "x")
expression({
.value <- pnorm(x)
.grad <- array(0, c(length(.value), 1L), list(NULL, c("x")))
.grad[, "x"] <- dnorm(x)
attr(.value, "gradient") <- .grad
.value
})
The code below estimates pi in R, now I am trying to find the minimum number of terms N_Min
you would have to include in your estimate of pie to make it accurate to three decimal places.
pi_Est<- function(NTerms){
NTerms = 5 # start with an estimate of just five terms
pi_Est = 0 # initialise the value of pi to zero
Sum_i = NA # initialise the summation variable to null
for(ii in 1:NTerms)
{
Sum_i[ii] = (-1)^(ii+1)/(2*ii - 1) # this is the series equation for calculating pi
}
Sum_i = 4*Sum_i # multiply by four as required in the formula (see lecture notes)
pi_Est = sum(Sum_i)
cat('\nThe estimate of pi with terms = ', NTerms ,' is ',pi_Est)
}
First of all, I would change some things about your function. Instead of getting it to print out a message, get it to return a value. Otherwise it becomes very difficult to do anything with its output, including testing it for convergence to pi.
Also, no matter what the value of NTerms is you feed this function, you are immediately over-writing NTerms inside the function.
You could rewrite the function like this:
pi_Est <- function(NTerms) {
pi_Est <- 0
Sum_i <- numeric()
for(ii in seq(NTerms))
{
Sum_i[ii] <- (-1)^(ii+1)/(2*ii - 1)
}
return(sum(4 * Sum_i))
}
And to show it converges to pi, let's test it with 50,000 terms:
pi_Est(50000)
#> [1] 3.141573
Now, if we want to find the first value of NTerms that is correct to 3 decimal places, we are going to need to be able to call this function on a vector of NTerms - at the moment it is only working on a single number. So let's define the function f that vectorizes pi_Est:
f <- Vectorize(pi_Est)
Now, let's create the estimate for all values of NTerms between 1 and 2,000 and store them in a vector:
estimates <- f(1:2000)
We can see that the values of estimates seem to oscillate round and converge to pi if we plot the first 100 values:
plot(estimates[1:100], type = 'l')
abline(h = pi)
Our answer is just the first value which, when rounded to three decimal places, is the same as pi rounded to three decimal places:
result <- which(round(estimates, 3) == round(pi, 3))[1]
result
#> [1] 1103
And we can check this is correct by feeding 1103 into our original function:
pi_Est(result)
#> [1] 3.142499
You will see that this gives us 3.142, which is the same as pi rounded to 3 decimal places.
Created on 2022-01-31 by the reprex package (v2.0.1)
1000 terms are required to make the estimate accurate to within 0.001:
pi_Est1 <- function(n) {
if (n == 0) return(0)
neg <- 1/seq(3, 2*n + 1, 4)
if (n%%2) neg[length(neg)] <- 0
4*sum(1/seq(1, 2*n, 4) - neg)
}
pi_Est2 <- function(tol) {
for (i in ceiling(1/tol + 0.5):0) {
est <- pi_Est1(i)
if (abs(est - pi) > tol) break
est1 <- est
}
list(NTerms = i + 1, Estimate = est1)
}
tol <- 1e-3
pi_Est2(tol)
#> $NTerms
#> [1] 1000
#>
#> $Estimate
#> [1] 3.140593
tol - abs(pi - pi_Est2(tol)$Estimate)
#> [1] 2.500001e-10
tol - abs(pi - pi_Est1(pi_Est2(tol)$NTerms - 1))
#> [1] -1.00075e-06
Created on 2022-01-31 by the reprex package (v2.0.1)
Perhaps we can try the code below
pi_Est <- function(digits = 3) {
s <- 0
ii <- 1
repeat {
s <- s + 4 * (-1)^(ii + 1) / (2 * ii - 1)
if (round(s, digits) == round(pi, digits)) break
ii <- ii + 1
}
list(est = s, iter = ii)
}
and you will see
> pi_Est()
$est
[1] 3.142499
$iter
[1] 1103
> pi_Est(5)
$est
[1] 3.141585
$iter
[1] 130658
Why not use a single line of code for the calculation?
Pi <- tail(cumsum(4*(1/seq(1,4*50000000,2))*rep(c(1,-1), 50000000)),1)
I need to compute a division of integrals, where the function q_alpha(z) is the quantile function of a standard normal distribution.
I got a question regarding the denominator. As the normal standard distribution has Homoscedasticity, it is simmetric, continuous, etc.The integration of the denominator term its simple? I just need to elevated to the square each quantile of this function and proceed to the calculation? Right?
This is my code in R:
library(Bolstad)
thau=1:99/100
z.standard.quantile=qnorm(thau,0,1)
z.standard.quantile.square=qnorm(thau,0,1)^2
sintegral(thau[1:50],z.standard.quantile[1:50])$value/sintegral(thau[1:50], z.standard.quantile.square[1:50])$value
The result is: -0.8676396
There is no problem in taking the square of qnorm, but qnorm is unbounded on [0, 0.5] (note qnorm(0) is -Inf) so the integral is not finite.
My second thought is that there is actually no need to use Bolstad::sintegral (Simpson's rule); the R base function integrate is sufficient. Or, we can discretize qnorm and use Trapezoidal rule because qnorm is a smooth function which can be well approximated by linear interpolation.
I will write a function evaluating the ratio of integral in your question, but lower bounded on l:
## using `integrate`
f1 <- function (l) {
a <- integrate(qnorm, lower = l, upper = 0.5)$value
b <- integrate(function (x) qnorm(x) ^ 2, lower = l, upper = 0.5)$value
a / b
}
## using Trapezoidal rule, with `n` division on interval `[l, 0.5]`
f2 <- function (l, n) {
x <- seq(l, 0.5, length = n)
delta <- x[2] - x[1]
y1 <- qnorm(x)
y2 <- y1 ^ 2
a <- sum(y1[-1] + y1[-n]) / 2 * delta
b <- sum(y2[-1] + y2[-n]) / 2 * delta
a / b
}
Those two functions return rather similar result as we can test:
f1 (0.1)
# [1] -1.276167
f2 (0.1, 1000)
# [1] -1.276166
Now, the only thing of interest is the limiting behaviour when l -> 0 (in a numerical sense). Let's try
l <- 10 ^ (- (1:16))
# [1] 1e-01 1e-02 1e-03 1e-04 1e-05 1e-06 1e-07 1e-08 1e-09 1e-10 1e-11 1e-12
# [13] 1e-13 1e-14 1e-15 1e-16
y1 <- sapply(l, f1)
# [1] -1.2761674 -0.8698411 -0.8096179 -0.7996069 -0.7981338 -0.7979341
# [7] -0.7978877 -0.7978848 -0.7978846 -0.7978846 -0.7978846 -0.7978846
# [13] -0.7978846 -0.7978846 -0.7978846 -0.7978846
## quite a dense grid; takes some time to compute
y2 <- sapply(l, f2, n = 1e+6)
# [1] -1.2761674 -0.8698411 -0.8096179 -0.7996071 -0.7981158 -0.7979137
# [7] -0.7978877 -0.7978834 -0.7978816 -0.7978799 -0.7978783 -0.7978767
# [13] -0.7978750 -0.7978734 -0.7978717 -0.7978700
Now, it looks like there is a limit toward around -0.7978 as l -> 0.
Note, the -0.8676396 you got is actually about f1(0.01) or f2(0.01, 1e+6).
I am writing my Masters final project in which I am deriving probability of default using Black Scholes Merton Model.I have got stuck in R code. Mathematically, I want to solve this system of nonlinear equations with the package nleqslv:
library(nleqslv)
T <- 1
D1 <- 20010.75
R <- 0.8516
sigmaS <- .11
SO1 <- 1311.74
fnewton <- function(x){
y <- numeric(2)
d1 <- (log(x[1]/D1)+(R+x[2]^2/2)*T)/x[2]*sqrt(T)
d2 <- d1 - x[2]*sqrt(T)
y[1] <- SO1 - (x[1]*pnorm(d1) - exp(-R*T)*D1*pnorm(d2))
y[2] <- sigmaS*SO1 - pnorm(d1)*x[2]*x[1]
y
}
xstart <- c(1311.74,0.11)
nleqslv(xstart, fnewton, method="Broyden")
# $x
# [1] 1311.74 0.11
# $fvec
# [1] 1311.7400 144.2914
# $termcd
# [1] 6
# $message
# [1] "Jacobian is singular (see allowSingular option)"
# $scalex
# [1] 1 1
# $nfcnt
# [1] 0
# $njcnt
# [1] 1
# $iter
# [1] 1
I have tried this with many values of the 5 inputs( stated above that I have computed for 2 companies for different years), but I am not getting the final values of S0 and sigma V.
I am getting message as "Jacobian is singular (see allowSingular option)" If I allow singular Jacobean using "control=list(trace=1,allowSingular=TRUE)", then also no answer is displayed. I do not know how to obtain the solution of these 2 variables now.
I really don’t know, what I am doing wrong as I oriented my model on Teterevas slides ( on slide no.5 is her model code), who’s presentation is the first result by googeling
https://www.google.de/search?q=moodys+KMV+in+R&rlz=1C1SVED_enDE401DE401&aq=f&oq=moodys+KMV+in+R&aqs=chrome.0.57.13309j0&sourceid=chrome&ie=UTF-8#q=distance+to+default+in+R
q=distance+to+default+in+R
Like me, however more successful, she calculates the Distance to Default risk measure via the Black Scholes Merton approach. In this model, the value of equity (usually represented by the market capitalization, > SO1) can be written as a European call option.
The other variables are:
x[1]: the variable I want to derive, value of total assets
x[2]: the variable I want to derive, volatility of total assets
D1: the book value of debt (19982009)
R: a riskfree interest rate
T: is set to 1 year (time)
sigmaS: estimated (historical) equity volatility
You should be able to use the initial values of SO1 and sigmaS as starting values for nleqslv.
First of all the R code given by Tetereva doesn't seem quite correct (the variable Z should be D1 as you have named it; similar changes for her S0 and D).
I have modified Tetereva's into this:
library(nleqslv)
T <- 1
D1 <- 33404048
R <- 2.32
sigmaS <- .02396919
SO1 <- 4740291 # Ve?
fnewton <- function(x){
y <- numeric(2)
d1 <- (log(x[1]/D1)+(R+x[2]^2/2)*T)/x[2]*sqrt(T)
d2 <- d1 - x[2]*sqrt(T)
y[1] <- SO1 - (x[1]*pnorm(d1) - exp(-R*T)*D1*pnorm(d2))
y[2] <- sigmaS*SO1 - pnorm(d1)*x[2]*x[1]
y
}
xstart <- c(SO1,sigmaS)
nleqslv(xstart, fnewton, method="Broyden",control=list(trace=1))
nleqslv(xstart, fnewton, method="Newton",control=list(trace=1))
which will give the solution given by Tetereva. (I use trace=1 here just to check the iteration steps.)
I believe the value you give for R should be 8.516 and not something else. Using your values for the parameters
T <- 1
D1 <- 20010.75
R <- 8.516 # modified
sigmaS <- .11
SO1 <- 1311.74
like this
xstart <- c(1311.74,0.11)
nleqslv(xstart, fnewton, method="Broyden")
nleqslv(xstart, fnewton, method="Newton")
Then running nleqslv with these values converges very quickly.
If one uses R <- 2.32 (like Tetereva) nleqslv will also converge albeit with more iterations.
I cannot help you with what R should actually be but from Tetereva's presentation I assume R is in percentages. Since I don't have enough knowledge on the Black-Scholes model I can't be of any help for finding out what the correct values are for the various parameters. It's up to you.
I am trying to use http://rss.acs.unt.edu/Rdoc/library/stats/html/constrOptim.html in R to do optimization in R with some given linear constraints but not able to figure out how to set up the problem.
For example, I need to maximize $f(x,y) = log(x) + \frac{x^2}{y^2}$ subject to constraints $g_1(x,y) = x+y < 1$, $g_2(x,y) = x > 0$ and $g_3(x,y) = y > 0$. How do I do this in R? This is just a hypothetical example. Do not worry about its structure, instead I am interested to know how to set this up in R.
thanks!
Setting up the function was trivial:
fr <- function(x) { x1 <- x[1]
x2 <- x[2]
-(log(x1) + x1^2/x2^2) # need negative since constrOptim is a minimization routine
}
Setting up the constraint matrix was problematic due to a lack of much documentation, and I resorted to experimentation. The help page says "The feasible region is defined by ui %*% theta - ci >= 0". So I tested and this seemed to "work":
> rbind(c(-1,-1),c(1,0), c(0,1) ) %*% c(0.99,0.001) -c(-1,0, 0)
[,1]
[1,] 0.009
[2,] 0.990
[3,] 0.001
So I put in a row for each constraint/boundary:
constrOptim(c(0.99,0.001), fr, NULL, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0)) # the thresholds
For this problem there is a potential difficulty in that for all values of x the function goes to Inf as y -> 0. I do get a max around x=.95 and y=0 even when I push the starting values out to the "corner", but I'm somewhat suspicious that this is not the true maximum which I would have guessed was in the "corner".
EDIT:
Pursuing this I reasoned that the gradient might provide additional "direction" and added a gradient function:
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-(1/x[1] + 2 * x[1]/x[2]^2),
2 * x[1]^2 /x[2]^3 )
}
This did "steer" the optimization a bit closer to the c(.999..., 0) corner, instead of moving away from it, as it did for some starting values. I remain somewhat disappointed that the process seems to "head for the cliff" when the starting values are close to the center of the feasible region:
constrOptim(c(0.99,0.001), fr, grr, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0) )
$par
[1] 9.900007e-01 -3.542673e-16
$value
[1] -7.80924e+30
$counts
function gradient
2001 37
$convergence
[1] 11
$message
[1] "Objective function increased at outer iteration 2"
$outer.iterations
[1] 2
$barrier.value
[1] NaN
Note: Hans Werner Borchers posted a better example on R-Help that succeeded in getting the corner values by setting the constraint slightly away from the edge:
> constrOptim(c(0.25,0.25), fr, NULL,
ui=rbind( c(-1,-1), c(1,0), c(0,1) ),
ci=c(-1, 0.0001, 0.0001))
$par
[1] 0.9999 0.0001