Simple approximation of Inverse Incomplete gamma function - math

How could one approximate Inverse Incomplete gamma function Г(s,x) by some simple analytical function f(s,Г)?
That means write something like x = f(s,Г) = 12*log(123.45*Г) + Г + 123.4^s .
(I need at least ideas or references.)

You can look at the code in Boost: http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_gamma/igamma.html and see what they're using.
EDIT: They also have inverses: http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_gamma/igamma_inv.html

I've found out that x = f(s,Г) with given s can be nicely approximated by x = p0*(1-Г)^p1*ln(Г*p2). At least it worked for me with s <= 15 in region 0.001 < Г < 0.999.
Here p0,p1,p2 - is constants, which are chosen by approximation of f(s,Г) after you have chosen s.

There's a pretty good implementation in Cephes. There's also a D translation that I think fixes a few bugs in the Cephes version.

Related

Logsoftmax stability

I know how to make softmax stable by adding to element -max _i x_i. This avoids overflow and underflow.
Now, taking log of this can cause underflow. log softmax(x) can evaluate to zero, leading to -infinity.
I am not sure how to fix it. I know this is a common problem. I read several answers on it, which I didn't understand. But I am still confused on how to solve this problem.
PS: If you provide a simple example, it would be awesome.
In order to stabilize Logsoftmax, most implementations such as Tensorflow and Thenao, use a trick which takes out the largest component max(x_i). This trick is often used for stably computing softmax. For logsoftmax, we begin with:
After extracting out the exp(b) and using the fact that log(exp(x)) = x, we have:
If we set , this new equation has both overflow and underflow stability conditions.
In terms of code, if x is a vector:
def log_softmax(x):
x_off = x - np.max(x)
return x_off - np.log(np.sum(np.exp(x_off)))
See also: https://timvieira.github.io/blog/post/2014/02/11/exp-normalize-trick/
logsoftmax = logits - log(reduce_sum(exp(logits), dim))
refer: https://www.tensorflow.org/api_docs/python/tf/nn/log_softmax
Just use this as it take care of Nan
tf.nn.softmax_cross_entropy_with_logits(
labels, logits, axis=-1, name=None
)
logits = tf.constant([[4, 5, 1000]], dtype = tf.float32)
labels = tf.constant([[1,0,1]], dtype = tf.float32)
# Case-1
output = tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits)
print(output)
>>> tf.Tensor([996.], shape=(1,), dtype=float32)
#Case-2
a = tf.nn.softmax(logits)
output = tf.reduce_sum(-(labels * tf.math.log(a)))
print(output)
>>> tf.Tensor(nan, shape=(), dtype=float32)
# this happens because value of softmax truncates to zero
print(a)
>>> <tf.Tensor: shape=(1, 3), dtype=float32, numpy=array([[0., 0., 1.]], dtype=float32)>
Mathematical tricks cannot help you create log 0 be something other that -inf.
If you think it trough, the only way is you normalize the data so that you don't end in there.

R: How to add additional constraints to DEoptim

I am trying to minimize an objective function using DEoptim, subject to a simple constraint. I am not clear as to how to add the simple constraint to the call to DEoptim. Here is the objective function:
obj_min <- function(n,in_data) {
gamma <- in_data$Gamma
delta <- in_data$Delta
theta <- in_data$Theta
gammaSum <- sum(n * gamma)
deltaSum <- sum(n * delta)
thetaSum <- sum(n * theta)
abs((EPC * gammaSum - 2 * abs(deltaSum)) / thetaSum )
}
My mapping function (to impose integer constraints) is as follows:
mappingFun <- function(x) {
x[1:length(x)] <- round(x[1:length(x)], 0)
}
My call to DEoptim is:
out <- DEoptim(DTRRR_min, lower = c(rep(-5, length(in_data[, 1]))),
upper = c(rep(5, length(in_data[, 1]))),
fnMap = mappingFun, DEoptim.control(trace = F),in_data)
My in_data object (data frame) is:
Underlying.Price Delta Gamma Theta Vega Rho Implied.Volatility
1 40.69 0.9237 3.2188 -0.7111 2.0493 0.0033 0.3119
2 40.69 0.7713 6.2267 -1.6352 4.3240 0.0032 0.3402
3 40.69 0.5822 8.4631 -2.0019 5.5782 0.0338 0.3229
4 40.69 0.3642 8.5186 -1.8403 5.3661 0.0210 0.3086
5 40.69 0.1802 6.1968 -1.2366 3.7517 0.0093 0.2966
I would like to add a simple constraint that:
sum(n * delta) = target
In other words, the summation of the optimized parameters, n, multiplied by the deltas in my in_data data frame sum to a target of some sort. For simplicity, lets just say 0.5. How do I impose
sum(n * delta) = 0.5
as a constraint? Thank you for your help!
OK, thank you for all of your suggestions. I have researched and worked through my problem from many angles, and I wanted to share my thoughts with everyone, in case they can be helpful to some of you.
Most obvious, in my particular objective function, deltaSum is a variable, and I am attempting to constrain it to a particular value. Simple substitution of this constrained value into the objective function is the solution to this (trivial). However, assuming I was to introduce a constraint on a variable which is not already a variable in the objective function, I can simply run a for loop which returns Inf for any constraint I wish to impose, ie:
obj_func_sum_RRRs <- function(n, in_data) {
#Declare deltaSum, gammaSum, thetaSum, vegaSum, and rhoSum from in_data
#Impose constraints
#No dividing by 0:
if (thetaSum == 0) {
return(Inf)
}
#Specify that regardless of the length of vector of variables to
#be optimized, we only want our final results to include either 4 or 6
#nonzero n's in our final optimized solution
if (!sum(n[1:length(n)] != 0) == 4 &
!sum(n[1:length(n)] != 0) == 6) {
return(Inf)
}
(deltaSum + gammaSum)/thetaSum
}
The first for loop, (thetaSum == 0, return Inf) works because while Inf is a solution which the optimizer understands (and will never select as optimal), division by 0 in R returns NaN, which "breaks" the optimization process. This is a bit "hacky", in that it is likely NOT the most computationally efficient way to approach the problem, but to be honest, with the infrastructure that I am developing with a close friend and software architect guru (which utilizes microservices deployed through the Microsoft Service Fabric), our long-range backtesting is still lightening quick. This methodology actually allows you to impose any number of constraints on your problem, although further testing would need to be done to see how burdensome the computational complexity could become using this technique...
The Lagrange technique above can be viable, but only if you derive an analytical form of lambda on paper, then implement in code. It is not always practical in application, and while you may be able to code up an algorithm to optimize the parameter, it sounds like a bad idea to paint yourself into a corner where you have to optimize a parameter which is, in turn, necessary to the optimizing of the original objective function. Just setting a for loop as advised above seems the better way to go.
Food for thought....
DEOptim package description says
Implements the differential evolution algorithm for global
optimization of a realvalued function of a real-valued parameter
vector.
The concept of global optimization doesn't have place for constraints and it is also known as unconstrained optimization. So sorry but its not possible directly. Having said that you can always use "Lagrange's multiplier" hack if you must do it. To do it you need to do something like:
abs((EPC * gammaSum - 2 * abs(deltaSum))/thetaSum) - lambda* (sum(n * delta) - 0.5)
where you penalizing slack of your constraint.
I am using a wrapper which customises the call of DEoptim based on external constraints. Not very elegant I admit it but it works to some extent.
My objective function - a Monte Carlo simulation - is quite time consuming
so constraints are really helpful...
Chris
Due to the very specific character of what I am doing (Monte Carlo raytracing for the optimisation of neutron beam optics) I did not see any reason to add code. I think it is really the concept what matters here. I'll gladly share what I have with anybody interested. Just let me know.... Chris

How extreme values of a functional can be found using R?

I have a functional like this :
(LaTex formula: $v[y]=\int_0^2 (y'^2+23yy'+12y^2+3ye^{2t})dt$)
with given start and end conditions y(0)=-1, y(2)=18.
How can I find extreme values of this functional in R? I realize how it can be done for example in Excel but didn't find appropriate solution in R.
Before trying to solve such a task in a numerical setting, it might be better to lean back and think about it for a moment.
This is a problem typically treated in the mathematical discipline of "variational calculus". A necessary condition for a function y(t) to be an extremum of the functional (ie. the integral) is the so-called Euler-Lagrange equation, see
Calculus of Variations at Wolfram Mathworld.
Applying it to f(t, y, y') as the integrand in your request, I get (please check, I can easily have made a mistake)
y'' - 12*y + 3/2*exp(2*t) = 0
You can go now and find a symbolic solution for this differential equation (with the help of a textbook, or some CAS), or solve it numerically with the help of an R package such as 'deSolve'.
PS: Solving this as an optimization problem based on discretization is possible, but may lead you on a long and stony road. I remember solving the "brachistochrone problem" to a satisfactory accuracy only by applying several hundred variables (not in R).
Here is a numerical solution in R. First the functional:
f<-function(y,t=head(seq(0,2,len=length(y)),-1)){
len<-length(y)-1
dy<-diff(y)*len/2
y0<-(head(y,-1)+y[-1])/2
2*sum(dy^2+23*y0*dy+12*y0^2+3*y0*exp(2*t))/len
}
Now the function that does the actual optimization. The best results I got were using the BFGS optimization method, and parametrizing using dy rather than y:
findMinY<-function(points=100, ## number of points of evaluation
boundary=c(-1,18), ## boundary values
y0=NULL, ## optional initial value
method="Nelder-Mead", ## optimization method
dff=T) ## if TRUE, optimizes based on dy rather than y
{
t<-head(seq(0,2,len=points),-1)
if(is.null(y0) || length(y0)!=points)
y0<-seq(boundary[1],boundary[2],len=points)
if(dff)
y0<-diff(y0)
else
y0<-y0[-1]
y0<-head(y0,-1)
ff<-function(z){
if(dff)
y<-c(cumsum(c(boundary[1],z)),boundary[2])
else
y<-c(boundary[1],z,boundary[2])
f(y,t)
}
res<-optim(y0,ff,control=list(maxit=1e9),method=method)
cat("Iterations:",res$counts,"\n")
ymin<-res$par
if(dff)
c(cumsum(c(boundary[1],ymin)),boundary[2])
else
c(boundary[1],ymin,boundary[2])
}
With 500 points of evaluation, it only takes a few seconds with BFGS:
> system.time(yy<-findMinY(500,method="BFGS"))
Iterations: 90 18
user system elapsed
2.696 0.000 2.703
The resulting function looks like this:
plot(seq(0,2,len=length(yy)),yy,type='l')
And now a solution that numerically integrates the Euler equation.
As #HansWerner pointed out, this problem boils down to applying the Euler-Lagrange equation to the integrand in OP's question, and then solving that differential equation, either analytically or numerically. In this case the relevant ODE is
y'' - 12*y = 3/2*exp(2*t)
subject to:
y(0) = -1
y(2) = 18
So this is a boundary value problem, best approached using bvpcol(...) in package bvpSolve.
library(bvpSolve)
F <- function(t, y.in, pars){
dy <- y.in[2]
d2y <- 12*y.in[1] + 1.5*exp(2*t)
return(list(c(dy,d2y)))
}
init <- c(-1,NA)
end <- c(18,NA)
t <- seq(0, 2, by = 0.01)
sol <- bvpcol(yini = init, yend = end, x = t, func = F)
y = function(t){ # analytic solution...
b <- sqrt(12)
a <- 1.5/(4-b*b)
u <- exp(2*b)
C1 <- ((18*u + 1) - a*(exp(4)*u-1))/(u*u - 1)
C2 <- -1 - a - C1
return(a*exp(2*t) + C1*exp(b*t) + C2*exp(-b*t))
}
par(mfrow=c(1,2))
plot(t,y(t), type="l", xlim=c(0,2),ylim=c(-1,18), col="red", main="Analytical Solution")
plot(sol[,1],sol[,2], type="l", xlim=c(0,2),ylim=c(-1,18), xlab="t", ylab="y(t)", main="Numerical Solution")
It turns out that in this very simple example, there is an analytical solution:
y(t) = a * exp(2*t) + C1 * exp(sqrt(12)*t) + C2 * exp(-sqrt(12)*t)
where a = -3/16 and C1 and C2 are determined to satisfy the boundary conditions. As the plots show, the numerical and analytic solution agree completely, and also agree with the solution provided by #mrip

Why is nlogn so hard to invert?

Let's say I have a function that is nlogn in space requirements, I want to work out the maximum size of input for that function for a given available space. i.e. I want to find n where nlogn=c.
I followed an approach to calculate n, that looks like this in R:
step = function(R, z) { log(log(R)-z)}
guess = function(R) log(log(R))
inverse_nlogn = function(R, accuracy=1e-10) {
zi_1 = 0
z = guess(R)
while(abs(z - zi_1)>accuracy) {
zi_1 = z
z = step(R, z)
}
exp(exp(z))
}
But I can't get understand why it must be solved iteratively. For the range we are interested (n>1), the function is non singular.
There's nothing special about n log n — nearly all elementary functions fail to have elementary inverses, and so have to be solved by some other means: bisection, Newton's method, Lagrange inversion theorem, series reversion, Lambert W function...
As Gareth hinted the Lambert W function (eg here) gets you almost there, indeed n = c/W(c)
A wee google found this, which might be helpful.
Following up (being completely explicit):
library(emdbook)
n <- 2.5
c <- 2.5*log(2.5)
exp(lambertW(c)) ## 2.5
library(gsl)
exp(lambert_W0(c)) ## 2.5
There are probably minor differences in speed, accuracy, etc. of the two implementations. I haven't tested/benchmarked them extensively. (Now that I tried
library(sos)
findFn("lambert W")
I discover that it's implemented all over the place: the games package, and a whole package that's called LambertW ...

Computation of numerical integral involving convolution

I have to solve the following convolution related numerical integration problem in R or perhaps computer algebra system like Maxima.
Integral[({k(y)-l(y)}^2)dy]
where
k(.) is the pdf of a standard normal distribution
l(y)=integral[k(z)*k(z+y)dz] (standard convolution)
z and y are scalars
The domain of y is -inf to +inf.
The integral in function l(.) is an indefinite integral. Do I need to add any additional assumption on z to obtain this?
Thank you.
Here is a symbolic solution from Mathematica:
R does not do symbolic integration, just numerical integration. There is the Ryacas package which intefaces with Yacas, a symbolic math program that may help.
See the distr package for possible help with the convolution parts (it will do the convolutions, I just don't know if the result will be integrable symbolicly).
You can numerically integrate the convolutions from distr using the integrate function, but all the parameters need to be specified as numbers not variables.
For the record, here is the same problem solved with Maxima 5.26.0.
(%i2) k(u):=exp(-(1/2)*u^2)/sqrt(2*%pi) $
(%i3) integrate (k(x) * k(y + x), x, minf, inf);
(%o3) %e^-(y^2/4)/(2*sqrt(%pi))
(%i4) l(y) := ''%;
(%o4) l(y):=%e^-(y^2/4)/(2*sqrt(%pi))
(%i5) integrate ((k(y) - l(y))^2, y, minf, inf);
(%o5) ((sqrt(2)+2)*sqrt(3)-2^(5/2))/(4*sqrt(3)*sqrt(%pi))
(%i6) float (%);
(%o6) .02090706601281356
Sorry for the late reply. Leaving this here in case someone finds it by searching.
I try to do something similar in matlab, where I convolute two random (Rayleigh distributed) variables. The result of fz_fun is equal to fy_fun, I don't know why. Maybe some here knows it?
sigma1 = 0.45;
sigma2 = 0.29;
fx_fun =#(x) [0*x(x<0) , (x(x>=0)./sigma1^2).*exp(-0.5*(x(x>=0)./sigma1).^2)];
fy_fun =#(y) [0*y(y<0) , (y(y>=0)./sigma2^2).*exp(-0.5*(y(y>=0)./sigma2).^2)];
% Rayleigh distribution of random var X,Y:
step = 0.1;
x= -2:step:3;
y= -2:step:3;
%% Convolution:
z= y;
fz = zeros(size(y));
for i = 1:length(y)
fz_fun(i) = integral(#(z) fy_fun(y(i)).*fx_fun(z-y(i)),0,Inf); % probability density of random variable z= x+y
end

Resources