Difference between Error functions - r

In Mathematica there is an option see this question to calculate the difference between two error functions. However, I have not yet found any thing similar in R.
I need to calculate things like Erf(1604.041) - Erf(3117.127) and get a non zero value...

You can come close to the result of 4e-1117421 given in the comment by #James.
First, the error function can be computed like this in R:
1 - 2 * pnorm(-sqrt(2) * x)
However, this will give you numerical zeros due to floating point precision. Fortunately, pnorm can return the log of the p-values. You can then exponentiate it using arbitrary precision numbers:
library(Rmpfr)
2 * exp(mpfr(pnorm(-sqrt(2) * 1604.041, log.p = TRUE), precBits = 32)) -
2 * exp(mpfr(pnorm(-sqrt(2) * 3117.127, log.p = TRUE), precBits = 32))
#1 'mpfr' number of precision 32 bits
#[1] 4.2826176801e-1117421
(Note that you get only floating point precision for the log-p-values.)
However, I wonder in which kind of application such a precision is necessary. It's essentially a zero value.
Edit: And I've just found out that Rmpfr offers an implementation of the complementary error function. You can simply do this:
erfc(mpfr(3117.127, precBits = 32)) - erfc(mpfr(1604.041, precBits = 32))
#1 'mpfr' number of precision 32 bits
#[1] -4.2854514871e-1117421

Related

How many Binary Digits needed for X Decimal Digits of Square Root of 2

I've been playing around with calculating the square root of 2 and the like. It's easy to come up with an algorithm that will produce n correct binary digits. What I'd like help with is determining how many binary digits I need to get m correct decimal digits? m Binary digits will get me m Decimal digits, but the m decimal digits may not all be correct yet.
EDIT:
I've determined that the lower bound on the binary precision = ceil(log2(10^m)).
Thinking about it there might not be a strict upper-bound, since a carry from any lower power of 2 (when converting to base 10) could potentially effect any higher digit base 10.
This may thus be a dynamic problem that requires evaluating the fractional expansion at m binary digits and determining which additional binary digits could potentially cause a carry in base 10.
Edit 2: I was probably overthinking this. After the initial calculation I can keep adding (1x10^(-precision)) and squaring the result until I exceed 2 - and then subtract (1x10^(-precision)) and I'll have my answer. Nevertheless I am still interested in finding/developing such an algorithm :)
Let x be a real and y be its approximation.
Let RE be the relative error of y with respect to x:
RE(x, y) = abs(x - y) / abs(x)
Let b be a nonnegative integer. The Log-Relative Error in base b is defined as:
LREb(x, y) = -logb(RE(x, y))
where logb is the base-b logarithm:
logb(z) = log(z) / log(b)
for any nonnegative z.
The LRE in base b represents the number of common digits between x and y. Here, the "number of correct digits" is not an integer, but a real number: this will simplify the next calculations avoiding the need for ceil and floor functions, provided that we accept statements such as : "y has 2.3 correct digits with respect to x". More precisely, if x and y have q common base b digits, then:
LREb(x, y) >= q - 1
With these equation, if the relative error has an upper bound, then the LREb has a lower bound. More precisely, if:
RE(x, y) <= epsilon
then:
LREb(x, y) >= -logb(epsilon)
Also, if the number of correct digits in base 10 is LRE10 = p, then RE = 10^-p, which implies that the number of correct digits in base 2 is:
LRE2 = -log2(10^-p)
what method you are using?
I am assuming binary search of x in y = x^2
integer part is limited by the result sqrt(y) and cannot be cut otherwise result would be wrong. However the x is limited by half the bits of y so:
ni2 = log2(|y|)
fractional part is tricky see:
the relation between binary and decimal digits
but after the nonlinear start of first digits the dependence stabilizes here reversed formula from linked answer:
nf2 = (((nf10-7.810)/9.6366363636363636363636)+1.0)<<5;
ni2 is integer part binary bits/digits
nf2 is fractional part binary bits/digits
nf10 is fractional part decadic digits
btw I used 32 bit aligned values as that is what I use for my arithmetics so:
9.6366363636363636363636 = 32/0.30102999566398119521373889472449
0.30102999566398119521373889472449 = log10(2)

Getting 0 in R instead of a precise result

How can I get the actual precise result instead of the rounded ones?
result = ((0.61)**(10435)*(0.39)**(6565))/((0.63)**(5023)*(0.60)**(5412)*(0.37)**(2977)*(0.40)**(3588))
out:
NaN
Because, denominator is 0
I think logarithm is a power tool to deal with exponential with large powers (see it properties in https://mathworld.wolfram.com/Logarithm.html)
You can try to use log first over your math expression and then apply exp in turn, i.e.,
result <- exp((10435*log(0.61)+6565*log(0.39)) - (5023*log(0.63)+5412*log(0.60)+ 2977*log(0.37)+3588*log(0.40)))
which gives
> result
[1] 0.001219116
R cannot handle such large exponents because that will converge to 0 beyond its precision. Precision is not infinite. For what you want, you need an arbitrary precision package, such as Rmpfr.
library(Rmpfr)
precision <- 120
result <- (mpfr(0.61, precision)**10435 * mpfr(0.39, precision)**6565) /
(mpfr(0.63, precision)**5023 * mpfr(0.60, precision)**5412 * mpfr(0.37, precision)**2977 * mpfr(0.40, precision)**3588)
print(result)
Output:
1 'mpfr' number of precision 120 bits
[1] 0.0012191160601483692718001967190171336975

Numerical blowup problem in a fractional function in R

Ciao,
I am working with this function in R:
betaFun = function(x){
if(x == 0){
return(0.5)
}
return( ( 1+exp(x)*(x-1) )/( x*(exp(x)-1) ) )
}
The function is smooth and well defined for every x (at least from a theoretical point of view) and in 0 the limit approach to 0.5 (you can convince yourself about this by using Hopital theorem).
I have the following problem:
i.e. the fact that, due to the limit, R wrongly compute the values and I get a blowup in 0.
Here I report the numerical issue:
x = c(1e-4, 1e-6, 1e-8, 1e-10, 1e-12, 1e-13)
sapply(x, betaFun)
[1] 5.000083e-01 5.000442e-01 2.220446e+00 0.000000e+00 0.000000e+00 1.111111e+10
As you can see the evaluation is pretty weird, in particular last one.
I thought that I could solve this problem by defining the missing value in 0 (as you can see from the code) but it is not true.
Do you know how can I solve this numerical blow up problem?
I need high precision for this function since I have to invert it around 0. I will do it using nleqslv function from nleqslv library. Of course the inversion will return wrong solutions if the function has numerical problems.
I think that you are losing accuracy in the evaluation of exp(x)-1 for x close to 0. In C if I evaluate your function as
double f2( double x)
{ return (x==0) ? 0.5
: (x*exp(x) - expm1(x))/( x*expm1(x));
}
The problem goes away. Here expm1 is a math library function that computes exp(x) - 1, without losing accuracy for small x. I'm afraid I don't know if R has this, but you'd hope it would.
I think, though, that you would be better to test for |x| was sufficiently small, rather than 0.0. The point is that for small enough x both x*exp(x) and expm1(x) will be, as doubles, x, so their difference will be 0. To keep maximum accuracy may need to add a linear term to the 0.5 you return. I've not worked out precisely what 'sufficiently small should be, but it's somewhere around 1e-16 I think.
Your problem is that you take the quotient of two numbers with very small absolute values. Such numbers are only represented to floating point precision.
You don't specify why you need these function values for x values close to zero. One easy option would be coercion to high precision numbers:
library(Rmpfr)
betaFun = function(x){
x <- mpfr(as.character(x), precBits = 256)
#if x is calculated, you should switch to high precision numbers for its calculation
#this step could be removed then
#do calculation with high precision,
#then coerce to normal precision (assuming that is necessary)
ifelse(x == 0, 0.5, as((1 + exp(x) * (x - 1)) / (x * (exp(x) - 1)), "numeric"))
}
x = c(1e-4, 1e-6, 1e-8, 1e-10, 1e-12, 1e-13, 0)
betaFun(x)
#[1] 0.5000083 0.5000001 0.5000000 0.5000000 0.5000000 0.5000000 0.5000000
As you notice, you are encountering the problem near zero. The roots of both the numerator and denominator are zero. And as the OP mentioned, using L'Hôpitcal, you notice that in that f(x) = 1/2.
From a numerical point of view, things go slightly different. Floating points will always have an error as not every Real number can be represented as a floating point number. For example:
exp(1E-3) -1 = 0.0010005001667083845973138522822409868 # numeric
exp(1/1000)-1 = 0.001000500166708341668055753993058311563076200580... # true
^
The problem in evaluating numerically exp(1E-3)-1 already starts at the beginning, i.e. 1E-3
1E-3 = x = 0.0010000000000000000208166817117216851
exp(x) = 1.0010005001667083845973138522822409868
exp(x) - 1 = 0.0010005001667083845973138522822409868
1E-3 cannot be represented as a floating point, and is accurate upto 17 digits.
IEEE will give the closest floating point value possible to the true value of x, which already has an error due to (1). Still exp(x) is only accurate upto 17 digits.
By subtracting 1, we get a bunch of zero's in the beginning, and now our result is only accurate upto 14 digits.
So now that we know that we cannot represent everything exactly as a floating point, you should realize that near zero, it becomes a bit awkward and both numerator and denominator become less and less accurate, especially near 1E-13.
numerator_numeric(1E-13) = 1.1102230246251565E-16
numerator_true(1E-13) = 5.00000000000033333333333...E-27
Generally, what you do near such a point is use a Taylor expansion around zero, and the normal function everywhere else:
betaFun = function(x){
if(-1E-1 < x && x < 1E-1){
return(0.5 + x/12. - x^3/720. + x^5/30240.)
}
return( ( 1+exp(x)*(x-1) )/( x*(exp(x)-1) ) )
}
The above expansion is accurate upto 13 digits for x in the small region

R: approximating `e = exp(1)` using `(1 + 1 / n) ^ n` gives absurd result when `n` is large

So, I was just playing around with manually calculating the value of e in R and I noticed something that was a bit disturbing to me.
The value of e using R's exp() command...
exp(1)
#[1] 2.718282
Now, I'll try to manually calculate it using x = 10000
x <- 10000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.718146
Not quite but we'll try to get closer using x = 100000
x <- 100000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.718268
Warmer but still a bit off...
x <- 1000000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.71828
Now, let's try it with a huge one
x <- 5000000000000000
y <- (1 + (1 / x)) ^ x
y
#[1] 3.035035
Well, that's not right. What's going on here? Am I overflowing the data type and need to use a certain package instead? If so, are there no warnings when you overflow a data type?
You've got a problem with machine precision. As soon as (1 / x) < 2.22e-16, 1 + (1 / x) is just 1. Mathematical limit breaks down in finite-precision numerical computations. Your final x in the question is already 5e+15, very close to this brink. Try x <- x * 10, and your y would be 1.
This is neither "overflow" nor "underflow" as there is no difficulty in representing a number as small as 1e-308. It is the problem of the loss of significant digits during floating-point arithmetic. When you do 1 + (1 / x), the bigger x is, the fewer significant digits in the (1 / x) part can be preserved when you add it to 1, and eventually you lose that (1 / x) term altogether.
## valid 16 significant digits
1 + 1.23e-01 = 1.123000000000000|
1 + 1.23e-02 = 1.012300000000000|
... ...
1 + 1.23e-15 = 1.000000000000001|
1 + 1.23e-16 = 1.000000000000000|
Any numerical analysis book would tell you the following.
Avoid adding a large number and a small number. In floating-point addition a + b = a * (1 + b / a), if b / a < 2.22e-16, there us a + b = a. This implies that when adding up a number of positive numbers, it is more stable to accumulate them from the smallest to the largest.
Avoid subtracting one number from another of the same magnitude, or you may get cancellation error. The web page has a classic example of using the quadratic formula.
You are also advised to have a read on Approximation to constant "pi" does not get any better after 50 iterations, a question asked a few days after your question. Using a series to approximate an irrational number is numerically stable as you won't get the absurd behavior seen in your question. But the finite number of valid significant digits imposes a different problem: numerical convergence, that is, you can only approximate the target value up to a certain number of significant digits. MichaelChirico's answer using Taylor series would converge after 19 terms, since 1 / factorial(19) is already numerically 0 when added to 1.
Multiplication / division between floating-point numbers don't cause problem on significant digits; they may cause "overflow" or "underflow". However, given the wide range of representable floating-point values (1e-308 ~ 1e+307), "overflow" and "underflow" should be rare. The real difficulty is with addition / subtraction where significant digits can be easily lost. See Can I stably invert a Vandermonde matrix with many small values in R? for an example on matrix computations. It is not impossible to get higher precision, but the work is probably more involved. For example, OP of the matrix example eventually used the GMP (GNU Multiple Precision Arithmetic Library) and associated R packages to proceed: How to put Rmpfr values into a function in R?
You might also try the Taylor series approximation to exp(1), namely
e^x = \sum_{k = 0}{\infty} x^k / k!
Thus we can approximate e = e^1 by truncating this sum; in R:
sprintf('%.20f', exp(1))
# [1] "2.71828182845904509080"
sprintf('%.20f', sum(1/factorial(0:10)))
# [1] "2.71828180114638451315"
sprintf('%.20f', sum(1/factorial(0:100)))
# [1] "2.71828182845904509080"

Dealing with very small numbers in R

I need to calculate a list of very small numbers such as
(0.1)^1000, 0.2^(1200),
and then normalize them so they will sum up to one
i.e.
a1 = 0.1^1000,
a2 = 0.2^1200
And I want to calculate
a1' = a1/(a1+a2),
a2'=a2(a1+a2).
I'm running into underflow problems, as I get a1=0. How can I get around this?
Theoretically I could deal with logs, and then log(a1) = 1000*log(0.l) would be a way to represent a1 without underflow problems - But in order to normalize I would need to get
log(a1+a2) - which I can't compute since I can't represent a1 directly.
I'm programming with R - as far as I can tell there is no data type such Decimal in c# which
allows you to get better than double-precision value.
Any suggestions will be appreciated, thanks
Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.
But to do that in general, you can use the idea from the logspace_add C-function that's underneath the hood of R. One can define logxpy ( =log(x+y) ) when lx = log(x) and ly = log(y) as :
logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))
Which means that we can use :
> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)
> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162
> exp(la2 - logxpy(la1,la2))
[1] 1
This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus 5.807...e-162 . If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.
EDIT :
to do the math for you :
log(x+y) = log(exp(lx)+exp(ly))
= log( exp(lx) * (1 + exp(ly-lx) )
= lx + log ( 1 + exp(ly - lx) )
Now you just take the largest as lx, and then you come at the expression in logxpy().
EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:
Say lx=1 and ly=1000, then :
> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000
The Brobdingnag package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.
a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)
Try the arbitrary precision packages:
Rmpfr "R MPFR - Multiple Precision Floating-Point Reliable"
Ryacas "R Interface to the 'Yacas' Computer Algebra System" - may also be able to do arbitrary precision.
Maybe you can treat a1 and a2 as fractions. In your example, with
a1 = (a1num/a1denom)^1000 # 1/10
a2 = (a2num/a2denom)^1200 # 1/5
you would arrive at
a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
which can be computed using the gmp package:
library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))

Resources