Ciao,
I am working with this function in R:
betaFun = function(x){
if(x == 0){
return(0.5)
}
return( ( 1+exp(x)*(x-1) )/( x*(exp(x)-1) ) )
}
The function is smooth and well defined for every x (at least from a theoretical point of view) and in 0 the limit approach to 0.5 (you can convince yourself about this by using Hopital theorem).
I have the following problem:
i.e. the fact that, due to the limit, R wrongly compute the values and I get a blowup in 0.
Here I report the numerical issue:
x = c(1e-4, 1e-6, 1e-8, 1e-10, 1e-12, 1e-13)
sapply(x, betaFun)
[1] 5.000083e-01 5.000442e-01 2.220446e+00 0.000000e+00 0.000000e+00 1.111111e+10
As you can see the evaluation is pretty weird, in particular last one.
I thought that I could solve this problem by defining the missing value in 0 (as you can see from the code) but it is not true.
Do you know how can I solve this numerical blow up problem?
I need high precision for this function since I have to invert it around 0. I will do it using nleqslv function from nleqslv library. Of course the inversion will return wrong solutions if the function has numerical problems.
I think that you are losing accuracy in the evaluation of exp(x)-1 for x close to 0. In C if I evaluate your function as
double f2( double x)
{ return (x==0) ? 0.5
: (x*exp(x) - expm1(x))/( x*expm1(x));
}
The problem goes away. Here expm1 is a math library function that computes exp(x) - 1, without losing accuracy for small x. I'm afraid I don't know if R has this, but you'd hope it would.
I think, though, that you would be better to test for |x| was sufficiently small, rather than 0.0. The point is that for small enough x both x*exp(x) and expm1(x) will be, as doubles, x, so their difference will be 0. To keep maximum accuracy may need to add a linear term to the 0.5 you return. I've not worked out precisely what 'sufficiently small should be, but it's somewhere around 1e-16 I think.
Your problem is that you take the quotient of two numbers with very small absolute values. Such numbers are only represented to floating point precision.
You don't specify why you need these function values for x values close to zero. One easy option would be coercion to high precision numbers:
library(Rmpfr)
betaFun = function(x){
x <- mpfr(as.character(x), precBits = 256)
#if x is calculated, you should switch to high precision numbers for its calculation
#this step could be removed then
#do calculation with high precision,
#then coerce to normal precision (assuming that is necessary)
ifelse(x == 0, 0.5, as((1 + exp(x) * (x - 1)) / (x * (exp(x) - 1)), "numeric"))
}
x = c(1e-4, 1e-6, 1e-8, 1e-10, 1e-12, 1e-13, 0)
betaFun(x)
#[1] 0.5000083 0.5000001 0.5000000 0.5000000 0.5000000 0.5000000 0.5000000
As you notice, you are encountering the problem near zero. The roots of both the numerator and denominator are zero. And as the OP mentioned, using L'Hôpitcal, you notice that in that f(x) = 1/2.
From a numerical point of view, things go slightly different. Floating points will always have an error as not every Real number can be represented as a floating point number. For example:
exp(1E-3) -1 = 0.0010005001667083845973138522822409868 # numeric
exp(1/1000)-1 = 0.001000500166708341668055753993058311563076200580... # true
^
The problem in evaluating numerically exp(1E-3)-1 already starts at the beginning, i.e. 1E-3
1E-3 = x = 0.0010000000000000000208166817117216851
exp(x) = 1.0010005001667083845973138522822409868
exp(x) - 1 = 0.0010005001667083845973138522822409868
1E-3 cannot be represented as a floating point, and is accurate upto 17 digits.
IEEE will give the closest floating point value possible to the true value of x, which already has an error due to (1). Still exp(x) is only accurate upto 17 digits.
By subtracting 1, we get a bunch of zero's in the beginning, and now our result is only accurate upto 14 digits.
So now that we know that we cannot represent everything exactly as a floating point, you should realize that near zero, it becomes a bit awkward and both numerator and denominator become less and less accurate, especially near 1E-13.
numerator_numeric(1E-13) = 1.1102230246251565E-16
numerator_true(1E-13) = 5.00000000000033333333333...E-27
Generally, what you do near such a point is use a Taylor expansion around zero, and the normal function everywhere else:
betaFun = function(x){
if(-1E-1 < x && x < 1E-1){
return(0.5 + x/12. - x^3/720. + x^5/30240.)
}
return( ( 1+exp(x)*(x-1) )/( x*(exp(x)-1) ) )
}
The above expansion is accurate upto 13 digits for x in the small region
Related
I've been playing around with calculating the square root of 2 and the like. It's easy to come up with an algorithm that will produce n correct binary digits. What I'd like help with is determining how many binary digits I need to get m correct decimal digits? m Binary digits will get me m Decimal digits, but the m decimal digits may not all be correct yet.
EDIT:
I've determined that the lower bound on the binary precision = ceil(log2(10^m)).
Thinking about it there might not be a strict upper-bound, since a carry from any lower power of 2 (when converting to base 10) could potentially effect any higher digit base 10.
This may thus be a dynamic problem that requires evaluating the fractional expansion at m binary digits and determining which additional binary digits could potentially cause a carry in base 10.
Edit 2: I was probably overthinking this. After the initial calculation I can keep adding (1x10^(-precision)) and squaring the result until I exceed 2 - and then subtract (1x10^(-precision)) and I'll have my answer. Nevertheless I am still interested in finding/developing such an algorithm :)
Let x be a real and y be its approximation.
Let RE be the relative error of y with respect to x:
RE(x, y) = abs(x - y) / abs(x)
Let b be a nonnegative integer. The Log-Relative Error in base b is defined as:
LREb(x, y) = -logb(RE(x, y))
where logb is the base-b logarithm:
logb(z) = log(z) / log(b)
for any nonnegative z.
The LRE in base b represents the number of common digits between x and y. Here, the "number of correct digits" is not an integer, but a real number: this will simplify the next calculations avoiding the need for ceil and floor functions, provided that we accept statements such as : "y has 2.3 correct digits with respect to x". More precisely, if x and y have q common base b digits, then:
LREb(x, y) >= q - 1
With these equation, if the relative error has an upper bound, then the LREb has a lower bound. More precisely, if:
RE(x, y) <= epsilon
then:
LREb(x, y) >= -logb(epsilon)
Also, if the number of correct digits in base 10 is LRE10 = p, then RE = 10^-p, which implies that the number of correct digits in base 2 is:
LRE2 = -log2(10^-p)
what method you are using?
I am assuming binary search of x in y = x^2
integer part is limited by the result sqrt(y) and cannot be cut otherwise result would be wrong. However the x is limited by half the bits of y so:
ni2 = log2(|y|)
fractional part is tricky see:
the relation between binary and decimal digits
but after the nonlinear start of first digits the dependence stabilizes here reversed formula from linked answer:
nf2 = (((nf10-7.810)/9.6366363636363636363636)+1.0)<<5;
ni2 is integer part binary bits/digits
nf2 is fractional part binary bits/digits
nf10 is fractional part decadic digits
btw I used 32 bit aligned values as that is what I use for my arithmetics so:
9.6366363636363636363636 = 32/0.30102999566398119521373889472449
0.30102999566398119521373889472449 = log10(2)
I'm reading Deep Learning by Goodfellow et al. and am trying to implement gradient descent as shown in Section 4.5 Example: Linear Least Squares. This is page 92 in the hard copy of the book.
The algorithm can be viewed in detail at https://www.deeplearningbook.org/contents/numerical.html with R implementation of linear least squares on page 94.
I've tried implementing in R, and the algorithm as implemented converges on a vector, but this vector does not seem to minimize the least squares function as required. Adding epsilon to the vector in question frequently produces a "minimum" less than the minimum outputted by my program.
options(digits = 15)
dim_square = 2 ### set dimension of square matrix
# Generate random vector, random matrix, and
set.seed(1234)
A = matrix(nrow = dim_square, ncol = dim_square, byrow = T, rlnorm(dim_square ^ 2)/10)
b = rep(rnorm(1), dim_square)
# having fixed A & B, select X randomly
x = rnorm(dim_square) # vector length of dim_square--supposed to be arbitrary
f = function(x, A, b){
total_vector = A %*% x + b # this is the function that we want to minimize
total = 0.5 * sum(abs(total_vector) ^ 2) # L2 norm squared
return(total)
}
f(x,A,b)
# how close do we want to get?
epsilon = 0.1
delta = 0.01
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
steps = vector()
while(L2_norm > delta){
x = x - epsilon * value
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
print(L2_norm)
}
minimum = f(x, A, b)
minimum
minimum_minus = f(x - 0.5*epsilon, A, b)
minimum_minus # less than the minimum found by gradient descent! Why?
On page 94 of the pdf appearing at https://www.deeplearningbook.org/contents/numerical.html
I am trying to find the values of the vector x such that f(x) is minimized. However, as demonstrated by the minimum in my code, and minimum_minus, minimum is not the actual minimum, as it exceeds minimum minus.
Any idea what the problem might be?
Original Problem
Finding the value of x such that the quantity Ax - b is minimized is equivalent to finding the value of x such that Ax - b = 0, or x = (A^-1)*b. This is because the L2 norm is the euclidean norm, more commonly known as the distance formula. By definition, distance cannot be negative, making its minimum identically zero.
This algorithm, as implemented, actually comes quite close to estimating x. However, because of recursive subtraction and rounding one quickly runs into the problem of underflow, resulting in massive oscillation, below:
Value of L2 Norm as a function of step size
Above algorithm vs. solve function in R
Above we have the results of A %% x followed by A %% min_x, with x estimated by the implemented algorithm and min_x estimated by the solve function in R.
The problem of underflow, well known to those familiar with numerical analysis, is probably best tackled by the programmers of lower-level libraries best equipped to tackle it.
To summarize, the algorithm appears to work as implemented. Important to note, however, is that not every function will have a minimum (think of a straight line), and also be aware that this algorithm should only be able to find a local, as opposed to a global minimum.
So, I was just playing around with manually calculating the value of e in R and I noticed something that was a bit disturbing to me.
The value of e using R's exp() command...
exp(1)
#[1] 2.718282
Now, I'll try to manually calculate it using x = 10000
x <- 10000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.718146
Not quite but we'll try to get closer using x = 100000
x <- 100000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.718268
Warmer but still a bit off...
x <- 1000000
y <- (1 + (1 / x)) ^ x
y
#[1] 2.71828
Now, let's try it with a huge one
x <- 5000000000000000
y <- (1 + (1 / x)) ^ x
y
#[1] 3.035035
Well, that's not right. What's going on here? Am I overflowing the data type and need to use a certain package instead? If so, are there no warnings when you overflow a data type?
You've got a problem with machine precision. As soon as (1 / x) < 2.22e-16, 1 + (1 / x) is just 1. Mathematical limit breaks down in finite-precision numerical computations. Your final x in the question is already 5e+15, very close to this brink. Try x <- x * 10, and your y would be 1.
This is neither "overflow" nor "underflow" as there is no difficulty in representing a number as small as 1e-308. It is the problem of the loss of significant digits during floating-point arithmetic. When you do 1 + (1 / x), the bigger x is, the fewer significant digits in the (1 / x) part can be preserved when you add it to 1, and eventually you lose that (1 / x) term altogether.
## valid 16 significant digits
1 + 1.23e-01 = 1.123000000000000|
1 + 1.23e-02 = 1.012300000000000|
... ...
1 + 1.23e-15 = 1.000000000000001|
1 + 1.23e-16 = 1.000000000000000|
Any numerical analysis book would tell you the following.
Avoid adding a large number and a small number. In floating-point addition a + b = a * (1 + b / a), if b / a < 2.22e-16, there us a + b = a. This implies that when adding up a number of positive numbers, it is more stable to accumulate them from the smallest to the largest.
Avoid subtracting one number from another of the same magnitude, or you may get cancellation error. The web page has a classic example of using the quadratic formula.
You are also advised to have a read on Approximation to constant "pi" does not get any better after 50 iterations, a question asked a few days after your question. Using a series to approximate an irrational number is numerically stable as you won't get the absurd behavior seen in your question. But the finite number of valid significant digits imposes a different problem: numerical convergence, that is, you can only approximate the target value up to a certain number of significant digits. MichaelChirico's answer using Taylor series would converge after 19 terms, since 1 / factorial(19) is already numerically 0 when added to 1.
Multiplication / division between floating-point numbers don't cause problem on significant digits; they may cause "overflow" or "underflow". However, given the wide range of representable floating-point values (1e-308 ~ 1e+307), "overflow" and "underflow" should be rare. The real difficulty is with addition / subtraction where significant digits can be easily lost. See Can I stably invert a Vandermonde matrix with many small values in R? for an example on matrix computations. It is not impossible to get higher precision, but the work is probably more involved. For example, OP of the matrix example eventually used the GMP (GNU Multiple Precision Arithmetic Library) and associated R packages to proceed: How to put Rmpfr values into a function in R?
You might also try the Taylor series approximation to exp(1), namely
e^x = \sum_{k = 0}{\infty} x^k / k!
Thus we can approximate e = e^1 by truncating this sum; in R:
sprintf('%.20f', exp(1))
# [1] "2.71828182845904509080"
sprintf('%.20f', sum(1/factorial(0:10)))
# [1] "2.71828180114638451315"
sprintf('%.20f', sum(1/factorial(0:100)))
# [1] "2.71828182845904509080"
I have a problem with the following function in R:
test <- function(alpha, beta, n){
result <- exp(lgamma(alpha) + lgamma(n + beta) - lgamma(alpha + beta + n) - (lgamma(alpha) + lgamma(beta) - lgamma(alpha + beta)))
return(result)
}
Now if you insert the following values:
betabinom(-0.03292708, -0.3336882, 10)
It should fail and result in a NaN. That is because if we implement the exact function in Excel, we would get a result that is not a number. The implementation in Excel is simple, for J32 is a cell for alpha, K32 beta and L32 for N. The implementation of the resulting cell is given below:
=EXP(GAMMALN(J32)+GAMMALN(L32+K32)-GAMMALN(J32+K32+L32)-(GAMMALN(J32)+GAMMALN(K32)-GAMMALN(J32+K32)))
So this seems to give the correct answer, because the function is only defined for alpha and beta greater than zero and n greater or equal to zero. Therefore I am wondering what is happening here? I have also tried the package Rmpf to increase the numerical accuracy, but that does not seem to do anything.
Thanks
tl;dr log(gamma(x)) is defined more generally than you think, or than Excel thinks. If you want your function not to accept negative values of alpha and beta, or to return NaN, just test manually and return the appropriate values (if (alpha<0 || beta<0) return(NaN)).
It's not a numerical accuracy problem, it's a definition issue. The Gamma function is defined for negative real values: ?lgamma says:
The gamma function is defined by (Abramowitz and Stegun section 6.1.1, page 255)
Gamma(x) = integral_0^Inf t^(x-1) exp(-t) dt
for all real ‘x’ except zero and negative integers (when ‘NaN’ is returned).
Furthermore, referring to lgamma ...
... and the natural logarithm of the absolute value of the gamma function ...
(emphasis in original)
curve(lgamma(x),-1,1)
gamma(-0.1) ## -10.68629
log(gamma(-0.1)+0i) ## 2.368961+3.141593i
log(abs(gamma(-0.1)) ## 2.368961
lgamma(-0.1) ## 2.368961
Wolfram Alpha agrees with second calculation.
I have a problem with what i guess is a rounding error with floating-points in OpenEdge ABL / Progress 4GL
display truncate(log(4) / log(2) , 0) .
This returns 1.0 but should give me a 2.0
if i do this pseudo solution it gives me the right answer in most cases which hints to floating-points.
display truncate(log(4) / log(2) + 0.00000001, 0) .
What I am after is this
find the largest x where
p^x < n, p is prime, n and x is natural numbers.
=>
x = log(n) / log(p)
Any takes on this one?
No numerical arithmetic system is exact. The natural logarithms of 4 and 2 cannot be represented exactly. Since the log function can only return a representable value, it returns an approximation of the exact mathematical result.
Sometimes this approximation will be slightly higher than the mathematical result. Sometimes it will be slightly lower. Therefore, you cannot generally expect that log(x*x) will be exactly twice log(x).
Ideally, a high-quality log implementation would return the representable value that is closest to the exact mathematical value. (This is called a “correctly rounded” result.) In that case, and if you are using binary floating-point (which is common), then log(4) would always be exactly twice log(2). Since this does not happen for you, it seems the log implementation you are using does not provide correctly rounded results.
However, for this problem, you also need log(8) to be exactly three times log(2), and so on for additional powers. Even if the log implementation did return correctly rounded results, this would not necessarily be true for all the values you need. For some y = x5, log(y) might not be exactly five times log(x), because rounding log(y) to the closest representable value might round down while rounding log(x) rounds up, just because of where the exact values happen to lie relative to the nearest representable values.
Therefore, you cannot rely on even a best-possible log implementation to tell you exactly how many powers of x divide some number y. You can get close, and then you can test the result by confirming or denying it with integer arithmetic. There are likely other approaches depending upon the needs specific to your situation.
I think you want:
/* find the largest x where p^x < n, p is prime, n and x is natural numbers.
*/
define variable p as integer no-undo format ">,>>>,>>>,>>9".
define variable x as integer no-undo format ">>9".
define variable n as integer no-undo format ">,>>>,>>>,>>9".
define variable i as integer no-undo format "->>9".
define variable z as decimal no-undo format ">>9.9999999999".
update p n with side-labels.
/* approximate x
*/
z = log( n ) / log( p ).
display z.
x = integer( truncate( z, 0 )). /* estimate x */
/* is p^x < n ?
*/
if exp( p, x ) >= n then
do while exp( p, x ) >= n: /* was the estimate too high? */
assign
i = i - 1
x = x - 1
.
end.
else
do while exp( p, x + 1 ) < n: /* was the estimate too low? */
assign
i = i + 1
x = x + 1
.
end.
display
x skip
exp( p, x ) label "p^x" format ">,>>>,>>>,>>9" skip
i skip
log( n ) skip
log( p ) skip
z skip
with
side-labels
.
The root of the problem is that the log function, susceptible to floating point truncation error, is being used to address a question in the realm of natural numbers. First, I should point out that actually, in the example given, 1 really is the correct answer. We are looking for the largest x such that p^x < n; not p^x <= n. 2^1 < 4, but 2^2 is not. That said, we still have a problem, because when p^x = n for some x, log(n) divided by log(p) could probably just as well land slightly above the whole number rather than below, unless there is some systemic bias in the implementation of the log function. So in this case where there is some x for which p^x=n, we actually want to be sure to round down to the next lower whole value for x.
So even a solution like this will not correct this problem:
display truncate(round(log(4) / log(2), 10) , 0) .
I see two ways to deal with this. One is similar to what you already tried, except that because we actually want to round down to the next lower natural number, we would subtract rather than add:
display truncate(log(4) / log(2) - 0.00000001, 0) .
This will work as long as n is less than 10^16, but a more tidy solution would be to settle the boundary conditions with actual integer math. Of course, this will fail too if you get to numbers that are higher than the maximum integer value. But if this is not a concern, you can just use your first solution get the approximate solution:
display truncate(log(4) / log(2) , 0) .
And then test whether the result works in the equation p^x < n. If it isn't less than n, subtract one and try again.
On a side note, by the way, the definition of natural numbers does not include zero, so if the lowest possible value for x is 1, then the lowest possible value for p^x is p, so if n is less than or equal to p, there is no natural number solution.
Most calculators can not calculate sqrt{2}*sqrt{2} either. The problem is that we usually do not have that many decimals.
Work around: Avoid TRUNCATE use ROUND like
ROUND(log(4) / log(2), 0).
Round(a,b) rounds up the decimal a to closest number having b decimals.