Computing harmonic series for very large N (arbitrary precision problems) - julia

This is a followup question to a previous one I made.
I'm trying to compute the Harmonic series to very large terms, however when comparing to log(n)+γ I'm not getting the expected error.
I suspect the main problem is with the BigFloat julia type.
harmonic_bf = function(n::Int64)
x=BigFloat(0)
for i in n:-1:1
x += BigFloat(1/i)
end
x
end
For example it is well known that the lower bound for the formula: H_n - log(n) - γ is 1/2/(n+1).
However, this holds for n=10^7 then fails for n=10^8.
n=10^8
γ = big"0.57721566490153286060651209008240243104215933593992"
lower_bound(n) = 1/2/(n+1)
>>> harmonic_bf(n)-log(n)-γ > lower_bound(BigFloat(n))
false
It's driving me crazy, I can't seem to understand what is missing... BigFloat supossedly should get arithmetic precision problems out of the way, however it seems not to be the case.
Note: I tried with BigFloat with unset precision and with 256 bits of precision.

You have to make sure that you use BigFloat everywhere. First in your function (notice that BigFloat(1/n) is not the same as 1/BigFloat(i)):
function harmonic_bf(n::Int64)
x=BigFloat(0)
for i in n:-1:1
x += 1/BigFloat(i)
end
x
end
and then in the test (notice BigFloat under log):
julia> harmonic_bf(n)-log(BigFloat(n))-γ > lower_bound(BigFloat(n))
true

Related

Mean function incorrect value

I have an 80 element array with the same entries: 176.01977965813853
If I use the mean function I will get the value 176.01977965813842
Why is that?
Here is a minimal working example:
using Statistics
arr = fill(176.01977965813853, 80)
julia> mean(arr)
176.01977965813842
I expected this to return 176.01977965813853.
These are just expected floating point errors. But if you need very precise summations, you can use a a bit more elaborate (and costly) summation scheme:
julia> using KahanSummation
[ Info: Precompiling KahanSummation [8e2b3108-d4c1-50be-a7a2-16352aec75c3]
julia> sum_kbn(fill(176.01977965813853, 80))/80
176.01977965813853
Ref: Wikipedia
The problem as I understand it can be reproduced as follows:
using Statistics
arr = fill(176.01977965813853, 80)
julia> mean(arr)
176.01977965813842
The reason for this is that julia does all floating point arithmetic with 64 bits of precision by default (i.e. the Float64 type). Float64s cannot represent any real number. There is a finite step between each floating point number and rounding errors are incurred when you do arithmetic on them. These rounding errors are usually fine, but if you're not careful, they can be catastrophic. For instance:
julia> 1e100 + 1.0 - 1e100
0.0
That says that if I do 10^100 + 1 - 10^100 I get zero! If you want to get an upper bound on the errors caused by floating point arithmetic, we can use IntervalArithmetic.jl:
using IntervalArithmetic
julia> 1e100 + interval(1.0) - 1e100
[0, 1.94267e+84]
That says that the operation 1e100 + 1.0 - 1e100 is at least equal to 0.0 and at most 1.94*10^84, so the error bounds are huge!
We can do the same for the operation you were interested in,
arr = fill(interval(176.01977965813853), 80);
julia> mean(arr)
[176.019, 176.02]
julia> mean(arr).lo
176.019779658138
julia> mean(arr).hi
176.0197796581391
which says that the actual mean could be at least 176.019779658138 or at most 176.0197796581391, but one can't be any more certain due to floating point error! So here, Float64 gave the answer with at most 10^-13 percent error, which is actually quite small.
What if those are unacceptable error bounds? Use more precision! You can use the big string macro to get arbitrary precision number literals:
arr = fill(interval(big"176.01977965813853"), 80);
julia> mean(arr).lo
176.0197796581385299999999999999999999999999999999999999999999999999999999999546
julia> mean(arr).hi
176.019779658138530000000000000000000000000000000000000000000000000000000000043
That calculation was done using 256 bits of precision, but you can get even more precision using the setprecision function:
setprecision(1000)
arr = fill(interval(big"176.01977965813853"), 80);
julia> mean(arr).lo
176.019779658138529999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999599
julia> mean(arr).hi
176.019779658138530000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000579
Note that arbitrary precision arithmetic is sloooow compared to Float64s, so it's usually best to just use arbitrary precision arithmetic to validate your results to make sure you're converging to a good result within your desired accuracy.

How to increase precision of solution of nlm-solver

Given is a function F1:
F1 <- function(C1,C2,C3,...,x,u_target) {
# a lot of equations follow
...
u_actual - u_target
}
F1 returns the result of the very last equation
u_actual - u_target
I want to determine the value for the parameter x in a way that the result of the last equation converges to zero. With
nlm(f=F1,p=c(0),C1=C1,C2=C2,...,stepmax=0.001,ndigit=8)
I get a result, but not a satisfying one:
u_actual = 0.1316566
u_target = 0.1
I played a lot with the arguments of the nlm command (gradtol,stepmax,iterlim etc.), but I was not able to get a better result. I also tried optim, optimize and uniroot, but was not able to get them run at all.
u and x show a negative exponential development. With decreasing x, u increases exponential. If x is zero, u results in a finite value. x also has an upper boundary, which is unknown. So I guessed it would be promising if the iteration starts at the lower boundary (zero) and increases step by step. However, whether I decrease or increase the value of stepmax, the result is not getting better.
I would appreciate any hint from the r-community.
Thank you very much.
PS: in matlab a colleague uses fsolve(#(x) F1(x,u_target,C1,C2,...),0), and it works fine.

exp function in Julia evaluating to 0

I want to calculate and plot the probability density of a wave function in Julia. I wrote a small snippet of Julia code for evaluating the following function:
The Julia (incomplete) code is:
set_bigfloat_precision(100)
A = 10
C = 5
m = BigFloat(9.10938356e-31)
ℏ = BigFloat(1.054571800e-34)
t = exp(-(sqrt(C * m) / ℏ))
The last line where I evaluate t gives 0.000000000000.... I tried to set the precision of the BigFloat as well. No luck! What am I doing wrong? Help appreciated.
While in comments Chris Rackauckas has pointed out you entered the formula wrong. I figured it was interesting enough to answer the question anyway
Lets break it down so we can see what we are raising:
A = 10
C = 5
m = BigFloat(9.10938356e-31)
h = BigFloat(1.054571800e-34)
z = -sqrt(C * m)/h
t = exp(z)
So
z =-2.0237336022083455711032042949257e+19
so very roughly z=-2e19)
so roughly t=exp(-2e19) (ie t=1/((e^(2*10^19)))
That is a very small number.
Consider that
exp(big"-1e+10") = 9.278...e-4342944820
and
exp(big"-1e+18") = 2.233...e-434294481903251828
and yes, julia says:
exp(big"-2e+19) = 0.0000
exp(big"-2e+19) is a very small number.
That puts us in context I hope. Very small number.
So julia depends on MPFR for BigFloats
You can try MPFR online. At precision 8192, exp(-2e10)=0
So same result.
Now, it is not the precision that we care about.
But rather the range of the exponant.
MPFR use something kinda like IEEE style floats, where precision is the length of the mantissa, and then you have a exponent. 2^exponent * mantissa
So there is a limit on the range of the exponent.
See: MPFR docs:
Function: mpfr_exp_t mpfr_get_emin (void)
Function: mpfr_exp_t mpfr_get_emax (void)
Return the (current) smallest and largest exponents allowed for a floating-point variable. The smallest positive value of a floating-point variable is one half times 2 raised to the smallest exponent and the largest value has the form (1 - epsilon) times 2 raised to the largest exponent, where epsilon depends on the precision of the considered variable.
Now julia does set these to there maximum range the fairly default MPFR compile will allow. I've been digging around the MPFR source trying to find where this is set, but can't find it. I believe it is related to the max fault a Int64 can hold.
Base.MPFR.get_emin() = -4611686018427387903 =typemin(Int64)>>1 + 1
You can adjust this but only up.
So anyway
0.5*big"2.0"^(Base.MPFR.get_emin()) = 8.5096913117408361391297879096205e-1388255822130839284
but
0.5*big"2.0"^(Base.MPFR.get_emin()-1) = 0.00000000000...
Now we know that
exp(x) = 2^(log(2,e)*x)
So we can exp(z) = 2^(log(2,e)*z)
log(2,e)*z = -29196304319863382016
Base.MPFR.get_emin() = -4611686018427387903
So since the exponent (rough -2.9e19) is less than the minimum allowed exponent (roughly -4.3e17).
An underflow occurs.
Thus your answer as to why you get zero.
It may (or may not) be possible to recomplile MPFR with Int128 exponents, but julia hasn't.
Perhaps julia should throw a Underflow exception.
Free encouraged to report that as an issue on the Julia Bug Tracker.

Why (e^x-1)/x does not work properly, but (e^x-1)/log(e^x) does?

I would like to ask a question why computing value of (e^x-1)/x for numbers very close to zero does not work properly (for example if x=10^-15, result is 1.1102230), but when I use formula (e^x-1)/log(e^x), which is mathematical equivalent, it gives me correct result of 1.000000. Thanks.
The problem is that the first function exhibits what is known as catastrophic cancellation: for x near 0, ex is very close to 1 + x. As floating point numbers are less dense near 1 than 0, the result of the expression ex − 1 will be very close to x, but lose accuracy due to intermediate rounding.
The second exploits a neat trick of "cancelling out" the rounding error. In fact, this particular example is covered in detail in section 1.14.1 of Nicholas J. Higham's excellent book Accuracy and Stability of Numerical Algorithms. The crux of his explanation is
The expression (ex − 1) / x cannot be accurately evaluated for a given x ≈ 0 in floating point arithmetic, while the expression (y − 1) / log y can be be accurately evaluated for a given y ≈ 1. Since these functions are slowly varying near x = 0 (y = 1), evaluating (y − 1) / log y with an accurate, if inexact, approximation to y = ex ≈ 1 produces an accurate result.
Since this is an R question, how about computing exp(x)-1 by calling expm1(x)? expm1() is an R function designed to return accurate values of exp(x)-1 even for values of x close to 0. expm1(x)/x gives you the right answer.

differentiation in matlab

i need to find acceleration of an object the formula for that given in text is a = d^2(L)/d(T)^2 , where L= length and T= time
i calculated this in matlab by using this equation
a = (1/(T3-T1))*(((L3-L2)/(T3-T2))-((L2-L1)/(T2-T1)))
or
a = (v2-v1)/(T2-T1)
but im not getting the right answers ,can any body tell me how to find (a) by any other method in matlab.
This has nothing to do with matlab, you are just trying to numerically differentiate a function twice. Depending on the behaviour of the higher (3rd, 4th) derivatives of the function this will or will not yield reasonable results. You will also have to expect an error of order |T3 - T1|^2 with a formula like the one you are using, assuming L is four times differentiable. Instead of using intervals of different size you may try to use symmetric approximations like
v (x) = (L(x-h) - L(x+h))/ 2h
a (x) = (L(x-h) - 2 L(x) + L(x+h))/ h^2
From what I recall from my numerical math lectures this is better suited for numerical calculation of higher order derivatives. You will still get an error of order
C |h|^2, with C = O( ||d^4 L / dt^4 || )
with ||.|| denoting the supremum norm of a function (that is, the fourth derivative of L needs to be bounded). In case that's true you can use that formula to calculate how small h has to be chosen in order to produce a result you are willing to accept. Note, though, that this is just the theoretical error which is a consequence of an analysis of the Taylor approximation of L, see [1] or [2] -- this is where I got it from a moment ago -- or any other introductory book on numerical mathematics. You may get additional errors depending on the quality of the evaluation of L; also, if |L(x-h) - L(x)| is very small numerical substraction may be ill conditioned.
[1] Knabner, Angermann; Numerik partieller Differentialgleichungen; Springer
[2] http://math.fullerton.edu/mathews/n2003/numericaldiffmod.html

Resources