Factorial(x) for x>170 using Rmpfr/gmp library - r

The problem that I would like to solve is the infinite sum over the following function:
For the sum I use an FTOL determination criterion. This whole term doesn't create any problems until z becomes very large. I expect the maximum value of z around 220. As you can see the first term has its max around Factorial(221) and therefore has to go around Factorial(500) until the determination criterion has been reached. After spotting this problem I didn't want to change the whole code (as it is only one small part) and tried to use library('Rmpfr') and library('gmp'). The problem is that I do not get what I want to. While multiplication normally works, subtraction fails for higher values:
This works
> factorialZ(22)-factorial(22)
Big Integer ('bigz') :
[1] 0
but this fails:
> factorialZ(50)-factorial(50)
Big Integer ('bigz') :
[1] 359073645150499628823711419759505502520867983196160
another way I tried:
> gamma(as(10,"mpfr"))-factorial(9)
1 'mpfr' number of precision 128 bits
[1] 0
> gamma(as(40,"mpfr"))-factorial(39)
1 'mpfr' number of precision 128 bits
[1] 1770811808798664813196481658880
There has to be something that I don't really understand. Does someone have a even better solution for the problem or can someone help me out with the issue above?

I think you incorrectly understand the priorities in factorialZ(x)-factorial(x) . The second term, factorial(x) is calculated before it's converted to a bigz to be combined with the first term.
You must create any integer outside the 2^64 (or whatever, depending on your machine) range using a bigz - compatible function.

50! is between 2^214 and 2^215 so the closest representable numbers are 2^(214-52) apart. factorial in R is based on a Lanczos approximation whereas factorialZ is calculating it exactly. The answers are within the machine precision:
> all.equal(as.numeric(factorialZ(50)), factorial(50))
[1] TRUE
The part that you're not understanding is floating point and it's limitations. You're only getting ~15 digits of precision in floating point. factorialZ(50) has a LOT more precision than that, so you shouldn't expect them to be the same.

Related

How can I improve the lindep function's applicability in Pari/GP for integral approximations?

While doing certain computations involving the Rogers L-function, the following result was generated by Wolfram Alpha:
                              
I wanted to verify this result in Pari/GP by means of the lindep function, so I calculated the integral to 20 digits in WA, yielding:
11.3879638800312828875
Then, I used the following code in Pari/GP:
lindep([zeta(2), zeta(3), 11.3879638800312828875])
As pi^2 = 6*zeta(2), one would expect the output to be a vector along the lines of:
[12,12,-3]
because that's the linear dependency suggested by WA's result. However, I got a very elaborate vector from Pari/GP:
[35237276454, -996904369, -4984618961]
I think the first vector should be the "right" output of the Pari code sample.
Questions:
Why is the lindep function in Pari/GP not yielding the output one would expect in this case?
What can I do to make it give the vector that would be more appropriate in this situation?
It comes down to Pari treating your rounded values as exact. Since you must round your values, lindep's solution doesn't always come to the same solution as the true answer due to error.
You can try changing the accuracy of lindep using the second argument. The manual states that you should choose this to be smaller than the number of correct decimal digits. I believe this should solve the issue.
lindep(v, {flag = 0}) finds a small nontrivial integral linear
combination between components of v. If none can be found return an
empty vector.
If v is a vector with real/complex entries we use a floating point
(variable precision) LLL algorithm. If flag = 0 the accuracy is chosen
internally using a crude heuristic. If flag > 0 the computation is
done with an accuracy of flag decimal digits. To get meaningful
results in the latter case, the parameter flag should be smaller than
the number of correct decimal digits in the input.

How to increase $double.xmax in .Machine?

I want to calculate a big number.
My problem is that there is a limit.
So for example, if you run factorial(170) it returns: [1] 7.257416e+306.
But as soon as you want to calculate factorial(171)(or a bigger number) it returns [1] Inf.
That is because when you run .Machine you will see that
$double.xmax
[1] 1.797693e+308
So my question is, how can one make it bigger? For instance, can we make it to 1.797693e+500?
You can't, in base R; R can only do computations with 32-bit integers and 64-bit floating point values. You can use the Rmpfr package:
library(Rmpfr)
factorialMpfr(200)
1 'mpfr' number of precision 1246 bits
## [1] 788657867364790503552363213932185062295135977687173263294742533244359449963403342920304284011984623904177212138919638830257642790242637105061926624952829931113462857270763317237396988943922445621451664240254033291864131227428294853277524242407573903240321257405579568660226031904170324062351700858796178922222789623703897374720000000000000000000000000000000000000000000000000
This value is "only" about 1e374, but we can easily go larger than that, e.g.
round(log10(factorialMpfr(400)))
869
However, there are some drawbacks to this: (1) computation is much slower; (2) it can be complicated to fit these results into an existing R workflow. People often find ways to do their computations on the log scale (you can compute log-factorial directly with the lfactorial() function)

Managing floating point accuracy

I'm struggling with issues re. floating point accuracy, and could not find a solution.
Here is a short example:
aa<-c(99.93029, 0.0697122)
aa
[1] 99.9302900 0.0697122
aa[1]
99.93029
print(aa[1],digits=20)
99.930289999999999
It would appear that, upon storing the vector, R converted the numbers to something with a slightly different internal representation (yes, I have read circle 1 of the "R inferno" and similar material).
How can I force R to store the input values exactly "as is", with no modification?
In my case, my problem is that the values are processed in such a way that the small errors very quickly grow:
aa[2]/(100-aa[1])*100
[1] 100.0032 ## Should be 100, of course !
print(aa[2]/(100-aa[1])*100,digits=20)
[1] 100.00315593171625
So I need to find a way to get my normalization right.
Thanks
PS- There are many questions on this site and elsewhere, discussing the issue of apparent loss of precision, i.e. numbers displayed incorrectly (but stored right). Here, for instance:
How to stop read.table from rounding numbers with different degrees of precision in R?
This is a distinct issue, as the number is stored incorrectly (but displayed right).
(R version 3.2.1 (2015-06-18), win 7 x64)
Floating point precision has always generated lots of confusion. The crucial idea to remember is: when you work with doubles, there is no way to store each real number "as is", or "exactly right" -- the best you can store is the closest available approximation. So when you type (in R or any other modern language) something like x = 99.93029, you'll get this number represented by 99.930289999999999.
Now when you expect a + b to be "exactly 100", you're being inaccurate in terms. The best you can get is "100 up to N digits after the decimal point" and hope that N is big enough. In your case it would be correct to say 99.9302900 + 0.0697122 is 100 with 5 decimal points of accuracy. Naturally, by multiplying that equality by 10^k you'll lose additional k digits of accuracy.
So, there are two solutions here:
a. To get more precision in the output, provide more precision in the input.
bb <- c(99.93029, 0.06971)
print(bb[2]/(100-bb[1])*100, digits = 20)
[1] 99.999999999999119
b. If double precision not enough (can happen in complex algorithms), use packages that provide extra numeric precision operations. For instance, package gmp.
i guess you have misunderstood here. It's the same case where R is storing the correct value but the value is displayed accordingly to the value of option chosen while displaying it.
For Eg:
# the output of below will be:
> print(99.930289999999999,digits=20)
[1] 99.930289999999999395
But
# the output of:
> print(1,digits=20)
[1] 1
Also
> print(1.1,digits=20)
[1] 1.1000000000000000888
In addition to previous answers, I think that a good lecture regarding the subject would be
R Inferno, by P.Burns
http://www.burns-stat.com/documents/books/the-r-inferno/

Understanding Floating point precision analysis for Parallel Reduction

I am trying to analyze how reduction (parallel) can be used to add a large array of floating point numbers and precision loss involved in it. Definitely reduction will help in getting more precision compared to serial addition . I'll be really thankful if you can direct me to some detailed source or provide some insight for this analysis. Thanks.
Every primitive floating point operation will have a rounding error; if the result is x then the rounding error is <= c * abs (x) for some rather small constant c > 0.
If you add 1000 numbers, that takes 999 additions. Each addition has a result and a rounding error. The rounding error is small when the result is small. So you want to adjust the order of additions so that the average absolute value of the result is as small as possible. A binary tree is one method. Sorting the values, then adding the smallest two numbers and putting the result back into the sorted list is also quite reasonable. Both methods keep the average result small, and therefore keep the rounding error small.

R: Strange trig function behavior

As a Matlab user transitioning to R, I have ran across the problem of applying trigonometric functions to degrees. In Matlab there are trig functions for both radians and degrees (e.g. cos and cosd, respectively). R seems to only include functions for radians, thus requiring me to create my own (see below)
cosd<-function(degrees) {
radians<-cos(degrees*pi/180)
return(radians)
}
Unfortunately this function does not work properly all of the time. Some results are shown below.
> cosd(90)
[1] 6.123234e-17
> cosd(180)
[1] -1
> cosd(270)
[1] -1.836970e-16
> cosd(360)
[1] 1
I'd like to understand what is causing this and how to fix this. Thanks!
This is floating point arithmetic:
> all.equal(cosd(90), 0)
[1] TRUE
> all.equal(cosd(270), 0)
[1] TRUE
If that is what you meant by "does not work properly"?
This is also a FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
Looks like it's working fine to me. The value for pi probably isn't precise enough, so you are getting a very close estimate. If you think about it, 6.123234e-17 and -1.836970e-16 are very very close to 0, which is what the answer should be.
Your problem lies in the fact that while 90*pi/180 = pi/2 on paper, in computers, we use floating point numbers. I'm not sure what R/matlab use, but I'd definitely guess either a 32 bit or 64 bit floating point number. And you can only fit so much information in that limited number of bits, so you can't store every possible decimal.
You could modify your function so that given 90 or 270, return 0.
This is a floating point representation error. See Chapter 1 of http://lib.stat.cmu.edu/s/Spoetry/Tutor/R_inferno.pdf
The same reason that
1-(1/3)-(1/3)-(1/3)
doesn't equal 0. It has something to do with floating point numbers. I'm sure there will be more elaboration.
You may also be interested in the zapsmall function for another way of showing numbers that are close to 0 as 0.

Resources