Calculate the exponentials of big negative value - r

I would like to know how can I get the exponential of big negative number in R? For example when I try :
> exp(-6400)
[1] 0
> exp(-1200)
[1] 0
> exp(-2000)
[1] 0
but I need the value of above expression, even if it is so small, how can I get it in R?

Those number are too small. To know the minimum value your computer can handle try:
> .Machine$double.xmin
[1] 2.225074e-308
Will give you (from ?.Machine)
the smallest non-zero normalized floating-point number, a power of the radix, i.e., double.base ^ double.min.exp. Normally 2.225074e-308.
In my case
> .Machine$double.base
[1] 2
> .Machine$double.min.exp
[1] -1022
Actually I can calculate powers up to
> exp(-745)
[1] 4.940656e-324
To go around this issue you need infinite precision arithmetic.
In R you can achieve that using package Rmpfr (PDF vignette)
library(Rmpfr)
# Calculate exp(-100)
> a <- mpfr(exp(-100), precBits=64)
# exp(-1000)
> a^10
1 'mpfr' number of precision 64 bits
[1] 5.07595889754945890823e-435
# exp(-6400)
> a^64
1 'mpfr' number of precision 64 bits
[1] 3.27578823787094497049e-2780
# use an array of powers
> ex <- c(10, 20, 50, 100, 500, 1000, 1e5)
> a ^ ex
7 'mpfr' numbers of precision 64 bits
[1] 5.07595889754945890823e-435 2.57653587296115182772e-869
[3] 3.36969414830892462745e-2172 1.13548386531474089222e-4343
[5] 1.88757769782054893243e-21715 3.56294956530952353784e-43430
[7] 1.51693678090513840149e-4342945
Note that Rmpfr is based on GNU MPFR and requires GNU GMP. Under Linux you will need gmp, gmp-devel, mpfr, and mpfr-devel to be installed in your system in order to install these packages, not sure how that works under Windows.

Related

How can I overcome large integer limitations in R?

In python 3 I can easily represent and use fairly large integers such as 2**128. However, in R I run into problems at much smaller integer values, with 2^53 being the upper limit (and why that limit?). For example, the following problem can occur.
x11 <- 2^54 - 11
x12 <- 2^54 - 12
print(x11, digits = 22)
# [1] 18014398509481972
print(x12, digits = 22)
# [1] 18014398509481972
x11 == x12
# [1] TRUE
I know that I could scale values or use floating point and then deal with machine error. But I'm wondering if there is a library or some other work around for using integers directly. Note that the L designation does not solve this problem.
In this case I know versions, and hardware matter so this is R 4.0.5 on macos 11.5.1.
You may use the package gmp (look at https://www.r-bloggers.com/2019/08/really-large-numbers-in-r/). Then
library(gmp)
num = as.bigz(2)
x11 <- num^54 -11
x12 <- num^54 -12
print(x11, digits = 22)
Big Integer ('bigz') :
[1] 18014398509481973
print(x12, digits = 22)
Big Integer ('bigz') :
[1] 18014398509481972
x11 == x12
[1] FALSE

R: Number precision, how to prevent rounding?

In R, I have the following vector of numbers:
numbers <- c(0.0193738397702257, 0.0206218006695066, 0.021931558829559,
0.023301378178208, 0.024728095594751, 0.0262069239112787, 0.0277310799996657,
0.0292913948762414, 0.0308758879014822, 0.0324693108459748, 0.0340526658271053,
0.03560271425176, 0.0370915716288017, 0.0384863653635563, 0.0397490272396821,
0.0408363289939899, 0.0417002577578561, 0.0422890917131629, 0.0425479537267193,
0.0424213884467212, 0.0418571402964338, 0.0408094991140723, 0.039243951482081,
0.0371450856007627, 0.0345208537496488, 0.0314091884865658, 0.0278854381969885,
0.0240607638577763, 0.0200808932436969, 0.0161193801903312, 0.0123615428382314,
0.00920410652651576, 0.00628125319205829, 0.0038816517651031,
0.00214210795679701, 0.00103919307280354, 0.000435532895812429,
0.000154730641092234, 4.56593150728962e-05, 1.09540661898799e-05,
2.08952167815574e-06, 3.10045314287095e-07, 3.51923218134997e-08,
3.02121734299694e-09, 1.95269500257237e-10, 9.54697530552714e-12,
3.5914029230041e-13, 1.07379981978647e-14, 2.68543048763588e-16,
6.03891613157815e-18, 1.33875697089866e-19, 3.73885699170518e-21,
1.30142752487978e-22, 5.58607581840324e-24, 2.92551478380617e-25,
1.85002124085815e-26, 1.39826890505611e-27, 1.25058972437096e-28,
1.31082961467944e-29, 1.59522437605631e-30, 2.23371981458205e-31,
3.5678974253211e-32, 6.44735482309705e-33, 1.30771083084868e-33,
2.95492180915218e-34, 7.3857554006177e-35, 2.02831084124162e-35,
6.08139499028838e-36, 1.97878175996974e-36, 6.94814886769478e-37,
2.61888070029751e-37, 1.05433608968287e-37, 4.51270543356897e-38,
2.04454840598946e-38, 9.76544451781597e-39, 4.90105271869773e-39,
2.5743371658684e-39, 1.41165292292001e-39, 8.06250933233367e-40,
4.78746160076622e-40, 2.94835809615626e-40, 1.87667170875529e-40,
1.22833908072915e-40, 8.21091993733535e-41, 5.53869254991177e-41,
3.74485710867631e-41, 2.52485401054841e-41, 1.69027430542613e-41,
1.12176290106797e-41, 7.38294520887852e-42, 4.8381070000246e-42,
3.20123319815522e-42, 2.16493953538386e-42, 1.50891804884267e-42,
1.09057070511506e-42, 8.1903023226717e-43, 6.3480235351625e-43,
5.13533594742621e-43, 4.25591269645348e-43, 3.57422485839717e-43,
3.0293235331048e-43, 2.58514651313175e-43, 2.21952686649801e-43,
1.91634521841049e-43, 1.66319240529025e-43, 1.45043336371471e-43,
1.27052593975384e-43, 1.11752052211757e-43, 9.86689196888877e-44,
8.74248543892126e-44)
I use cumsum to get the cumulative sum. Due to R's numerical precision, many of the numbers towards the end of the vector are now equivalent to 1 (even though technically they're not exactly = 1, just very close to it).
So then when I try to recover my original numbers by using diff(cumulative), I get a lot of 0s instead of a very small number. How can I prevent R from "rounding"?
cumulative <- cumsum(numbers)
diff(cumulative)
I think the Rmpfr package does what you want:
library(Rmpfr)
x <- mpfr(numbers,200) # set arbitrary precision that's greater than R default
cumulative <- cumsum(x)
diff(cumulative)
Here's the top and bottom of the output:
> diff(cumulative)
109 'mpfr' numbers of precision 200 bits
[1] 0.02062180066950659862445860426305443979799747467041015625
[2] 0.021931558829559001655429284483034280128777027130126953125
[3] 0.02330137817820800150148130569505156017839908599853515625
[4] 0.0247280955947510004688805196337852976284921169281005859375
...
[107] 1.117520522117570086014450710640040701536080790307716261438975e-43
[108] 9.866891968888769759087690539062888824928577731689952701181586e-44
[109] 8.742485438921260418707338389502002282130643811990663213422948e-44
You can adjust the precision as you like by changing the second argument to mpfr.
You might want to try out the package Rmpfr.

Approximation to constant "pi" does not get any better after 50 iterations

In R I have written this function
ifun <- function(m) {
o = c()
for (k in 1:m) {
o[k] = prod(1:k) / prod(2 * (1:k) + 1)
}
o_sum = 2 * (1 + sum(o)) # Final result
print(o_sum)
}
This function approximates constant pi, however, after m > 50 the approximation gets stuck, i.e. the approximation is the same value and don't get better. How can I fix this? Thanks.
Let's go inside:
o <- numeric(100)
for (k in 1:length(o)) {
o[k] = prod(1:k) / prod(2 * (1:k) + 1)
}
o
# [1] 3.333333e-01 1.333333e-01 5.714286e-02 2.539683e-02 1.154401e-02
# [6] 5.328005e-03 2.486402e-03 1.170072e-03 5.542445e-04 2.639260e-04
# [11] 1.262255e-04 6.058822e-05 2.917211e-05 1.408309e-05 6.814396e-06
# [16] 3.303950e-06 1.604776e-06 7.807016e-07 3.803418e-07 1.855326e-07
# [21] 9.060894e-08 4.429771e-08 2.167760e-08 1.061760e-08 5.204706e-09
# [26] 2.553252e-09 1.253415e-09 6.157124e-10 3.026383e-10 1.488385e-10
# [31] 7.323800e-11 3.605563e-11 1.775874e-11 8.750685e-12 4.313718e-12
# [36] 2.127313e-12 1.049474e-12 5.179224e-13 2.556832e-13 1.262633e-13
# [41] 6.237104e-14 3.081863e-14 1.523220e-14 7.530524e-15 3.723886e-15
# [46] 1.841922e-15 9.112667e-16 4.509361e-16 2.231906e-16 1.104904e-16
# [51] 5.470883e-17 2.709390e-17 1.342034e-17 6.648610e-18 3.294356e-18
# [56] 1.632601e-18 8.092024e-19 4.011431e-19 1.988861e-19 9.862119e-20
# [61] 4.890969e-20 2.425921e-20 1.203410e-20 5.970404e-21 2.962414e-21
# [66] 1.470070e-21 7.295904e-22 3.621325e-22 1.797636e-22 8.924434e-23
# [71] 4.431013e-23 2.200227e-23 1.092630e-23 5.426483e-24 2.695273e-24
# [76] 1.338828e-24 6.650954e-25 3.304296e-25 1.641757e-25 8.157799e-26
# [81] 4.053875e-26 2.014653e-26 1.001295e-26 4.976849e-27 2.473873e-27
# [86] 1.229786e-27 6.113795e-28 3.039627e-28 1.511323e-28 7.514865e-29
# [91] 3.736900e-29 1.858350e-29 9.242063e-30 4.596582e-30 2.286258e-30
# [96] 1.137206e-30 5.656871e-31 2.814078e-31 1.399968e-31 6.965017e-32
print(sum(o[1:49]), digits = 22)
#[1] 0.5707963267948963359544
print(sum(o[1:50]), digits = 22)
#[1] 0.5707963267948964469767
print(sum(o[1:51]), digits = 22)
#[1] 0.570796326794896557999
print(sum(o[1:52]), digits = 22)
#[1] 0.570796326794896557999
There is no further improvement after 51, because:
o[51] / o[1]
#[1] 1.641265e-16
o[52] / o[1]
#[1] 8.128169e-17
Further terms are too small compared with the 1st term, readily beyond what machine precision could measure.
.Machine$double.eps
#[1] 2.220446e-16
So eventually you are just adding zeros.
In this case, the summation over o has numerically converged, so does your approximation to pi.
More thoughts
IEEE 754 standard for double-precision floating-point format states that on a 64-bit machine: 11 bits are used for exponential, 53 bits are used for significant digits (including a sign bit). This gives the machine precision: 1 / (2 ^ 52) = 2.2204e-16. In other words, a double-precision floating point number at most has 16 valid significant digits. R function print can display up to 22 digits, while sprintf can display more, but remember, any digits beyond the 16th are invalid, garbage values.
Have a look at the constant pi in R:
sprintf("%.53f", pi)
#[1] "3.14159265358979311599796346854418516159057617187500000"
If you compare it with How to print 1000 decimals places of pi value?, you will see that only the first 16 digits are truly correct:
3.141592653589793
What could alternative be done, so I can calculate more digits using my approach?
No. There have been many crazy algorithms around so that we can compute a shockingly great many of digits of pi, but you cannot modify your approach to get more valid significant digits.
At first I was thinking about computing sum(o[1:51]) and sum(o[52:100]) separately, as both of them give 16 valid significant digits. But we can't just concatenate them to get 32 digits. Because for sum(o[1:51]), the true digits beyond the 16th are not zeros, so the 16 digits for sum(o[52:100]) are not the 17 ~ 32th digits of sum(o[1:100]).

R rhyper() fails to give correct hypergeometric random number

I am trying to generate some random numbers from hypergeometric distribution using R. However, the rhyper() behaves very strange when I have a very small number of white balls and a large number for black balls. Here is what I got in my computer:
> sum(rhyper(100,1000,1e9-1000,1e6))
[1] 91
> sum(rhyper(100,2000,1e9-2000,1e6))
[1] 204
> sum(rhyper(100,10000,1e9-10000,1e6))
[1] 1016
> sum(rhyper(100,20000,1e9-20000,1e6))
[1] 1909
> sum(rhyper(100,50000,1e9-50000,1e6))
[1] 4968
> sum(rhyper(100,5000,1e9-5000,1e6))
[1] 60
> sum(rhyper(100,6000,1e9-6000,1e6))
[1] 164
> sum(rhyper(100,8000,1e9-8000,1e6))
[1] 0
> sum(rhyper(100,9000,1e9-9000,1e6))
[1] 45
The first 5 works fine, but for the 6th, I expected to get a number around 500, but not something like 60, also for the 7th,8th,9th.
Something wrong with the rhyper() function or my computer?

conversion bigq to mpfr with Rmpfr package

The help documentation of the Rmpfr R package claims that the .bigq2mpfr() function uses the minimal precision necessary for correct representation when the precB argument is NULL :
Description:
Coerce from and to big integers (‘bigz’) and ‘mpfr’ numbers.
Further, coerce from big rationals (‘bigq’) to ‘mpfr’ numbers.
Usage:
.bigz2mpfr(x, precB = NULL)
.bigq2mpfr(x, precB = NULL)
.mpfr2bigz(x, mod = NA)
Arguments:
x: an R object of class ‘bigz’, ‘bigq’ or ‘mpfr’ respectively.
precB: precision in bits for the result. The default, ‘NULL’, means
to use the _minimal_ precision necessary for correct
representation.
However when converting 31/3 one gets a bad approximation:
> x <- as.bigq(31,3)
> .bigq2mpfr(x)
1 'mpfr' number of precision 8 bits
[1] 10.31
By looking inside the .bigq2mpfr() function we see the detailed procedure:
N <- numerator(x)
D <- denominator(x)
if (is.null(precB)) {
eN <- frexpZ(N)$exp
eD <- frexpZ(D)$exp
precB <- eN + eD + 1L
}
.bigz2mpfr(N, precB)/.bigz2mpfr(D, precB)
Firstly I do not understand why precB is taken as follows. The exp output of the frexpZ() is the exponent in binary decomposition:
> frexpZ(N)
$d
[1] 0.96875
$exp
[1] 5
> 0.96875*2^5
[1] 31
Here we get precB=8 and the result is then identical to:
> mpfr(31, precBits=8)/mpfr(3, precBits=8)
1 'mpfr' number of precision 8 bits
[1] 10.31
I am under the impression one should rather replace precB with 2^precB but I'd like to get some advices about that:
> mpfr(31, precBits=8)/mpfr(3, precBits=2^8)
1 'mpfr' number of precision 256 bits
[1] 10.33333333333333333333333333333333333333333333333333333333333333333333333333329
> mpfr(31, precBits=8)/mpfr(3, precBits=2^9)
1 'mpfr' number of precision 512 bits
[1] 10.3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333329
> mpfr(31, precBits=8)/mpfr(3, precBits=2^7)
1 'mpfr' number of precision 128 bits
[1] 10.33333333333333333333333333333333333332
I get (note the difference in my initial creation):
Rgames> fooq<-as.bigq(31/3)
Rgames> fooq
Big Rational ('bigq') :
[1] 5817149518686891/562949953421312
Rgames> .bigq2mpfr(fooq)
1 'mpfr' number of precision 104 bits
[1] 10.3333333333333339254522798000835
All this strongly suggest to me that the precision in your bigq number is in fact zero decimal places, i.e. each of "31" and "3" has that precision. As such, your mpfr conversion is quite correct in giving you a result with one decimal place precision.
This has been corrected in a newer version of the package:
> x <- as.bigq(31,3)
> .bigq2mpfr(x)
1 'mpfr' number of precision 128 bits
[1] 10.33333333333333333333333333333333333332

Resources