Large exponent values in R with 'Rmpfr' - r

I am trying to calculate the value of exp(1000) or more in R. I looked at the solution provided in Calculate the exponentials of big negative value and followed it to replicate the code for negative values first, but I have the following output.
a <- mpfr(exp(-1000),precBits = 64)
a
1 'mpfr' number of precision 64 bits
[1] 0
I do not understand why my output would be different from the provided solution. I understand this particular solution is for negative values and that I am looking for positive exponents. Regardless, it should work for both ways.

You need to convert to extended precision before you exponentiate, otherwise R will try to compute exp(-1000) with its usual precision and underflow will occur (as you found out).
> a <- exp(mpfr(-1000, precBits = 64))
> a
1 'mpfr' number of precision 64 bits
[1] 5.07595889754945676548e-435

Related

R random number generator faulty?

I was looking into the RNG of base R and was curious if the 32-bit implementation of Mersenne-Twister might be limiting it when scaled to large numbers of random numbers needed so I did a simple test:
set.seed(8)
length(unique(runif(1e8)))
# [1] 98845641
1e8 - 98845641
# 1154359
So it turns out that there are indeed numerous duplicates in the 100 million draw.
When I switch to the 64-bit version of the MT RNG implemented by dqrng package, the problem does not appear.
Question 1:
The 64 bit referenced refers to the type of floating point numbers used?
Question 2:
Am I right to conclude that because of the large span of possible numbers (64bit FP vs 32bit FP), duplicates are less likely when using the 64-bit MT?
from ?Random:
Do not rely on randomness of low-order bits from RNGs. Most of the supplied uniform generators return 32-bit integer values that are converted to doubles, so they take at most 2^32 distinct values and long runs will return duplicated values.
Indeed, when we calculate the expected number of draws that have a duplicate, we get
M <- 2^32
n <- 1e8
(n * (1 - (1 - 1 / M)^(n - 1))) / 2
# [1] 1150705
which is very close to the result that you have.

R addition of large and small numbers is ignoring the smaller values

I'm encountering a problem when adding larger numbers in R. The smaller values are getting ignored and it's producing an incorrect result.
For example, I've been using a binary to decimal converter found here: Convert binary string to binary or decimal value. The penultimate step looks like this:
2^(which(rev(unlist(strsplit(as.character(MyData$Index[1]), "")) == 1))-1)
[1] 1 2 32 64 256 2048 ...
I didn't include all number for length purposes, but when these numbers are summed, they will yield the integer value of the binary number. The correct result should be 4,919,768,674,277,575,011, but R is giving me a result of 4,919,768,674,277,574,656. Notice that this number is off by 355, which is the sum of the first 5 listed numbers.
I had thought it might have to do with a integer limit, but I tested it and R can handle larger numbers than what I need. Here's an example of something I tried, which again yielded an incorrect result:
2^64
[1] 18446744073709551616 #Correct Value
2^65
[1] 36893488147419103232 #Correct Value
2^64 + 2^65
[1] 55340232221128654858 #Correct Value
2^64 + 2^65 + 1
[1] 55340232221128654858 #Incorrect Value
It seems like there's some sort of problem with precision of large number addition, but I don't know how I can fix this so that I can get the desired result.
Any help would be greatly appreciated. And I apologize if anything is formatted improperly, this is my first post on the site.
For large integers, we could use as.bigz from gmp
library(gmp)
as.bigz(2^64) + as.bigz(2^65) + 1
# Big Integer ('bigz') :
#[1] 55340232221128654849

Donot want large numbers to be rounded off in R

options(scipen=999)
625075741017804800
625075741017804806
When I type the above in the R console, I get the same output for the two numbers listed above. The output being: 625075741017804800
How do I avoid that?
Numbers greater than 2^53 are not going to be unambiguously stored in the R numeric classed vectors. There was a recent change to allow integer storage in the numeric abscissa, however your number is larger that that increased capacity for precision:
625075741017804806 > 2^53
[1] TRUE
Prior to that change integers could only be stored up to Machine$integer.max == 2147483647. Numbers larger than that value get silently coerced to 'numeric' class. You will either need to work with them using character values or install a package that is capable of achieving arbitrary precision. Rmpfr and gmp are two that come to mind.
You can use package Rmpfr for arbitrary precision
dig <- mpfr("625075741017804806")
print(dig, 18)
# 1 'mpfr' number of precision 60 bits
# [1] 6.25075741017804806e17

How to work with large numbers in R?

I would like to change the precision in a calculation of R. For example I would like to calculate x^6 with x = c(-2.5e+59, -5.6e+60). In order to calculate it I should change the precision in R, otherwise the result is Inf, and I don't know how to do it.
As Livius points out in his comment, this is an issue with R (and in fact, most programming language), with how numbers are represented in binary.
To work with extremely large/small floating point numbers, you can use the Rmpfr library:
install.packages("Rmpfr")
library("Rmpfr")
x <- c(-2.5e+59, -5.6e+60)
y <- mpfr(x, 6) # the second number is how many precision **bits** you want - NB: not decimal places!
y^6
# 2 'mpfr' numbers of precision 6 bits
# [1] 2.50e356 3.14e364
To work with numbers that are even larger than R can handle (e.g. exp(1800)) you can use the "Brobdingnag" package:
install.packages("Brobdingnag")
library("Brobdingnag")
## An example of a single number too large for R:
10^1000.7
# [1] Inf
## Now using the Brobdingnag package:
10^as.brob(1000.7)
# [1] +exp(2304.2)

Calculations precision level in R

I am working in R with very small numbers which reflect probabilities in an Maximum Likelihood Estimation algorithm. Some of these numbers are as small as 1e-155 ( or smaller). However, when there is something as simple as summation taking place, the precision level gets truncated to the least precise one and thus ruins the precisions of my calculations and produces meaningless results.
Example:
> sum(c(7.831908e-70,6.002923e-26,6.372573e-36,5.025015e-38,5.603268e-38,1.118121e-14, 4.512098e-07,4.400717e-05,2.300423e-26,1.317602e-58))
[1] 4.445838e-05
As is seen from the example, the base for this calculation is 1e-5 , which in a very rude manner rounds up sensitive calculation.
Is there a way around this? Why is R choosing such a strange automatic behavior? Perhaps it is not really doing this, I just see the result in the truncated form? In this case, is the actual number with correct precision stored in the variable?
There is no precision loss in your sum. But if you're worried about it, you should use a multiple-precision library:
library("Rmpfr")
x <- c(7.831908e-70,6.002923e-26,6.372573e-36,5.025015e-38,5.603268e-38,1.118121e-14, 4.512098e-07,4.400717e-05,2.300423e-26,1.317602e-58)
sum(mpfr(x, 1024))
# 1 'mpfr' number of precision 1024 bits
# [1] 4.445837981118120898327314579322617633703674840117902103769961398533293289165193843930280422747754618577451267010103975610356319174778512980120125435961577770470993217990999166176083700886405875414277348471907198346293122011042229843450802884152750493740313686430454254150390625000000000000000000000000000000000e-5
Your results are only truncated in the display.
Try:
x <- sum(c(7.831908e-70,6.002923e-26,6.372573e-36,5.025015e-38,5.603268e-38,1.118121e-14, 4.512098e-07,4.400717e-05,2.300423e-26,1.317602e-58))
print(x, digits=22)
[1] 4.445837981118121081878e-05
You can read more about the behaviour of print at ?print.default
You can also set an option - this will affext all calls to print
options(digits=22)
have you ever heard about Floating point numbers?
there is no loss of precision (significant figures) in multiplication or division as far as the result stay between
1.7976931348623157·10^308 to 4.9·10^−324 (see the link for detail)
so if you do 1.0e-30 * 1.0e-10 result will be 1.0e-40
but if you do 1.0e-30 + 1.0e-10 result will be 1.0e-10
Why?
-> finite set of number rapresentable with computer works. (64 bits max 2^64 different representation of numbers with 64 bits)
instead of using a direct conversion like for integer numbers (they represent from ~ -2^62 to +2^62, every INTEGER number -> about from -10^16 to +10*16)
or there exist a clever way like floating point? from 1.7976931348623157·10^308 to - 4.9·10^−324 and it can represent /approximate rational numbers?
So in floating point, to achieve a wider range, precision in sums is sacrified, There is loss of precision during sums or subtractions as the significant figures that could be represented by (the 52 bits of) the fraction part (of a floating point number of 64 bits) are less than log10(2^52) ~ 16.
if you look for a basic everyday example, summary(lm), when the p-value of parameter is near zero, summary() output <2.2e-16 (what a coincidence).
why limited to 64 bits? CPU have the execution units specifically to 64bits floating point arithmetic (64 bit IEEE 754 standard), if you use higher precision like 128 bits floating point, the performances will be lowered by 10 times or more, as CPU need to split the data and operation in multiple 64 bits data and operations.
https://en.wikipedia.org/wiki/Double-precision_floating-point_format

Resources