R: Exponent returning infinity - r

I need to remove logarithms of my data and thus am taking e to the power of the values which are logarithmed.
My issue is that when I have e to the power of more than 709 R returns the value of infinity. How can I surpass this?
e^710
[1] Inf
Thanks :)

If you really want to work with numbers that big you can use a Rmpfr package.
library('Rmpfr')
x <- mpfr(710, precBits = 106)
exp(x)
1 'mpfr' number of precision 106 bits
[1] 2.233994766161711031253644458116e308

Related

High precision sum with R involving binomial coefficients and logs

I am trying to get the sum of terms obtained from the products of binomial coefficients (very large integers) and logarithms (small reals), each terms having alternating signs.
For example:
library(Rmpfr)
binom <- function(n,i) {factorial(n)/(factorial(n-i)*factorial(i))}
i <- 30
n <-60
Ui <- rep(0,i)
for (k in (0:(i-1))) {
Ui[k+1] <- (-1)^(i-1-k) * binom(i-1,k)/(n-k) * log(n-k)
}
U <- sum(mpfr(Ui, 1024))
returns 7.2395....e-10, which is very far from the actual response returned by Mathematica, that is, -5.11...e-20.
How can I make the sum be accurate? I checked manually the Ui and all individually seem accurate to many digits.
Edit
Here is the Mathematica code that computes the same sum. It works on integers and only convert to reals once the sum is over. I increased the number of reported decimals.
Reason for this?
In the end, I need to get the ratio of two numbers obtained with similar computations. When the two numbers are off a couple of order of magnitude, the ratio obtained is just simply unpredictable.
You need to work with the mpfr objects during the whole calculation, rather than just at the summation:
library(Rmpfr)
i <- 30
n <- 60
k <- 0:(i - 1)
nk <- mpfr(n - k, 128)
(U <- sum((-1)^(i-1-k)*choose(i-1,k)/(nk)*log(nk)))
#> 1 'mpfr' number of precision 128 bits
#> [1] -5.110333215290518581300810256453669394729e-20
nk <- mpfr(n - k, 256)
(U <- sum((-1)^(i-1-k)*choose(i-1,k)/(nk)*log(nk)))
#> 1 'mpfr' number of precision 256 bits
#> [1] -5.110333215285320173235309727002720346864555872897902728222060861935229197560667e-20
nk <- mpfr(n - k, 512)
(U <- sum((-1)^(i-1-k)*choose(i-1,k)/(nk)*log(nk)))
#> 1 'mpfr' number of precision 512 bits
#> [1] -5.1103332152853201732353097270027203468645558728979134452318939958128833820370490135678222208577277855238767473116630391351888405531035522832949562601913591e-20

Getting 0 in R instead of a precise result

How can I get the actual precise result instead of the rounded ones?
result = ((0.61)**(10435)*(0.39)**(6565))/((0.63)**(5023)*(0.60)**(5412)*(0.37)**(2977)*(0.40)**(3588))
out:
NaN
Because, denominator is 0
I think logarithm is a power tool to deal with exponential with large powers (see it properties in https://mathworld.wolfram.com/Logarithm.html)
You can try to use log first over your math expression and then apply exp in turn, i.e.,
result <- exp((10435*log(0.61)+6565*log(0.39)) - (5023*log(0.63)+5412*log(0.60)+ 2977*log(0.37)+3588*log(0.40)))
which gives
> result
[1] 0.001219116
R cannot handle such large exponents because that will converge to 0 beyond its precision. Precision is not infinite. For what you want, you need an arbitrary precision package, such as Rmpfr.
library(Rmpfr)
precision <- 120
result <- (mpfr(0.61, precision)**10435 * mpfr(0.39, precision)**6565) /
(mpfr(0.63, precision)**5023 * mpfr(0.60, precision)**5412 * mpfr(0.37, precision)**2977 * mpfr(0.40, precision)**3588)
print(result)
Output:
1 'mpfr' number of precision 120 bits
[1] 0.0012191160601483692718001967190171336975

How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

In genetics very small p-values are common (for example 10^-400), and I am looking for a way to get very small p-values (two-tailed) when the z-score is large in R, for example:
z=40
pvalue = 2*pnorm(abs(z), lower.tail = F)
This gives me a zero instead of a very small value which is very significant.
The inability to handle p-values less than about 10^(-308) (.Machine$double.xmin) is not really R's fault, but is rather a generic limitation of any computational system that uses double precision (64-bit) floats to store numeric information.
It's not hard to solve the problem by computing on the log scale, but you can't store the result as a numeric value in R; instead, you need to store (or print) the result as a mantissa plus exponent.
pvalue.extreme <- function(z) {
log.pvalue <- log(2) + pnorm(abs(z), lower.tail = FALSE, log.p = TRUE)
log10.pvalue <- log.pvalue/log(10) ## from natural log to log10
mantissa <- 10^(log10.pvalue %% 1)
exponent <- log10.pvalue %/% 1
## or return(c(mantissa,exponent))
return(sprintf("p value is %1.2f times 10^(%d)",mantissa,exponent))
}
Test with a not-too-extreme case:
pvalue.extreme(5)
## [1] "p value is 5.73 times 10^(-7)"
2*pnorm(5,lower.tail=FALSE)
## [1] 5.733031e-07
More extreme:
pvalue.extreme(40)
## [1] "p value is 7.31 times 10^(-350)"
There are a variety of packages that handle extremely large/small numbers with extended precision in R (Brobdingnag, Rmpfr, ...) For example,
2*Rmpfr::pnorm(mpfr(40, precBits=100), lower.tail=FALSE, log.p = FALSE)
## 1 'mpfr' number of precision 100 bits
## [1] 7.3117870818300594074979715966414e-350
However, you will pay a big cost in computational efficiency and convenience for working with an arbitrary-precision system.

R histogram breaks Error

I have to prepare an algorithm for my thesis to cross check a theoretical result which is that the binomial model for N periods converges to lognormal distribution for N\to \infty. For those of you not familiar with the concept i have to create an algorithm that takes a starter value and multiplies it with an up-multiplier and a down multiplier and continues to do so for every value for N steps. The algorithm should return a vector whose elements are in the form of StarterValueu^id^{N-i} i=0,\dots,N
the simple algorithm i proposed is
rata<-function(N,r,u,d,S){
length(x)<-N
for(i in 0:N){
x[i]<-S*u^{i}*d^{N-i}
}
return(x)
}
N is the number of periods and the rest are just nonimportant values (u is for the up d for down etc)
In order to extract my results i need to make a histogram of the produced vector's logarithm to prove that they are normally distributed. However for a N=100000( i need an great number of steps to prove convergence) when i type hist(x) i get the error :(invalid number of breaks)
Can anyone help?? thanks in advance.
An example
taf<-rata(100000,1,1.1,0.9,1)
taf1<-log(taf)
hist(taf1,xlim=c(-400,400))
First I fix your function:
rata<-function(N,r,u,d,S){
x <- numeric(N+1)
for(i in 0:N){
x[i]<-S*u^{i}*d^{N-i}
}
return(x)
}
Or relying on vectorization:
rata<-function(N,r,u,d,S){
x<-S*u^{0:N}*d^{N-(0:N)}
return(x)
}
taf<-rata(100000,1,1.1,0.9,1)
Looking at the result, we notice that it contains NaN values:
taf[7440 + 7:8]
#[1] 0 NaN
What happened? Apparently the multiplication became NaN:
1.1^7448*0.9^(1e5-7448)
#[1] NaN
1.1^7448
#[1] Inf
0.9^(1e5-7448)
#[1] 0
Inf * 0
#[1] NaN
Why does an Inf value occur? Well, because of double overflow (read help("double")):
1.1^(7440 + 7:8)
#[1] 1.783719e+308 Inf
You have a similar problem with floating point precision when a multiplicant gets close to 0 (read help(".Machine")).
You may need to use arbitrary precision numbers.

R small pvalues

I am calculating z-scores to see if a value is far from the mean/median of the distribution.
I had originally done it using the mean, then turned these into 2-side pvalues. But now using the median I noticed that there are some Na's in the pvalues.
I determined this is occuring for values that are very far from the median.
And looks to be related to the pnorm calculation.
"
'qnorm' is based on Wichura's algorithm AS 241 which provides
precise results up to about 16 digits. "
Does anyone know a way around this as I would like the very small pvalues.
Thanks,
> z<- -12.5
> 2-2*pnorm(abs(z))
[1] 0
> z<- -10
> 2-2*pnorm(abs(z))
[1] 0
> z<- -8
> 2-2*pnorm(abs(z))
[1] 1.332268e-15
Intermediately, you are actually calculating very high p-values:
options(digits=22)
z <- c(-12.5,-10,-8)
pnorm(abs(z))
# [1] 1.0000000000000000000000 1.0000000000000000000000 0.9999999999999993338662
2-2*pnorm(abs(z))
# [1] 0.000000000000000000000e+00 0.000000000000000000000e+00 1.332267629550187848508e-15
I think you will be better off using the low p-values (close to zero) but I am not good enough at math to know whether the error at close-to-one p-values is in the AS241 algorithm or the floating point storage. Look how nicely the low values show up:
pnorm(z)
# [1] 3.732564298877713761239e-36 7.619853024160526919908e-24 6.220960574271784860433e-16
Keep in mind 1 - pnorm(x) is equivalent to pnorm(-x). So, 2-2*pnorm(abs(x)) is equivalent to 2*(1 - pnorm(abs(x)) is equivalent to 2*pnorm(-abs(x)), so just go with:
2 * pnorm(-abs(z))
# [1] 7.465128597755427522478e-36 1.523970604832105383982e-23 1.244192114854356972087e-15
which should get more precisely what you are looking for.
One thought, you'll have to use an exp() with larger precision, but you might be able to use log(p) to get slightly more precision in the tails, otherwise you are effectively at 0 for the non-log p values in terms of the range that can be calculated:
> z<- -12.5
> pnorm(abs(z),log.p=T)
[1] -7.619853e-24
Converting back to the p value doesn't work well, but you could compare on log(p)...
> exp(pnorm(abs(z),log.p=T))
[1] 1
pnorm is a function which gives what P value is based on given x. If You do not specify more arguments, then default distribution is Normal with mean 0, and standart deviation 1.
Based on simetrity, pnorm(a) = 1-pnorm(-a).
In R, if you add positive numbers it will round them. But if you add negative no rounding is done. So using this formula and negative numbers you can calculate needed values.
> pnorm(0.25)
[1] 0.5987063
> 1-pnorm(-0.25)
[1] 0.5987063
> pnorm(20)
[1] 1
> pnorm(-20)
[1] 2.753624e-89

Resources