R rounding off numbers like 8.829847e-07 - r

How can I round off a number like 0.0000234889 (or in the form 8.829847e-07) to a power of ten, either below or above (whichever is my choice), ie here 0.00001 or 0.0001 ?
I tried round(...., digits=-100000) but it returns an error NaN error.
Ex: round(2e-07, digits=6) gives 0, while I would like 1e-06 and another function to give 1e-07.

# Is this what you're looking for?
# find the nearest power of ten for some number
x <- 0.0000234889 # Set test input value
y <- log10(x) # What is the fractional base ten logarithm?
yy <- round(y) # What is the nearest whole number base ten log?
xx <- 10 ^ yy # What integer power of ten is nearest the input?
print(xx)
# [1] 1e-05

The digits argument to the round() function must be positive. If you want your number to show up in scientific notation with an exponent n, just just do
round(value, 10^n)
However, this will only get you what you want up to a point. For example, you can do round(0.0000234889, 10^6) but you still get 2.34889e-05. (Notice that an exponent of 6 was specified but you got 5.)

Use options("scipen" = ) like this:
num <- 0.0000234889
> num
[1] 2.34889e-05
options("scipen" = 10)
options()$scipen
> num
[1] 0.0000234889
This will change the global option for the session. Read documentation here:https://stat.ethz.ch/R-manual/R-devel/library/base/html/options.html

Related

R: how to have gamma() return the actual number instead of Inf

Running gamma(200) returns Inf in R. Would it be possible to have R return the actual number somehow? It looks like anything above gamma(171.6) returns an Inf in R.
The issue is that you cannot represent it with double precision:
gamma(200) # too large value
#R> [1] Inf
lgamma(200) # but log is not
#R> [1] 857.9337
exp(857) # the issue!
#R> [1] Inf
.Machine$double.xmax # maximum double value
#R> [1] 1.797693e+308
gamma(171) # almost there!
#R> [1] 7.257416e+306
You can work with the log of the gamma function instead using lgamma. Otherwise you will need to use a third party library which has higher precision than R's floating points.
A google search suggests that the Rmpfr::igamma function might be what you want if you cannot work with the log of the gamma function:
Rmpfr::igamma(171, 0)
#R> 1 'mpfr' number of precision 53 bits
#R> [1] 7.257415615307999e+306
Rmpfr::igamma(200, 0)
#R> 1 'mpfr' number of precision 53 bits
#R> [1] 3.9432893368239526e+372
Using lgamma as proposed by Benjamin Cristoffersen, you can calculate the significand and the exponent (base 10) as individual variables:
(res <- gamma(100))
9.332622e+155
# Natural logarithm of result
(ln_res <- lgamma(100))
359.1342
# log base 10 of result
(log10_res <- ln_res/log(10))
155.97
# decimal part of the number above, raised to the 10th power
(significand_res <- 10 ^ (log10_res %% 1))
9.332622
# non-decimal part
(exp_res <- log10_res %/% 1)
155
For gamma(200) this returns: 3.9432 * 10 ^ 372

Why am I getting NAs in this calculation in R?

While working on an Rcpp program, I used the sample() function, which gave me the following error: "NAs not allowed in probability." I traced this issue to the fact that the probability vector I used had NA values in it. I have no idea how. Below is some R code that captures the errors:
n.0=20
n.1=20
n.reps=1
beta0.vals=rep(seq(-.3,.1,,n.0),n.reps)
beta1.vals=rep(seq(-7,0,,n.1),n.reps)
beta.grd=as.matrix(expand.grid(beta0.vals,beta1.vals))
n.rnd=200
beta.rnd.grd=cbind(runif(n.rnd,min(beta0.vals),max(beta0.vals)),runif(n.rnd,min(beta1.vals),max(beta1.vals)))
beta.grd=rbind(beta.grd,beta.rnd.grd)
N = 22670
count = 0
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
phi01 = exp(logit)/(1 + exp(logit))
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
Here, $\beta_0 \in (-0.3, 0.1), \beta_1 \in (-7, 0), N = 22670, N_\text{sum} \in (1, N)$. Note that $N$ and $N_\text{sum}$ are integers, whereas the beta values may not be.
Since mathematically, $\phi_{01} \in (0,1)$, I'm assuming that NAs are arising because R is not liking extremely small values. I am receiving an overwhelming amount of NA values, too. More so than numbers. Why would I be getting NAs in this code?
Include print(logit) next to count = count + 1 and you will find lots of logit > 1000 values. exp(1000) == Inf so you divide Inf by Inf which will get you a NaN and NaN is NA:
> exp(500)
[1] 1.403592e+217
> Inf/Inf
[1] NaN
> is.na(NaN)
[1] TRUE
So your problems are not too small but to large numbers coming first out of the evaluation of exp(x) with x larger then roughly 700:
> exp(709)
[1] 8.218407e+307
> exp(710)
[1] Inf
Bernhard's answer correctly identifies the problem:
If logit is large, exp(logit) = Inf.
Here is a solution:
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
## This one isn't great because exp(logit) can be very large
# phi01 = exp(logit)/(1 + exp(logit))
## So, we say instead
## phi01 = 1 / ( 1 + exp(-logit) )
phi01 = plogis(logit)
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
# Total number of invalid probabilities: 0
We can use the more stable 1 / (1 + exp(-logit)
(to convince yourself of this, multiply your expression with exp(-logit) / exp(-logit)),
and luckily either way, R has a builtin function plogis() that can calculate these probabilities quickly and accurately.
You can see from the help file (?plogis) that this function evaluates the expression I gave, but you can also double check to assure yourself
x = rnorm(1000)
y = 1 / (1 + exp(-x))
z = plogis(x)
all.equal(y, z)
[1] TRUE

Where does function equal certain value

I have fitted a function to my data:
BCF.plot <- function(x) {
vv[2] +((vv[3]/(2*(1-vv[4])))*(cos(x-vv[1])-vv[4]+abs(cos(x-vv[1])-vv[4])))
}
It is a baseline (b) cosine wave, i.e. a baseline with a cosine wave on top of it. Now I have a certain value on the Y-axis (dlmo_val) and I want to know which x value corresponds to it. I have tried something like this:
BCF.dlmo <- function(x, dlmo_val = 0) {
vv[2] +((vv[3]/(2*(1-vv[4])))*(cos(x-vv[1])-vv[4]+abs(cos(x-vv[1])-vv[4])))-b-dlmo_val ## find point where function minus baseline & dlmo_val is 0
}
vv = c(2.3971780, 2.0666526, 11.1775231, 0.7870128)
b = 2.066653
H = 11.17752
dlmo_val = 0.4*H ## dlmo*peak height above baseline, H is result from optimisation
uniroot(BCF.dlmo, c(0.2617994, 6.021386), dlmo_val=dlmo_val) ## lower & upper are min(x) and max(x)
However, uniroot tells me
"...values at end points not of opposite sign"
I am not really sure how to go about this. Any recommendations are more than welcome!
As described in this post, uniroot() is designed for finding only one zero in a function, while you have two zeroes. You could call it on a smaller interval:
uniroot(BCF.dlmo, c(0.2617994, 2.5), dlmo_val = dlmo_val)$root
# [1] 1.886079
As that post describes, you can instead use the unitroot.all function in the rootSolve package to find both zeroes:
library(rootSolve)
uniroot.all(BCF.dlmo, c(0.2617994, 6.021386), dlmo_val = dlmo_val)
# [1] 1.886084 2.908276

How to construct a sequence with a pattern in R

I would like to construct a sequence with length 50 of the following type:
Xn+1=4*Xn*(1-Xn). For your information, this is the Logistic Map for r=4. In the case of the Logistic Map with parameter r = 4 and an initial state in (0,1), the attractor is also the interval (0,1) and the probability measure corresponds to the beta distribution with parameters a = 0.5 and b = 0.5. (The Logistic Map is a polynomial mapping (equivalently, recurrence relation) of degree 2, often cited as an archetypal example of how complex, chaotic behaviour can arise from very simple non-linear dynamical equations). How can I do this in R?
There are some ready to use solution on the net. I cite the general solution of mage's blog where you can find more detailed description.
logistic.map <- function(r, x, N, M){
## r: bifurcation parameter
## x: initial value
## N: number of iteration
## M: number of iteration points to be returned
z <- 1:N
z[1] <- x
for(i in c(1:(N-1))){
z[i+1] <- r *z[i] * (1 - z[i])
}
## Return the last M iterations
z[c((N-M):N)]
}
For OP example:
logistic.map(4,0.2,50,49)
This isn't really an R question, is it? More basic programming. Anyway, you probably need an accumulator and a value to process.
values <- 0.2 ## this accumulates as a vector, starting with 0.2
xn <- values ## xn gets the first value
for (it in 2:50) { ## start the loop from the second iteration
xn <- 4L*xn*(1L-xn) ## perform the sequence function
values <- c(values, xn) ## add the new value to the vector
}
values
# [1] 0.2000000000 0.6400000000 0.9216000000 0.2890137600 0.8219392261 0.5854205387 0.9708133262 0.1133392473 0.4019738493 0.9615634951 0 .1478365599 0.5039236459
# [13] 0.9999384200 0.0002463048 0.0009849765 0.0039360251 0.0156821314 0.0617448085 0.2317295484 0.7121238592 0.8200138734 0.5903644834 0 .9673370405 0.1263843622
# [25] 0.4416454208 0.9863789723 0.0537419811 0.2034151221 0.6481496409 0.9122067356 0.3203424285 0.8708926280 0.4497546341 0.9899016128 0 .0399856390 0.1535471506
# [37] 0.5198816927 0.9984188732 0.0063145074 0.0250985376 0.0978744041 0.3531800204 0.9137755744 0.3151590962 0.8633353611 0.4719496615 0 .9968527140 0.0125495222
# [49] 0.0495681269 0.1884445109

Mode of density function using optimize

I want to find the mode (x-value) of a univariate density function using R
s optimize function
I.e. For a standard normal function f(x) ~ N(3, 1) the mode should be the mean i.e. x=3.
I tried the following:
# Define the function
g <- function(x) dnorm(x = x, mean = 3, sd = 1)
Dvec <- c(-1000, 1000)
# First get the gradient of the function
gradfun <- function(x){grad(g, x)}
# Find the maximum value
x_mode <- optimize(f=g,interval = Dvec, maximum=TRUE)
x_mode
This gives the incorrect value of the mode as:
$maximum
[1] 999.9999
$objective
[1] 0
Which is incorrect i.e. gives the max value of the (-1000, 1000) interval as opposed to x=3.
Could anyone please help edit the optimisation code.
It will be used to pass more generic functions of x if this simple test case works
I would use optim for this, avoiding to mention the interval. You can tailor the seed by taking the maximum of the function on the original guessed interval:
guessedInterval = min(Dvec):max(Dvec)
superStarSeed = guessedInterval[which.max(g(guessedInterval))]
optim(par=superStarSeed, fn=function(y) -g(y))
#$par
#[1] 3
#$value
#[1] -0.3989423
#$counts
#function gradient
# 24 NA
#$convergence
#[1] 0
#$message
#NULL

Resources