Round numbers in output to show small near-zero quantities as zero - r

I would like the output of my R console to look readable. To this end, I would like R to round all my numbers to the nearest N decimal places. I have some success but it doesn't work completely:
> options(scipen=100, digits=4)
> .000000001
[1] 0.000000001
> .1
[1] 0.1
> 1.23123123123
[1] 1.231
I would like the 0.000000001 to be displayed as simply 0. How does one do this? Let me be more specific: I would like a global fix for the entire R session. I realize I can start modifying things by rounding them but it's less helpful than simply setting things for the entire session.

Look at ?options, specifically the digits and scipen options.

try
sprintf("%.4f", 0.00000001)
[1] "0.0000"

Combine what Greg Snow and Ricardo Saporta already gave you to get the right answer: options('scipen'=+20) and options('digits'=2) , combined with round(x,4) .
round(x,4) will round small near-zero quantities.
Either you round off the results of your regression once and store it:
x <- round(x, 4)
... or else yes, you have to do that every time you display the small quantity, if you don't want to store its rounded value. In your case, since you said small near-zero quantities effectively represent zero, why don't you just round it?
If for some reason you need to keep both the precise and the rounded versions, then do.
x.rounded <- round(x, 4)

Related

Is there a way to handle calculations invovling exponential of big values in R?

I have looked a bit online and in the site but I did not find any solution. My problem is relatively simple so if you could point me to a possible solution, much appreciated.
test_vec <- c(2,8,709,600)
mean(exp(test_vec))
test_vec_bis <- c(2,8,710,600)
mean(exp(test_vec_bis))
exp(709)
exp(710)
# The numerical limit of R is at exp(709)
How can I calculate the mean of my vector and deal with the Inf values knowing that R could probably handle the mean value but not all values in the numerator of the mean calculation ?
There is an edge case where you can solve your problem by simply restating your problem mathematically, but that would require that the length of your vector is extremely large and/or that your large exp. numbers are close to the numeric limit:
Since the mean sum(x)/n can be written as sum(x/n) and since exp(x)/exp(y) = exp(x-y), you can calculate sum(exp(x-log(n))), which gives you a relief of log(n).
mean(exp(test_vec))
[1] 2.054602e+307
sum(exp(test_vec - log(length(test_vec))))
[1] 2.054602e+307
sum(exp(test_vec_bis - log(length(test_vec_bis))))
[1] 5.584987e+307
While this works for your example, most likely this won't work for your real vector.
In this case, you will have to consult packages like Rmpfr as suggested by #fra.
Here's one way where you qualify to only select those in your test_vec that give an answer < Inf:
mean(exp(test_vec)[which(exp(test_vec) < Inf)])
[1] 1.257673e+260
t2 <- c(2,8,600)
mean(exp(t2))
[1] 1.257673e+260
This assumes you were looking to exclude values that result in Inf, of course.

R: force format(scientific=FALSE) not to round

I have been playing around with this command for a while and cannot seem to make it work the way I would like it to. I would like format to give me the full list of numbers as a text without any rounding even when the whole number portion is large. For example:
format(2290000000000000000.000081 , scientific=FALSE)
[1] "2290000000000000000"
While what I want returned is:
"2290000000000000000.000081"
As noted, you can't store that number exactly using double precision. You'll need to use multiple-precision floating point numbers.
library(Rmpfr)
mpfr("2290000000000000000.000081", precBits=85)
## 1 'mpfr' number of precision 85 bits
## [1] 2290000000000000000.000081

R spline function given a fixed space

So, I need to generate a spline function to feed it into another program which only accepts a fixed space between consecutive points. So, I used spline function in R with a given number of points to genrate spline, however, the floating-point cutoff makes the space among the points variable, for example:
spline(d$V1, d$V2, n=(max(d$V1)-min(d$V1))/0.0200)
> head(t.spl, 7)
x y
1 2.3000 -3.0204
2 2.3202 -3.0204
3 2.3404 -3.0204
4 2.3606 -3.0204
5 2.3807 -3.0204
6 2.4009 -3.0204
7 2.4211 -3.0204
so, the space between 1st 1nd 2nd row is 0.0202, while between 4th and 5th is 0.0201. So because of this problem, the other program that I am feeding this spline into, doesn't accept this. So, is there any way to make this work?
As an aside: please provide a reproducible example next time (I can't copy/paste your code in because I don't have d or t.spl)
I think you'll find that the different intervals (0.0202 vs 0.0201) is an artifact of the number of characters you are printing on the screen, not of the spline function.
It seems R is printing 4 digits after the decimal point for you for neatness, so it's doing the rounding only for the purposes of displaying the results to you.
You can see how many digits are displayed with options('digits')$digits, and adjust it with options(digits=new_number_of_digits) (see ?options for details).
For example:
options(digits=4)
pi
# 3.142
options(digits=10)
pi
# 3.141592654
In summary, when you feed the values in to your other program, make sure you print the values with enough decimal points that the other program accepts the intervals as being "equal".
If you are writing to a file, for example, just make sure you write enough digits out. If you are copy-pasting from the R console, make sure you adjust R to print out enough digits.
MathematicalCoffee is probably right. I'm just adding an alternative for the sake of wordiness.
myspline <- splinefun(dV$1,dV$2)
mydata.y <- myspline(desired_x_values,deriv=0)
Will guarantee the uniform x-spacings you desire.

acos(1) returns NaN for some values, not others

I have a list of latitude and longitude values, and I'm trying to find the distance between them. Using a standard great circle method, I need to find:
acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1))
And multiply this by the radius of earth, in the units I am using. This is valid as long as the values we take the acos of are in the range [-1,1]. If they are even slightly outside of this range, it will return NaN, even if the difference is due to rounding.
The issue I have is that sometimes, when two lat/long values are identical, this gives me an NaN error. Not always, even for the same pair of numbers, but always the same ones in a list. For instance, I have a person stopped on a road in the desert:
Time |lat |long
1:00PM|35.08646|-117.5023
1:01PM|35.08646|-117.5023
1:02PM|35.08646|-117.5023
1:03PM|35.08646|-117.5023
1:04PM|35.08646|-117.5023
When I calculate the distance between the consecutive points, the third value, for instance, will always be NaN, even though the others are not. This seems to be a weird bug with R rounding.
Can't tell exactly without seeing your data (try dput), but this is mostly likely a consequence of FAQ 7.31.
(x1 <- 1)
## [1] 1
(x2 <- 1+1e-16)
## [1] 1
(x3 <- 1+1e-8)
## [1] 1
acos(x1)
## [1] 0
acos(x2)
## [1] 0
acos(x3)
## [1] NaN
That is, even if your values are so similar that their printed representations are the same, they may still differ: some will be within .Machine$double.eps and others won't ...
One way to make sure the input values are bounded by [-1,1] is to use pmax and pmin: acos(pmin(pmax(x,-1.0),1.0))
A simple workaround is to use pmin(), like this:
acos(pmin(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1),1))
It now ensures that the precision loss leads to a value no higher than exactly 1.
This doesn't explain what is happening, however.
(Edit: Matthew Lundberg pointed out I need to use pmin to get it tow work with vectorized inputs. This fixes the problem with getting it to work, but I'm still not sure why it is rounding incorrectly.)
I just encountered this. This is caused by input larger than 1. Due to the computational error, my inner product between unit norms becomes a bit larger than 1 (like 1+0.00001). And acos() can only deal with [-1,1]. So, we can clamp the upper bound to exactly 1 to solve the problem.
For numpy: np.clip(your_input, -1, 1)
For Pytorch: torch.clamp(your_input, -1, 1)

How to make "pretty rounding"?

I need to do a rounding like this and convert it as a character:
as.character(round(5.9999,2))
I expect it to become 6.00, but it just gives me 6
Is there anyway that I can make it show 6.00?
Try either one of these:
> sprintf("%3.2f", round(5.9999, digits=2))
[1] "6.00
> sprintf("%3.2f", 5.999) # no round needed either
[1] "6.00
There are also formatC() and prettyNum().
To help explain what's going on - the round(5.9999, 2) call is rounding your number to the nearest hundredths place, which gives you the number (not string) very close to (or exactly equal to, if you get lucky with floating-point representations) 6.00. Then as.character() looks at that number, takes up to 15 significant digits of it (see ?as.character) in order to represent it to sufficient accuracy, and determines that only 1 significant digit is necessary. So that's what you get.
As Dirk indicated, formatC() is another option.
formatC(x = 5.999, digits = 2, format = 'f')

Resources