R (and python) rounding up starting at 5 or 6 [duplicate] - r

This question already has answers here:
Round up from .5
(7 answers)
Closed 4 years ago.
I saw already a question with very large number of decimal digits R rounding explanation.
round(62.495, digits=2)
gives me 62.49. I would expect already 62.5, but it seems, R (3.4.3, 3.5.0) rounds up only starting at 6, e.g.,
round(62.485, 2) == 62.48
round(62.486, 2) == 62.49.
For other reasons, I am using the option
options(digits.secs=6)
From what I have learnt, one rounds up starting at 5. I tested also with Python and Matlab. Matlab rounds up, Python 3.5.4 down.
How can I change the behaviour or is this definition different, e.g. between Europe and US?

This is a floating point representation issue, 62.495 is actually represented by a slightly smaller number which then gets rounded downwards.
print(62.495,digits=22)
[1] 62.49499999999999744205
R's rounding is statistical rounding, or round half to even. It should round halves up or down to an even number, eg
round(0.5) # rounds the half down to 0
[1] 0
round(1.5) # rounds the half up to 2
[1] 2

Related

R factorial() is not yielding the correct result [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 3 days ago.
I am trying to calculate the factorial of 52 in R. Astonishingly, I am getting contradicting results.
aaa<-1*2*3*4*5*6*7*8*9*10*11*12*13*14*15*16*17*18*19*20*21*22*23*24*25*26*27*28*29*30*31*32*33*34*35*36*37*38*39*40*41*42*43*44*45*46*47*48*49*50*51*52
bbb<-factorial(52)
aaa
[1] 80658175170943876845634591553351679477960544579306048386139594686464
bbb
[1] 80658175170944942408940349866698506766127860028660283290685487972352
aaa==bbb #False
What am I doing wrong?
This is a well known problem in computing with large numbers; R uses double-precision floating-point, the precision of which may vary by machine. Thats why you are getting multiple results across methods (including the online calculator in your comments). If you want to change your precision (in bits), one option is to use the Rmpfr package:
Rmpfr::mpfr(factorial(52),
6) # six bits
#1 'mpfr' number of precision 6 bits
#[1] 8.09e+67
Rmpfr::mpfr(factorial(52),
8) # eight bits
#1 'mpfr' number of precision 8 bits
#[1] 8.046e+67
This will allow you to obtain a result with the same value:
x <-Rmpfr::mpfr(1*2*3*4*5*6*7*8*9*10*11*12*13*14*15*16*17*18*19*20*21*22*23*24*25*26*27*28*29*30*31*32*33*34*35*36*37*38*39*40*41*42*43*44*45*46*47*48*49*50*51*52,
8)
y <- Rmpfr::mpfr(factorial(52),
8)
x == y
#[1] TRUE

Round numbers in R correctly [duplicate]

This question already has answers here:
Round up from .5
(7 answers)
Closed 1 year ago.
There are a number of threads about this question. None seems to answer the simple question: why does R round incorrectly and how can I let it round correctly?
Correct rounding to the i-th decimal x considers the i+1-th decimal. If it is 5 or larger then x is is set to x+1. If it is 4 or smaller then x is returned. For example 1.45 is rounded to the first decimal as 1.5. 1.44 is rounded 1.4. However, in R
> round(1.45,1)
[1] 1.4
But
> round(1.46,1)
[1] 1.5
So it changes the convention to 'if the i+1th decimal is 6 or larger, then x is set to x+1'. Why? And how can I change this to the convention I am familiar with?
Most decimal fractions are not exactly representable in binary double precision
Learned here: https://stat.ethz.ch/R-manual/R-devel/library/base/html/Round.html
Section "Warnings":
Rounding to decimal digits in binary arithmetic is non-trivial (when digits != 0) and may be surprising. Be aware that most decimal fractions are not exactly representable in binary double precision. In R 4.0.0, the algorithm for round(x, d), for d > 0, has been improved to measure and round “to nearest even”, contrary to earlier versions of R (or also to sprintf() or format() based rounding).

R fails to round 126.5 [duplicate]

This question already has answers here:
Round up from .5
(7 answers)
Sometimes rounding number are not consistent [duplicate]
(2 answers)
Closed 3 years ago.
R fails to round the number "126.5". I discovered this by accident.
round(125.5) # = 126, correct
round(126.5) # = 126, wrong
round(127.5) # = 128, correct
I expect that the output of round(126.5) to be 127, but the actual output is 126. R rounds other numbers correctly (see above). Does anybody know what the problem is and how can I fix it?
From documentation ?round -
Note that for rounding off a 5, the IEC 60559 standard is expected to
be used, ‘go to the even digit’. Therefore round(0.5) is 0 and
round(-1.5) is -2. However, this is dependent on OS services and on
representation error (since e.g. 0.15 is not represented exactly, the
rounding rule applies to the represented number and not to the printed
number, and so round(0.15, 1) could be either 0.1 or 0.2).

Is there an error in round function in R? [duplicate]

This question already has answers here:
Round up from .5
(7 answers)
Closed 6 years ago.
It seems there is an error in round function. Below I would expect it to return 6, but it returns 5.
round(5.5)
# 5
Other then 5.5, such as 6.5, 4.5 returns 7, 5 as we expect.
Any explanation?
This behaviour is explained in the help file of the ?round function:
Note that for rounding off a 5, the IEC 60559 standard is expected to
be used, ‘go to the even digit’. Therefore round(0.5) is 0 and
round(-1.5) is -2. However, this is dependent on OS services and on
representation error (since e.g. 0.15 is not represented exactly, the
rounding rule applies to the represented number and not to the printed
number, and so round(0.15, 1) could be either 0.1 or 0.2).
round( .5 + 0:10 )
#### [1] 0 2 2 4 4 6 6 8 8 10 10
Another relevant email exchange by Greg Snow: R: round(1.5) = round(2.5) = 2?:
The logic behind the round to even rule is that we are trying to
represent an underlying continuous value and if x comes from a truly
continuous distribution, then the probability that x==2.5 is 0 and the
2.5 was probably already rounded once from any values between 2.45 and 2.54999999999999..., if we use the round up on 0.5 rule that we learned in grade school, then the double rounding means that values
between 2.45 and 2.50 will all round to 3 (having been rounded first
to 2.5). This will tend to bias estimates upwards. To remove the
bias we need to either go back to before the rounding to 2.5 (which is
often impossible to impractical), or just round up half the time and
round down half the time (or better would be to round proportional to
how likely we are to see values below or above 2.5 rounded to 2.5, but
that will be close to 50/50 for most underlying distributions). The
stochastic approach would be to have the round function randomly
choose which way to round, but deterministic types are not
comforatable with that, so "round to even" was chosen (round to odd
should work about the same) as a consistent rule that rounds up and
down about 50/50.
If you are dealing with data where 2.5 is likely to represent an exact
value (money for example), then you may do better by multiplying all
values by 10 or 100 and working in integers, then converting back only
for the final printing. Note that 2.50000001 rounds to 3, so if you
keep more digits of accuracy until the final printing, then rounding
will go in the expected direction, or you can add 0.000000001 (or
other small number) to your values just before rounding, but that can
bias your estimates upwards.
When I was in college, a professor of Numerical Analysis told us that the way you describe for rounding numbers is the correct one. You shouldn't always round up the number (integer).5, because it is equally distant from the (integer) and the (integer + 1). In order to minimize the error of the sum (or the error of the average, or whatever), half of those situations should be rounded up and the other half should be rounded down. The R programmers seem to share the same opinion as my professor of Numerical Analysis...

Logic regarding summation of decimals [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 8 years ago.
Does the last statement in this series of statements make logical sense to anybody else? R seems to give similar results for a small subset of possible sums of decimals under 1. I cannot recall any basic mathematical principles that would make this true, but it seems to be unlikely to be an error.
> 0.4+0.6
[1] 1
> 0.4+0.6==1.0
[1] TRUE
> 0.3+0.6
[1] 0.9
> 0.3+0.6==0.9
[1] FALSE
Try typing 0.3+0.6-0.9, on my system the result is -1.110223e-16 this is because the computer doesn't actually sum them as decimal numbers, it stores binary approximations, and sums those. And none of those numbers can be exactly represented in binary, so there is a small amount of error present in the calculations, and apparently it's small enough not to matter in the first one, but not the second.
Floating point arithmetic is not exact, but the == operator is. Use all.equal to compare two floating point values in R.
isTRUE(all.equal(0.3+0.6, 0.9))
You can also define a tolerance when calling all.equals.
isTRUE(all.equal(0.3+0.6, 0.9, tolerance = 0.001))

Resources