How to properly compare numbers in scientific notation using R? [duplicate] - r

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 1 year ago.
I was reading the following tutorial for testing proportions in two populations. After running
prop.test(x=c(342,290), n=c(400,400))
I received a p-value of 9.558674e-06, which the tutorial says is greater than the alpha level of .05. I assumed this was a typo, and was just comparing the p-value to its value in decimal notation, 0.000009558674, but received "False". I even turned off scientific notation using
options(scipen=999)
and when printing out the p-value from the object returned by prop.test, I still receive "False" when comparing the p-value to 0.000009558674 for equality, it recognizes the p-value as lesser than. Why is this the case?

You may want to consider using the all.equal() function. The tolerance between the values can be set with the tolerance argument.
isTRUE(all.equal(2, 2.00000001))
## [1] TRUE
isTRUE(all.equal(2, 2.00000001, tolerance = 0.0000000001))
## [1] FALSE

Related

Sum function in R not giving the expected answer [duplicate]

This question already has answers here:
How can I disable scientific notation?
(4 answers)
Why are these numbers not equal?
(6 answers)
Closed 3 years ago.
I have just started with R and am stuck on this bug.
fuel_efficiency<-c(28.2, 28.3, 28.4, 28.5, 29.0)
mean=28.48
deviation<-(fuel_efficiency-mean)
deviation
sum(deviation)
I have written this code to subtract the mean from the elements of the fuel efficiency vector to get the deviation vector. Then am trying to get the sum of the updated deviation vector.
The sum answer should return 0 but instead gives something like -3.552714e-15
The deviation vector is printed properly as expected.
#[1] -0.28 -0.18 -0.08 0.02 0.52
This are just rounding errors, -3.552714e-15 is a very, very small number:
Computers are notoriously bad at handling decimal numbers, (one could even say they are just not able to do it exactly). To overcome this R provides a function to check for equality:
all.equal(sum(deviation), 0)
This returns:
[1] TRUE
format(sum(deviation), scientific = FALSE)
Your answer is very close to zero. It's a rounding error. R uses IEEE 754 double-precision floating point numbers. Read more here: https://en.wikipedia.org/wiki/Double-precision_floating-point_format

Sum row that does not give exactly zero, when it should [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 5 years ago.
I am doing a simple row sum and two columns give me 0 (which is the number it should give), but the last one gives an epsilon, but not zero per se.
# generate the row values that their sumation should give zero.
d<-0.8
c<-1-d
a<-0.5
b<-0.5
e<-0.2
f<-1-e
Perc<-c(-1, a,b,c,-1,d,e,f,-1)
# Put them in a 3x3 matrix
div<-matrix(ncol = 3, byrow = TRUE,Perc)
# Do the row sum
rowSums(div)
# RESULT
[1] 0.000000e+00 0.000000e+00 5.551115e-17
rowSums(div)[3]==0
[1] FALSE
I am using this version of R: version 3.4.1 (2017-06-30) -- "Single Candle"
Any idea why ? and how i can fix this?
This happens because the machines can't store decimal numbers exactly. There can be a small error for some numbers.
The fix here is to use the all.equal function. It takes the tolerance level of the machine into account when comparing two numbers.
all.equal(sum(div[3, ]), 0)
TRUE

Logic regarding summation of decimals [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 8 years ago.
Does the last statement in this series of statements make logical sense to anybody else? R seems to give similar results for a small subset of possible sums of decimals under 1. I cannot recall any basic mathematical principles that would make this true, but it seems to be unlikely to be an error.
> 0.4+0.6
[1] 1
> 0.4+0.6==1.0
[1] TRUE
> 0.3+0.6
[1] 0.9
> 0.3+0.6==0.9
[1] FALSE
Try typing 0.3+0.6-0.9, on my system the result is -1.110223e-16 this is because the computer doesn't actually sum them as decimal numbers, it stores binary approximations, and sums those. And none of those numbers can be exactly represented in binary, so there is a small amount of error present in the calculations, and apparently it's small enough not to matter in the first one, but not the second.
Floating point arithmetic is not exact, but the == operator is. Use all.equal to compare two floating point values in R.
isTRUE(all.equal(0.3+0.6, 0.9))
You can also define a tolerance when calling all.equals.
isTRUE(all.equal(0.3+0.6, 0.9, tolerance = 0.001))

Why is this easy comparison false? [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 8 years ago.
Why does this simple statement evaluate to FALSE in R?
mean(c(7.18, 7.13)) == 7.155
Furthermore, what do I have to do in order to make this a TRUE statement? Thanks!
Floating point arithmetic is not exact. The answer to this question has more information.
You can actually see this:
> print(mean(c(7.18,7.13)), digits=16)
[1] 7.154999999999999
> print(7.155, digits=16)
[1] 7.155
In general, do not compare floating point numbers for equality (this applies to pretty much every programming language, not just R).
You can use all.equal to do an inexact comparison:
> all.equal(mean(c(7.18,7.13)), 7.155)
[1] TRUE
It's probably due to small rounding error. Rounding to the third decimal place shows that they are equal:
round(mean(c(7.18, 7.13)), 3) == 7.155
Generally, don't rely on numerical comparisons to give expected logical outputs :)

Why did I obtain wrong answer when using "=="? [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 9 years ago.
If I type:
x<-seq(0,20,.05)
x[30]
x[30]==1.45
Why do I obtain a False from the last line of code? What did I do wrong here?
This question has been asked a million times, albeit in different forms. This is due to floating point inaccuracy. Also here's another link on floating point errors you may want to catch up on!
Try this to first see what's going on:
x <- seq(0, 20, 0.5)
sprintf("%.20f", x[30]) # convert value to string with 20 decimal places
# [1] "14.50000000000000000000"
x[30] == 14.5
# [1] TRUE
All is well so far. Now, try this:
x <- seq(0, 20, 0.05)
sprintf("%.20f", x[30]) # convert value to string with 20 decimal places
# [1] "1.45000000000000017764"
x[30] == 1.45
# [1] FALSE
You can see that the machine is able to accurately represent this number only up to certain digits. Here, up to 15 digits or so. So, by directly comparing the results, you get of course a FALSE. Instead what you could do is to use all.equal which has a parameter for tolerance which equals .Machine$double.eps ^ 0.5. On my machine this evaluates to 1.490116e-08. This means if the absolute difference between the numbers x[30] and 1.45... is < this threshold, then all.equal evaluates this to TRUE.
all.equal(x[30], 1.45)
[1] TRUE
Another way of doing this is to explicitly check with a specific threshold (as #eddi's answer shows it).
This has to do with the fact that these are double's, and the correct way of comparing double's in any language is to do something like:
abs(x[30] - 1.45) < 1e-8 # or whatever precision you think is appropriate

Resources