How come as.character(1) == as.numeric(1) is TRUE? [duplicate] - r

This question already has answers here:
Why does "one" < 2 equal FALSE in R?
(2 answers)
Why is the expression "1"==1 evaluating to TRUE? [duplicate]
(1 answer)
Closed 3 years ago.
Just like the title says, why does "1" == 1 is TRUE? What is the real reason behind this? Is R trying to be kind or is this something else? I was thinking since "1" (or any numbers it really doesn't matter) where read by R as a character it would automatically return FALSE if compare with as.numeric(1) or as.integer(1).
> as.character(1) == as.numeric(1)
[1] TRUE
or
> "1" == 1
[1] TRUE
I guess it is a simple question but I'd like to get an answer. Thank you.

According to ?==
For numerical and complex values, remember == and != do not allow for the finite representation of fractions, nor for rounding error. Using all.equal with identical is almost always preferable. S
In another paragraph, it is also written
x, y
atomic vectors, symbols, calls, or other objects for which methods have been written. If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
identical(as.character(1), as.numeric(1))
#[1] FALSE

Related

How does R compare version strings with the inequality operators?

Could someone explain this behavior in R?
> '3.0.1' < '3.0.2'
[1] TRUE
> '3.0.1' > '3.0.2'
[1] FALSE
What process is R doing to make the comparison?
It's making a lexicographic comparison in this case, as opposed to converting to numeric, as calling as.numeric('3.0.1') returns NA.
The logic here would be something like, "the strings '3.0.1' and '3.0.2' are equivalent until their final characters, and since 1 precedes 2 in an alphanumeric alphabet, '3.0.1' is less than '3.0.2'." You can test this with some toy examples:
'a' < 'b' # TRUE
'ab' < 'ac' # TRUE
'ab0' < 'ab1' # TRUE
Per the note in the manual in the post that #rawr linked in the comments, this will get hairy in different locales, where the alphanumeric alphabet may be sorted differently.

Comparing integers with characters in R [duplicate]

This question already has answers here:
Why TRUE == "TRUE" is TRUE in R?
(3 answers)
Why does "one" < 2 equal FALSE in R?
(2 answers)
Closed last year.
It appears that as.character() of a number is still a number, which I find counter intuitive. Consider this example:
1 > "2"
[1] FALSE
2 > "1"
[1] TRUE
Even if I try to use as.character() or paste()
as.character(2)
[1] "2"
as.character(2) > 1
[1] TRUE
as.character(2) < 1
[1] FALSE
Why is that? Can't I have R return an error when I am comparing numbers with strings?
As explained in the comments the problem is that the numeric 1 is coerced to character.
The operation < still works for characters. A character is smaller than another if it comes first in alphabetical order.
> "a" < "b"
[1] TRUE
> "z" < "b"
[1] FALSE
So in your case as.character(2) > 1 is transformed to as.character(2) > as.character(1) and because of the "alphabetical" order of numbers TRUEis returned.
To prevent this you would have to check for the class of an object manually.
The documentation of ?Comparison states that
If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
So in your case the number is automatically coerced to string and the comparison is made based on the respective collation.
In order to prevent it, the only option I know of is to manually compare the class first.

Why does "one" < 2 equal FALSE in R?

I'm reading Hadley Wickham's Advanced R section on coercion, and I can't understand the result of this comparison:
"one" < 2
# [1] FALSE
I'm assuming that R coerces 2 to a character, but I don't understand why R returns FALSE instead of returning an error. This is especially puzzling to me since
-1 < "one"
# TRUE
So my question is two-fold: first, why this answer, and second, is there a way of seeing how R converts the individual elements within a logical vector like these examples?
From help("<"):
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.
So in this case, the numeric is of lower precedence than the character. So 2 is coerced to the character "2". Comparison of strings in character vectors is lexicographic which, as I understand it, is alphabetic but locale-dependent.
It coerces 2 into a character, then it does an alphabetical comparison. And numeric characters are assumed to come before alphabetical ones
to get a general idea on the behavior try
'a'<'1'
'1'<'.'
'b'<'B'
'a'<'B'
'A'<'B'
'C'<'B'

Logic regarding summation of decimals [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 8 years ago.
Does the last statement in this series of statements make logical sense to anybody else? R seems to give similar results for a small subset of possible sums of decimals under 1. I cannot recall any basic mathematical principles that would make this true, but it seems to be unlikely to be an error.
> 0.4+0.6
[1] 1
> 0.4+0.6==1.0
[1] TRUE
> 0.3+0.6
[1] 0.9
> 0.3+0.6==0.9
[1] FALSE
Try typing 0.3+0.6-0.9, on my system the result is -1.110223e-16 this is because the computer doesn't actually sum them as decimal numbers, it stores binary approximations, and sums those. And none of those numbers can be exactly represented in binary, so there is a small amount of error present in the calculations, and apparently it's small enough not to matter in the first one, but not the second.
Floating point arithmetic is not exact, but the == operator is. Use all.equal to compare two floating point values in R.
isTRUE(all.equal(0.3+0.6, 0.9))
You can also define a tolerance when calling all.equals.
isTRUE(all.equal(0.3+0.6, 0.9, tolerance = 0.001))

Why is this easy comparison false? [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 8 years ago.
Why does this simple statement evaluate to FALSE in R?
mean(c(7.18, 7.13)) == 7.155
Furthermore, what do I have to do in order to make this a TRUE statement? Thanks!
Floating point arithmetic is not exact. The answer to this question has more information.
You can actually see this:
> print(mean(c(7.18,7.13)), digits=16)
[1] 7.154999999999999
> print(7.155, digits=16)
[1] 7.155
In general, do not compare floating point numbers for equality (this applies to pretty much every programming language, not just R).
You can use all.equal to do an inexact comparison:
> all.equal(mean(c(7.18,7.13)), 7.155)
[1] TRUE
It's probably due to small rounding error. Rounding to the third decimal place shows that they are equal:
round(mean(c(7.18, 7.13)), 3) == 7.155
Generally, don't rely on numerical comparisons to give expected logical outputs :)

Resources