R: boolean OR and double equal - r

Say a=1; b=2. Why does (a|b)==1 yield TRUE but (a|b)==2 FALSE? What then is a simple way to return TRUE if either (or both) variable is a match?

If you look at the numeric values that TRUE and FALSE evaluate to, they are 1 and 0 respectively
as.numeric(c(TRUE, FALSE))
#[1] 1 0

| compares two Boolean values.
In this case, (a|b) itself returns TRUE because the numbers are coerced to Boolean values by turning 0 into FALSE, and everything else into TRUE.
From ?base::Logic:
Numeric and complex vectors will be coerced to logical values, with
zero being false and all non-zero values being true. Raw vectors are
handled without any coercion for !, &, | and xor, with these operators
being applied bitwise (so ! is the 1s-complement).
== doesn't work that way, though; it coerces the TRUE into it's numeric form, 1, so 1==2 returns FALSE.
From ?base::Comparison:
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.

Related

R: surprising behaviour of `identical`

In R, I stumbled upon this surprising behaviour of the function identical().
With simple ==:
(ncol(dpx)-1) == length(test)
TRUE
But with identical:
identical((ncol(dpx)-1) , length(test))
FALSE
They are both of type integer (81 each).
What is happening?
identical is the "safe and reliable way to test two objects for being exactly equal." ncol(dpx) - 1 returns a numeric vector due to 1 being numeric, while length returns an integer.
As pointed out by #amatsuo_net we could change the code slightly and convert the 1 to be of type integer.
identical((ncol(iris) + 1L - 1L), length(iris))
# [1] TRUE

Why do the expressions isTRUE(3) and isTRUE(NA) evaluate to false in R?

Since all values other than 0 are taken as true in R, isTRUE(3) should logically evaluate to True but it doesn't. Why so?
Also, I would like to know the reason behind isTRUE(NA) being evaluated to false.
Straight from the documentation (try ?isTRUE)
isTRUE(x) is an abbreviation of identical(TRUE, x), and so is true if and only if x is a length-one logical vector whose only element is TRUE and which has no attributes (not even names).
It's not just doing a check on value, it's doing a check to ensure it is a logical value.
I know in computer science often 0 is false and anything non-zero is true, but R approaches things from a statistics point of view, not a computer science point of view, so it's a bit stricter about the definition.
Saying this, you'll notice this if statement evaluates the way you would imagine
if(3){print("yay")}else{print("boo")}
It's just the way R going about evaluation. The function isTRUE is just more specific.
Also note these behaviours
FALSE == 0 is true
TRUE == 1 is true
TRUE == 3 is false
So R will treat 0 and 1 as false and true respectively when perform these sorts of evaluations.
I'm not sure what your planned implementation was (if there was any) but it's probably better trying to be precise about boolean logic in R, or test things beforehand.
As for NA, more strange behaviour.
TRUE & NA equates to NA
TRUE | NA equates to TRUE
In these cases R forces NA to a logical type, since anything or'd with TRUE is a TRUE, it can equate that. But the value would change depending on the second term in an and operation, so it returns NA.
As for your particular case, again isTRUE(NA) is equated as false because NA is not a length-one logical vector whose only element is TRUE.
Because this function bypass R's automatic conversion rules and check that x is literally TRUE or FALSE.

Subsetting a vector with a condition (excluding NA)

vector1 = c(1,2,3,NA)
condition1 = (vector1 == 2)
vector1[condition1]
vector1[condition1==TRUE]
In the above code, the condition1 is "FALSE TRUE FALSE NA",
and the 3rd and the 4th lines both gives me the result "2 NA"
which is not I expected.
I wanted elements whose values are really '2', not including NA.
Could anybody explain why R is designed to work in this way?
and how I can get the result I want with a simple command?
The subset vector[NA] will always be NA because the NA value is unknown and therefore the result of the subset is also unknown. %in% returns FALSE for NA, so it can be useful here.
vector1 = c(1,2,3,NA)
condition1 = (vector1 %in% 2)
vector1[condition1]
# [1] 2
If you are in RStudio and enter
?`[`
You will get the following explanation:
NAs in indexing
When extracting, a numerical, logical or character NA index picks an
unknown element and so returns NA in the corresponding element of a
logical, integer, numeric, complex or character result, and NULL for a
list. (It returns 00 for a raw result.)
When replacing (that is using indexing on the lhs of an assignment) NA
does not select any element to be replaced. As there is ambiguity as
to whether an element of the rhs should be used or not, this is only
allowed if the rhs value is of length one (so the two interpretations
would have the same outcome). (The documented behaviour of S was that
an NA replacement index ‘goes nowhere’ but uses up an element of
value: Becker et al p. 359. However, that has not been true of other
implementations.)
try the logical operator in that case,
vector1 = c(1,2,3,NA)
condition1<-(vector1==2 & !is.na(vector1) )
condition1
# FALSE TRUE FALSE FALSE
vector1[condition1]
# 2
& operation returns true when both of the logical operators are True.
identical is "The safe and reliable way to test two objects for being exactly equal. It returns TRUE in this case, FALSE in every other case." (see ?identical)
As it does not compare elementwise comparison you can use it in sapply to compare each element in vector1 to 2. I.e.:
condition1 = sapply(vector1, identical, y = 2)
which will give:
vector1[condition1]
[1] 2

Does operator precedence explain the difference in these expressions involving multiplication of a numeric with a negated logical?

I have three expressions, each involving multiplication with a logical or its negation. These logicals and their negation represent indicator variables, so that the expressions are conditionally evaluated:
-2*3*!T + 5*7*T
5*7*T + -2*3*!T
(-2*3*!T) + 5*7*T
I expect the above to produce the same result. However:
> -2*3*!T + 5*7*T
[1] 0 # unexpected!
> 5*7*T + -2*3*!T
[1] 35
> (-2*3*!T) + 5*7*T
[1] 35
I am sure this has something to do with operator precedence and type coercion, but I can't work out how it makes sense to even evaluate !T after the *.
You're exactly right that this is about operator precedence. As ?base::Syntax (which you link above) states, ! has lower precedence than all of the arithmetic operators, so the first expression is equivalent to
(-2*3)*!(T + 5*7*T)
(because the expression containing ! has to be evaluated before the final multiplication can be done) or
-6*!(36) # T coerced to 1 in numeric operations
or
-6*FALSE # non-zero numbers coerced to TRUE in logical operations
or
-6*0 # FALSE coerced to 0 in numeric operations

Why does "one" < 2 equal FALSE in R?

I'm reading Hadley Wickham's Advanced R section on coercion, and I can't understand the result of this comparison:
"one" < 2
# [1] FALSE
I'm assuming that R coerces 2 to a character, but I don't understand why R returns FALSE instead of returning an error. This is especially puzzling to me since
-1 < "one"
# TRUE
So my question is two-fold: first, why this answer, and second, is there a way of seeing how R converts the individual elements within a logical vector like these examples?
From help("<"):
If the two arguments are atomic vectors of different types, one is
coerced to the type of the other, the (decreasing) order of precedence
being character, complex, numeric, integer, logical and raw.
So in this case, the numeric is of lower precedence than the character. So 2 is coerced to the character "2". Comparison of strings in character vectors is lexicographic which, as I understand it, is alphabetic but locale-dependent.
It coerces 2 into a character, then it does an alphabetical comparison. And numeric characters are assumed to come before alphabetical ones
to get a general idea on the behavior try
'a'<'1'
'1'<'.'
'b'<'B'
'a'<'B'
'A'<'B'
'C'<'B'

Resources