Logical operators (AND, OR) with NA, TRUE and FALSE - r

I cannot understand the properties of logical (boolean) values TRUE, FALSE and NA when used with logical OR (|) and logical AND (&). Here are some examples:
NA | TRUE
# [1] TRUE
NA | FALSE
# [1] NA
NA & TRUE
# [1] NA
NA & FALSE
# [1] FALSE
Can you explain these outputs?

To quote from ?Logic:
NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.
The key there is the word "ambiguous". NA represents something that is "unknown". So NA & TRUE could be either true or false, but we don't know. Whereas NA & FALSE will be false no matter what the missing value is.

It's explained in help("|"):
NA is a valid logical object. Where a component of x or y
is NA, the result will be NA if the outcome is ambiguous. In
other words NA & TRUE evaluates to NA, but NA & FALSE
evaluates to FALSE. See the examples below.
From the examples in help("|"):
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
# <NA> FALSE TRUE
# <NA> NA FALSE NA
# FALSE FALSE FALSE FALSE
# TRUE NA FALSE TRUE
outer(x, x, "|") ## OR table
# <NA> FALSE TRUE
# <NA> NA NA TRUE
# FALSE NA FALSE TRUE
# TRUE TRUE TRUE TRUE

Related

Can someone explain this strange behavior of R's logical test results? [duplicate]

I cannot understand the properties of logical (boolean) values TRUE, FALSE and NA when used with logical OR (|) and logical AND (&). Here are some examples:
NA | TRUE
# [1] TRUE
NA | FALSE
# [1] NA
NA & TRUE
# [1] NA
NA & FALSE
# [1] FALSE
Can you explain these outputs?
To quote from ?Logic:
NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.
The key there is the word "ambiguous". NA represents something that is "unknown". So NA & TRUE could be either true or false, but we don't know. Whereas NA & FALSE will be false no matter what the missing value is.
It's explained in help("|"):
NA is a valid logical object. Where a component of x or y
is NA, the result will be NA if the outcome is ambiguous. In
other words NA & TRUE evaluates to NA, but NA & FALSE
evaluates to FALSE. See the examples below.
From the examples in help("|"):
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
# <NA> FALSE TRUE
# <NA> NA FALSE NA
# FALSE FALSE FALSE FALSE
# TRUE NA FALSE TRUE
outer(x, x, "|") ## OR table
# <NA> FALSE TRUE
# <NA> NA NA TRUE
# FALSE NA FALSE TRUE
# TRUE TRUE TRUE TRUE

How to list only variables that have been evaluated?

When using ls() in a function, it lists arguments of the function even if they've not been evaluated yet (even if they are missing from the call with no default value).
fun <- function(x,y,z,m){
a <- 1
y <- 1
force(z)
print(ls())
mget(ls())
}
fun(i,j,42)
# [1] "a" "m" "x" "y" "z"
Error in mget(ls()) : object 'i' not found
How can I list only evaluated variables ?
In that case I would be happy with a modified list giving either of :
# [1] "a" "y" "z"
# [1] "a" "y"
Alternatively (or additionally), a logical list telling me if arguments have been evaluated (or overwritten) would be great : in that case list(x = FALSE, y = TRUE, z = TRUE, m = FALSE)
Well, this is kind of close, there is a is_promise function in pryr. It expects a symbol but the unexported version is_promise2 can take a name. So something like this maybe
fun <- function(x,y,z,m){
a <- 1
y <- 1
force(z)
mget(ls()[!sapply(ls(), pryr:::is_promise2, environment())])
}
fun(i, j, 42)
which at least gets rid of the message about i. But doesn't seem to capture x. But just like is_promise2 does, I think you're going to have to dip into c/c++ land to find out information about evaluation/promise status because I think R tries to hide most of that from the user.
MrFlick's answer is what I was looking for, additional relevant information can be gathered using the function below, which is wrapped around trace for ease of use.
Better sample data
defined_in_global <- 1
enclosing_fun <- function(){
defined_in_enclos <- quote(qux)
function(not_evaluated,
overridden = "bar",
forced = "baz",
defined_in_global,
defined_in_enclos,
missing_with_default = 1,
missing_overriden,
missing_absent){
overridden <- TRUE
missing_overridden <- "a"
new_var <- 1
}
}
How to use, without trying to evaluate
fun <- enclosing_fun()
diagnose_vars(fun)
fun(not_evaluated = foo)
#> Tracing fun(not_evaluated = foo) on exit
#> name evaluable type is_formal missing absent_from_call is_promise has_default_value default_value called_with_value exists_in_parent exists_in_enclos
#> 1 not_evaluated FALSE <NA> TRUE FALSE FALSE TRUE FALSE NA foo FALSE FALSE
#> 2 overridden TRUE logical TRUE FALSE TRUE FALSE TRUE "bar" <NA> FALSE FALSE
#> 3 forced FALSE <NA> TRUE TRUE TRUE TRUE TRUE "baz" <NA> FALSE FALSE
#> 4 defined_in_global FALSE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> TRUE TRUE
#> 5 defined_in_enclos FALSE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE TRUE
#> 6 missing_with_default FALSE <NA> TRUE TRUE TRUE TRUE TRUE 1 <NA> FALSE FALSE
#> 7 missing_overriden FALSE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE FALSE
#> 8 missing_absent FALSE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE FALSE
#> 9 missing_overridden TRUE character FALSE NA NA NA NA NA <NA> FALSE FALSE
#> 10 new_var TRUE double FALSE NA NA NA NA NA <NA> FALSE FALSE
How to use, trying to evaluate
diagnose_vars(fun, eval = TRUE)
fun(not_evaluated = foo)
#> Tracing fun(not_evaluated = foo) on exit
#> name evaluable type is_formal missing absent_from_call is_promise has_default_value default_value called_with_value exists_in_parent exists_in_enclos
#> 1 not_evaluated TRUE <NA> TRUE FALSE FALSE TRUE FALSE NA foo FALSE FALSE
#> 2 overridden FALSE logical TRUE FALSE TRUE FALSE TRUE "bar" <NA> FALSE FALSE
#> 3 forced FALSE character TRUE TRUE TRUE TRUE TRUE "baz" <NA> FALSE FALSE
#> 4 defined_in_global TRUE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> TRUE TRUE
#> 5 defined_in_enclos TRUE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE TRUE
#> 6 missing_with_default FALSE double TRUE TRUE TRUE TRUE TRUE 1 <NA> FALSE FALSE
#> 7 missing_overriden TRUE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE FALSE
#> 8 missing_absent TRUE <NA> TRUE TRUE TRUE FALSE FALSE NA <NA> FALSE FALSE
#> 9 missing_overridden FALSE character FALSE NA NA NA NA NA <NA> FALSE FALSE
#> 10 new_var FALSE double FALSE NA NA NA NA NA <NA> FALSE FALSE
The code
diagnose_vars <- function(f, eval = FALSE, on.exit = TRUE, ...) {
eval(substitute(
if(on.exit) trace(..., what =f, exit = quote({
diagnose_vars0(eval, print = TRUE)
untrace(f)}))
else trace(..., what =f, tracer = diagnose_vars0(eval, print = TRUE),
exit = substitute(untrace(f)), ...)
))
invisible(NULL)
}
diagnose_vars0 <- function(eval = FALSE, print = FALSE){
f_env <- parent.frame()
mc <- eval(quote(match.call()), f_env)
f <- eval.parent(mc[[1]],2)
f_parent_env <- parent.frame(2)
f_enclos <- rlang::fn_env(f)
vars <- ls(f_env)
fmls <- eval(quote(formals()), f_env)
fml_nms <- names(fmls)
fml_syms <- rlang::syms(fml_nms)
mc_args <- as.list(mc)[-1]
# compute complete df cols when possible
is_formal <- vars %in% fml_nms
# build raw df, with NA cols when necessary to initiate
data <- data.frame(row.names = vars,
name = vars,
evaluable = NA,
type = NA,
is_formal,
missing = NA,
absent_from_call = NA,
is_promise = NA,
has_default_value = NA)
# absent_from_call : different from missing when variable is overriden
data[fml_nms, "absent_from_call"] <- ! fml_nms %in% names(mc_args)
# promise
data[fml_nms, "is_promise"] <- sapply(fml_nms, pryr:::is_promise2, f_env)
# missing
data[fml_nms, "missing"] <- sapply(fml_syms, function(x)
eval(substitute(missing(VAR), list(VAR = x)), f_env))
# has default values
formal_has_default_value <- !sapply(fmls,identical, alist(x=)[[1]])
data[fml_nms, "has_default_value"] <- formal_has_default_value
# default values
data$default_value <-
vector("list",length(vars))
data$default_value[] <- NA
data[fml_nms[formal_has_default_value], "default_value"] <-
sapply(fmls[formal_has_default_value], deparse)
# called_with_value
data[names(mc_args), "called_with_value"] <-
sapply(mc_args, deparse)
# exists
data$exists_in_parent <- sapply(vars, exists, envir= f_parent_env)
data$exists_in_enclos <- sapply(vars, exists, envir= f_enclos)
# types
if(eval){
types <- sapply(vars, function(x)
try(eval(bquote(typeof(.(as.symbol(x)))), f_env),silent = TRUE))
data$type <- ifelse(startsWith(types,"Error"), NA, types)
data$evaluable <- is.na(data$type)
} else {
data$evaluable <-
with(data,!is_formal | (!is_promise & !missing))
data$type[data$evaluable] <-
sapply(mget(vars[data$evaluable], f_env), typeof)
}
# arrange
data <- rbind(data[fml_nms,],data[!data$name %in% fml_nms,])
row.names(data) <- NULL
if (print) print(data) else data
}

NA incorrectly appearing in selected/subsetted data

I'm stumped by the following:
z <- data.frame(a=c(1,2,3,4,5,6), b=c("Yes","Yes","No","No","",NA))
is.na(z$b)
[1] FALSE FALSE FALSE FALSE FALSE TRUE
z$a[z$b=="Yes"]
[1] 1 2 NA
is.na(z$a[z$b=="Yes"])
[1] FALSE FALSE TRUE
Why is it that when I select z$b=="Yes", NA appears as a third value for the subsetted z$a?
When I subset, however, this isn't a problem:
subset(z, b=="Yes")$a
[1] 1 2
Many thanks in advance.

R Programming - pmax with data frame

I was trying to use pmax function in my program. I had a data frame of numbers and I was trying to compare it with a single number. Output had NA's.
I figured out that data frame does not work with pmax so I changed data frame to matrix. And it worked. I was curious as to why data frame was returning NA. Is it something to do with recycling?
code:-
mat <- matrix(runif(500), nrow = 20, ncol = 5)
df <- as.data.frame(mat)
pmax(mat, .5) # No NA's
pmax(df, .5) # Many NA's
It is one of the problems where the replication of the value on the second argument is not recycled fully, i.e. it depends on the number of columns. This could be the reason
rep(0.5, ncol(df))[df < 0.5]
#[1] 0.5 0.5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
#[41] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Note that for the first 2 values, the 0.5 is correctly being changed as the logical matrix is TRUE for those elements df < 0.5 while it is not the case because 0.5 is replicated only based on the number of columns.
Suppose, we look at pmax, the line
mmm[change] <- each[change]
is problematic. We can check the output by printing the output of 'each' and 'each[change]. If we modify the function to include the print statement
pmax2 <- function (..., na.rm = FALSE)
{
elts <- list(...)
if (length(elts) == 0L)
stop("no arguments")
if (all(vapply(elts, function(x) is.atomic(x) && !is.object(x),
NA))) {
mmm <- .Internal(pmax(na.rm, ...))
mostattributes(mmm) <- attributes(elts[[1L]])
}
else {
mmm <- elts[[1L]]
has.na <- FALSE
as <- methods::as
asL <- function(x) if (isS4(x))
as(x, "logical")
else x
for (each in elts[-1L]) {
l1 <- length(each)
l2 <- length(mmm)
if (l2 && (l2 < l1 || !l1)) {
if (l1%%l2)
warning("an argument will be fractionally recycled")
mmm <- rep(mmm, length.out = l1)
}
else if (l1 && (l1 < l2 || !l2)) {
if (l2%%l1)
warning("an argument will be fractionally recycled")
each <- rep(each, length.out = l2)
}
na.m <- is.na(mmm)
na.e <- is.na(each)
if (has.na || (has.na <- any(na.m) || any(na.e))) {
if (any(na.m <- asL(na.m)))
mmm[na.m] <- each[na.m]
if (any(na.e <- asL(na.e)))
each[na.e] <- mmm[na.e]
}
nS4 <- !isS4(mmm)
if (isS4(change <- mmm < each) && (nS4 || !isS4(each)))
change <- as(change, "logical")
change <- change & !is.na(change)
print(change)
mmm[change] <- each[change]
print(each)
print(each[change])
if (has.na && !na.rm)
mmm[na.m | na.e] <- NA
if (nS4)
mostattributes(mmm) <- attributes(elts[[1L]])
}
}
mmm
}
Now, we check the print output based on applying pmax2 on 'df'
invisible(pmax2(df, 0.5))
# V1 V2 V3 V4 V5
# [1,] TRUE TRUE TRUE TRUE FALSE
# [2,] TRUE FALSE TRUE TRUE TRUE
# [3,] FALSE FALSE TRUE TRUE FALSE
# [4,] FALSE TRUE TRUE TRUE TRUE
# [5,] FALSE TRUE TRUE FALSE TRUE
# [6,] FALSE FALSE TRUE TRUE TRUE
# [7,] TRUE TRUE TRUE FALSE TRUE
# [8,] FALSE FALSE TRUE FALSE FALSE
# [9,] FALSE FALSE TRUE FALSE TRUE
#[10,] TRUE TRUE TRUE TRUE FALSE
#[11,] FALSE TRUE TRUE TRUE TRUE
#[12,] TRUE TRUE FALSE TRUE FALSE
#[13,] FALSE TRUE TRUE TRUE FALSE
#[14,] FALSE TRUE FALSE FALSE TRUE
#[15,] TRUE FALSE FALSE FALSE TRUE
#[16,] FALSE TRUE FALSE TRUE FALSE
#[17,] TRUE FALSE TRUE FALSE FALSE
#[18,] TRUE FALSE TRUE FALSE TRUE
#[19,] FALSE FALSE TRUE TRUE TRUE
#[20,] TRUE FALSE TRUE FALSE TRUE
#[1] 0.5 0.5 0.5 0.5 0.5
# [1] 0.5 0.5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
#[41] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Note that this is exactly the same output we got with the rep mentioned earlier.
However, on a matrix this is not executed because of the if/else statements
invisible(pmax2(mat, 0.5))
nothing is printed
It is better to apply pmax on a matrix when compared with a single element than on a data.frame and that element. Otherwise, we can unlist the data.frame or convert it to matrix
all.equal(c(pmax(mat, .5)), pmax(unlist(df), .5), check.attributes = FALSE)
#[1] TRUE
data
set.seed(24)
mat <- matrix(runif(500), nrow = 20, ncol = 5)
df <- as.data.frame(mat)

Vectorized OR function that evaluates FALSE | NA and NA | FALSE as FALSE?

Given the vectors:
vect1 <- c(TRUE,FALSE,FALSE,NA,NA,NA,TRUE,FALSE,NA,FALSE)
vect2 <- c(TRUE,NA,FALSE,NA,FALSE,TRUE,FALSE,NA,TRUE,NA)
vect3 <- vect1 | vect2
vect3 #c(TRUE,NA,FALSE,NA,NA,TRUE,TRUE,NA,TRUE,NA)
Is there a vectorized infix function x that evaluates elements like this:
TRUE x TRUE #TRUE
TRUE x FALSE #TRUE
FALSE x TRUE #TRUE
FALSE x FALSE #FALSE
TRUE x NA #TRUE
NA x TRUE #TRUE
FALSE x NA #FALSE - would have been NA with ordinary "|"
NA x FALSE #FALSE - would have been NA with ordinary "|"
NA x NA #NA
Producing a vector vect4 like this:
vect4 #c(TRUE,FALSE,FALSE,NA,FALSE,TRUE,TRUE,FALSE,TRUE,FALSE)
Or is there any other simple method to output vect4 from vect1 and vect2?
You can compute the paralell maximum (with na.rm = TRUE) and convert to logical:
as.logical(pmax(vect1, vect2, na.rm = TRUE))
# [1] TRUE FALSE FALSE NA FALSE TRUE TRUE FALSE TRUE FALSE
Note that by computing maxima of logical vectors, TRUE is interpreted as integer 1 and FALSE as integer 0.

Resources