R Subsetting Specific Value Also Returns NA?

R Subsetting Specific Value Also Returns NA? - r

I am just starting out on learning R and came across a piece of code as follows
vec_1 <- c("a","b", NA, "c","d")
# create a subet of all elements which equal "a"
vec_1[vec_1 == "a"]
The result from this is
## [1] "a" NA
Im just curious, since I am subsetting vec_1 for the value "a", why does NA also show up in my results?

This is because the result of anything == NA is NA. Even NA == NA is NA.
Here's the output of vec_1 == "a" -
[1] TRUE FALSE NA FALSE FALSE
and NA is not TRUE or FALSE so when you subset anything by NA you get NA. Check this out -
vec_1[NA]
[1] NA NA NA NA NA
When dealing with NA, R tries to provide the most informative answer i.e. T | NA returns TRUE because it doesn't matter what NA is. Here are some more examples -
T | NA
[1] TRUE
F | NA
[1] NA
T & NA
[1] NA
F & NA
[1] FALSE
R has no way to test equality with NA. In your case you can use %in% operator -
5 %in% NA
[1] FALSE
"a" %in% NA
[1] FALSE
vec_1[vec_1 %in% "a"]
[1] "a"

Related

Can someone explain this strange behavior of R's logical test results? [duplicate]

I cannot understand the properties of logical (boolean) values TRUE, FALSE and NA when used with logical OR (|) and logical AND (&). Here are some examples:
NA | TRUE
# [1] TRUE
NA | FALSE
# [1] NA
NA & TRUE
# [1] NA
NA & FALSE
# [1] FALSE
Can you explain these outputs?

To quote from ?Logic:
NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.
The key there is the word "ambiguous". NA represents something that is "unknown". So NA & TRUE could be either true or false, but we don't know. Whereas NA & FALSE will be false no matter what the missing value is.

It's explained in help("|"):
NA is a valid logical object. Where a component of x or y
is NA, the result will be NA if the outcome is ambiguous. In
other words NA & TRUE evaluates to NA, but NA & FALSE
evaluates to FALSE. See the examples below.
From the examples in help("|"):
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
# <NA> FALSE TRUE
# <NA> NA FALSE NA
# FALSE FALSE FALSE FALSE
# TRUE NA FALSE TRUE
outer(x, x, "|") ## OR table
# <NA> FALSE TRUE
# <NA> NA NA TRUE
# FALSE NA FALSE TRUE
# TRUE TRUE TRUE TRUE

lag() and lead() in base-R [duplicate]

This question already has answers here:
Shifting a vector
(3 answers)
Closed 3 years ago.
I'm used to using dplyr's lag() and lead() in my code, but I'm wondering -- is there a base R alternative?
For example, assume the following dataframe:
df<-data.frame(a=c("a","a","a","b","b"),stringsAsFactors=FALSE)
Using dplyr, I could do this to mark the beginning of a new grouping in a:
df %>% mutate(groupstart=a!=lag(a)|is.na(lag(a)))
a groupstart
1 a TRUE
2 a FALSE
3 a FALSE
4 b TRUE
5 b FALSE
Is there a way to do this in base R?

You could do something like this, where NAs are combined with a subset of df$a in lag_a, which is then compared with df$a:
lag_a <- c(rep(NA, 1), head(df$a, length(df$a) - 1))
df$groupstart <- df$a != lag_a | is.na(lag_a)
#### OUTPUT ####
a groupstart
1 a TRUE
2 a FALSE
3 a FALSE
4 b TRUE
5 b FALSE
You can generalize this principle in a function:
lead_lag <- function(v, n) {
if (n > 0) c(rep(NA, n), head(v, length(v) - n))
else c(tail(v, length(v) - abs(n)), rep(NA, abs(n)))
}
#### OUTPUT ####
lead_lag(df$a, 2) #[1] NA NA "a" "a" "a"
lead_lag(df$a, -2) #[1] "a" "b" "b" NA NA
lead_lag(df$a, 3) #[1] NA NA NA "a" "a"
lead_lag(df$a, -4) #[1] "b" NA NA NA NA

Replacing NAs from onle list by NAs in second list in equal positions in R

here is the problem. I have two lists of vectors. Those vectors have same length in same positions. But there are some NAs in those vectors. Data may looks like
HH
[[1]]
[1] 2 1 5 NA
[[2]]
[1] 2 0 5
[[3]]
[1] NA 1 NA
JJ
[[1]]
[1] 0 5 8 9
[[2]]
[1] NA 1 3
[[3]]
[1] 2 8 3
My goal is: have NAs in equal positions in both lists in all vectors. More exactly, write code, which will find NA in first list, nad replace value by NA in second list in equal position. I succesfully wrote similar function for vector, but i failed here. Can you help me? Here is my code.
D<-NULL
for(j in 1:length(PH)){
+ for(i in 1:length(PH[[j]])){
+ if(is.na(PH[[j]][i])==FALSE){
+ D[[j]][i]=AB[[j]][i]}
+ else{
+ D[[j]][i]=NA}}
+ }

Here's my two cents. Grabbing data from #Colonel's answer,
v1 <- unlist(firstlist)
v2 <- unlist(secondlist)
v1[is.na(v2)] <- NA
relist(v1, firstlist)
#[[1]]
#[1] NA "2" "3" NA
#[[2]]
#[1] "a" NA

You can use Map:
Map(function(u,v) {v[is.na(u)]<-NA;v}, firstlist, secondlist)
Example:
firstlist = list(c(1,2,3,NA), c('a',NA))
secondlist = list(c(NA,22,33,5), c('b','d'))
#[[1]]
#[1] NA 22 33 NA
#[[2]]
#[1] "b" NA

Not able to convert dates from string in R

Please help. Not sure, what i am doing wrong here. But, the below simple code for converting date from character is not working for me in R. It is giving NA, instead of any values.
x <- c("3-Sep-13","3-Oct-13","10-Nov-2014")
x
# [1] "3-Sep-13" "3-Oct-13" "10-Nov-2014"
class(x)
# [1] "character"
as.Date(x,format="%d-%m-%Y")
# [1] NA NA NA
format(as.Date(x,"%d-%m-%Y"))
# [1] NA NA NA
as.Date(x,format="%Y-%m-%d")
# [1] NA NA NA
format(as.Date(x,"%Y-%m-%d"))
# [1] NA NA NA

Besides what's already mentioned in comments, your character vector is ambiguous.
as.Date(x[1:2], "%d-%b-%y")
[1] "2013-09-03" "2013-10-03"
as.Date(x[3], "%d-%b-%Y")
[1] "2014-11-10"

Logical operators (AND, OR) with NA, TRUE and FALSE

I cannot understand the properties of logical (boolean) values TRUE, FALSE and NA when used with logical OR (|) and logical AND (&). Here are some examples:
NA | TRUE
# [1] TRUE
NA | FALSE
# [1] NA
NA & TRUE
# [1] NA
NA & FALSE
# [1] FALSE
Can you explain these outputs?

To quote from ?Logic:
NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.
The key there is the word "ambiguous". NA represents something that is "unknown". So NA & TRUE could be either true or false, but we don't know. Whereas NA & FALSE will be false no matter what the missing value is.

It's explained in help("|"):
NA is a valid logical object. Where a component of x or y
is NA, the result will be NA if the outcome is ambiguous. In
other words NA & TRUE evaluates to NA, but NA & FALSE
evaluates to FALSE. See the examples below.
From the examples in help("|"):
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
# <NA> FALSE TRUE
# <NA> NA FALSE NA
# FALSE FALSE FALSE FALSE
# TRUE NA FALSE TRUE
outer(x, x, "|") ## OR table
# <NA> FALSE TRUE
# <NA> NA NA TRUE
# FALSE NA FALSE TRUE
# TRUE TRUE TRUE TRUE

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R Subsetting Specific Value Also Returns NA? - r

Related

Can someone explain this strange behavior of R's logical test results? [duplicate]

lag() and lead() in base-R [duplicate]

Replacing NAs from onle list by NAs in second list in equal positions in R

Not able to convert dates from string in R

Logical operators (AND, OR) with NA, TRUE and FALSE

Categories

Resources