check elements of a list - r

I have a list of 1000 lists of booleans which is the result of a duplicated() check on the original lists of numbers. I need to find which of these lists contains a TRUE result and I need to know the position of the list where it appears in the 1000. i.e. I can then type
my.list[[456]]
[1] FALSE FALSE FALSE TRUE FALSE
And then use this to delete the elements from my list where a TRUE appears

# An example
l <- list(c(TRUE, FALSE), c(FALSE, FALSE), c(FALSE))
# The indices you want
l2 <- lapply(l, which)
# The number of TRUEs for each element of l
l3 <- lengths(l2)
# The initial list, without the elements containing a TRUE
l4 <- l[l3 == 0]

Related

Indexing tables of logical vectors with zero counts in R

I have the following:
> v1 <- c(T, F, T, T, F)
> table(v)
v
FALSE TRUE
2 3
To index the 'True' column, I do this:
> `table(v1)[2]`
TRUE
3
However, if a logical vector contains only FALSE values, the table will only have one column and the previos strategy no longer works to retrieve the TRUE column:
> v2 <- c(F, F, F, F, F)
> table(v2)[2]
<NA>
NA
How can one consistently index the TRUE column regardless of if its count is zero? One solution is to do this:
> table(factor(v2, levels= c("FALSE", "TRUE")))[2]
TRUE
0
But this feels like cheating because it treats TRUE and FALSE as characters that become levels of a factor. For non-logical vectors, this behaviour is understandable, because there is no way of knowing what levels exist. (1) Is there a way to force table() to take into consideration the fact that logical vectors only take on two values and always present two columns for them? (2) Am I overthinking this and the last command is an acceptable and robust practice?
Convert to factor with levels specified so that it always have two levels - without a TRUE value, there is no way the table to create the count of TRUE as that information is not present. With factor levels, it gives the TRUE count to be 0
table(factor(v2, levels = c(FALSE, TRUE)))[2]
It is not clear why a logical vector TRUE values needs to be counted with table and then extract based on the TRUE, FALSE names. It can be more easily done with sum as TRUE -> 1 and FALSE -> 0, negating (!) reverses this
> sum(v1)
[1] 3
> sum(!v1)
[1] 2
> sum(v2)
[1] 0
> sum(!v2)
[1] 5
Because the case of logical is so specific for the requirements, I would write a specific function:
logitable <- function(x)
{
x <- as.logical(x)
kNA <- sum(is.na(x))
kT <- sum(x, na.rm=TRUE)
kF <- length(x) - kT - kNA
return (structure(
c(kT, kF, kNA),
names = c("TRUE", "FALSE", "NA")
))
}
Please note that the type of the return object is not of class "table" --- let me know if this is important to you, to return such an object.
Test with:
logitable(c(T,F,T,F,T))
logitable(c(T,T,T,T,T))
logitable(c(F,F,F,F,F))
logitable(c(T,F,T,F,NA))

Determine which elements of a vector partially match a second vector, and which elements don't (in R)

I have a vector A, which contains a list of genera, which I want to use to subset a second vector, B. I have successfully used grepl to extract anything from B that has a partial match to the genera in A. Below is a reproducible example of what I have done.
But now I would like to get a list of which genera in A matched with something in B, and which which genera did not. I.e. the "matched" list would contain Cortinarius and Russula, and the "unmatched" list would contain Laccaria and Inocybe. Any ideas on how to do this? In reality my vectors are very long, and the genus names in B are not all in the same position amongst the other info.
# create some dummy vectors
A <- c("Cortinarius","Laccaria","Inocybe","Russula")
B <- c("fafsdf_Cortinarius_sdfsdf","sdfsdf_Russula_sdfsdf_fdf","Tomentella_sdfsdf","sdfas_Sebacina","sdfsf_Clavulina_sdfdsf")
# extract the elements of B that have a partial match to anything in A.
new.B <- B[grepl(paste(A,collapse="|"), B)]
# But now how do I tell which elements of A were present in B, and which ones were not?
We could use lapply or sapply to loop over the patterns and then get a named output
out <- setNames(lapply(A, function(x) grep(x, B, value = TRUE)), A)
THen, it is easier to check the ones returning empty elements
> out[lengths(out) > 0]
$Cortinarius
[1] "fafsdf_Cortinarius_sdfsdf"
$Russula
[1] "sdfsdf_Russula_sdfsdf_fdf"
> out[lengths(out) == 0]
$Laccaria
character(0)
$Inocybe
character(0)
and get the names of that
> names(out[lengths(out) > 0])
[1] "Cortinarius" "Russula"
> names(out[lengths(out) == 0])
[1] "Laccaria" "Inocybe"
You can use sapply with grepl to check for each value of A matching with ever value of B.
sapply(A, grepl, B)
# Cortinarius Laccaria Inocybe Russula
#[1,] TRUE FALSE FALSE FALSE
#[2,] FALSE FALSE FALSE TRUE
#[3,] FALSE FALSE FALSE FALSE
#[4,] FALSE FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE FALSE
You can take column-wise sum of these values to get the count of matches.
result <- colSums(sapply(A, grepl, B))
result
#Cortinarius Laccaria Inocybe Russula
# 1 0 0 1
#values with at least one match
names(Filter(function(x) x > 0, result))
#[1] "Cortinarius" "Russula"
#values with no match
names(Filter(function(x) x == 0, result))
#[1] "Laccaria" "Inocybe"

Subsetting a logical vector with a logical vector in R

(Note: following the suggestions in the comments, I have changed the original title "Comparing the content of two vectors in R?" to "Subsetting a logical vector with a logical vector in R")
I am trying to understand the following R code snippet (by the way, the question originated while I was trying to understand this example.)
I have a vector a defined as:
a = c(FALSE, FALSE)
Then I can define b:
b <- a
I check b's content and everything looks OK:
b
#> [1] FALSE FALSE
Question
Now, what is the following code doing? Is it checking if b is equal to "not" a?
b[!a]
#> [1] FALSE FALSE
But if I try b[a] the result is different:
b[a]
#> logical(0)
I also tried a different example:
a = c(FALSE, TRUE)
b <- a
b
#> [1] FALSE TRUE
Now I try the same operations as above, but I get a different result:
b[!a]
#> [1] FALSE
b[a]
#> [1] TRUE
Created on 2021-03-23 by the reprex package (v0.3.0)
[] is used for subsetting a vector. You can subset a vector using integer index or logical values.
When you are using logical vector to subset a vector, a value in the vector is selected if it is TRUE. In your example you are subsetting a logical vector with a logical vector which might be confusing. Let's take another example :
a <- c(10, 20)
b <- c(TRUE, FALSE)
a[b]
#[1] 10
Since 1st value is TRUE and second is FALSE, the first value is selected.
Now if we invert the values, 20 would be selected because !b returns FALSE TRUE.
a[!b]
#[1] 20
Now implement this same logic in your example -
a = c(FALSE, FALSE)
b <- a
!b returns TRUE TRUE, hence both the values are selected when you do b[!a] and the none of the value is selected when you do b[a].
b[!a] will result in displaying those values of b which are at TRUE positions as evalauted by !a.
!a is actually T, T therefore displays first and second values of b which are F and F
More efficiently please see this
a <- 1:4
b <- c(T, T, F, T)
now a[!b] will display a[c(F, F, T, F)] i.e. only third element of a

replace logical elements in a list based on another logical list in R

I am trying to change the logical values (elements) of my list based on another list. Basically, where both lists are "TRUE", I want to change the value in the main list to "FALSE". Both lists are lengths of 5. For example
List_A <- list(c(TRUE,FALSE,TRUE),c(FALSE,TRUE,TRUE),c(FALSE,FALSE,FALSE),c(TRUE,TRUE,TRUE),c(TRUE,FALSE,TRUE))
List_B <-list(c(FALSE,FALSE,FALSE),c(TRUE,TRUE,FALSE),c(TRUE,TRUE,TRUE),c(FALSE,FALSE,TRUE),c(FALSE,TRUE,TRUE))
List B has sequences as name attributes.
Desired output:
Output <-
list(c(TRUE,FALSE,TRUE),c(FALSE,FALSE,TRUE),c(FALSE,FALSE,FALSE),c(TRUE,TRUE,FALSE),c(TRUE,FALSE,FALSE))
In other words, elements in listA remain the same unless they have matching TRUE values in both lists, which replaces them to FALSE.
I've tried running the for loop below but it doesn't work and I don't know how I would redirect the output, if it did.
for(i in 1:length(List_A)) { List_A[[i]][List_B[[i]]] <- FALSE }
You can take help of Map function.
If both the values are TRUE turn to FALSE or keep value from List_A.
Output <- Map(function(x, y) replace(x, x & y, FALSE), List_A, List_B)
Output
#[[1]]
#[1] TRUE FALSE TRUE
#[[2]]
#[1] FALSE FALSE TRUE
#[[3]]
#[1] FALSE FALSE FALSE
#[[4]]
#[1] TRUE TRUE FALSE
#[[5]]
#[1] TRUE FALSE FALSE
data
List_A <- list(c(TRUE,FALSE,TRUE),c(FALSE,TRUE,TRUE),c(FALSE,FALSE,FALSE),c(TRUE,TRUE,TRUE),c(TRUE,FALSE,TRUE))
List_B <- list(c(FALSE,FALSE,FALSE),c(TRUE,TRUE,FALSE),c(TRUE,TRUE,TRUE),c(FALSE,FALSE,TRUE),c(FALSE,TRUE,TRUE))
We can use map2
library(purrr)
map2(List_A, List_B, ~ !(.x & .y))
data
List_A <- list(c(TRUE,FALSE,TRUE),c(FALSE,TRUE,TRUE),c(FALSE,FALSE,FALSE),c(TRUE,TRUE,TRUE),c(TRUE,FALSE,TRUE))
List_B <- list(c(FALSE,FALSE,FALSE),c(TRUE,TRUE,FALSE),c(TRUE,TRUE,TRUE),c(FALSE,FALSE,TRUE),c(FALSE,TRUE,TRUE))

Finding the existance of a vector within matrix within list within list

I have tried to use R to find a vector within a matrix within list within list. I have tried if the vector 'ab' exists by using the following 'exists' code but none of them work. How can I make it work?
aa <- list(x = matrix(1,2,3), y = 4, z = 3)
colnames(aa$x) <- c('ab','bb','cb')
aa
#$x
# ab bb cb
#[1,] 1 1 1
#[2,] 1 1 1
#
#$y
#[1] 4
#
#$z
#[1] 3
exists('ab', where=aa)
#[1] FALSE
exists('ab', where=aa$x)
# Error in exists("ab", where = aa$x) : invalid 'envir' argument
exists('ab', where=colnames(aa$x))
# Error in as.environment(where) : no item called "ab" on the search list
colnames(aa$x)
#[1] "ab" "bb" "cb"
The column names are part of either matrix or data.frames. So, we loop over the list using sapply, get the column names (colnames), unlist and check whether 'ab' is among that vector
'ab' %in% unlist(sapply(aa, colnames))
#[1] TRUE
If we want to be more specific for a particular list element, we extract the element (aa$x), get the column names and check whether 'ab' is among them.
'ab' %in% colnames(aa$x)
#[1] TRUE
Or another option would be to loop through 'aa', and if the element is a matrix, extract the 'ab' column and check whether it is a vector, wrap the sapply with any to get a single TRUE/FALSE output.
any(sapply(aa, function(x) if(is.matrix(x)) is.vector(x[, 'ab']) else FALSE))

Resources