Extract all TRUE elements from a list in R - r

I have a list:
lst <- list(list(c(TRUE, TRUE, TRUE, TRUE),
c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
c(TRUE, TRUE)),
list(c(FALSE, FALSE),
c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
c(TRUE, TRUE,TRUE)))
I want to extract only TRUE element with their index.
The result have to be:
[[1]][[1]]
[1] TRUE TRUE TRUE TRUE
[[1]][[3]]
[1] TRUE TRUE
[[2]][[3]]
[1] TRUE TRUE TRUE

We loop through the nested list rename it with sequence and then extract if all are TRUE
lapply(lst, function(x) {x1 <- setNames(x, seq_along(x)); x1[sapply(x1, all)] })
#[[1]]
#[[1]]$`1`
#[1] TRUE TRUE TRUE TRUE
#[[1]]$`3`
#[1] TRUE TRUE
#[[2]]
#[[2]]$`3`
#[1] TRUE TRUE TRUE
Or another option is modify_depth from purrr, which result in empty list elements if the condition is not satisfied
library(purrr)
lst %>%
modify_depth(2, ~ .x[all(.x)])

Related

Find the *first* longest sequence of TRUE in a boolean vector

I need to find the first longest sequence of TRUE in a boolean vector. Some examples:
bool <- c(FALSE, TRUE, FALSE, TRUE)
# should become
c(FALSE, TRUE, FALSE, FALSE)
bool <- c(FALSE, TRUE, FALSE, TRUE, TRUE)
# should become
c(FALSE, FALSE, FALSE, TRUE, TRUE)
bool <- c(FALSE, TRUE, TRUE, FALSE, TRUE, TRUE)
# should become
c(FALSE, TRUE, TRUE, FALSE, FALSE, FALSE)
The answer from here handles all my cases correct, except the first one of the above examples.
How can I change
with(rle(bool), rep(lengths == max(lengths[values]) & values, lengths))
so that it also handles the first above example correct?
One option could be:
with(rle(bool), rep(seq_along(values) == which.max(lengths * values), lengths))
Results for the first vector:
[1] FALSE TRUE FALSE FALSE
For the second:
[1] FALSE FALSE FALSE TRUE TRUE
For the third:
[1] FALSE TRUE TRUE FALSE FALSE FALSE
Not elegant but might work:
bool <- c(FALSE, TRUE, FALSE, TRUE)
tt <- rle(bool)
t1 <- which.max(tt$lengths[tt$values])
tt$values[tt$values][-t1] <- FALSE
inverse.rle(tt)
#[1] FALSE TRUE FALSE FALSE
and as a function:
fun <- function(bool) {
tt <- rle(bool)
t1 <- which.max(tt$lengths[tt$values])
tt$values[tt$values][-t1] <- FALSE
inverse.rle(tt)
}
fun(c(FALSE, TRUE, FALSE, TRUE))
#[1] FALSE TRUE FALSE FALSE
fun(c(FALSE, TRUE, FALSE, TRUE, TRUE))
#[1] FALSE FALSE FALSE TRUE TRUE
fun(c(FALSE, TRUE, TRUE, FALSE, TRUE, TRUE))
#[1] FALSE TRUE TRUE FALSE FALSE FALSE
fun(FALSE)
#[1] FALSE
fun(logical(0))
#logical(0)

compare vector to dataframe by applying identical with booleans

Apply does not work, but using identical directly does:
Create the dataframe
gp130 <- data.frame(matrix(nrow=7,ncol=6))
rownames(gp130) <- c("ABCDEF","ABCDE","ABCD","ABC","AB","BCDEF","MUCV5")
names(gp130) <- c("A","B","C","D","E","F")
gp130$A <- c(TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE)
gp130$B <- c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE)
gp130$C <- c(TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE)
gp130$D <- c(TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE)
gp130$E <- c(TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE)
gp130$F <- c(TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE)
Evaluate dataframe
gp130
A B C D E F
ABCDEF TRUE TRUE TRUE TRUE TRUE TRUE
ABCDE TRUE TRUE TRUE TRUE TRUE FALSE
ABCD TRUE TRUE TRUE TRUE FALSE FALSE
ABC TRUE TRUE TRUE FALSE FALSE FALSE
AB TRUE TRUE FALSE FALSE FALSE FALSE
BCDEF FALSE TRUE TRUE TRUE TRUE TRUE
MUCV5 FALSE FALSE FALSE FALSE FALSE FALSE
Create a vector that matches column C
myv <- c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) ##matches column C
apply(gp130, 2, identical, myv)
A B C D E F
FALSE FALSE FALSE FALSE FALSE FALSE
Why is C FALSE?
identical(gp130$C, myv)
[1] TRUE
Ok, I think I've got it. sapplystrips the column names, while apply doesn't, the vectors become named vectors. See the output of the two versions below.
apply(gp130, 2, function(x){
identical(x, myv)
print(x) # prints names
print(myv)
})
sapply(gp130, function(x){
identical(x, myv)
print(x)
print(myv)
})

Split logical vector based on FALSE/TRUE patterns

Given logical vector x:
x <- c(FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE,
FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, TRUE)
How to split x based on every FALSE/TRUE patterns? Of course, we can simply do the split based on TRUE/FALSE patterns using !x.
So the split would search for the patterns FALSE, FALSE, ..., FALSE , TRUE, TRUE, ..., TRUE until we reach again a FALSE. At which point, we stop. Said differently, we do the split every time we move from a TRUE to a FALSE.
Here is what I ended up with:
p <- which(diff(x)==-1)+1
split(x, cumsum(seq_along(x) %in% p))
So the output is rightly:
# $`0`
# [1] FALSE FALSE FALSE TRUE TRUE
# $`1`
# [1] FALSE FALSE TRUE
# $`2`
# [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
# $`3`
# [1] FALSE TRUE
Any other solution to this problem? More efficient way to do this?

Replace/Modify values in logical vector (Pattern Matching)

The question looks simple but I didn't figure out how it can done in R.
I want to modify a logical vector depending on patterns of its values. There are two modification steps:
If there is a single FALSE surrounded by TRUE values, switch it to TRUE.
If there are less then 3 successive TRUE values, switch them to FALSE.
Everything else should remain as it is. Here's an example:
# input
x = c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
# output
xo = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE,
TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
cbind(x,xo) is
x xo
[1,] FALSE FALSE
[2,] TRUE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
[5,] TRUE FALSE
[6,] TRUE FALSE
[7,] FALSE FALSE
[8,] FALSE FALSE
[9,] TRUE TRUE
[10,] TRUE TRUE
[11,] TRUE TRUE
[12,] FALSE TRUE
[13,] TRUE TRUE
[14,] TRUE TRUE
[15,] FALSE FALSE
[16,] FALSE FALSE
[17,] TRUE TRUE
[18,] TRUE TRUE
[19,] TRUE TRUE
[20,] TRUE TRUE
[21,] FALSE FALSE
I dont want to use a for loop because its slow and I would have to do a lot of if statements.
Is there a better way to get this working?
Here is an approach:
#sample data
x <- c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE,
FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
First, find the indices where FALSE values need to be changed to TRUE values, by looking for FALSE values that follow and are followed by TRUE values
tochange <-
intersect(
intersect(
which(x == FALSE), # not strictly necessary
which(diff(x) == 1) # FALSEs followed by a TRUE
),
which(diff(x) == -1) + 1 # FALSEs that follow a TRUE
)
Change the values
x[tochange] <- TRUE
Next, look for runs of TRUE (and FALSE) that are less than 3 in length, and set them to FALSE.
xrle <- rle(x)
xrle$values[xrle$lengths < 3] <- FALSE
newx <- inverse.rle(xrle) # thanks to Frank for pointing out inverse.rle!
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#[10] TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE
#[19] TRUE TRUE FALSE
You can try rle (thanks to #Frank for the modification)
xtmp <- inverse.rle(within.list(rle(x),{
n <- length(values)
values[lengths == 1 & !values & ! seq_len(n) %in% c(1,n)] <- TRUE
}))
res <- inverse.rle(within.list(rle(xtmp),
values[lengths < 3 & values] <- FALSE
))
identical(xo,res) # TRUE
Try:
make_true <- function(x) {
string <- paste(as.numeric(x), collapse='')
ans <- gregexpr('(?=(101))', string, perl=T)
x[as.numeric(ans[[1]])+1L] <- TRUE
res <- rle(x)
res$values[res$lengths < 3] <- FALSE
inverse.rle(res)
}
The function takes advantage of the fact that T and F can be coerced to numeric. The pattern searched for is "101".

which rows match a given vector in R

I have a matrix A,
A = as.matrix(data.frame(col1 = c(1,1,2,3,1,2), col2 = c(-1,-1,-2,-3,-1,-2), col3 = c(2,6,1,3,2,4)))
And I have a vector v,
v = c(-1, 2)
How can I get a vector of TRUE/FALSE that compares the last two columns of the matrix and returns TRUE if the last two columns match the vector, or false if they don't?
I.e., If I try,
A[,c(2:3)] == v
I obtain,
col2 col3
[1,] TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
[5,] TRUE FALSE
[6,] FALSE FALSE
Which is not what I want, I want both columns to be the same as vector v, more like,
result = c(TRUE, FALSE, FALSE, FALSE, TRUE, FALSE)
Since the first, and 5th rows match the vector v entirely.
Here's a simple alternative
> apply(A[, 2:3], 1, function(x) all(x==v))
[1] TRUE FALSE FALSE FALSE TRUE FALSE
Ooops by looking into R mailing list I found an answer: https://stat.ethz.ch/pipermail/r-help/2010-September/254096.html,
check.equal <- function(x, y)
{
isTRUE(all.equal(y, x, check.attributes=FALSE))
}
result = apply(A[,c(2:3)], 1, check.equal, y=v)
Not sure I need to define a function and do all that, maybe there are easier ways to do it.
Here's another straightforward option:
which(duplicated(rbind(A[, 2:3], v), fromLast=TRUE))
# [1] 1 5
results <- rep(FALSE, nrow(A))
results[which(duplicated(rbind(A[, 2:3], v), fromLast=TRUE))] <- TRUE
results
# [1] TRUE FALSE FALSE FALSE TRUE FALSE
Alternatively, as one line:
duplicated(rbind(A[, 2:3], v), fromLast=TRUE)[-(nrow(A)+1)]
# [1] TRUE FALSE FALSE FALSE TRUE FALSE
A dirty one:
result <- c()
for(n in 1:nrow(A)){result[n] <-(sum(A[n,-1]==v)==2)}
> result
[1] TRUE FALSE FALSE FALSE TRUE FALSE

Resources