Return all elements of list containing certain strings - r

I have a list of vectors containing strings and I want R to give me another list with all vectors that contain certain strings. MWE:
list1 <- list("a", c("a", "b"), c("a", "b", "c"))
Now, I want a list that contains all vectors with "a" and "b" in it. Thus, the new list should contain two elements, c("a", "b") and c("a", "b", "c").
As list1[grep("a|b", list1)] gives me a list of all vectors containing either "a" or "b", I expected list1[grep("a&b", list1)] to do what I want, but it did not (it returned a list of length 0).

This should work:
test <- list("a", c("a", "b"), c("a", "b", "c"))
test[sapply(test, function(x) sum(c('a', 'b') %in% x) == 2)]

Try purrr::keep
library(purrr)
keep(list1, ~ all(c("a", "b") %in% .))

We can use Filter
Filter(function(x) all(c('a', 'b') %in% x), test)
#[[1]]
#[1] "a" "b"
#[[2]]
#[1] "a" "b" "c"

A solution with grepl:
> list1[grepl("a", list1) & grepl("b", list1)]
[[1]]
[1] "a" "b"
[[2]]
[1] "a" "b" "c"

Related

Finding in which vector does the element belong to

suppose I have 3 vectors:
a = c("A", "B", "C")
b = c("D", "E", "F")
c = c("G", "H", "I")
then I have an element:
element = "E"
I want to find which list does my element belongs to. In this case, list b.
It will be appreciated if the solution to this problem is more general because my real data set have more than a hundred lists.
element = "E"
names(our_lists)[sapply(our_lists, `%in%`, x = element)]
# [1] "b"
Data
our_lists <- list(
a = c("A", "B", "C"),
b = c("D", "E", "F"),
c = c("G", "H", "I")
)
Using grep.
element <- "E"
l <- mget(c("a", "b", "c"))
names(l)[grep(element, l)]
# [1] "b"
If you keep the data in individual objects, you need to check for the element in each one individually. Get them in a list.
list_data <- mget(c('a', 'b', 'c'))
names(Filter(any, lapply(list_data, `==`, element)))
#[1] "b"
If all your vectors have the same length then a vectorised idea can be,
c('a', 'b', 'c')[ceiling(which(c(a, b, c) == 'E') / length(a))]
#[1] "b"
You can use dplyr::lst that creates named list from variable names. Then purrr::keep to keep only the vectors that contain your element.
require(tidyverse)
lst(a, b, c) %>%
keep(~ element %in% .x) %>%
names()
output:
[1] "b"

R use mapply on nested list

Using base R, I'd like to use the mapply function on a nested list. For example, in the code below, I'm trying to remove the letter "a" from each element of a nested list. I'd like to replace the last two lines with just a single line of code.
mylist <- list(
list(c("a", "b", "c"), c("d", "e", "f")),
list(c("a", "v", "w"), c("x", "y"), c("c", "b", "a"))
)
mylist
not_a <- lapply(mylist, lapply, `!=`, "a")
not_a
mylist[[1]] <- mapply(`[`, mylist[[1]], not_a[[1]], SIMPLIFY = FALSE)
mylist[[2]] <- mapply(`[`, mylist[[2]], not_a[[2]], SIMPLIFY = FALSE)
One option could be:
rapply(mylist, how = "replace", function(x) x[x != "a"])
[[1]]
[[1]][[1]]
[1] "b" "c"
[[1]][[2]]
[1] "d" "e" "f"
[[2]]
[[2]][[1]]
[1] "v" "w"
[[2]][[2]]
[1] "x" "y"
[[2]][[3]]
[1] "c" "b"
Or using map2
library(purrr)
map2(mylist, not_a, ~ map2(.x, .y, `[`))
Or using map_depth (if the OP is interested only in the final outcome)
map_depth(mylist, 2, ~ .x[.x != 'a'])
#[[1]]
#[[1]][[1]]
#[1] "b" "c"
#[[1]][[2]]
#[1] "d" "e" "f"
#[[2]]
#[[2]][[1]]
#[1] "v" "w"
#[[2]][[2]]
#[1] "x" "y"
#[[2]][[3]]
#[1] "c" "b"
Or more compactly
map_depth(mylist, 2, setdiff, 'a')
A double loop Map/mapply will do what the question asks for.
Map(function(i) mapply(`[`, mylist[[i]], not_a[[i]], SIMPLIFY = FALSE), seq_along(mylist))
Simpler:
Map(function(x, y) Map(`[`, x, y), mylist, not_a)

Generate all combinations (and their sum) of a vector of characters in R

Suppose that I have a vector of length n and I need to generate all possible combinations and their sums. For example:
If n=3, we have:
myVec <- c("a", "b", "c")
Output =
"a"
"b"
"c"
"a+b"
"a+c"
"b+c"
"a+b+c"
Note that we consider that a+b = b+a, so only need to keep one.
Another example if n=4,
myVec <- c("a", "b", "c", "d")
Output:
"a"
"b"
"c"
"d"
"a+b"
"a+c"
"a+d"
"b+c"
"b+d"
"c+d"
"a+b+c"
"a+c+d"
"b+c+d"
"a+b+c+d"
We can use sapply with varying length in combn and use paste as function to apply.
sapply(seq_along(myVec), function(n) combn(myVec, n, paste, collapse = "+"))
#[[1]]
#[1] "a" "b" "c"
#[[2]]
#[1] "a+b" "a+c" "b+c"
#[[3]]
#[1] "a+b+c"
myVec <- c("a", "b", "c", "d")
sapply(seq_along(myVec), function(n) combn(myVec, n, paste, collapse = "+"))
#[[1]]
#[1] "a" "b" "c" "d"
#[[2]]
#[1] "a+b" "a+c" "a+d" "b+c" "b+d" "c+d"
#[[3]]
#[1] "a+b+c" "a+b+d" "a+c+d" "b+c+d"
#[[4]]
#[1] "a+b+c+d"
We can unlist if we need output as single vector.

Returning the values of a list based on "two" parameters

Very new to R. So I am wondering if you can use two different parameters to get the position of both elements from a list. See the below example...
x <- c("A", "B", "A", "A", "B", "B", "C", "C", "A", "A", "B")
y <- c(which(x == "A"))
[1] 1 3 4 9 10
x[y]
[1] "A" "A" "A" "A" "A"
x[y+1]
[1] "B" "A" "B" "A" "B"
But I would like to return the positions of both y and y+1 together in the same list. My current solution is to merge the two above lists by row number and create a dataframe from there. I don't really like that and was wondering if there is another way. Thanks!
I dont know what exactly you want, but this could help:
newY = c(which(x == "A"),which(x == "A")+1)
After that you can sort it with
finaldata <- newY[order(newY)]
Or you do both in one step:
finaldata <- c(which(x == "A"),which(x == "A")+1)[order(c(which(x == "A"),which(x == "A")+1))]
Then you could also delete duplicates if you want to. Please tell me if this is what you wanted.

Order a numeric vector by length in R

I've got two numeric vectors that I want to order by the length of the their observations, i.e., the number of times each observation appears.
For example:
x <- c("a", "a", "a", "b", "b", "b", "b", "c", "e", "e")
Here, b occurs four times, a three times, e two and c one time. I'd like my result in this order.
ans <- c("b", "b", "b", "b", "a", "a", "a", "e", "e", "c")
I´ve tried this:
x <- x[order(-length(x))] # and some similar lines.
Thanks
Using rle you can get values lenghts. You order lengths, and use values to recreate the vector again using the new order:
xx <- c('a', 'a', 'a', 'b', 'b', 'b','b', 'c', 'e', 'e')
rr <- rle(xx)
ord <- order(rr$lengths,decreasing=TRUE)
rep(rr$values[ord],rr$length[ord])
## [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"
You may also use ave when calculating the lengths
x[order(ave(x, x, FUN = length), decreasing = TRUE)]
# [1] "b" "b" "b" "b" "a" "a" "a" "e" "e" "c"

Resources