which() function for lists in R - r

This should be easy, but I am hoping to find out how to return the indices of a list that contain one element. For example, in the list below, let's say I want to find all indices where "a" is an element. I would want a function to return the index 1.
> x = list(c("a", "b"), "c")
> x
[[1]]
[1] "a" "b"
[[2]]
[1] "c"
> which(x=="a")
integer(0)
Of course, which() does not work here. Any help would be appreciated!

You need to iterate over the list elements and check for the element in each set. The
sapply(x, function(e) is.element('a', e))
## [1] TRUE FALSE
which(sapply(x, function(e) is.element('a', e)))
## [1] 1
The sapply expression returns a logical vector, indicating the presence of a each element of the list, and which returns the indices of the TRUE elements.

It's not exactly clear to me how you want result formatted. Since there are two list elements, it would be difficult to determine which list element the match came from when you have a longer list, should you simply want the indices as a vector. You could use which here. Just write
sapply(x, function(y) which(y == "a"))
Or you could use grep, which returns the index of the matched pattern. Here I'll show it used on the unlisted list, and then iterated over the list.
> grep("a", unlist(x))
# [1] 1
> sapply(x, function(y) grep("a", y))
# [[1]]
# [1] 1
# [[2]]
# integer(0)
You could also use %in% to see exactly where the occurrences of "a" are. This returns a logical vector.
> lapply(x, `%in%`, "a") ## or lapply(x, `==`, "a")
# [[1]]
# [1] TRUE FALSE
# [[2]]
# [1] FALSE

Related

subset of list of vector with grep?

I have a list of vector and I want to create a new list containing any value containing the letter 'a' but keep in internal structure.
l = list ( g1 = c('a','b','ca') ,
g2 = c('a','b') )
lapply(l, function(x) grep('a',x) )
lapply on provides the index number but what I want it to return are the values.
The end result should be a list with vector g1 containing a and ca whilst g2 with just a.
thanks!
Add value = TRUE.
lapply(l, function(x) grep('a', x, value = TRUE))
# $g1
# [1] "a" "ca"
#
# $g2
# [1] "a"
Alternatively, you can do:
lapply(l, function(x) x[grepl("a", x)])
$g1
[1] "a" "ca"
$g2
[1] "a"
If you want to try with tidyverse here are couple of approaches.
library(tidyverse)
map(l, ~grep('a', .x, value=T))
map(l, ~str_subset(.x, 'a')) # str_subset from stringr package is a wrapper for grep shown above.

Is there a R function for limiting the length of list elements?

I am struggling with a list manipulation in R right now. I have a list containing about 3000 elements, where each element is a character vector. The length of these character vectors is between 7 and 10.
I would like to manipulate this list in such a way, that those character vectors, that contain more than 7 elements, are limited to only the first 7 elements - hence drop the 8th, 9th, and 10th element/word/number of the respective character vector of the list.
Is there an easy way to do this? I hope you understand what I mean.
Thanks in advance!
You can use lapply as below:
mylist <- list(a = c("a", "b"),
b = c("a", "b", "c"))
mylist
$a
[1] "a" "b"
$b
[1] "a" "b" "c"
mylist2 <- lapply(mylist, function(x) {
x[1:min(length(x), 2)]
})
mylist2
$a
[1] "a" "b"
$b
[1] "a" "b"
What you need is an auxiliary function that will shorten your vector. Something like
shorten_vector <- function(y, max_length = 7){
# NOTE: assumes that there are at least 7 elements in the vector.
y[seq_len(max_length)]
}
you can then shorten the vectors in your list with
lapply(your_list, shorten_vector)
Or better
lapply(your_list, head, 7) # Thanks Moody
Reproducible example
# Make an object for an example. A list of length 15
# where each element is a character vector between length 7 and 10
random_length <- sample(7:10, 15, replace = TRUE)
char_list <-
lapply(random_length,
function(x){
letters[seq_len(x)]
})
# utility function
shorten_vector <- function(y, max_length = 7){
y[seq_len(max_length)]
}
lapply(char_list,
shorten_vector)
Bonus
You said in a comment on Sonny's answer that you weren't really sure how the lapply worked. At it's conceptual core, lapply is a wrapper around a for loop. The equivalent for loop would be
for(i in seq_along(char_list)){
char_list[[i]] <- shorten_vector(char_list[[i]])
}
char_list
The lapply just handles the iteration limits for you and looks a little cleaner on the screen.

Remove duplicated elements from list

I have a list of character vectors:
my.list <- list(e1 = c("a","b","c","k"),e2 = c("b","d","e"),e3 = c("t","d","g","a","f"))
And I'm looking for a function that for any character that appears more than once across the list's vectors (in each vector a character can only appear once), will only keep the first appearance.
The result list for this example would therefore be:
res.list <- list(e1 = c("a","b","c","k"),e2 = c("d","e"),e3 = c("t","g","f"))
Note that it is possible that an entire vector in the list is eliminated so that the number of elements in the resulting list doesn't necessarily have to be equal to the input list.
We can unlist the list, get a logical list using duplicated and extract the elements in 'my.list' based on the logical index
un <- unlist(my.list)
res <- Map(`[`, my.list, relist(!duplicated(un), skeleton = my.list))
identical(res, res.list)
#[1] TRUE
Here is an alternative using mapply with setdiff and Reduce.
# make a copy of my.list
res.list <- my.list
# take set difference between contents of list elements and accumulated elements
res.list[-1] <- mapply("setdiff", res.list[-1],
head(Reduce(c, my.list, accumulate=TRUE), -1))
Keeping the first element of the list, we compute on subsequent elements and the a list of the cumulative vector of elements produced by Reduce with c and the accumulate=TRUE argument. head(..., -1) drops the final list item containing all elements so that the lengths align.
This returns
res.list
$e1
[1] "a" "b" "c" "k"
$e2
[1] "d" "e"
$e3
[1] "t" "g" "f"
Note that in Reduce, we could replace c with function(x, y) unique(c(x, y)) and accomplish the same ultimate output.
I found the solutions here very complex for my understanding and sought a simpler technique. Suppose you have the following list.
my_list <- list(a = c(1,2,3,4,5,5), b = c(1,2,2,3,3,4,4),
d = c("Mary", "Mary", "John", "John"))
The following much simpler piece of code removes the duplicates.
sapply(my_list, unique)
You will end up with the following.
$a
[1] 1 2 3 4 5
$b
[1] 1 2 3 4
$d
[1] "Mary" "John"
There is beauty in simplicity!

Delete elements of list appearing before one element and itself with R

I have a list of elements (letter here in the example)
(l <- list(letters[1:2], letters[2:3]))
# [[1]]
# [1] "a" "b"
# [[2]]
# [1] "b" "c"
And another elements
(r <- letters[2])
# [1] "b"
I create a function that delete that elements appearing before one element and itself (here "b").
out = lapply(l, function(x) x[-c(1,which(x == "b"))])
Filter(length, out)
#[[1]]
#[1] "c"
Now my question is in case I have a list of elements r not only one "b", how can I loop all the list:
for example :
r
[1] a
[2] b
I would like to have a result like this
[1] c
Thank you
Cheers
We can use %in%
Filter(length, lapply(l, function(x)
x[-seq(tail(which(x %in% r),1))]))
#[[1]]
#[1] "c"
data
r <- c('a', 'b')

Applying as.numeric only to elements of a list that can be coerced to numeric (in R)

I have a function which returns a list containing individual character vectors which I would like to convert to numeric. Most of the time, all the elements of the list can easily be coerced to numeric:
and so a simplelapply(x, FUN = as.numeric) works fine.
e.g.
l <- list(a = c("1","1"), b = c("2","2"))
l
$a
[1] "1" "1"
$b
[1] "2" "2"
lapply(l, FUN = as.numeric)
$a
[1] 1 1
$b
[1] 2 2
However, in some situations, vectors contain true characters:
e.g.
l <- list(a = c("1","1"), b = c("a","b"))
l
$a
[1] "1" "1"
$b
[1] "a" "b"
lapply(l, FUN = as.numeric)
$a
[1] 1 1
$b
[1] NA NA
The solution I have come with works but feels a little convoluted:
l.id <- unlist(lapply(l, FUN = function(x){all(!is.na(suppressWarnings(as.numeric(x))))}))
l.id
a b
TRUE FALSE
l[l.id] <- lapply(l[l.id], FUN = as.numeric)
l
$a
[1] 1 1
$b
[1] "a" "b"
So I was just wondering if anyone out there had a more streamlined and elegant solution to suggest.
Thanks!
One option would be to check whether all the elements in the vector have only numbers and if so convert to numeric or else stay as the same.
lapply(l, function(x) if(all(grepl('^[0-9.]+$', x))) as.numeric(x) else x)
Or we can use type.convert to automatically convert the class, but the character vectors will be converted to factor class.
lapply(l, type.convert)
You could also do something like
lapply(l, function(x) if(is.numeric(t <- type.convert(x))) t else x)
# $a
# [1] 1 1
#
# $b
# [1] "a" "b"
This doesn't convert anything other than when a numeric results from type.convert(). Or, for this simple case we can use as.is = TRUE but note that this will not always give us what we want.
lapply(l, type.convert, as.is = TRUE)
# $a
# [1] 1 1
#
# $b
# [1] "a" "b"

Resources