Extract list element by successive / cascading index chain - r

ll<-list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
looks like:
[[1]]
[[1]][[1]]
[1] "A" "B" "C"
[[1]][[2]]
[1] "Peter"
[[2]]
[1] "John"
[[3]]
[1] "Hans"
Lets say I have the indices in a list for "Peter" and "B" respectively.
peter.ind <- list(1,2) # correlates with ll[[1]][[2]]
B.ind <- list(1,1,2) # correlates with ll[[1]][[1]][[2]]
So how can I most effectively extract a "tangled" list element by its cascaded index chain?
Here is my already working function:
extract0r <- function(x,l) {
for(ind in l) {
x <- x[[ind]]
}
return(x)
}
call function:
extract0r(ll,peter.ind) #evals [1] "Peter"
extract0r(ll,B.ind) #evals [1] "B"
Is there a neater alternative to my function?

You can use a recursive function:
ll <- list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
my.ind <- function(L, ind) {
if (length(ind)==1) return(L[[ind]])
my.ind(L[[ind[1]]], ind[-1])
}
my.ind(ll, c(1,2))
my.ind(ll, c(1,1,2))
# > my.ind(ll, c(1,2))
# [1] "Peter"
# > my.ind(ll, c(1,1,2))
# [1] "B"
The recursive function has a (relative) clear coding, but during execution it has an overhead for the deep function calls.

There are many ways of doing this.
For example, you can build the commands from character strings:
my.ind.str <- function(L, ind) {
command <- paste0(c("L",sprintf("[[%i]]", ind)),collapse="")
return(eval(parse(text=command)))
}
With your example, I had to convert the lists of indices to vectors:
my.ind.str(ll, unlist(peter.ind))
[1] "Peter"
my.ind.str(ll, unlist(B.ind))
[1] "B"

Related

How to find NULL in nested lists [duplicate]

How do I remove the null elements from a list of lists, like below, in R:
lll <- list(list(NULL),list(1),list("a"))
The object I want would look like:
lll <- list(list(1),list("a"))
I saw a similar answer here: How can I remove an element from a list? but was not able to extend it from simple lists to a list of lists.
EDIT
Bad example above on my part. Both answers work on simpler case (above). What if list is like:
lll <- list(list(NULL),list(1,2,3),list("a","b","c"))
How to get:
lll <- list(list(1,2,3),list("a","b","c"))
This recursive solution has the virtue of working on even more deeply nested lists.
It's closely modeled on Gabor Grothendieck's answer to this quite similar question. My modification of that code is needed if the function is to also remove objects like list(NULL) (not the same as NULL), as you are wanting.
## A helper function that tests whether an object is either NULL _or_
## a list of NULLs
is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null))
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
rmNullObs(lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Here is an example of its application to a more deeply nested list, on which the other currently proposed solutions variously fail.
LLLL <- list(lll)
rmNullObs(LLLL)
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [[1]][[1]][[1]][[1]]
# [1] 1
#
#
# [[1]][[1]][[2]]
# [[1]][[1]][[2]][[1]]
# [1] "a"
Here's an option using Filter and Negate combination
Filter(Negate(function(x) is.null(unlist(x))), lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Using purrr
purrr::map(lll, ~ purrr::compact(.)) %>% purrr::keep(~length(.) != 0)
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[1]][[3]]
[1] 3
[[2]]
[[2]][[1]]
[1] "a"
[[2]][[2]]
[1] "b"
[[2]][[3]]
[1] "c"
For this particular example you can also use unlist with its recursive argument.
lll[!sapply(unlist(lll, recursive=FALSE), is.null)]
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Since you have lists in lists, you probably need to run l/sapply twice, like:
lll[!sapply(lll,sapply,is.null)]
#[[1]]
#[[1]][[1]]
#[1] 1
#
#
#[[2]]
#[[2]][[1]]
#[1] "a"
There is a new package rlist on CRAN, thanks to Kun Ren for making our life easier.
list.clean(.data, fun = is.null, recursive = FALSE)
or for recursive removal of NULL:
list.clean(.data, fun = is.null, recursive = TRUE)
Quick fix on Josh O'Brien's solution. There's a bit of an issue with lists of functions
is.NullOb <- function(x) if(!(is.function(x))) is.null(x) | all(sapply(x, is.null)) else FALSE
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
if(!(is.function(x))) {
x = x[!(sapply(x, is.NullOb))]
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
}

How to perform a vectorised operation with a self-defined function, adding the results to a list?

library(tidyverse)
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
test <- ridiculous_function("apple", "A")
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
This code produces a list of elements of a and b, however what I would like is to run the function over two vectors in parallel, and then put all of the results in the same list.
For example, with these two vectors:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
I would want to create a list which produces elements of character vectors for "apple", "A", "apricot", "B", "avocado", "C".. and so on. My real scenario is a lot more complex so I need a solution which works with the confines of my function.
Expected output:
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
....
[[19]]
[1] "blueberry"
[[20]]
[1] "T"
How about:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
library(tidyverse)
flatten(map2(fruits10, letters10, ridiculous_function))
which gives you
[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
[[7]]
[1] "banana"
[[8]]
[1] "D"
etc...
Here are a few different ways of doing this:
library(tidyverse)
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
# using mapply, base R for writing packages
mapply(ridiculous_function, fruits10, letters10) %>%
split(rep(1:ncol(.), each = nrow(.)))
# using map2, takes two args
map2(fruits10, letters10, ridiculous_function)
# using pmap, can take as many args as you want
list(a = fruits10,
b = letters10) %>%
pmap(ridiculous_function)
You ask for results in a flat list format, so you can pop a flatten at the end
of each of these, but usually you would want to retain the list structure.

R doubling list length when subsetting

I am currently trying to subset a list in R from a dataframe. My current attempt looks like:
list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))
for(i in list.level){
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
}
However, instead of filling the list it seems to create a duplicate list of the same amount of rows, returning:
[[1]]
NULL
[[2]]
NULL
...
NULL
[[22]]
NULL
[[23]]
NULL
$A
[1] "A"
$C
[1] "C" "C" "C"
$D
[1] "D" "D" "D"
...
$AJ
[1] "AJ" "AJ" "AJ" "AJ" "AJ"
$AK
[1] "AK" "AK"
A should be filling into 1, C into 2, etc. etc. How do I get these to fill in the original rows rather than creating extra rows at the bottom of the list?
Here is what is going on. Suppose your buckets$group is c("a","a","b","b").
list.level <- unique(buckets$group)
Now list.level is c("a","b")
bucket.group <- vector("list",length(list.level))
Since length(list.level) is 2, now your bucket.group is a list of 2 NULL elements, their names are 1 and 2.
for(i in list.level){
Recalling the value of list.level, it is the same as for i in c("a","b").
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
Since i loops over "a" and "b", you now fill bucket.group[["a"]] and bucket.group[["b"]], while bucket.group[[1]] and bucket.group[[2]] remain intact.
To fix this, you should write instead
list.level <- unique(buckets$group) # ok, this was correct
bucket.group <- list() # just empty list
for(i in 1:length(list.level)){
bucket.group[[i]] <- buckets$group[buckets$group == list.level[[i]] ]
}
I think the issue is with your for statement.
Your code is like this:
list.level<-letters[1:10]
> for(i in list.level) print(i)
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
It assigns each element in list.level to i, so i is a letter. When you do
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
in the first iteration, i is a letter. So it looks for a list element called bucket.group[["a"]] and does not find it, so it creates it and stores the data there. If instead you use seq_along
for(i in seq_along(list.level)) print(i)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
now i will alway be a number and the code will do what you want.
So use seq_along instead.
this should work:
list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))
for(i in 1:length(list.level)){
bucket.group[[i]] <- subset(buckets$group,buckets$group == list.level[i])
}

Create nested list structure from a string

I have a string that is a composite of n substrings. It could look like this:
string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
Every subcomponent in this string is separated from any other by "_". Here, the first level consists of the values "A" and "B", the second level of "AA", "BB" and "CC", the third level of "AAA". Deeper nestings are possible and the solution should extend to those cases. The nestings are not necessarily balanced, e.g. "A" only has two children, while "B" has three, but it also has a grandchild which "B" has not.
Essentially, I want to recreate the nested structure in this string in some kind of R object, preferably a list. Thus, the nested list structure that would look like this:
list("A" = list("AA", "BB" = list("AAA")),
"B" = list("AA", "BB", "CC"))
> $A
$A[[1]]
[1] "AA"
$A$BB
$A$BB[[1]]
[1] "CCC"
$B
$B[[1]]
[1] "AA"
$B[[2]]
[1] "BB"
$B[[3]]
[1] "CC"
Any help on this is appreciated
You can make it into a matrix without too much fuss...
string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
splitted<-strsplit(string,"_")
cols<-max(lengths(splitted))
mat<-do.call(rbind,lapply(splitted, "length<-", cols))
Not so straight forward, also not the most beautiful code, but it should do its job and return a list:
string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
# loop through each element of the string "str_el"
list_els <- lapply(string, function(str_el) {
# split the string into parts
els <- strsplit(str_el, "_")[[1]]
# loop backwards through the elements
for (i in length(els):1){
# the last element gives the value
if (i == length(els)){
# assign the value to a list and rename the list
res <- list(els[[i]])
names(res) <- els[[i - 1]]
} else {
# if its not the last element (value) assign the list res to another list
# with the name of that element
if (i != 1) {
res <- list(res)
names(res) <- els[[i - 1]]
}
}
}
return(res)
})
# combine the lists into one list
res_list <- mapply(c, list_els, SIMPLIFY = F)
res_list
# [[1]]
# [[1]]$A
# [1] "AA"
#
#
# [[2]]
# [[2]]$A
# [1] "BB"
#
#
# [[3]]
# [[3]]$A
# [[3]]$A$BB
# [1] "AAA"
#
#
#
# [[4]]
# [[4]]$B
# [1] "AA"
#
#
# [[5]]
# [[5]]$B
# [1] "BB"
#
#
# [[6]]
# [[6]]$B
# [1] "CC"
Does that give you what you want?
I found this way to do it. It's weird, but seems to work
my_relist <- function(x){
y=list()
#This first loop creates the skeleton of the list
for (name in x){
split=strsplit(name,'_',fixed=TRUE)[[1]]
char='y'
l=length(split)
for (i in 1:(l-1)){
char=paste(char,'$',split[i],sep="")
}
char2=paste(char,'= list()',sep="")
#Example of char2: "y$A$BB=list()"
eval(parse(text=char2))
#Evaluates the expression inside char2
}
#The second loop fills the list with the last element
for (name in x){
split=strsplit(name,'_',fixed=TRUE)[[1]]
char='y'
l=length(split)
for (i in 1:(l-1)){
char=paste(char,'$',split[i],sep="")
}
char3=paste(char,'=c(',char,',split[l])')
#Example of char3: "y$A = c(y$A,"BB")"
eval(parse(text=char3))
}
return(y)
}
And this is the result:
example <- c("A_AA_AAA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
my_relist(example)
#$A
#$BB
#1.'AAA'
#[[2]]
#'AA'
#[[3]]
#'BB'
#$B
#1.'AA'
#2.'BB'
#3.'CC'

Remove NULL elements from list of lists

How do I remove the null elements from a list of lists, like below, in R:
lll <- list(list(NULL),list(1),list("a"))
The object I want would look like:
lll <- list(list(1),list("a"))
I saw a similar answer here: How can I remove an element from a list? but was not able to extend it from simple lists to a list of lists.
EDIT
Bad example above on my part. Both answers work on simpler case (above). What if list is like:
lll <- list(list(NULL),list(1,2,3),list("a","b","c"))
How to get:
lll <- list(list(1,2,3),list("a","b","c"))
This recursive solution has the virtue of working on even more deeply nested lists.
It's closely modeled on Gabor Grothendieck's answer to this quite similar question. My modification of that code is needed if the function is to also remove objects like list(NULL) (not the same as NULL), as you are wanting.
## A helper function that tests whether an object is either NULL _or_
## a list of NULLs
is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null))
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
rmNullObs(lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Here is an example of its application to a more deeply nested list, on which the other currently proposed solutions variously fail.
LLLL <- list(lll)
rmNullObs(LLLL)
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [[1]][[1]][[1]][[1]]
# [1] 1
#
#
# [[1]][[1]][[2]]
# [[1]][[1]][[2]][[1]]
# [1] "a"
Here's an option using Filter and Negate combination
Filter(Negate(function(x) is.null(unlist(x))), lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Using purrr
purrr::map(lll, ~ purrr::compact(.)) %>% purrr::keep(~length(.) != 0)
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[1]][[3]]
[1] 3
[[2]]
[[2]][[1]]
[1] "a"
[[2]][[2]]
[1] "b"
[[2]][[3]]
[1] "c"
For this particular example you can also use unlist with its recursive argument.
lll[!sapply(unlist(lll, recursive=FALSE), is.null)]
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Since you have lists in lists, you probably need to run l/sapply twice, like:
lll[!sapply(lll,sapply,is.null)]
#[[1]]
#[[1]][[1]]
#[1] 1
#
#
#[[2]]
#[[2]][[1]]
#[1] "a"
There is a new package rlist on CRAN, thanks to Kun Ren for making our life easier.
list.clean(.data, fun = is.null, recursive = FALSE)
or for recursive removal of NULL:
list.clean(.data, fun = is.null, recursive = TRUE)
Quick fix on Josh O'Brien's solution. There's a bit of an issue with lists of functions
is.NullOb <- function(x) if(!(is.function(x))) is.null(x) | all(sapply(x, is.null)) else FALSE
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
if(!(is.function(x))) {
x = x[!(sapply(x, is.NullOb))]
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
}

Resources