Versions of lapply() and mclapply() that avoid redundant processing - r

I am looking for versions of lapply() and mclapply() that only process unique elements of the argument list X. Does something like this already exist?
EDIT: In other words, I want lapply() to not bother processing duplicates, but I want length(lapply(X, ...)) to equal length(X), not length(unique(X)) (and the appropriate values to match). Also, I am assuming each element of X is rather small, so taking unique values should not be too much trouble.
Current behavior:
long_computation <- function(task){
cat(task, "\n")
# Sys.sleep(1000) #
return(task)
}
tasks <- rep(LETTERS[1:2], 2)
lapply(tasks, long_computation)
## A
## B
## A
## B
## [[1]]
## [1] "A"
##
## [[2]]
## [1] "B"
##
## [[3]]
## [1] "A"
##
## [[4]]
## [1] "B"
Desired behavior:
lapply(tasks, long_computation)
## A
## B
## [[1]]
## [1] "A"
##
## [[2]]
## [1] "B"
##
## [[3]]
## [1] "A"
##
## [[4]]
## [1] "B"
You can find the intended use case here.

You can try something like this: I have created a map object which stores the result after long_computation for each unique 'key'. Once an existing 'key' is encountered, it returns from map else it calls the long_computation function and stores the result in map for future use. Not sure if it's the ideal way but it works.
tasks <- rep(letters[1:2], 2)
map=list()
lapply(tasks,function(t){if(t %in% names(.GlobalEnv$map)){
return(.GlobalEnv$map[[t]])
}else{
result=toupper(t)
if(!t %in% names(.GlobalEnv$map)){
.GlobalEnv$map[[t]]=result
}
}
})

This actually seems to work:
lightly_parallelize_atomic <- function(X, FUN, jobs = 1, ...){
keys <- unique(X)
index <- match(X, keys)
values <- mclapply(X = keys, FUN = FUN, mc.cores = jobs, ...)
values[index]
}
And in my case, it's okay that X is atomic.
But it would be neat to find something already built into either a package or R natively.

Related

Get the column of a list of lists in R

Using R, my data set L is a list of lists. My print(L) produces the following output:
[[1]]
[[1]][[1]]
[1] 0.8198689
[[1]][[2]]
[1] 0.8166747
[[2]]
[[2]][[1]]
[1] 0.5798426
[[2]][[2]]
[1] 0.5753511
[[3]]
[[3]][[1]]
[1] 0.4713508
[[3]][[2]]
[1] 0.4698621
And I want to get a vector of the second column. However unlist(L[[2]]) gives me the second row (not the second column) and L[,2] gives me the error Error in L[, 2] : incorrect number of dimensions. I tried also L$'2' and didn't work.
How can I get the vector of the second column of this data set in R?
1) Assuming the input shown reproducibly in the Note at the end use sapply (or use lapply if you want it as a list). No packages are used.
sapply(L, `[[`, 2)
## [1] 0.8166747 0.5753511 0.4698621
2) Using purrr we have:
library(purrr)
transpose(L)[[2]]
## [[1]]
## [1] 0.8166747
##
## [[2]]
## [1] 0.5753511
##
## [[3]]
## [1] 0.4698621
3) If we know that L is regularly shaped we could convert it to a matrix and then take the second column.
matrix(unlist(L), length(L), byrow = TRUE)[, 2]
## [1] 0.8166747 0.5753511 0.4698621
or as a list
do.call("rbind", L)[, 2]
4) The second column here is the last item in each "row" and if that is what you want as it would work even if the list is ragged then try this
mapply('[[', L, lengths(L))
## [1] 0.8166747 0.5753511 0.4698621
or as a list
Map('[[', L, lengths(L))
Note
# input in reproducible form
L <- list(
list(0.8198689, 0.8166747),
list(0.5798426, 0.5753511),
list(0.4713508, 0.4698621))
The simple way to do this using purrr is just to use map(), which returns a list.
library(purrr)
map(L, 2)
If you want a (numeric) vector, use map_dbl().
map_dbl(L, 2)
# [1] 0.8166747 0.5753511 0.4698621

How to find NULL in nested lists [duplicate]

How do I remove the null elements from a list of lists, like below, in R:
lll <- list(list(NULL),list(1),list("a"))
The object I want would look like:
lll <- list(list(1),list("a"))
I saw a similar answer here: How can I remove an element from a list? but was not able to extend it from simple lists to a list of lists.
EDIT
Bad example above on my part. Both answers work on simpler case (above). What if list is like:
lll <- list(list(NULL),list(1,2,3),list("a","b","c"))
How to get:
lll <- list(list(1,2,3),list("a","b","c"))
This recursive solution has the virtue of working on even more deeply nested lists.
It's closely modeled on Gabor Grothendieck's answer to this quite similar question. My modification of that code is needed if the function is to also remove objects like list(NULL) (not the same as NULL), as you are wanting.
## A helper function that tests whether an object is either NULL _or_
## a list of NULLs
is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null))
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
rmNullObs(lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Here is an example of its application to a more deeply nested list, on which the other currently proposed solutions variously fail.
LLLL <- list(lll)
rmNullObs(LLLL)
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [[1]][[1]][[1]][[1]]
# [1] 1
#
#
# [[1]][[1]][[2]]
# [[1]][[1]][[2]][[1]]
# [1] "a"
Here's an option using Filter and Negate combination
Filter(Negate(function(x) is.null(unlist(x))), lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Using purrr
purrr::map(lll, ~ purrr::compact(.)) %>% purrr::keep(~length(.) != 0)
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[1]][[3]]
[1] 3
[[2]]
[[2]][[1]]
[1] "a"
[[2]][[2]]
[1] "b"
[[2]][[3]]
[1] "c"
For this particular example you can also use unlist with its recursive argument.
lll[!sapply(unlist(lll, recursive=FALSE), is.null)]
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Since you have lists in lists, you probably need to run l/sapply twice, like:
lll[!sapply(lll,sapply,is.null)]
#[[1]]
#[[1]][[1]]
#[1] 1
#
#
#[[2]]
#[[2]][[1]]
#[1] "a"
There is a new package rlist on CRAN, thanks to Kun Ren for making our life easier.
list.clean(.data, fun = is.null, recursive = FALSE)
or for recursive removal of NULL:
list.clean(.data, fun = is.null, recursive = TRUE)
Quick fix on Josh O'Brien's solution. There's a bit of an issue with lists of functions
is.NullOb <- function(x) if(!(is.function(x))) is.null(x) | all(sapply(x, is.null)) else FALSE
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
if(!(is.function(x))) {
x = x[!(sapply(x, is.NullOb))]
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
}

Extract list element by successive / cascading index chain

ll<-list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
looks like:
[[1]]
[[1]][[1]]
[1] "A" "B" "C"
[[1]][[2]]
[1] "Peter"
[[2]]
[1] "John"
[[3]]
[1] "Hans"
Lets say I have the indices in a list for "Peter" and "B" respectively.
peter.ind <- list(1,2) # correlates with ll[[1]][[2]]
B.ind <- list(1,1,2) # correlates with ll[[1]][[1]][[2]]
So how can I most effectively extract a "tangled" list element by its cascaded index chain?
Here is my already working function:
extract0r <- function(x,l) {
for(ind in l) {
x <- x[[ind]]
}
return(x)
}
call function:
extract0r(ll,peter.ind) #evals [1] "Peter"
extract0r(ll,B.ind) #evals [1] "B"
Is there a neater alternative to my function?
You can use a recursive function:
ll <- list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
my.ind <- function(L, ind) {
if (length(ind)==1) return(L[[ind]])
my.ind(L[[ind[1]]], ind[-1])
}
my.ind(ll, c(1,2))
my.ind(ll, c(1,1,2))
# > my.ind(ll, c(1,2))
# [1] "Peter"
# > my.ind(ll, c(1,1,2))
# [1] "B"
The recursive function has a (relative) clear coding, but during execution it has an overhead for the deep function calls.
There are many ways of doing this.
For example, you can build the commands from character strings:
my.ind.str <- function(L, ind) {
command <- paste0(c("L",sprintf("[[%i]]", ind)),collapse="")
return(eval(parse(text=command)))
}
With your example, I had to convert the lists of indices to vectors:
my.ind.str(ll, unlist(peter.ind))
[1] "Peter"
my.ind.str(ll, unlist(B.ind))
[1] "B"

What's the name of the first argument of `[`?

letter[2] is equivalent to '['(letters,i=2) , second argument is i.
What is the name of the first argument so the 2 following expressions would be equivalent ?
lapply(1:3,function(x){letters[x]})
lapply(1:3,`[`,param1 = letters) # param1 to be replaced with solution
For you to be able to define a function similar to the one above, you will have to pass two arguments to your function. The function [ does take various inputs. We can use Map instead of lapply to give it both the data where to extract from and the Indices to indicate the part of the data to be extracted:
Map("[",list(letters),1:3)
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
This is similar to what you have above. Hope this helps
You have to be could be more specific than "[", for instance:
lapply(1:3, `[.numeric_version`, x = letters)
# [[1]]
# [1] "a"
#
# [[2]]
# [1] "b"
#
# [[3]]
# [1] "c"
(Not sure [.numeric_version is the most appropriate, though... I'm digging a bit more)
rlang::as_closure and purrr::as_mapper ,both based on rlang::as_function (see doc)
will both convert [ to a function with named parameters:
lapply(1:3, purrr::as_mapper(`[`), .x = letters)
lapply(1:3, rlang::as_closure(`[`), .x = letters)
# [[1]]
# [1] "a"
#
# [[2]]
# [1] "b"
#
# [[3]]
# [1] "c"

Remove NULL elements from list of lists

How do I remove the null elements from a list of lists, like below, in R:
lll <- list(list(NULL),list(1),list("a"))
The object I want would look like:
lll <- list(list(1),list("a"))
I saw a similar answer here: How can I remove an element from a list? but was not able to extend it from simple lists to a list of lists.
EDIT
Bad example above on my part. Both answers work on simpler case (above). What if list is like:
lll <- list(list(NULL),list(1,2,3),list("a","b","c"))
How to get:
lll <- list(list(1,2,3),list("a","b","c"))
This recursive solution has the virtue of working on even more deeply nested lists.
It's closely modeled on Gabor Grothendieck's answer to this quite similar question. My modification of that code is needed if the function is to also remove objects like list(NULL) (not the same as NULL), as you are wanting.
## A helper function that tests whether an object is either NULL _or_
## a list of NULLs
is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null))
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
rmNullObs(lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Here is an example of its application to a more deeply nested list, on which the other currently proposed solutions variously fail.
LLLL <- list(lll)
rmNullObs(LLLL)
# [[1]]
# [[1]][[1]]
# [[1]][[1]][[1]]
# [[1]][[1]][[1]][[1]]
# [1] 1
#
#
# [[1]][[1]][[2]]
# [[1]][[1]][[2]][[1]]
# [1] "a"
Here's an option using Filter and Negate combination
Filter(Negate(function(x) is.null(unlist(x))), lll)
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Using purrr
purrr::map(lll, ~ purrr::compact(.)) %>% purrr::keep(~length(.) != 0)
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[1]][[3]]
[1] 3
[[2]]
[[2]][[1]]
[1] "a"
[[2]][[2]]
[1] "b"
[[2]][[3]]
[1] "c"
For this particular example you can also use unlist with its recursive argument.
lll[!sapply(unlist(lll, recursive=FALSE), is.null)]
# [[1]]
# [[1]][[1]]
# [1] 1
#
#
# [[2]]
# [[2]][[1]]
# [1] "a"
Since you have lists in lists, you probably need to run l/sapply twice, like:
lll[!sapply(lll,sapply,is.null)]
#[[1]]
#[[1]][[1]]
#[1] 1
#
#
#[[2]]
#[[2]][[1]]
#[1] "a"
There is a new package rlist on CRAN, thanks to Kun Ren for making our life easier.
list.clean(.data, fun = is.null, recursive = FALSE)
or for recursive removal of NULL:
list.clean(.data, fun = is.null, recursive = TRUE)
Quick fix on Josh O'Brien's solution. There's a bit of an issue with lists of functions
is.NullOb <- function(x) if(!(is.function(x))) is.null(x) | all(sapply(x, is.null)) else FALSE
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
if(!(is.function(x))) {
x = x[!(sapply(x, is.NullOb))]
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
}

Resources