I have a nested list with some NAs, and I want to discard the NAs from the list.
purrr::discard does not work recursively:
l <- list(a = NA, b = T, c = c(F, F))
purrr::discard(l, is.na)
Throws this error:
Error: Predicate functions must return a single TRUE or FALSE, not a logical vector of length 2
I would like to end up with the following list in this case:
l2 <- list(b = T, c = c(F, F))
(purrr version: 0.3.2)
is.na(c(T,T,T)) returns c(F,F,F). To use discard, the function needs to return a single value for each list element as the error suggests.
This should work.
purrr::discard(l,function(x) all(is.na(x)))
This will work only if all the elements in an index of the list are NA.
To remove all NA elements this should work
library(tidyverse)
l <- list(a = NA, b = c(T,NA), c = c(F, F)) # Define a list
lapply(l,function(x) x[!is.na(x)])%>% # Remove all nested NA's
purrr::discard(.,function(x) length(x) == 0) # Remove all empty elements
EDIT(another option)
purrr::discard(l,function(x) isTRUE(anyNA(x)))
$b
[1] TRUE
$c
[1] FALSE FALSE
You can identify all NA elements and zap them:
purrr::list_modify(l,a=purrr::zap())
$b
[1] TRUE
$c
[1] FALSE FALSE
EDIT 2
If you want to remove all nested NAs, you can write up a helper zap_if():
zap_if <- function(x){
unlist(lapply(x, function(z) z[!is.na(z)]))
}
purrr::map(l,zap_if)
Result:
$a
[1] 1
$b
[1] TRUE
$c
[1] FALSE FALSE
Data for the zap_if part:
l <- list(a = c(NA,1), b = T, c = c(F, F))
Related
There are three lists:
a = list(1,2)
b = list(2,3)
c = list(a,b)
The command a %in% c yields FALSE FALSE. The result I would like to see is TRUE since a is an element of list c. How do I achieve this?
Check whether each component is identical to a and return TRUE if any of those comparisons are TRUE.
any(sapply(c, identical, a))
## [1] TRUE
This should also help:
list(a) %in% c
Examples:
a = list(1,2)
b = list(2,3)
c = list(a,b)
y = list(3,4)
z = list(1)
list(a) %in% c # True
list(b) %in% c # True
list(y) %in% c # False
list(z) %in% c # False
I want to check if all elements in a list are named. I've came up with this solution, but I wanted to know if there is a more elegant way to check this.
x <- list(a = 1, b = 2)
y <- list(1, b = 2)
z <- list (1, 2)
any(stringr::str_length(methods::allNames(x)) == 0L) # FALSE, all elements are
# named.
any(stringr::str_length(methods::allNames(y)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
any(stringr::str_length(methods::allNames(z)) == 0L) # TRUE, at least one
# element is not named.
# Throw an error here.
I am not sure if the following base R code works for your general cases, but it seems work for the ones in your post.
Define a function f to check the names
f <- function(lst) length(lst) == sum(names(lst) != "",na.rm = TRUE)
and you will see
> f(x)
[1] TRUE
> f(y)
[1] FALSE
> f(z)
[1] FALSE
We can create a function to check if the the names attribute is NULL or (|) there is blank ("") name, negate (!)
f1 <- function(lst1) is.list(lst1) && !(is.null(names(lst1))| '' %in% names(lst1))
-checking
f1(x)
#[1] TRUE
f1(y)
#[1] FALSE
f1(z)
#[1] FALSE
Or with allNames
f2 <- function(lst1) is.list(lst1) && !("" %in% allNames(lst1))
-checking
f2(x)
#[1] TRUE
f2(y)
#[1] FALSE
f2(z)
#[1] FALSE
I am having a relatively simple problem with R, which I hope we could find a solution to.
My aim is to define a following list, in which the c element should be the sum of a and b elements defined previously:
ex.list = list(
a = 1,
b = 2,
c = a+b
)
Code throws an error (Error: object 'a' not found), indicating that we cannot use the a and b elements defined just above.
Of course we can simply count the sum out of list definition
ex.list = list(
a = 1,
b = 2
)
ex.list$c = ex.list$a + ex.list$b
Or use another elements in creating the list
a.ex = 1
b.ex = 2
ex.list = list(
a = a.ex,
b = b.ex,
c = a.ex+b.ex
)
Unfortunately, I am not interested in the above solutions. Is there any way to do the sum in the list definition?
You can write your own list function that does lazy evaluation:
lazyList <- function(...) {
tmp <- match.call(expand.dots = FALSE)$`...`
lapply(tmp, eval, envir = tmp)
}
lazyList(
a = 1,
b = 2,
c = a+b
)
#$a
#[1] 1
#
#$b
#[1] 2
#
#$c
#[1] 3
However, obviously, the following is not possible with lazy evaluation:
lazyList(
a = 1,
b = 2,
d = c * a,
c = a+b
)
No, you can't do that. But you can do mad things like this:
> (function(a,b,c=a+b){list(a=a,b=b,c=c)})(11,22)
$a
[1] 11
$b
[1] 22
$c
[1] 33
But really, if you have a list you wish to construct in a particular way, write a function to do it. Its not difficult.
This is probably a trivial question.
Given a vector of characters, some of which are repeating:
vec <- c("a","b","d","e","e","f","g","a","d")
I'm looking for an efficient function that will return for each unique element in vec the indices of where it appears in vec.
I imagine that the return value would be something like this list:
list(a = c(1,8), b = 2, d = c(3,9), e = c(4,5), f = 6, g = 7)
Here's a few options:
lapply(setNames(unique(vec),unique(vec)), function(x) which(x == vec) )
# or to avoid setNames and still ensure you get a list:
sapply(unique(vec), function(x) which(x == vec), simplify=FALSE)
# or even better but maybe not as extensible:
split(seq_along(vec),vec)
All giving:
$a
[1] 1 8
$b
[1] 2
$d
[1] 3 9
$e
[1] 4 5
$f
[1] 6
$g
[1] 7
The little-used rapply function should be a perfect solution to many problems (such as this one ). However, it leaves the user-written function f with no ability to know where it is in the tree.
Is there any way to pass the name of the element in the nested list to f in an rapply call? Unfortunately, rapply calls .Internal pretty quickly.
I struggled with nested lists with arbitrary depth recently. Eventually, I came up to more or less acceptable decision in my case. It is not the direct answer to your question (no rapply usage), but it seems to be solving the same kind of problems. I hope it can be of some help.
Instead of trying to access names of list elements inside rapply I generated vector of names and queried it for elements.
# Sample list with depth of 3
mylist <- list(a=-1, b=list(A=1,B=2), c=list(C=3,D=4, E=list(F=5,G=6)))
Generating of names vector is a tricky in my case. Specifically, names of list elements should be safe, i.e. without . symbol.
list.names <- strsplit(names(unlist(mylist)), split=".", fixed=TRUE)
node.names <- sapply(list.names, function(x) paste(x, collapse="$"))
node.names <- paste("mylist", node.names, sep="$")
node.names
[1] "mylist$a" "mylist$b$A" "mylist$b$B" "mylist$c$C" "mylist$c$D" "mylist$c$E$F"
[7] "mylist$c$E$G"
Next step is accessing list element by string name. I found nothing better than using temporary file.
f <- function(x){
fname <- tempfile()
cat(x, file=fname)
source(fname)$value
}
Here f just returns value of x, where x is a string with full name of list element.
Finally, we can query list in pseudo-recursive way.
sapply(node.names, f)
Referring to the question Find the indices of an element in a nested list?, you can write:
rappply <- function(x, f) {
setNames(lapply(seq_along(x), function(i) {
if (!is.list(x[[i]])) f(x[[i]], .name = names(x)[i])
else rappply(x[[i]], f)
}), names(x))
}
then,
> mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
>
> rappply(mylist, function(x, ..., .name) {
+ switch(.name, "a" = 1, "C" = 5, x)
+ })
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C
[1] 5
$c$D
[1] 3
Update June 2020: the rrapply-function in the rrapply-package (an extended version of base rapply) allows to do this by defining the argument .xname in the f function. Inside f, the .xname variable will evaluate to the name of the element in the nested list:
library(rrapply)
L <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
rrapply(L, f = function(x, .xname) paste(.xname, x, sep = " = "))
#> $a
#> [1] "a = 1"
#>
#> $b
#> $b$A
#> [1] "A = 1"
#>
#> $b$B
#> [1] "B = 2"
#>
#>
#> $c
#> $c$C
#> [1] "C = 1"
#>
#> $c$D
#> [1] "D = 3"