I have a list like this:
x = list(a = 1:4, b = 3:10, c = NULL)
x
#$a
#[1] 1 2 3 4
#
#$b
#[1] 3 4 5 6 7 8 9 10
#
#$c
#NULL
and I want to extract all elements that are not null. How can this be done? Thanks.
Here's another option:
Filter(Negate(is.null), x)
What about:
x[!unlist(lapply(x, is.null))]
Here is a brief description of what is going on.
lapply tells us which elements are NULL
R> lapply(x, is.null)
$a
[1] FALSE
$b
[1] FALSE
$c
[1] TRUE
Next we convect the list into a vector:
R> unlist(lapply(x, is.null))
a b c
FALSE FALSE TRUE
Then we switch TRUE to FALSE:
R> !unlist(lapply(x, is.null))
a b c
TRUE TRUE FALSE
Finally, we select the elements using the usual notation:
x[!unlist(lapply(x, is.null))]
x[!sapply(x,is.null)]
This generalizes to any logical statement about the list, just sub in the logic for "is.null".
Simpler and likely quicker than the above, the following works for lists of any non-recursive (in the sense of is.recursive) values:
example_1_LST <- list(NULL, a=1.0, b=Matrix::Matrix(), c=NULL, d=4L)
example_2_LST <- as.list(unlist(example_1_LST, recursive=FALSE))
str(example_2_LST) prints:
List of 3
$ a: num 1
$ b:Formal class 'lsyMatrix' [package "Matrix"] with 5 slots
.. ..# x : logi NA
.. ..# Dim : int [1:2] 1 1
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : NULL
.. ..# uplo : chr "U"
.. ..# factors : list()
$ d: int 4
Related
I have a list called master, which contains three IDs:
master = list(p1 = list(id = 'abc'), p2 = list(id = 'def'), p3 = list(id = 'ghi'))
str(master)
List of 3
$ p1:List of 1
..$ id: chr "abc"
$ p2:List of 1
..$ id: chr "def"
$ p3:List of 1
..$ id: chr "ghi"
To each level 1 element of this list, I would like to append the corresponding value and radius elements from the val and rad lists:
val = list(p1 = list(value = 5), p3 = list(value = 8))
str(val)
List of 2
$ p1:List of 1
..$ value: num 5
$ p3:List of 1
..$ value: num 8
rad = list(p1 = list(radius = 2), p2 = list(radius = 10))
str(rad)
List of 2
$ p1:List of 1
..$ radius: num 2
$ p2:List of 1
..$ radius: num 10
I have to be careful to match the elements by name because val and rad do not have the same structure as master, i.e. val is missing a slot for p2 and rad is missing a slot for p3.
I can use the following to partially achieve the desired result:
master_final = lapply(X=names(master),function(x, master, val, rad) c(master[[x]], val[[x]], rad[[x]]), master, val, rad)
str(master_final)
List of 3
$ :List of 3
..$ id : chr "abc"
..$ value : num 5
..$ radius: num 2
$ :List of 2
..$ id : chr "def"
..$ radius: num 10
$ :List of 2
..$ id : chr "ghi"
..$ value: num 8
But I would like each element of the resulting list to have the same structure, i.e. an id, value and radius slot. I am not sure how to do this in a way that generalises to any number of lists? I don't like having to write [[x]] for each list in the lapply function: function(x, master, val, rad) c(master[[x]], val[[x]], rad[[x]]).
One way would be to convert the lists to dataframe and do a merge based on list name. We can then split the dataframe based on list_name.
df1 <- Reduce(function(x, y) merge(x, y, all = TRUE, by = "ind"),
list(stack(master), stack(val),stack(rad)))
names(df1) <- c("list_name", "id", "value", "radius")
lapply(split(df1[-1], df1$list_name), as.list)
#$p1
#$p1$id
#[1] "abc"
#$p1$value
#[1] 5
#$p1$radius
#[1] 2
#$p2
#$p2$id
#[1] "def"
#$p2$value
#[1] NA
#$p2$radius
#[1] 10
#$p3
#$p3$id
#[1] "ghi"
#$p3$value
#[1] 8
#$p3$radius
#[1] NA
This keeps NA values in the list as it is, if we want to remove them the code becomes a bit ugly.
lapply(split(df1[-1], df1$list_name), function(x)
{inds <- !is.na(x); as.list(setNames(x[inds], names(x)[inds]))})
You could first group all your lists in L and run
L = list(master,val,rad)
lapply(names(master),function(x) unlist(lapply(L,"[[",x)))
[[1]]
id value radius
"abc" "5" "2"
[[2]]
id radius
"def" "10"
[[3]]
id value
"ghi" "8"
Here is one way with tidyverse
library(dplyr)
library(purrr)
out <- list(master, rad, val) %>%
transpose %>%
map(flatten)
str(out)
#List of 3
# $ p1:List of 3
# ..$ id : chr "abc"
# ..$ radius: num 2
# ..$ value : num 5
# $ p2:List of 2
# ..$ id : chr "def"
# ..$ radius: num 10
# $ p3:List of 2
# ..$ id : chr "ghi"
# ..$ value: num 8
I have a list like this:
x = list(a = 1:4, b = 3:10, c = NULL)
x
#$a
#[1] 1 2 3 4
#
#$b
#[1] 3 4 5 6 7 8 9 10
#
#$c
#NULL
and I want to extract all elements that are not null. How can this be done? Thanks.
Here's another option:
Filter(Negate(is.null), x)
What about:
x[!unlist(lapply(x, is.null))]
Here is a brief description of what is going on.
lapply tells us which elements are NULL
R> lapply(x, is.null)
$a
[1] FALSE
$b
[1] FALSE
$c
[1] TRUE
Next we convect the list into a vector:
R> unlist(lapply(x, is.null))
a b c
FALSE FALSE TRUE
Then we switch TRUE to FALSE:
R> !unlist(lapply(x, is.null))
a b c
TRUE TRUE FALSE
Finally, we select the elements using the usual notation:
x[!unlist(lapply(x, is.null))]
x[!sapply(x,is.null)]
This generalizes to any logical statement about the list, just sub in the logic for "is.null".
Simpler and likely quicker than the above, the following works for lists of any non-recursive (in the sense of is.recursive) values:
example_1_LST <- list(NULL, a=1.0, b=Matrix::Matrix(), c=NULL, d=4L)
example_2_LST <- as.list(unlist(example_1_LST, recursive=FALSE))
str(example_2_LST) prints:
List of 3
$ a: num 1
$ b:Formal class 'lsyMatrix' [package "Matrix"] with 5 slots
.. ..# x : logi NA
.. ..# Dim : int [1:2] 1 1
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : NULL
.. ..# uplo : chr "U"
.. ..# factors : list()
$ d: int 4
I have a function that I have made which returns a dataframe with two variables. As a simple example lets have:
test <- function(x) {y <- matrix( 5 , nrow= x , ncol = 2)
z<- data.frame(y)
return(z) }
I want to find out on which x values this function gives an error. (on our example I think for negative values, but I just want to convey the concept.) So I try:
z <- rep(0)
testnumbers <- c(0,1,2,3,4,-1,5)
for (i in 1:length(testnumbers)) {
tempo <- tryCatch( testfun(testnumbers[i]) , error= function(e) return(0) )
if (tempo == 0 ) z[i] <- {testnumbers[i] next}
}
What is wrong with my process and how can I find where in my function does not work?
If you're looking to run all of the testnumbers regardless of any of them failing, I suggest a slightly different tact.
Base R
This borrows from Rui's use of inherits which is more robust and unambiguous. It goes one step further by preserving not just which one had the error, but the actual error text as well:
testfun <- function(x) {
y <- matrix(5, nrow = x, ncol = 2)
z <- as.data.frame(y)
z
}
testnumbers <- c(0, 1, 2, 3, 4, -1, 5)
rets <- setNames(
lapply(testnumbers, function(n) tryCatch(testfun(n), error=function(e) e)),
testnumbers
)
sapply(rets, inherits, "error")
# 0 1 2 3 4 -1 5
# FALSE FALSE FALSE FALSE FALSE TRUE FALSE
Filter(function(a) inherits(a, "error"), rets)
# $`-1`
# <simpleError in matrix(5, nrow = x, ncol = 2): invalid 'nrow' value (< 0)>
(The setNames(lapply(...), ...) is because the inputs are numbers so sapply(..., simplify=F) did not preserve the names, something I thought was important.)
All of this falls in line with what some consider good practice: if you're doing one function to a lot of "things", then do it in a list, and therefore in one of the *apply functions.
tidyverse
There is a function in purrr that formalizes this a little: safely, which returns a function wrapped around its argument. For instance:
library(purrr)
safely(testfun)
# function (...)
# capture_error(.f(...), otherwise, quiet)
# <environment: 0x0000000015151d90>
It is returning a function that can then be passed. A one-time call would look like one of the following:
safely(testfun)(0)
# $result
# [1] V1 V2
# <0 rows> (or 0-length row.names)
# $error
# NULL
testfun_safe <- safely(testfun)
testfun_safe(0)
# $result
# [1] V1 V2
# <0 rows> (or 0-length row.names)
# $error
# NULL
To use it here, you can do:
rets <- setNames(
lapply(testnumbers, safely(testfun)),
testnumbers
)
str(rets[5:6])
# List of 2
# $ 4 :List of 2
# ..$ result:'data.frame': 4 obs. of 2 variables:
# .. ..$ V1: num [1:4] 5 5 5 5
# .. ..$ V2: num [1:4] 5 5 5 5
# ..$ error : NULL
# $ -1:List of 2
# ..$ result: NULL
# ..$ error :List of 2
# .. ..$ message: chr "invalid 'nrow' value (< 0)"
# .. ..$ call : language matrix(5, nrow = x, ncol = 2)
# .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
Filter(Negate(is.null), sapply(rets, `[[`, "error"))
# $`-1`
# <simpleError in matrix(5, nrow = x, ncol = 2): invalid 'nrow' value (< 0)>
and to get to the results of all runs (including the errant one):
str(sapply(rets, `[[`, "result"))
# List of 7
# $ 0 :'data.frame': 0 obs. of 2 variables:
# ..$ V1: num(0)
# ..$ V2: num(0)
# $ 1 :'data.frame': 1 obs. of 2 variables:
# ..$ V1: num 5
# ..$ V2: num 5
# $ 2 :'data.frame': 2 obs. of 2 variables:
# ..$ V1: num [1:2] 5 5
# ..$ V2: num [1:2] 5 5
# $ 3 :'data.frame': 3 obs. of 2 variables:
# ..$ V1: num [1:3] 5 5 5
# ..$ V2: num [1:3] 5 5 5
# $ 4 :'data.frame': 4 obs. of 2 variables:
# ..$ V1: num [1:4] 5 5 5 5
# ..$ V2: num [1:4] 5 5 5 5
# $ -1: NULL
# $ 5 :'data.frame': 5 obs. of 2 variables:
# ..$ V1: num [1:5] 5 5 5 5 5
# ..$ V2: num [1:5] 5 5 5 5 5
or just the results without the failed run:
str(Filter(Negate(is.null), sapply(rets, `[[`, "result")))
# List of 6
# $ 0:'data.frame': 0 obs. of 2 variables:
# ..$ V1: num(0)
# ..$ V2: num(0)
# $ 1:'data.frame': 1 obs. of 2 variables:
# ..$ V1: num 5
# ..$ V2: num 5
# $ 2:'data.frame': 2 obs. of 2 variables:
# ..$ V1: num [1:2] 5 5
# ..$ V2: num [1:2] 5 5
# $ 3:'data.frame': 3 obs. of 2 variables:
# ..$ V1: num [1:3] 5 5 5
# ..$ V2: num [1:3] 5 5 5
# $ 4:'data.frame': 4 obs. of 2 variables:
# ..$ V1: num [1:4] 5 5 5 5
# ..$ V2: num [1:4] 5 5 5 5
# $ 5:'data.frame': 5 obs. of 2 variables:
# ..$ V1: num [1:5] 5 5 5 5 5
# ..$ V2: num [1:5] 5 5 5 5 5
You were actually quite close. I'm not sure what did the trick in the end but I
Changed the 1:length(testnumbers) as this is unneccessary
Changed return(0) to a character
Wrapped your if in another if as it kept failing if the length was larger than 1 or could not be assessed.
Then you get the correct results. You could try and change the code bit by bit to see what was wrong.
test <- function(x) {y <- matrix( 5 , nrow = x , ncol = 2)
z<- data.frame(y)
return(z) }
errored <- numeric()
testnumbers <- c(0,1,2,3,4,-1,5)
for (i in testnumbers) {
tempo <- tryCatch(test(i), error = function(e) "error")
if (length(tempo) == 1) {
if (tempo == "error") errored <- c(errored, i)
}
}
errored
> -1
You need tryCatch to return the error, not zero.
testfun <- function(x) {
y <- matrix(5, nrow = x, ncol = 2)
z <- as.data.frame(y)
z
}
testnumbers <- c(0, 1, 2, 3, 4, -1, 5)
z <- numeric(length(testnumbers))
for (i in seq_along(testnumbers)) {
tempo <- tryCatch(testfun(testnumbers[i]), error = function(e) e)
if (inherits(tempo, "error")) {
z[i] <- testnumbers[i]
}
}
z
#[1] 0 0 0 0 0 -1 0
Also,
In order to coerce a matrix to data.frame use as.data.frame.
I have removed the calls to return since the last value of a function is its return value.
rep(0) is the same as just 0, replaced by numeric(length(testnumbers)).
seq_along(testnumbers) is always better than 1:length(testnumbers). Try it with testnumbers of length zero and see what happens.
Say I have a nested list like this
lst <- list(a=list(b=list("a", "b")), c=list("d"))
str(lst)
#List of 2
# $ a:List of 1
# ..$ b:List of 2
# .. ..$ : chr "a"
# .. ..$ : chr "b"
# $ c:List of 1
# ..$ : chr "d"
and I want to remove all the elements that don't match a vector of names (characters here), but I also want to remove the entire nested component if there are no matches. So, for example, using rapply I have this
## Just keep the branches that have an "a" value
keeps <- "a"
## Pass this function to rapply
f <- function(x) if(any(unlist(x) %in% keeps)) x else NULL
res <- rapply(lst, f, how="replace")
str(res)
# List of 2
# $ a:List of 1
# ..$ b:List of 2
# .. ..$ : chr "a"
# .. ..$ : NULL
# $ c:List of 1
# ..$ : NULL
So, I would have liked the entire c list to be cleaved. I don't think I can do this with a single rapply operation? If not, what would be a good way to do this.
I have a data.frame
'data.frame': 4 obs. of 2 variables:
$ name:List of 4
..$ : chr "a"
..$ : chr "b"
..$ : chr "c"
..$ : chr "d"
$ tvd :List of 4
..$ : num 0.149
..$ : num 0.188
..$ : num 0.161
..$ : num 0.187
structure(list(name = list("a", "b", "c",
"d"), tvd = list(0.148831029536996, 0.187699857380692,
0.161428147003292, 0.18652668961466)), .Names = c("name",
"tvd"), row.names = c(NA, -4L), class = "data.frame")
It appears that as.data.frame(lapply(z,unlist)) converts it to the usual
'data.frame': 4 obs. of 2 variables:
$ name: Factor w/ 4 levels "a",..: 4 1 2 3
$ tvd : num 0.149 0.188 0.161 0.187
However, I wonder if I could do better.
I create my ugly data frame like this:
as.data.frame(do.call(rbind,lapply(my.list, function (m)
list(name = ...,
tvd = ...))))
I wonder if it is possible to modify this expressing so that it would produce the normal data table.
It looks like you're just trying to tear down your original data then re-assemble it? If so, here are a few cool things to look at. Assume df is your data.
A data.frame is just a list in disguise. To see this, compare df[[1]] to df$name in your data. [[ is used for list indexing, as well as $. So we are actually viewing a list item when we use df$name on a data frame.
> is.data.frame(df) # df is a data frame
# [1] TRUE
> is.list(df) # and it's also a list
# [1] TRUE
> x <- as.list(df) # as.list() can be more useful than unlist() sometimes
# take a look at x here, it's a bit long
> (y <- do.call(cbind, x)) # reassemble to matrix form
# name tvd
# [1,] "a" 0.148831
# [2,] "b" 0.1876999
# [3,] "c" 0.1614281
# [4,] "d" 0.1865267
> as.data.frame(y) # back to df
# name tvd
# 1 a 0.148831
# 2 b 0.1876999
# 3 c 0.1614281
# 4 d 0.1865267
I recommend doing
do.call(rbind,lapply(my.list, function (m)
data.frame(name = ...,
tvd = ...)))
rather than trying to convert a list of lists into a data.frame