Concatenate lists with override in R - r

How do I concatenate two lists in R in such a way that duplicate elements from the second one will override corresponding elements from the first one?
I tried c(), but it keeps duplicates:
l1 <- list(a = 1, b = 2)
l1 <- c(l1, list(b = 3, c = 4))
# $a
# [1] 1
#
# $b
# [1] 2
#
# $b
# [1] 3
#
# $c
# [1] 4
I want something that will produce this instead:
# $a
# [1] 1
#
# $b
# [1] 3
#
# $c
# [1] 4
Hope that there exists something simpler than calling some *apply function?

This works with duplicated:
> l1[!duplicated(names(l1), fromLast=TRUE)]
$a
[1] 1
$b
[1] 3
$c
[1] 4

I suspect that you could get the behavior you want by using the "rlist" package; specifically, list.merge seems to do what you are looking for:
l1 <- list(a = 1, b = 2)
l2 <- list(b = 3, c = 4)
library(rlist)
list.merge(l1, l2)
# $a
# [1] 1
#
# $b
# [1] 3
#
# $c
# [1] 4
Try list.merge(l2, l1) to get a sense of the function's behavior.
If you have just two lists, as noted by #PeterDee, the list.merge function pretty much makes modifyList accept more than two lists. Thus, for this particular example, you could skip loading any packages and directly do:
modifyList(l1, l2)

I am sure there is a quicker way than:
l1 <- list(a = 1, b = 2)
l2 <- list(b = 3, c = 4)
l1 <- c(l1[!(names(l1) %in% names(l2))], l2)
You take the names of the l1 that are not in l2 and concatenate the two lists

I would suggest union:
l1 <- list(a = 1, b = 3)
l2 <- list(b = 3, c = 4)
union(l1, l2)
This works like a union in mathematics (Wiki link)

Related

discard elements from list recursively r

I have a nested list with some NAs, and I want to discard the NAs from the list.
purrr::discard does not work recursively:
l <- list(a = NA, b = T, c = c(F, F))
purrr::discard(l, is.na)
Throws this error:
Error: Predicate functions must return a single TRUE or FALSE, not a logical vector of length 2
I would like to end up with the following list in this case:
l2 <- list(b = T, c = c(F, F))
(purrr version: 0.3.2)
is.na(c(T,T,T)) returns c(F,F,F). To use discard, the function needs to return a single value for each list element as the error suggests.
This should work.
purrr::discard(l,function(x) all(is.na(x)))
This will work only if all the elements in an index of the list are NA.
To remove all NA elements this should work
library(tidyverse)
l <- list(a = NA, b = c(T,NA), c = c(F, F)) # Define a list
lapply(l,function(x) x[!is.na(x)])%>% # Remove all nested NA's
purrr::discard(.,function(x) length(x) == 0) # Remove all empty elements
EDIT(another option)
purrr::discard(l,function(x) isTRUE(anyNA(x)))
$b
[1] TRUE
$c
[1] FALSE FALSE
You can identify all NA elements and zap them:
purrr::list_modify(l,a=purrr::zap())
$b
[1] TRUE
$c
[1] FALSE FALSE
EDIT 2
If you want to remove all nested NAs, you can write up a helper zap_if():
zap_if <- function(x){
unlist(lapply(x, function(z) z[!is.na(z)]))
}
purrr::map(l,zap_if)
Result:
$a
[1] 1
$b
[1] TRUE
$c
[1] FALSE FALSE
Data for the zap_if part:
l <- list(a = c(NA,1), b = T, c = c(F, F))

Using list elements in its' definition

I am having a relatively simple problem with R, which I hope we could find a solution to.
My aim is to define a following list, in which the c element should be the sum of a and b elements defined previously:
ex.list = list(
a = 1,
b = 2,
c = a+b
)
Code throws an error (Error: object 'a' not found), indicating that we cannot use the a and b elements defined just above.
Of course we can simply count the sum out of list definition
ex.list = list(
a = 1,
b = 2
)
ex.list$c = ex.list$a + ex.list$b
Or use another elements in creating the list
a.ex = 1
b.ex = 2
ex.list = list(
a = a.ex,
b = b.ex,
c = a.ex+b.ex
)
Unfortunately, I am not interested in the above solutions. Is there any way to do the sum in the list definition?
You can write your own list function that does lazy evaluation:
lazyList <- function(...) {
tmp <- match.call(expand.dots = FALSE)$`...`
lapply(tmp, eval, envir = tmp)
}
lazyList(
a = 1,
b = 2,
c = a+b
)
#$a
#[1] 1
#
#$b
#[1] 2
#
#$c
#[1] 3
However, obviously, the following is not possible with lazy evaluation:
lazyList(
a = 1,
b = 2,
d = c * a,
c = a+b
)
No, you can't do that. But you can do mad things like this:
> (function(a,b,c=a+b){list(a=a,b=b,c=c)})(11,22)
$a
[1] 11
$b
[1] 22
$c
[1] 33
But really, if you have a list you wish to construct in a particular way, write a function to do it. Its not difficult.

Mapping indices of vector elements

This is probably a trivial question.
Given a vector of characters, some of which are repeating:
vec <- c("a","b","d","e","e","f","g","a","d")
I'm looking for an efficient function that will return for each unique element in vec the indices of where it appears in vec.
I imagine that the return value would be something like this list:
list(a = c(1,8), b = 2, d = c(3,9), e = c(4,5), f = 6, g = 7)
Here's a few options:
lapply(setNames(unique(vec),unique(vec)), function(x) which(x == vec) )
# or to avoid setNames and still ensure you get a list:
sapply(unique(vec), function(x) which(x == vec), simplify=FALSE)
# or even better but maybe not as extensible:
split(seq_along(vec),vec)
All giving:
$a
[1] 1 8
$b
[1] 2
$d
[1] 3 9
$e
[1] 4 5
$f
[1] 6
$g
[1] 7

Find the indices of an element in a nested list?

I have a list like:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
is there an (loop-free) way to identify the positions of the elements, e.g. if I want to replace a values of "C" with 5, and it does not matter where the element "C" is found, can I do something like:
Aindex <- find_index("A", mylist)
mylist[Aindex] <- 5
I have tried grepl, and in the current example, the following will work:
mylist[grepl("C", mylist)][[1]][["C"]]
but this requires an assumption of the nesting level.
The reason that I ask is that I have a deep list of parameter values, and a named vector of replacement values, and I want to do something like
replacements <- c(a = 1, C = 5)
for(i in names(replacements)){
indx <- find_index(i, mylist)
mylist[indx] <- replacements[i]
}
this is an adaptation to my previous question, update a node (of unknown depth) using xpath in R?, using R lists instead of XML
One method is to use unlist and relist.
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
tmp <- as.relistable(mylist)
tmp <- unlist(tmp)
tmp[grep("(^|.)C$",names(tmp))] <- 5
tmp <- relist(tmp)
Because list names from unlist are concatenated with a ., you'll need to be careful with grep and how your parameters are named. If there is not a . in any of your list names, this should be fine. Otherwise, names like list(.C = 1) will fall into the pattern and be replaced.
Based on this question, you could try it recursively like this:
find_and_replace <- function(x, find, replace){
if(is.list(x)){
n <- names(x) == find
x[n] <- replace
lapply(x, find_and_replace, find=find, replace=replace)
}else{
x
}
}
Testing in a deeper mylist:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3, d = list(C=10, D=55)))
find_and_replace(mylist, "C", 5)
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C ### it worked
[1] 5
$c$D
[1] 3
$c$d
$c$d$C ### it worked
[1] 5
$c$d$D
[1] 55
This can now also be done using rrapply in the rrapply-package (an extended version of base rapply). To return the position of an element in the nested list based on its name, we can use the special arguments .xpos and .xname. For instance, to look up the position of the element with name "C":
library(rrapply)
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
## get position C-node
(Cindex <- rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x, .xpos) .xpos, how = "unlist"))
#> c.C1 c.C2
#> 3 1
We could then update its value in the nested list with:
## update value C-node
mylist[[Cindex]] <- 5
The two steps can also be combined directly in the call to rrapply:
rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x) 5, how = "replace")
#> $a
#> [1] 1
#>
#> $b
#> $b$A
#> [1] 1
#>
#> $b$B
#> [1] 2
#>
#>
#> $c
#> $c$C
#> [1] 5
#>
#> $c$D
#> [1] 3

Get element names in rapply

The little-used rapply function should be a perfect solution to many problems (such as this one ). However, it leaves the user-written function f with no ability to know where it is in the tree.
Is there any way to pass the name of the element in the nested list to f in an rapply call? Unfortunately, rapply calls .Internal pretty quickly.
I struggled with nested lists with arbitrary depth recently. Eventually, I came up to more or less acceptable decision in my case. It is not the direct answer to your question (no rapply usage), but it seems to be solving the same kind of problems. I hope it can be of some help.
Instead of trying to access names of list elements inside rapply I generated vector of names and queried it for elements.
# Sample list with depth of 3
mylist <- list(a=-1, b=list(A=1,B=2), c=list(C=3,D=4, E=list(F=5,G=6)))
Generating of names vector is a tricky in my case. Specifically, names of list elements should be safe, i.e. without . symbol.
list.names <- strsplit(names(unlist(mylist)), split=".", fixed=TRUE)
node.names <- sapply(list.names, function(x) paste(x, collapse="$"))
node.names <- paste("mylist", node.names, sep="$")
node.names
[1] "mylist$a" "mylist$b$A" "mylist$b$B" "mylist$c$C" "mylist$c$D" "mylist$c$E$F"
[7] "mylist$c$E$G"
Next step is accessing list element by string name. I found nothing better than using temporary file.
f <- function(x){
fname <- tempfile()
cat(x, file=fname)
source(fname)$value
}
Here f just returns value of x, where x is a string with full name of list element.
Finally, we can query list in pseudo-recursive way.
sapply(node.names, f)
Referring to the question Find the indices of an element in a nested list?, you can write:
rappply <- function(x, f) {
setNames(lapply(seq_along(x), function(i) {
if (!is.list(x[[i]])) f(x[[i]], .name = names(x)[i])
else rappply(x[[i]], f)
}), names(x))
}
then,
> mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
>
> rappply(mylist, function(x, ..., .name) {
+ switch(.name, "a" = 1, "C" = 5, x)
+ })
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C
[1] 5
$c$D
[1] 3
Update June 2020: the rrapply-function in the rrapply-package (an extended version of base rapply) allows to do this by defining the argument .xname in the f function. Inside f, the .xname variable will evaluate to the name of the element in the nested list:
library(rrapply)
L <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
rrapply(L, f = function(x, .xname) paste(.xname, x, sep = " = "))
#> $a
#> [1] "a = 1"
#>
#> $b
#> $b$A
#> [1] "A = 1"
#>
#> $b$B
#> [1] "B = 2"
#>
#>
#> $c
#> $c$C
#> [1] "C = 1"
#>
#> $c$D
#> [1] "D = 3"

Resources