This is probably a trivial question.
Given a vector of characters, some of which are repeating:
vec <- c("a","b","d","e","e","f","g","a","d")
I'm looking for an efficient function that will return for each unique element in vec the indices of where it appears in vec.
I imagine that the return value would be something like this list:
list(a = c(1,8), b = 2, d = c(3,9), e = c(4,5), f = 6, g = 7)
Here's a few options:
lapply(setNames(unique(vec),unique(vec)), function(x) which(x == vec) )
# or to avoid setNames and still ensure you get a list:
sapply(unique(vec), function(x) which(x == vec), simplify=FALSE)
# or even better but maybe not as extensible:
split(seq_along(vec),vec)
All giving:
$a
[1] 1 8
$b
[1] 2
$d
[1] 3 9
$e
[1] 4 5
$f
[1] 6
$g
[1] 7
Related
Proper subset: A proper subset S' of a set S is a subset that is strictly contained in S and so excludes S itself (note I am also excluding the empty set).
Suppose you have the following vectors in a list:
a = c(1,2)
b = c(1,3)
c = c(2,4)
d = c(1,2,3,4)
e = c(2,4,5)
f = c(1,2,3)
My aim is to keep only vectors which have no proper subset within the list, which in this example would be a, b and c. The following code is my solution,
possibilities = list(a,b,c,d,e,f)
final.list <- possibilities
for (i in possibilities) {
for (j in rev(possibilities)) {
if (all(i %in% j) & !all(j %in% i)) {
final.list <- final.list[!(final.list %in% list(j))]
} else {
final.list <- final.list
}
}
}
which gives the intended output, though I am concerned with the scalability of this approach. Does anyone have an idea for a more efficient approach? Thanks!
* Note that for my true purpose the length of the possibilities list--and its sub-vectors--can grow quite large.
One purrr option could be:
map2(.x = possibilities,
.y = seq_along(possibilities),
~ !any(map_lgl(possibilities[-.y], function(z) all(z %in% .x))))
[[1]]
[1] TRUE
[[2]]
[1] TRUE
[[3]]
[1] TRUE
[[4]]
[1] FALSE
[[5]]
[1] FALSE
[[6]]
[1] FALSE
To keep only the target vectors:
keep(possibilities,
map2_lgl(.x = possibilities,
.y = seq_along(possibilities),
~ !any(map_lgl(possibilities[-.y], function(z) all(z %in% .x)))))
[[1]]
[1] 1 2
[[2]]
[1] 1 3
[[3]]
[1] 2 4
Here is a base R option
final.list <- subset(
possibilities,
sapply(
seq_along(possibilities),
function(k) {
!any(sapply(
possibilities[-k],
function(v) all(v %in% possibilities[[k]]) & length(v) < length(possibilities[[k]])
))
}
)
)
which gives
> final.list
[[1]]
[1] 1 2
[[2]]
[1] 1 3
[[3]]
[1] 2 4
I have the following code in R:
a <- 2
evaluate <- function(x){
b <- 2*x
c <- 3*x
d <- 4*x
out <- list("b" = b, "c" = c, "d" = d)
return(out)
}
evaluate(a)
I obtain something like
$b
[1] 4
$c
[1] 6
$d
[1] 8
How can I compute something like b + c + d ?
so many options
# with
with(evaluate(a), b + c + d)
[1] 18
# unlist the unnamed output object
sum(unlist(evaluate(a)))
[1] 18
# subset a named output object
result <- evaluate(a)
result$b + result$c + result$d
[1] 18
# subset an unnamed output object
evaluate(a)$b + evaluate(a)$c + evaluate(a)$d
[1] 18
# custom function with fancy arguments
f <- function(...) {
args <- unlist(...)
sum(args)
}
f(evaluate(a))
[1] 18
Also, +1 from: #Gregor (double-bracket list subsetting)
result[["b"]] + result[["c"]] + result[["d"]]
[1] 18
In R you can access list members using $ operator, followed by member name so, in your code, for example:
result = evaluate(a)
result$b + result$c + result$d
Your function returns a list. You could return a vector and then use the sum() function to compute the sum of the elements in the vector. If you must use a list, the 'Reduce()` function can work.
l <- list(2, 3, 4)
v <- c(2,3,4)
sum(v) # returns 9
Reduce("+", l) # returns 9
How do I concatenate two lists in R in such a way that duplicate elements from the second one will override corresponding elements from the first one?
I tried c(), but it keeps duplicates:
l1 <- list(a = 1, b = 2)
l1 <- c(l1, list(b = 3, c = 4))
# $a
# [1] 1
#
# $b
# [1] 2
#
# $b
# [1] 3
#
# $c
# [1] 4
I want something that will produce this instead:
# $a
# [1] 1
#
# $b
# [1] 3
#
# $c
# [1] 4
Hope that there exists something simpler than calling some *apply function?
This works with duplicated:
> l1[!duplicated(names(l1), fromLast=TRUE)]
$a
[1] 1
$b
[1] 3
$c
[1] 4
I suspect that you could get the behavior you want by using the "rlist" package; specifically, list.merge seems to do what you are looking for:
l1 <- list(a = 1, b = 2)
l2 <- list(b = 3, c = 4)
library(rlist)
list.merge(l1, l2)
# $a
# [1] 1
#
# $b
# [1] 3
#
# $c
# [1] 4
Try list.merge(l2, l1) to get a sense of the function's behavior.
If you have just two lists, as noted by #PeterDee, the list.merge function pretty much makes modifyList accept more than two lists. Thus, for this particular example, you could skip loading any packages and directly do:
modifyList(l1, l2)
I am sure there is a quicker way than:
l1 <- list(a = 1, b = 2)
l2 <- list(b = 3, c = 4)
l1 <- c(l1[!(names(l1) %in% names(l2))], l2)
You take the names of the l1 that are not in l2 and concatenate the two lists
I would suggest union:
l1 <- list(a = 1, b = 3)
l2 <- list(b = 3, c = 4)
union(l1, l2)
This works like a union in mathematics (Wiki link)
I have a list like:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
is there an (loop-free) way to identify the positions of the elements, e.g. if I want to replace a values of "C" with 5, and it does not matter where the element "C" is found, can I do something like:
Aindex <- find_index("A", mylist)
mylist[Aindex] <- 5
I have tried grepl, and in the current example, the following will work:
mylist[grepl("C", mylist)][[1]][["C"]]
but this requires an assumption of the nesting level.
The reason that I ask is that I have a deep list of parameter values, and a named vector of replacement values, and I want to do something like
replacements <- c(a = 1, C = 5)
for(i in names(replacements)){
indx <- find_index(i, mylist)
mylist[indx] <- replacements[i]
}
this is an adaptation to my previous question, update a node (of unknown depth) using xpath in R?, using R lists instead of XML
One method is to use unlist and relist.
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
tmp <- as.relistable(mylist)
tmp <- unlist(tmp)
tmp[grep("(^|.)C$",names(tmp))] <- 5
tmp <- relist(tmp)
Because list names from unlist are concatenated with a ., you'll need to be careful with grep and how your parameters are named. If there is not a . in any of your list names, this should be fine. Otherwise, names like list(.C = 1) will fall into the pattern and be replaced.
Based on this question, you could try it recursively like this:
find_and_replace <- function(x, find, replace){
if(is.list(x)){
n <- names(x) == find
x[n] <- replace
lapply(x, find_and_replace, find=find, replace=replace)
}else{
x
}
}
Testing in a deeper mylist:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3, d = list(C=10, D=55)))
find_and_replace(mylist, "C", 5)
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C ### it worked
[1] 5
$c$D
[1] 3
$c$d
$c$d$C ### it worked
[1] 5
$c$d$D
[1] 55
This can now also be done using rrapply in the rrapply-package (an extended version of base rapply). To return the position of an element in the nested list based on its name, we can use the special arguments .xpos and .xname. For instance, to look up the position of the element with name "C":
library(rrapply)
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
## get position C-node
(Cindex <- rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x, .xpos) .xpos, how = "unlist"))
#> c.C1 c.C2
#> 3 1
We could then update its value in the nested list with:
## update value C-node
mylist[[Cindex]] <- 5
The two steps can also be combined directly in the call to rrapply:
rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x) 5, how = "replace")
#> $a
#> [1] 1
#>
#> $b
#> $b$A
#> [1] 1
#>
#> $b$B
#> [1] 2
#>
#>
#> $c
#> $c$C
#> [1] 5
#>
#> $c$D
#> [1] 3
The little-used rapply function should be a perfect solution to many problems (such as this one ). However, it leaves the user-written function f with no ability to know where it is in the tree.
Is there any way to pass the name of the element in the nested list to f in an rapply call? Unfortunately, rapply calls .Internal pretty quickly.
I struggled with nested lists with arbitrary depth recently. Eventually, I came up to more or less acceptable decision in my case. It is not the direct answer to your question (no rapply usage), but it seems to be solving the same kind of problems. I hope it can be of some help.
Instead of trying to access names of list elements inside rapply I generated vector of names and queried it for elements.
# Sample list with depth of 3
mylist <- list(a=-1, b=list(A=1,B=2), c=list(C=3,D=4, E=list(F=5,G=6)))
Generating of names vector is a tricky in my case. Specifically, names of list elements should be safe, i.e. without . symbol.
list.names <- strsplit(names(unlist(mylist)), split=".", fixed=TRUE)
node.names <- sapply(list.names, function(x) paste(x, collapse="$"))
node.names <- paste("mylist", node.names, sep="$")
node.names
[1] "mylist$a" "mylist$b$A" "mylist$b$B" "mylist$c$C" "mylist$c$D" "mylist$c$E$F"
[7] "mylist$c$E$G"
Next step is accessing list element by string name. I found nothing better than using temporary file.
f <- function(x){
fname <- tempfile()
cat(x, file=fname)
source(fname)$value
}
Here f just returns value of x, where x is a string with full name of list element.
Finally, we can query list in pseudo-recursive way.
sapply(node.names, f)
Referring to the question Find the indices of an element in a nested list?, you can write:
rappply <- function(x, f) {
setNames(lapply(seq_along(x), function(i) {
if (!is.list(x[[i]])) f(x[[i]], .name = names(x)[i])
else rappply(x[[i]], f)
}), names(x))
}
then,
> mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
>
> rappply(mylist, function(x, ..., .name) {
+ switch(.name, "a" = 1, "C" = 5, x)
+ })
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C
[1] 5
$c$D
[1] 3
Update June 2020: the rrapply-function in the rrapply-package (an extended version of base rapply) allows to do this by defining the argument .xname in the f function. Inside f, the .xname variable will evaluate to the name of the element in the nested list:
library(rrapply)
L <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
rrapply(L, f = function(x, .xname) paste(.xname, x, sep = " = "))
#> $a
#> [1] "a = 1"
#>
#> $b
#> $b$A
#> [1] "A = 1"
#>
#> $b$B
#> [1] "B = 2"
#>
#>
#> $c
#> $c$C
#> [1] "C = 1"
#>
#> $c$D
#> [1] "D = 3"