Using list elements in its' definition - r

I am having a relatively simple problem with R, which I hope we could find a solution to.
My aim is to define a following list, in which the c element should be the sum of a and b elements defined previously:
ex.list = list(
a = 1,
b = 2,
c = a+b
)
Code throws an error (Error: object 'a' not found), indicating that we cannot use the a and b elements defined just above.
Of course we can simply count the sum out of list definition
ex.list = list(
a = 1,
b = 2
)
ex.list$c = ex.list$a + ex.list$b
Or use another elements in creating the list
a.ex = 1
b.ex = 2
ex.list = list(
a = a.ex,
b = b.ex,
c = a.ex+b.ex
)
Unfortunately, I am not interested in the above solutions. Is there any way to do the sum in the list definition?

You can write your own list function that does lazy evaluation:
lazyList <- function(...) {
tmp <- match.call(expand.dots = FALSE)$`...`
lapply(tmp, eval, envir = tmp)
}
lazyList(
a = 1,
b = 2,
c = a+b
)
#$a
#[1] 1
#
#$b
#[1] 2
#
#$c
#[1] 3
However, obviously, the following is not possible with lazy evaluation:
lazyList(
a = 1,
b = 2,
d = c * a,
c = a+b
)

No, you can't do that. But you can do mad things like this:
> (function(a,b,c=a+b){list(a=a,b=b,c=c)})(11,22)
$a
[1] 11
$b
[1] 22
$c
[1] 33
But really, if you have a list you wish to construct in a particular way, write a function to do it. Its not difficult.

Related

Updating a list object in the global environment from within a function in R

I am trying to update a global list from inside a function.
Here is the code that does not work (can be sourced as a whole file):
require(rlang)
(my_list <- list(a = 1, b = "two", c = "set outside"))
print( paste("my_list$c is" , my_list$c) )
my_function <- function(x = 1, y = 2, parent_object_name = "my_list") {
z <- x + y # do some stuff (irrelevant here)
some_names <- "updated inside"
upper_env_object_name <- paste0(parent_object_name, "$c")
# browser()
# env_poke(env = env_tail(), upper_env_object_name, some_names) # does not work
# env_poke(env = env_parents()[[1]], upper_env_object_name, some_names) # does not work
env_poke(env = caller_env(), upper_env_object_name, some_names ) # creates `my_list$c` character vector
# force(env_poke(env = caller_env(), upper_env_object_name, some_names )) # creates `my_list$c` character vector
# browser()
# env_poke(env = caller_env(), paste0("as.list(",upper_env_object_name,")"), some_names) # creates as.list(my_list$c)` character vector
return(z)
}
my_function(x = 1, y = 2, parent_object_name = "my_list")
print(class(`my_list$c`))
print( `my_list$c`)
print( paste("my_list$c is" , my_list$c) )
I found this but it does not help:
Updating a nested list object in the global environment from within a function in R
Tried also with assign, and specifying the environment.
Background: I have some S3- subclases and want to keep track of them in the parent class object, which is also a list. The subclass objects are created "on-demand" and I want to have an overview what was created. My workaround for now is to create a new vector in the global environment and update it with :
if (exists("global_names_list")) global_names_list <<- unique(rbind(global_names_list, some_names)) else global_names_list <<- some_names
Modify List in Global Environment
Subscript the environment like this:
f <- function(listname = "my_list", envir = .GlobalEnv) {
envir[[listname]]$c <- "some value"
}
# test
my_list <- list(a = 1, b = "two", c = "set outside")
f()
str(my_list)
## List of 3
## $ a: num 1
## $ b: chr "two"
## $ c: chr "some value"
Functional Approach
Note that working via side effects such as the code above is not the usual style used in R. Rather an object oriented style using Reference Classes or other object oriented framework or a functional style is more common. For the functional style here is an example:
g <- function(x) modifyList(x, list(c = "somevalue"))
# test
my_list <- list(a = 1, b = "two", c = "set outside")
my_list <- g(my_list)
str(my_list)
## List of 3
## $ a: num 1
## $ b: chr "two"
## $ c: chr "somevalue"
Example of object oriented processing
Regarding the background paragraph at the end of the question here is an example of where we have a top object that contains properties a and last (both numeric) and a method add. There are any number of sub-objects that have property 'b' and inherit the add method to add a in top to b in the current object. last is the value of the last sum that was calculated by any sub-object.
library(proto)
top <- proto(a = 1,
last = NULL,
add = function(.) { top$last <- .$a + .$b; .$last }
)
sub1 <- top$proto(b = 2)
sub1$add()
# [1] 3
top$last
# [1] 3
sub2 <- top$proto(b = 3)
sub2$add()
# [1] 4
top$last
# [1] 4
This is probably not the best way, but it is a way:
my_list <- list(a = 1, b = "two", c = "set outside")
my_function <- function(x = 1, y = 2, parent_object_name = "my_list"){
z <- x+y
some_names <- "updated inside"
lst <- get(parent_object_name)
lst$c <- some_names
assign(parent_object_name, lst, envir = .GlobalEnv)
return(z)
}
my_function()
#> [1] 3
#check
my_list
#> $a
#> [1] 1
#>
#> $b
#> [1] "two"
#>
#> $c
#> [1] "updated inside"

enumerate in R over dataframe rows

I'm trying to modify a function so that if I put in a dataframe, I get the rownumber and row output.
These functions taken from Zip or enumerate in R? are a good starting point for me:
zip <- function(...) {
mapply(list, ..., SIMPLIFY = FALSE)
}
enumerate <- function(...) {
zip(k=seq_along(..1), ...)
}
I modified enumerate to work as I want when the input is a dataframe:
enumerate2 <- function(...){
mod <- ..1
if(is.data.frame(mod)){
mod = split(mod, seq(nrow(mod)))
}
zip(k = seq_along(mod), ...)
}
So for example:
g = data.frame(a = c(1, 2, 3), b = c(4, 5, 6))
enumerate2(v = g)
This will enumerate the rows of a dataframe, so I can do:
for(i in enumerate2(v = g)){
"rowNumber = %s, rowValues = %s" %>% sprintf(i$k, list(i$v)) %>% print
}
The problem is I get a warning:
Warning message:
In mapply(list, ..., SIMPLIFY = FALSE) :
longer argument not a multiple of length of shorter
Also, I'd rather the dataframe still be a dataframe so that I can do things like i$v$b to return the value of row i$k column b from the dataframe.
How can I get rid of the warning, and how can I keep the dataframe structure after split?
edit:
example 1 - data frame input
output:
enumerate2(v = data.frame(A = c(1, 2), B = c(3, 4)))
[[1]]
[[1]]$k
[1] 1
[[1]]$v
A B
1 1 3
[[2]]
[[2]]$k
[1] 2
[[2]]$v
A B
1 2 4
example 2 - list input
output:
enumerate2(v = LETTERS[1:2])
[[1]]
[[1]]$k
[1] 1
[[1]]$v
[1] "A"
[[2]]
[[2]]$k
[1] 2
[[2]]$v
[1] "B"

R: why providing a list to ellipsis (...) when ellipsis is the last argument does not work?

I was writing a function which takes advantage of an ellipsis (aka ...). It lets you specifiy a variable number of additional arguments. I wanted to provide a list with arguments as an additional argument. Below is a reproducible example:
f <- function(..., a =1, b = 2){
l <- list(...)
print(l)
}
f(list(a = 2))
[[1]]
[[1]]$a
[1] 2
The goal of providing additional arguments in a list was to avoid names conflict (the function inside f could take an argument named a and I wanted to ensure the possibility of providing it).
While changing implementation I noticed that moving ellipsis to the last place in function declaration returns different result (namely, an empty list):
g <- function(a =1, b = 2, ...){
l <- list(...)
print(l)
}
g(list(a = 2))
list()
Being curious, I added printing the default arguments to both functions:
f <- function(..., a =1, b = 2){
l <- list(...)
print(l)
print(c(a = a, b = b))
}
g <- function(a =1, b = 2, ...){
l <- list(...)
print(l)
print(c(a = a, b = b))
}
f(list(a = 2)) # results of calling f
[[1]]
[[1]]$a
[1] 2
a b
1 2
g(list(a = 2)) # results of calling g
list()
$a.a
[1] 2
$b
[1] 2
So, the first function (f) returned the intended output but the second one (g) ignored(?) the default argument a and somehow modified the list provided thanks to ellipsis.
I would like to understand why both outputs differ from each other. Does it mean that passing a list as an additional argument is possible only when an ellipsis is a first argument in a function call?
The way arguments work in R is that when you don't name your first argument, R just assumes that goes into the first argument for the function. This is true for the second, third and so forth arguments in the argument list EXCEPT for arguments that come after the ... - because R doesn't know how many arguments you intend to fit into the ..., if you want to change the default of whatever comes after it, you have to name it.
So in your case, when you call function f(), the object list(a=2) goes into the .... But in g(), that same object goes into a. The only way you can get something into ... when that is placed at the end of the argument list without including arguements for a and b is to name it something that isn't a or b, e.g. g(c=list(a=1)).
I think the function work as expected if you do not use list in the function call: f(a=2) and g(a=2) would both return list() a value of the variable l.
Since you pass a list as argument, it is treated as unnamed variable and is assigned to the first formal parameter, which is different for f and g.
Things would be different, again, if you would do do.call(f, list(a=2)) and do.call(g, list(a=2)). In this case, the value 2 would be assigned to the expected formal parameter a.
The short answer to your question Does it mean that passing a list as an additional argument is possible only when an ellipsis is a first argument in a function call? is No, if you follow correct naming or position when you pass the values during a function call, you won't have issues with incorrect matching of arguments.
It is better practice to name the argument (at least from the second argument) in the function call, so correct name matching takes place with the intended effect.
Failing to mention names during function call will perform positional matching of arguments. This may result in unintended effects, if you fail to send correct values at the right position.
Note the order of the argument names in the printed output. It follows the order of arguments as defined in the function. This correct order will only takes place, if you name the arguments correctly.
When naming of arguments is followed properly, no need to worry about the position of the arguments. You can pass anything to it. It does not necessarily have to be a list.
f <- function(..., a =1, b = 2){
mc <- match.call(expand.dots = TRUE )
print(names(mc))
}
g <- function(a =1, b = 2, ...){
mc <- match.call(expand.dots = TRUE )
print(names(mc))
}
f(c1 = list(z = 2, f = 5), a = 1, b = 2) # [1] "" "c1" "a" "b"
f(a = 1, c1 = list(z = 2, f = 5), b = 2) # [1] "" "c1" "a" "b"
f(a = 1, c1 = list(z = 2, f = 5), b = 2, c2 = 4) # "" "c1" "c2" "a" "b"
f(c1 = list(z = 2, f = 5), 1, 2, 4) # [1] "" "c1" "" "" ""
g(c1 = list(z = 2, f = 5), a = 1, b = 2) # [1] "" "a" "b" "c1"
g(a = 1, c1 = list(z = 2, f = 5), b = 2) # [1] "" "a" "b" "c1"
g(a = 1, c1 = list(z = 2, f = 5), b = 2, c2 = 4) # [1] "" "a" "b" "c1" "c2"
g(c1 = list(z = 2, f = 5), 1, 2, 4) # [1] "" "a" "b" "c1" ""

Find the indices of an element in a nested list?

I have a list like:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
is there an (loop-free) way to identify the positions of the elements, e.g. if I want to replace a values of "C" with 5, and it does not matter where the element "C" is found, can I do something like:
Aindex <- find_index("A", mylist)
mylist[Aindex] <- 5
I have tried grepl, and in the current example, the following will work:
mylist[grepl("C", mylist)][[1]][["C"]]
but this requires an assumption of the nesting level.
The reason that I ask is that I have a deep list of parameter values, and a named vector of replacement values, and I want to do something like
replacements <- c(a = 1, C = 5)
for(i in names(replacements)){
indx <- find_index(i, mylist)
mylist[indx] <- replacements[i]
}
this is an adaptation to my previous question, update a node (of unknown depth) using xpath in R?, using R lists instead of XML
One method is to use unlist and relist.
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
tmp <- as.relistable(mylist)
tmp <- unlist(tmp)
tmp[grep("(^|.)C$",names(tmp))] <- 5
tmp <- relist(tmp)
Because list names from unlist are concatenated with a ., you'll need to be careful with grep and how your parameters are named. If there is not a . in any of your list names, this should be fine. Otherwise, names like list(.C = 1) will fall into the pattern and be replaced.
Based on this question, you could try it recursively like this:
find_and_replace <- function(x, find, replace){
if(is.list(x)){
n <- names(x) == find
x[n] <- replace
lapply(x, find_and_replace, find=find, replace=replace)
}else{
x
}
}
Testing in a deeper mylist:
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3, d = list(C=10, D=55)))
find_and_replace(mylist, "C", 5)
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C ### it worked
[1] 5
$c$D
[1] 3
$c$d
$c$d$C ### it worked
[1] 5
$c$d$D
[1] 55
This can now also be done using rrapply in the rrapply-package (an extended version of base rapply). To return the position of an element in the nested list based on its name, we can use the special arguments .xpos and .xname. For instance, to look up the position of the element with name "C":
library(rrapply)
mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
## get position C-node
(Cindex <- rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x, .xpos) .xpos, how = "unlist"))
#> c.C1 c.C2
#> 3 1
We could then update its value in the nested list with:
## update value C-node
mylist[[Cindex]] <- 5
The two steps can also be combined directly in the call to rrapply:
rrapply(mylist, condition = function(x, .xname) .xname == "C", f = function(x) 5, how = "replace")
#> $a
#> [1] 1
#>
#> $b
#> $b$A
#> [1] 1
#>
#> $b$B
#> [1] 2
#>
#>
#> $c
#> $c$C
#> [1] 5
#>
#> $c$D
#> [1] 3

Get element names in rapply

The little-used rapply function should be a perfect solution to many problems (such as this one ). However, it leaves the user-written function f with no ability to know where it is in the tree.
Is there any way to pass the name of the element in the nested list to f in an rapply call? Unfortunately, rapply calls .Internal pretty quickly.
I struggled with nested lists with arbitrary depth recently. Eventually, I came up to more or less acceptable decision in my case. It is not the direct answer to your question (no rapply usage), but it seems to be solving the same kind of problems. I hope it can be of some help.
Instead of trying to access names of list elements inside rapply I generated vector of names and queried it for elements.
# Sample list with depth of 3
mylist <- list(a=-1, b=list(A=1,B=2), c=list(C=3,D=4, E=list(F=5,G=6)))
Generating of names vector is a tricky in my case. Specifically, names of list elements should be safe, i.e. without . symbol.
list.names <- strsplit(names(unlist(mylist)), split=".", fixed=TRUE)
node.names <- sapply(list.names, function(x) paste(x, collapse="$"))
node.names <- paste("mylist", node.names, sep="$")
node.names
[1] "mylist$a" "mylist$b$A" "mylist$b$B" "mylist$c$C" "mylist$c$D" "mylist$c$E$F"
[7] "mylist$c$E$G"
Next step is accessing list element by string name. I found nothing better than using temporary file.
f <- function(x){
fname <- tempfile()
cat(x, file=fname)
source(fname)$value
}
Here f just returns value of x, where x is a string with full name of list element.
Finally, we can query list in pseudo-recursive way.
sapply(node.names, f)
Referring to the question Find the indices of an element in a nested list?, you can write:
rappply <- function(x, f) {
setNames(lapply(seq_along(x), function(i) {
if (!is.list(x[[i]])) f(x[[i]], .name = names(x)[i])
else rappply(x[[i]], f)
}), names(x))
}
then,
> mylist <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
>
> rappply(mylist, function(x, ..., .name) {
+ switch(.name, "a" = 1, "C" = 5, x)
+ })
$a
[1] 1
$b
$b$A
[1] 1
$b$B
[1] 2
$c
$c$C
[1] 5
$c$D
[1] 3
Update June 2020: the rrapply-function in the rrapply-package (an extended version of base rapply) allows to do this by defining the argument .xname in the f function. Inside f, the .xname variable will evaluate to the name of the element in the nested list:
library(rrapply)
L <- list(a = 1, b = list(A = 1, B = 2), c = list(C = 1, D = 3))
rrapply(L, f = function(x, .xname) paste(.xname, x, sep = " = "))
#> $a
#> [1] "a = 1"
#>
#> $b
#> $b$A
#> [1] "A = 1"
#>
#> $b$B
#> [1] "B = 2"
#>
#>
#> $c
#> $c$C
#> [1] "C = 1"
#>
#> $c$D
#> [1] "D = 3"

Resources