In my program, I am recursively going over a nested list and adding elements to an overall list that I will return. There are a few details to be taken care of, so I can't just use unlist.
formulaPart is taken to be a formula object.
My code is:
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart))
if(typeof(currentVar == 'language'
parseVariables(currentVar, myList)
else
if(! toString(currentVar) %in% c(\\various characters)
list <- c(list, currentVar)
}
I have checked that the function correctly adds elements to the list when it should. The problem is that the list loses elements due to recursion. The elements added during one inner recursive call are not saved for another recursive call.
If this was in C++, I could just use a pointer; the same for Java. However, I do not understand how to handle this error in R.
R does something like pass-by-value, so you can't modify (most) existing objects just by passing them into a function. If you want to add on to something recursively, one trick would be to use an environment instead, which get passed by reference. This can easily be coerced to list when you're done.
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart)) {
if(typeof(currentVar) == 'language') {
parseVariables(currentVar, myList)
}
else {
if(! toString(currentVar) %in% c(':', '+', '~'))
assign(toString(currentVar), currentVar, myList)
}
}
}
f1 <- z ~ a:b + x
f2 <- z ~ x + y
myList <- new.env()
parseVariables(f1, myList)
parseVariables(f2, mylist)
ls(myList)
# [1] "a" "b" "x" "z"
as.list(myList)
# $x
# x
#
# $z
# z
#
# $a
# a
#
# $b
# b
Related
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
I am scratching my head at the following problem:
I am creating two functions inside a for loop with parameters that depend on some dataframe. Each function is then put inside a list.
Printing the parameters inside the for loop shows that eachh function is well defined. Yet, when I use those outside of the loop, only the last parameters are used for both functions. The following example should make that clearer.
dt <- data.frame(color = c("red", "blue"),
a = c(3,9),
b = c(1.3, 1.8))
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo <- function(x) {
a*x^b
}
print(paste(col, foo(1)))
function_list[[col]] <- foo
}
[1] "red 3"
[1] "blue 9"
function_list[["red"]](1)
[1] 9
function_list[["blue"]](1)
[1] 9
To note, this is inspired from the following question: R nested for loop to write multiple functions and plot them
The equivalent solution with assign and get works (my answer to the previous question).
The relevant values of a and b are those when you call the function and not when you define it. The way you create the list, they are taken from the global environment. The solution is to create closures. I'd use Map for this, but you can do the same with a for loop:
funs <- Map(function(a, b) function(x) a*x^b, a = dt$a, b = dt$b)
print(funs)
#[[1]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a4298>
#
#[[2]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a3728>
Notice the different environments.
environment(funs[[1]])$a
#[1] 3
environment(funs[[2]])$a
#[1] 9
funs[[1]](1)
#[1] 3
funs[[2]](1)
#[1] 9
Your confusion will be solved by going a bit deeper with Environments
Let's check why your code doesn't work. When I try to print(function_list), you can see that both of the functions stored will return a*x^b.
# Part 1 : Why it doesn't work
# --------------------------
print(function_list)
# $red
# function (x)
# {
# a * x^b
# }
#
# $blue
# function (x)
# {
# a * x^b
# }
If you try to remove a and re-run the function, an error will be returned .
rm(a)
function_list[['red']](1)
# Error in function_list[["red"]](1) : object 'a' not found
.
And now to how to make your code work:
There is more than one way to make it work, most of which will require either playing around with your environments or changing the data structure.
One way to manage your environments - in such way that it will keep your values and not search for the variable in the global environment - is returning a function from the function.
# Part 2 : How to make it work
# ----------------------------
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo1 <- function(inner.a, inner.b) {
return(function(x) {inner.a*x^inner.b})
}
foo2 <- foo1(a,b)
print(paste(col, foo2(1)))
function_list[[col]] <- foo2
}
Now, if we check what's in the function_list, you can see that the functions are in two environments
print(function_list)
# $red
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x186fb40>
#
# $blue
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x2536438>
The output is also as expected. And even when we remove a, it will still work as expected.
function_list[['red']](1) # 3
function_list[['blue']](1) # 9
rm(a)
function_list[['red']](1) #[1] 3
I think that the for loop does not create new environments (you can check this by print(environment) within the loop), so the values of a and b are taken by foo in the global environment where they are 9 and 1.8, i.e. their last assigned values.
Edit: this question is silly, I confused [ and [[ (thanks #josilber), but I can't delete it.
How can one make an infinitely recursive list, L == L[1], in R ?
A terrible way (for an R novice)
is to look at dataset fmri in package
astsa:
data( fmri, package="astsa" ) # a list with fmri[1] == fmri[1][1] ... ??
print_recursive = function( x ){ # x x[1] x[1][1] ... while list
for( j in 1:5 ){
cat( class(x), len(x), names(x), "\n" )
str(x)
cat( "\n" )
x = x[1] # <-- wrong, should be x[[1]]
if( class(x) != "list" ) break
}
x
}
x = print_recursive( fmri )
The answer to "How can one make an infinitely recursive list, L == L[1], in R" is that it is impossible, because you would need an infinite amount of memory to store a list of infinite recursive depth.
That being said, you can build a recursive list of a specified depth with a simple for loop, at each iteration creating a new list that stores the old list as one of its elements:
l <- list()
depth <- 50
for (k in seq(depth-1)) {
l <- list(l)
}
You could write a recursive function to check the depth of a recursive list:
recursive.depth <- function(l) {
if (!is.list(l)) 0
else if (length(l) == 0) 1
else 1+max(sapply(l, recursive.depth))
}
recursive.depth(l)
# [1] 50
recursive.depth(fmri)
# [1] 1
Getting back to the example from the question, the list you have is actually not recursive at all (it's just a list of matrices). The reason you think it is recursive is that indexing l[1] subsets the list (aka it returns the list with just its first element). For instance, consider the following very simple list:
(l <- list(2))
# [[1]]
# [1] 2
No matter how many times I subset with the [ notation, I will get back the exact same list:
l[1]
# [[1]]
# [1] 2
l[1][1]
# [[1]]
# [1] 2
Running this list l through your print_recursive function will also result in an infinite loop. If you wanted to actually extract the first element of the list instead of subsetting the list itself, you should use the [[ notation (e.g. l[[1]]).
In R, I have a list of functions (strategies for a simulation). For example:
a <- function(x){
return(x)
}
b <- function(y){
return(y)
}
funclist <- list(a,b)
I'd like to write some code that returns the name of each function. Normally, for functions I would use:
as.character(substitute(a))
But this does not work for the list, as it just would return the list name (as expected). I then tried lapply:
> lapply(X = funclist,FUN = substitute)
Error in lapply(X = funclist, FUN = substitute) :
'...' used in an incorrect context
But get the above error.
Ideally I would get (lapply solution):
[[1]]
[1] "a"
[[2]]
[1] "b"
or even (sapply solution):
[1] "a" "b"
After you do
funclist <- list(a,b)
The parameters a and b are evaluated and the functions they point to are returned. There is no way to get back to the original names. (The substitute() "trick" works on parameters passed to functions as promises. It will not work on evaluated called without additional escaping.)
If you want to retain names, it's best to use a named list. You can do
funclist <- list(a=a,a=b)
or
funclist <- setNames(list(a,b), c("a","b"))
or even use mget() here
funclist <- mget(c("a","b"))
All these methods will returned a named list and you can use
names(funclist)
# [1] "a" "b"
to get the names
I'm working on an R package that has a number of functions that follow a non-R-standard practice of modifying in place the object passed in as an argument. This normally works OK, but fails when the object to be modified is on a list.
An function to give an example of the form of the assignments:
myFun<-function(x){
xn <- deparse(substitute(x))
ev <- parent.frame()
# would do real stuff here ..
# instead set simple value to modify local copy
x[[1]]<-"b"
# assign in parent frame
if (exists(xn, envir = ev))
on.exit(assign(xn, x, pos = ev))
# return invisibly
invisible(x)
}
This works:
> myObj <-list("a")
> myFun(myObj)
> myObj
[[1]]
[1] "b"
But it does not work if the object is a member of a list:
> myObj <-list("a")
> myList<-list(myObj,myObj)
> myFun(myList[[1]])
> myList
[[1]]
[[1]][[1]]
[1] "a"
[[2]]
[[2]][[1]]
[1] "a"
After reading answers to other questions here, I see the docs for assign clearly state:
assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.
Since there is an existing codebase using these functions, we cannot abandon the modify-in-place syntax. Does anyone have suggestions for workarounds or alternative approaches for modifying objects which are members of a list in a parent frame?
UPDATE:
I've considered trying to roll my own assignment function, something like:
assignToListInEnv<-function(name,env,value){
# assume name is something like "myList[[1]]"
#check for brackets
index<-regexpr('[[',name,fixed=TRUE)[1]
if(index>0){
lname<-substr(name,0,index-1)
#check that it exists
if (exists(lname,where=env)){
target<-get(lname,pos=env)
# make sure it is a list
if (is.list(target)){
eval(parse(text=paste('target',substr(name,index,999),'<-value',sep='')))
assign(lname, target, pos = env)
} else {
stop('object ',lname,' is not a list in environment ',env)
}
} else {
stop('unable to locate object ',lname,' in frame ',env)
}
}
}
But it seems horrible brittle, would need to handle many more cases ($ and [ as well as [[) and would probably still fail for [[x]] because x would be evaluated in the wrong frame...
Since it was in the first search results to my query, here's my solution :
You can use paste() with "<<-" to create an expression which will assign the value to your list element when evaluated.
assignToListInEnv<-function(name, value, env = parent.frame()){
cl <- as.list(match.call())
lang <- str2lang(paste(cl["name"], "<<-", cl["value"]))
eval(lang, envir = env)
}
EDIT : revisiting this answer because it got a vote up
I'm not sure why I used <<- instead of <-. If using the 'env' argument, <<-with assign to the parent.frame of that env.
So if you always want it to be the first parent.frame it can just be :
assignToListInParentFrame<-function(name, value){
cl <- as.list(match.call())
paste(cl["name"], "<<-", cl["value"]) |>
str2lang() |>
eval()
}
and if you want to precise in which env to modify the list :
assignToListInEnv<-function(name, value, env){
cl <- as.list(match.call())
paste(cl["name"], "<-", cl["value"]) |>
str2lang() |>
eval(envir = env)
}