Creating functions in a for loop with lists - r

I am scratching my head at the following problem:
I am creating two functions inside a for loop with parameters that depend on some dataframe. Each function is then put inside a list.
Printing the parameters inside the for loop shows that eachh function is well defined. Yet, when I use those outside of the loop, only the last parameters are used for both functions. The following example should make that clearer.
dt <- data.frame(color = c("red", "blue"),
a = c(3,9),
b = c(1.3, 1.8))
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo <- function(x) {
a*x^b
}
print(paste(col, foo(1)))
function_list[[col]] <- foo
}
[1] "red 3"
[1] "blue 9"
function_list[["red"]](1)
[1] 9
function_list[["blue"]](1)
[1] 9
To note, this is inspired from the following question: R nested for loop to write multiple functions and plot them
The equivalent solution with assign and get works (my answer to the previous question).

The relevant values of a and b are those when you call the function and not when you define it. The way you create the list, they are taken from the global environment. The solution is to create closures. I'd use Map for this, but you can do the same with a for loop:
funs <- Map(function(a, b) function(x) a*x^b, a = dt$a, b = dt$b)
print(funs)
#[[1]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a4298>
#
#[[2]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a3728>
Notice the different environments.
environment(funs[[1]])$a
#[1] 3
environment(funs[[2]])$a
#[1] 9
funs[[1]](1)
#[1] 3
funs[[2]](1)
#[1] 9

Your confusion will be solved by going a bit deeper with Environments
Let's check why your code doesn't work. When I try to print(function_list), you can see that both of the functions stored will return a*x^b.
# Part 1 : Why it doesn't work
# --------------------------
print(function_list)
# $red
# function (x)
# {
# a * x^b
# }
#
# $blue
# function (x)
# {
# a * x^b
# }
If you try to remove a and re-run the function, an error will be returned .
rm(a)
function_list[['red']](1)
# Error in function_list[["red"]](1) : object 'a' not found
.
And now to how to make your code work:
There is more than one way to make it work, most of which will require either playing around with your environments or changing the data structure.
One way to manage your environments - in such way that it will keep your values and not search for the variable in the global environment - is returning a function from the function.
# Part 2 : How to make it work
# ----------------------------
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo1 <- function(inner.a, inner.b) {
return(function(x) {inner.a*x^inner.b})
}
foo2 <- foo1(a,b)
print(paste(col, foo2(1)))
function_list[[col]] <- foo2
}
Now, if we check what's in the function_list, you can see that the functions are in two environments
print(function_list)
# $red
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x186fb40>
#
# $blue
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x2536438>
The output is also as expected. And even when we remove a, it will still work as expected.
function_list[['red']](1) # 3
function_list[['blue']](1) # 9
rm(a)
function_list[['red']](1) #[1] 3

I think that the for loop does not create new environments (you can check this by print(environment) within the loop), so the values of a and b are taken by foo in the global environment where they are 9 and 1.8, i.e. their last assigned values.

Related

Create an R function programmatically with non-fixed body

In a for loop I make a "string-formula" and allocate it to e.g. body1. And when I try to make a function with that body1 it fails... And I have no clue what I should try else...
This question How to create an R function programmatically? helped me a lot but sadly only quote is used to set the body...
I hope you have an idea how to work around with this issue.
And now my code:
A.m=matrix(c(3,4,2,2,1,1,1,3,2),ncol=3,byrow=TRUE)
for(i in 1:dim(A.m)[1]) {
body=character()
# here the string-formula emerges
for(l in 1:dim(A.m)[2]) {
body=paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
# only the last plus-sign is cutted off
assign(paste0("body",i),substr(body,1,nchar(body)-1))
}
args=alist(x = )
# just for your convenience the console output
body1
## [1] "A.m[1,1]*x[1]+A.m[1,2]*x[2]+A.m[1,3]*x[3]"
# in this code-line I don't know how to pass body1 in feasible way
assign("Function_1", as.function(c(args, ???body1???), env = parent.frame())
And this is my aim:
Function_1(x=c(1,1,1))
## 9 # 3*1 + 4*1 + 2*1
Since you have a string, you need to parse that string. You can do
assign("Function_1",
as.function(c(args, parse(text=body1)[[1]])),
env = parent.frame())
Though I would strongly discourage the use of assign for filling your global environment with a bunch of variables with indexes in their name. In general that makes things much tougher to program with. It would be much easier to collect all your functions in a list. For example
funs <- lapply(1:dim(A.m)[1], function(i) {
body <- ""
for(l in 1:dim(A.m)[2]) {
body <- paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
body <- substr(body,1,nchar(body)-1)
body <- parse(text=body)[[1]]
as.function(c(alist(x=), body), env=parent.frame())
})
And then you can call the different functions by extracting them with [[]]
funs[[1]](x=c(1,1,1))
# [1] 9
funs[[2]](x=c(1,1,1))
# [1] 4
Or you can ever call all the functions with an lapply
lapply(funs, function(f, ...) f(...), x=c(1,1,1))
# [[1]]
# [1] 9
# [[2]]
# [1] 4
# [[3]]
# [1] 6
Although if this is actually what your function is doing, there are easier ways to do this in R using matrix multiplication %*%. Your Function_1 is the same as A.m[1,] %*% c(1,1,1). You could make a generator funciton like
colmult <- function(mat, row) {
function(x) {
as.numeric(mat[row,] %*% x)
}
}
And then create the same list of functions with
funs <- lapply(1:3, function(i) colmult(A.m, i))
Then you don't need any string building or parsing which tends to be error prone.

Can I access the assignment of a function from inside that function? [duplicate]

For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"

Elegant way to define a function inside another function

I want to construct
f <- function(...) {
g <- function(x) x ^ 2
list(...)
}
so that I can invoke using f(g(4)) and have list(...) result in list(16).
In general I will define several temporary functions inside f that the user can invoke when calling f(...).
I have experimented with assign and newenvironment but have just gotten more confused. Help with an elegant solution is appreciated.
The reason for wanting this is that I want a function in the Hmisc package, drawPlot to be able to let the users specify generically named functions as input for building up a series of graphical elements, and I don't want to reserve these generic-type names. E.g.:
d <- drawPlot(Curve(), Points()) # interactively make a curve and
# a set of points
I'm guessing you in fact need something more elaborate than this, but the following does what you've asked for in your supplied example:
f <- function(...) {
g <- function(x) x ^ 2
list(eval(substitute(...)))
}
f(g(4))
# [[1]]
# [1] 16
Or, if users may supply one or more function calls, something like this:
f <- function(...) {
g <- function(x) x ^ 2
h <- function(x) 100*x
cc <- as.list(substitute(list(...))[-1])
res <- list()
for(i in seq_along(cc)) {
res[[i]] <- eval(cc[[i]])
}
res
}
f(g(4), h(5))
# [[1]]
# [1] 16
#
# [[2]]
# [1] 500
Very similar to this answer but I think maybe more extensible and closer to your original idea:
match.fun_wrapper <- function(...) {
# `match.fun` searches in the parent environment of the environment that
# calls `match.fun`, so this wrapper is a hack to be able to search in
# the current environment rather than the parent of the current environemnt
match.fun(...)
}
f <- function(fun, ...) {
g <- function(x) x ^ 2
fun <- match.fun_wrapper(substitute(fun))
fun(...)
}
If you wanted to do away with match.fun, you could also do away with the wrapper hack:
f <- function(fun, ...) {
g <- function(x) x ^ 2
fun(...)
}
It looks to me like what you're trying to do is something like this:
f <- function(fun, ...) {
g <- function(x) x ^ 2
h <- function(x) x ^ 3
i <- function(x) x ^ 4
switch(fun,
'g' = g(...),
'h' = h(...),
'i' = i(...))
}
> f('g', 3)
[1] 9
> f('h', 3)
[1] 27
> f('i', 3)
[1] 81
It's not obvious why you would want to, unless you're just trying to encapsulate functions with similar names inside different namespaces and using this as a hacky workaround for the fact R doesn't offer fully-featured classes. If that's the case, you can also just use actual namespaces, i.e. put your functions inside a package so they're called by package::g(arg) instead of f('g', arg).

Compose functions with function operators does not work as expected

In the following example I created the add_timing function operator. The input is a function (say mean) and it returns a function that does the same as mean, but reports on how long it took for the function to complete. See the following example:
library(pryr)
add_timing = function(input_function, specific_info) {
if (missing(specific_info)) specific_info = function(l) 'That'
function(...) {
relevant_value = specific_info(list(...))
start_time = Sys.time()
res = input_function(...)
cat(sprintf('%s took', relevant_value), difftime(Sys.time(), start_time, units = 'secs'), 'sec', '\n')
res
}
}
timed_mean = add_timing(mean)
# > timed_mean(runif(10000000))
# That took 0.4284899 sec
# [1] 0.4999762
Next I tried to use pryr::compose to create the same timed_mean function (I like the syntax):
timed_mean_composed = pryr::compose(add_timing, mean)
But this does get me the required output:
# > timed_mean_composed(runif(100))
# function(...) {
# relevant_value = specific_info(list(...))
# start_time = Sys.time()
# res = input_function(...)
# cat(sprintf('%s took', relevant_value), difftime(Sys.time(), start_time, units = 'secs'), 'sec', '\n')
# res
# }
It seems that the compose operation does not lead to the add_timing function actually being executed. Only after calling the function, the new timed_mean_compose actually shows the correct function output.
Based on the following example from Advanced R by #HadleyWickham I expected this to work as I used it (see below for an excerpt):
dot_every <- function(n, f) {
i <- 1
function(...) {
if (i %% n == 0) cat(".")
i <<- i + 1
f(...)
}
}
download <- pryr::compose(
partial(dot_every, 10),
memoise,
partial(delay_by, 1),
download_file
)
Where the dot_every function operator is used in the same way I use add_timing above.
What am I missing?
The difference is that in your first attempt, you are calling
(add_timing(mean))(runif(1e7)
and with the compose syntax you are calling something more similar to
add_timing(mean(runif(1e7))
These are not exactly equivalent. Actually, the pryr compose function is really expanding the syntax to something more like
x <- runif(1e7)
x <- mean(x)
x <- add_timing(x)
Maybe looking at this will help
a <- function(x) {print(paste("a:", x));x}
b <- function(x) {print(paste("b:", x));x}
x <- pryr::compose(a,b)(print("c"))
# [1] "c"
# [1] "b: c"
# [1] "a: c"
Notice how a isn't called until after b. This means that a would have no way to time b. compose would not be an appropriate way to create a timer wrapper.
The issue is that pryr::compose is aimed at doing something completely different from what you're trying to do in your initial example. You want to create a function factory (called add_timing), which will take a function as input and return a new function as output that does the same thing as the input function but with an additional time printing. I would write that as follows:
add_timing <- function(FUN) { function(...) { print(system.time(r <- FUN(...))); r }}
mean(1:5)
# [1] 3
add_timing(mean)(1:5)
# user system elapsed
# 0 0 0
# [1] 3
The compose function, by contrast, returns a function that represents a series of functions to be evaluated in sequence. The examples in ? compose are helpful here. Here's an example that builds on that:
add1 <- function(x) x + 1
times2 <- function(x) x * 2
# the following two are identical:
add1(1)
# [1] 2
compose(add1)(1)
# [1] 2
# the following two are identical:
times2(1)
# [1] 2
compose(times2)(1)
# [1] 2
compose becomes useful for nesting, when the order of nesting is important:
add1(times2(2))
# [1] 5
compose(add1, times2)(2)
# [1] 5
times2(add1(2))
# [1] 6
compose(times2, add1)(2)
# [1] 6
This means that the reason your example does not work is because your functions are not actually nested in the way that compose is intended to work. In your example, you're asking system.time to, for example, calculate the time to evaluate 3 (the output of mean) rather than the time to evaluate mean(1:5).

Recursively editing a list in R

In my program, I am recursively going over a nested list and adding elements to an overall list that I will return. There are a few details to be taken care of, so I can't just use unlist.
formulaPart is taken to be a formula object.
My code is:
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart))
if(typeof(currentVar == 'language'
parseVariables(currentVar, myList)
else
if(! toString(currentVar) %in% c(\\various characters)
list <- c(list, currentVar)
}
I have checked that the function correctly adds elements to the list when it should. The problem is that the list loses elements due to recursion. The elements added during one inner recursive call are not saved for another recursive call.
If this was in C++, I could just use a pointer; the same for Java. However, I do not understand how to handle this error in R.
R does something like pass-by-value, so you can't modify (most) existing objects just by passing them into a function. If you want to add on to something recursively, one trick would be to use an environment instead, which get passed by reference. This can easily be coerced to list when you're done.
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart)) {
if(typeof(currentVar) == 'language') {
parseVariables(currentVar, myList)
}
else {
if(! toString(currentVar) %in% c(':', '+', '~'))
assign(toString(currentVar), currentVar, myList)
}
}
}
f1 <- z ~ a:b + x
f2 <- z ~ x + y
myList <- new.env()
parseVariables(f1, myList)
parseVariables(f2, mylist)
ls(myList)
# [1] "a" "b" "x" "z"
as.list(myList)
# $x
# x
#
# $z
# z
#
# $a
# a
#
# $b
# b

Resources