Compose functions with function operators does not work as expected - r

In the following example I created the add_timing function operator. The input is a function (say mean) and it returns a function that does the same as mean, but reports on how long it took for the function to complete. See the following example:
library(pryr)
add_timing = function(input_function, specific_info) {
if (missing(specific_info)) specific_info = function(l) 'That'
function(...) {
relevant_value = specific_info(list(...))
start_time = Sys.time()
res = input_function(...)
cat(sprintf('%s took', relevant_value), difftime(Sys.time(), start_time, units = 'secs'), 'sec', '\n')
res
}
}
timed_mean = add_timing(mean)
# > timed_mean(runif(10000000))
# That took 0.4284899 sec
# [1] 0.4999762
Next I tried to use pryr::compose to create the same timed_mean function (I like the syntax):
timed_mean_composed = pryr::compose(add_timing, mean)
But this does get me the required output:
# > timed_mean_composed(runif(100))
# function(...) {
# relevant_value = specific_info(list(...))
# start_time = Sys.time()
# res = input_function(...)
# cat(sprintf('%s took', relevant_value), difftime(Sys.time(), start_time, units = 'secs'), 'sec', '\n')
# res
# }
It seems that the compose operation does not lead to the add_timing function actually being executed. Only after calling the function, the new timed_mean_compose actually shows the correct function output.
Based on the following example from Advanced R by #HadleyWickham I expected this to work as I used it (see below for an excerpt):
dot_every <- function(n, f) {
i <- 1
function(...) {
if (i %% n == 0) cat(".")
i <<- i + 1
f(...)
}
}
download <- pryr::compose(
partial(dot_every, 10),
memoise,
partial(delay_by, 1),
download_file
)
Where the dot_every function operator is used in the same way I use add_timing above.
What am I missing?

The difference is that in your first attempt, you are calling
(add_timing(mean))(runif(1e7)
and with the compose syntax you are calling something more similar to
add_timing(mean(runif(1e7))
These are not exactly equivalent. Actually, the pryr compose function is really expanding the syntax to something more like
x <- runif(1e7)
x <- mean(x)
x <- add_timing(x)
Maybe looking at this will help
a <- function(x) {print(paste("a:", x));x}
b <- function(x) {print(paste("b:", x));x}
x <- pryr::compose(a,b)(print("c"))
# [1] "c"
# [1] "b: c"
# [1] "a: c"
Notice how a isn't called until after b. This means that a would have no way to time b. compose would not be an appropriate way to create a timer wrapper.

The issue is that pryr::compose is aimed at doing something completely different from what you're trying to do in your initial example. You want to create a function factory (called add_timing), which will take a function as input and return a new function as output that does the same thing as the input function but with an additional time printing. I would write that as follows:
add_timing <- function(FUN) { function(...) { print(system.time(r <- FUN(...))); r }}
mean(1:5)
# [1] 3
add_timing(mean)(1:5)
# user system elapsed
# 0 0 0
# [1] 3
The compose function, by contrast, returns a function that represents a series of functions to be evaluated in sequence. The examples in ? compose are helpful here. Here's an example that builds on that:
add1 <- function(x) x + 1
times2 <- function(x) x * 2
# the following two are identical:
add1(1)
# [1] 2
compose(add1)(1)
# [1] 2
# the following two are identical:
times2(1)
# [1] 2
compose(times2)(1)
# [1] 2
compose becomes useful for nesting, when the order of nesting is important:
add1(times2(2))
# [1] 5
compose(add1, times2)(2)
# [1] 5
times2(add1(2))
# [1] 6
compose(times2, add1)(2)
# [1] 6
This means that the reason your example does not work is because your functions are not actually nested in the way that compose is intended to work. In your example, you're asking system.time to, for example, calculate the time to evaluate 3 (the output of mean) rather than the time to evaluate mean(1:5).

Related

Create an R function programmatically with non-fixed body

In a for loop I make a "string-formula" and allocate it to e.g. body1. And when I try to make a function with that body1 it fails... And I have no clue what I should try else...
This question How to create an R function programmatically? helped me a lot but sadly only quote is used to set the body...
I hope you have an idea how to work around with this issue.
And now my code:
A.m=matrix(c(3,4,2,2,1,1,1,3,2),ncol=3,byrow=TRUE)
for(i in 1:dim(A.m)[1]) {
body=character()
# here the string-formula emerges
for(l in 1:dim(A.m)[2]) {
body=paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
# only the last plus-sign is cutted off
assign(paste0("body",i),substr(body,1,nchar(body)-1))
}
args=alist(x = )
# just for your convenience the console output
body1
## [1] "A.m[1,1]*x[1]+A.m[1,2]*x[2]+A.m[1,3]*x[3]"
# in this code-line I don't know how to pass body1 in feasible way
assign("Function_1", as.function(c(args, ???body1???), env = parent.frame())
And this is my aim:
Function_1(x=c(1,1,1))
## 9 # 3*1 + 4*1 + 2*1
Since you have a string, you need to parse that string. You can do
assign("Function_1",
as.function(c(args, parse(text=body1)[[1]])),
env = parent.frame())
Though I would strongly discourage the use of assign for filling your global environment with a bunch of variables with indexes in their name. In general that makes things much tougher to program with. It would be much easier to collect all your functions in a list. For example
funs <- lapply(1:dim(A.m)[1], function(i) {
body <- ""
for(l in 1:dim(A.m)[2]) {
body <- paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
body <- substr(body,1,nchar(body)-1)
body <- parse(text=body)[[1]]
as.function(c(alist(x=), body), env=parent.frame())
})
And then you can call the different functions by extracting them with [[]]
funs[[1]](x=c(1,1,1))
# [1] 9
funs[[2]](x=c(1,1,1))
# [1] 4
Or you can ever call all the functions with an lapply
lapply(funs, function(f, ...) f(...), x=c(1,1,1))
# [[1]]
# [1] 9
# [[2]]
# [1] 4
# [[3]]
# [1] 6
Although if this is actually what your function is doing, there are easier ways to do this in R using matrix multiplication %*%. Your Function_1 is the same as A.m[1,] %*% c(1,1,1). You could make a generator funciton like
colmult <- function(mat, row) {
function(x) {
as.numeric(mat[row,] %*% x)
}
}
And then create the same list of functions with
funs <- lapply(1:3, function(i) colmult(A.m, i))
Then you don't need any string building or parsing which tends to be error prone.

How could I define an application method on an S3 object in R? (like a "function object" in c++)

The problem I am trying to tackle here is needing to apply (execute) an S3 object which is essentially a vector-like structure. This may contain various formulas which at some stage I need to evaluate for a single argument, in order to get back a vector-like object of the original shape, containing the evaluation of its constituent formulas at the given argument.
Examples of this (just to illustrate) might be a matrix of transformation - say rotation - which would take the angle to rotate by, and produce a matrix of values by which to multiply a point, for the given rotation. Another example might be the vector of states in a problem in classical mechanics. Then given t, v, a, etc, it could return s...
Now, I have created my container object in S3, and its working fine in most respects, using generic methods; I also found the Ops.myClass system of operator overloading very useful.
To complete my class, all I need now is a way to specify it as executable.
I see that there are various mechanisms that will do what I want in part, for instance I suppose that as.function() will convert the object to behave as I want, and something like lapply() could be used for the "reverse" application of the argument to the functions. What I am not sure how to do is link it all up so that I can do something like this mock-up:
new_Object <- function(<all my function vector stuff spec>)
vtest <- new_Object(<say, sin, cos, tan>)
vtest(1)
==>
myvec(.8414709848078965 .5403023058681398 1.557407724654902)
(Yes, I have already specified a generic print() routine that will make it appear nice)
All suggestions, sample code, links to examples are welcome.
PS =====
I have added some basic example code as per request.
I am not sure how much would be too much, so the full working minimal example, including operator overloading is in this gist here.
I am only showing the constructor and helper functions below:
# constructor
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <-function(...){
vec <- unlist(list(...),use.names = FALSE)
new_Struct("up",vec)
}
down <-function(...){
vec <- unlist(list(...),use.names = FALSE)
new_Struct("down",vec)
}
The above code behaves thus:
> u1 <- up(1,2,3)
> u2 <- up(3,4,5)
> d1 <- down(u1)
> d1
[1] down(1, 2, 3)
> u1+u2
[1] up(4, 6, 8)
> u1+d1
Error: '+' not defined for opposite tuple types
> u1*d1
[1] 14
> u1*u2
[,1] [,2] [,3]
[1,] 3 4 5
[2,] 6 8 10
[3,] 9 12 15
> u1^2
[1] 14
> s1 <- up(sin,cos,tan)
> s1
[1] up(.Primitive("sin"), .Primitive("cos"), .Primitive("tan"))
> s1(1)
Error in s1(1) : could not find function "s1"
What I need, is for it to be able to do this:
> s1(1)
[1] up(.8414709848078965 .5403023058681398 1.557407724654902)
You can not call each function in a list of functions without a loop.
I'm not fully understanding all requirements, but this should give you a start:
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec) || is.function(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <- function(...) UseMethod("up")
up.default <- function(...){
vals <- list(...)
stopifnot(all(vapply(vals, is.vector, FUN.VALUE = logical(1))))
vec <- unlist(vals, use.names = FALSE)
new_Struct("up",vec)
}
up.function <- function(...){
funs <- list(...)
stopifnot(all(vapply(funs, is.function, FUN.VALUE = logical(1))))
new_Struct("up", function(x) new_Struct("up", sapply(funs, do.call, list(x))))
}
up(1, 2, 3)
#[1] 1 2 3
#attr(,"class")
#[1] "Struct"
#attr(,"type")
#[1] "up"
up(1, 2, sin)
#Error in up.default(1, 2, sin) :
# all(vapply(vals, is.vector, FUN.VALUE = logical(1))) is not TRUE
up(sin, 1, 2)
#Error in up.function(sin, 1, 2) :
# all(vapply(funs, is.function, FUN.VALUE = logical(1))) is not TRUE
s1 <- up(sin, cos, tan)
s1(1)
#[1] 0.8414710 0.5403023 1.5574077
#attr(,"class")
#[1] "Struct"
#attr(,"type")
#[1] "up"
After some thought I have come up with a way to approach this, it's not perfect, it would be great if someone could figure out a way to make the function call implicit/transparent.
So, for now I just use the call() mechanism on the object, and that seems to work fine. Here's the pertinent part of the code, minus checks. I'll put up the latest full version on the same gist as above.
# constructor
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <- function(...){
vec <- unlist(list(...), use.names = FALSE)
new_Struct("up",vec)
}
down <- function(...){
vec <- unlist(list(...), use.names = FALSE)
new_Struct("down",vec)
}
# generic print for tuples
print.Struct <- function(s){
outstr <- sprintf("%s(%s)", attributes(s)$type, paste(c(s), collapse=", "))
print(noquote(outstr))
}
# apply the structure - would be nice if this could be done *implicitly*
call <- function(...) UseMethod("call")
call.Struct <- function(s,x){
new_Struct(attributes(s)$type, sapply(s, do.call, list(x)))
}
Now I can do:
> s1 <- up(sin,cos,tan)
> length(s1)
[1] 3
> call(s1,1)
[1] up(0.841470984807897, 0.54030230586814, 1.5574077246549)
>
Not as nice as my ultimate target of
> s1(1)
[1] up(0.841470984807897, 0.54030230586814, 1.5574077246549)
but it will do for now...

Objects between functions within larger function in R (reproducible code)

This has to simple but it has me confused for the life of me.
I'm trying to get objects (numeric variables, data frames, etc.) created in a larger function into a smaller function within that larger function.
The output of the example functions below should be a data.frame with 20 rows, but I only get "test". Would someone please kindly explain this object function behavior in R and how to get created objects between the functions? Thanks in advance!
testfunload <- function(test){
###All my different tries to get test dataframe into testfunload function
txt <- get("test")
try1 <- get(test)
try2 <- get(paste0(txt))
try3 <- get(paste0(test))
try4 <- get(test)
try5 <- test
try6 <- get("test")
print(try1)
print(try2)
print(try3)
print(try4)
print(try5)
print(try6)
}
###Testing with test data.frame in local environment
testfunload2 <- function(){
test <- data.frame(i1=rep("i",20), i2=rep("I",20))
testfunload(test="test")
}
###Testing with test data.frame in local environment
testfunload2global <- function(){
test <<- data.frame(i1=rep("i",20), i2=rep("I",20))
testfunload(test="test")
}
###PROBLEM:Doesn't have data.frame only "test" as char
testfunload2()
[1] "test"
[1] "test"
[1] "test"
[1] "test"
[1] "test"
[1] "test"
###PROBLEM:Doesn't have data.frame only "test" as char
testfunload2global()
[1] "test"
[1] "test"
[1] "test"
[1] "test"
[1] "test"
[1] "test"
Desired output:
i1 i2
1 i I
2 i I
3 i I
4 i I
5 i I
6 i I
###on for 20 rows etc.
To me, it looks a bit like a complicated way to pass variables to a function. Maybe you can get what you want using the ... argument?
If you want to access the calling scope, though, you can look at parent.frame(). It gives you the environment in the scope where your function is called.
Consider this function:
f <- function() {
e <- parent.frame()
with(e, x + y)
}
It evaluates x + y in the calling scope -- you are getting dynamic rather than lexical scoping, in a way.
So with these functions, you are passing x and y to f without doing so explicitly.
g <- function(x, y) f()
h <- function(x) {
y <- 2
f()
}
They will work like this:
> g(1, 1)
[1] 2
> h(3)
[1] 5
It would be much simpler just to make x and y parameters of f, of course.

Creating functions in a for loop with lists

I am scratching my head at the following problem:
I am creating two functions inside a for loop with parameters that depend on some dataframe. Each function is then put inside a list.
Printing the parameters inside the for loop shows that eachh function is well defined. Yet, when I use those outside of the loop, only the last parameters are used for both functions. The following example should make that clearer.
dt <- data.frame(color = c("red", "blue"),
a = c(3,9),
b = c(1.3, 1.8))
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo <- function(x) {
a*x^b
}
print(paste(col, foo(1)))
function_list[[col]] <- foo
}
[1] "red 3"
[1] "blue 9"
function_list[["red"]](1)
[1] 9
function_list[["blue"]](1)
[1] 9
To note, this is inspired from the following question: R nested for loop to write multiple functions and plot them
The equivalent solution with assign and get works (my answer to the previous question).
The relevant values of a and b are those when you call the function and not when you define it. The way you create the list, they are taken from the global environment. The solution is to create closures. I'd use Map for this, but you can do the same with a for loop:
funs <- Map(function(a, b) function(x) a*x^b, a = dt$a, b = dt$b)
print(funs)
#[[1]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a4298>
#
#[[2]]
#function (x)
#a * x^b
#<environment: 0x000000000a9a3728>
Notice the different environments.
environment(funs[[1]])$a
#[1] 3
environment(funs[[2]])$a
#[1] 9
funs[[1]](1)
#[1] 3
funs[[2]](1)
#[1] 9
Your confusion will be solved by going a bit deeper with Environments
Let's check why your code doesn't work. When I try to print(function_list), you can see that both of the functions stored will return a*x^b.
# Part 1 : Why it doesn't work
# --------------------------
print(function_list)
# $red
# function (x)
# {
# a * x^b
# }
#
# $blue
# function (x)
# {
# a * x^b
# }
If you try to remove a and re-run the function, an error will be returned .
rm(a)
function_list[['red']](1)
# Error in function_list[["red"]](1) : object 'a' not found
.
And now to how to make your code work:
There is more than one way to make it work, most of which will require either playing around with your environments or changing the data structure.
One way to manage your environments - in such way that it will keep your values and not search for the variable in the global environment - is returning a function from the function.
# Part 2 : How to make it work
# ----------------------------
function_list <- list()
for (col in dt$color) {
a <- dt$a[dt$color == col]
b <- dt$b[dt$color == col]
foo1 <- function(inner.a, inner.b) {
return(function(x) {inner.a*x^inner.b})
}
foo2 <- foo1(a,b)
print(paste(col, foo2(1)))
function_list[[col]] <- foo2
}
Now, if we check what's in the function_list, you can see that the functions are in two environments
print(function_list)
# $red
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x186fb40>
#
# $blue
# function (x)
# {
# inner.a * x^inner.b
# }
# <environment: 0x2536438>
The output is also as expected. And even when we remove a, it will still work as expected.
function_list[['red']](1) # 3
function_list[['blue']](1) # 9
rm(a)
function_list[['red']](1) #[1] 3
I think that the for loop does not create new environments (you can check this by print(environment) within the loop), so the values of a and b are taken by foo in the global environment where they are 9 and 1.8, i.e. their last assigned values.

Proper way to have two functions access a single function's environment?

Based on the answer provided in1088639, I set up a pair of functions which both access the same sub-function's environment. This example works, but I wanted to see if I'd missed some cleaner way to "connect" both top-level functions to the internal environment.
( Back story: I wanted to write a pair of complementary functions which shared a variable, e.g. "count" in this example, and meet CRAN package requirements which do not allow functions to modify the global environment. )
static.f <- function() {
count <- 0
f <- function(x) {
count <<- count + 1
return( list(mean=mean(x), count=count) )
}
return( f )
}
# make sure not to delete this command, even tho' it's not
# creating a function.
f1 <- static.f()
statfoo <- function(x){
tmp<-f1(x)
tmp<- list(tmp,plus=2)
return(tmp)
}
statbar <- function(x){
tmp<-f1(x)
tmp<- list(tmp,minus=3)
return(tmp)
}
Sample output:
> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 1
$plus
[1] 2
Rgames> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 2
$plus
[1] 2
> statbar(4)
[[1]]
[[1]]$mean
[1] 4
[[1]]$count
[1] 3
$minus
[1] 3
> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 4
$plus
[1] 2
A cleaner method would be to use an object oriented approach. There is already an answer using reference classes.
A typical object oriented approach with classes would create a class and then create a singleton object, i.e. a single object of that class. Of course it is a bit wasteful to create a class only to create one object from it so here we provide a proto example. (Creating a function to enclose count and the function doing the real work has a similar problem -- you create an enclosing function only to run it once.) The proto model allows one to create an object directly bypassing the need to create a class only to use it once. Here foobar is the proto object with property count and methods stats, statfoo and statbar. Note that we factored out stats to avoid duplicating its code in each of statfoo and statbar. (continued further down)
library(proto)
foobar <- proto(count = 0,
stats = function(., x) {
.$count <- .$count + 1
list(mean = mean(x), count = .$count)
},
statfoo = function(., x) c(.$stats(x), plus = 2),
statbar = function(., x) c(.$stats(x), plus = 3)
)
foobar$statfoo(1:3)
foobar$statbar(2:4)
giving:
> foobar$statfoo(1:3)
$mean
[1] 2
$count
[1] 1
$plus
[1] 2
> foobar$statbar(2:4)
$mean
[1] 3
$count
[1] 2
$plus
[1] 3
A second design would be to have statfoo and statbar as independent functions and only keep count and stats in foobar (continued further down)
library(proto)
foobar <- proto(count = 0,
stats = function(., x) {
.$count <- .$count + 1
list(mean = mean(x), count = .$count)
}
)
statfoo <- function(x) c(foobar$stats(x), plus = 2)
statbar <- function(x) c(foobar$stats(x), plus = 3)
statfoo(1:3)
statbar(2:4)
giving similar output to the prior example.
Third approach Of course the second variation could easily be implemented by using local and a function getting us close to where you started. This does not use any packages but does not create a function only to throw it away:
foobar <- local({
count <- 0
function(x) {
count <<- count + 1
list(mean = mean(x), count = count)
}
})
statfoo <- function(x) c(foobar(x), plus = 2)
statbar <- function(x) c(foobar(x), plus = 3)
statfoo(1:3)
statbar(2:4)
Another simple option is tocreate an environment and assign it to both functions. Here I use simpler functions for illustrative purposes, but this can be easily extended:
f1 <- function() {count <<- count + 1; return(paste("hello", count))}
f2 <- function() {count <<- count + 1; return(paste("goodbye", count))}
environment(f1) <- environment(f2) <- list2env(list(count=0))
Then:
> f1()
[1] "hello 1"
> f2()
[1] "goodbye 2"
> f1()
[1] "hello 3"
Both functions have the same environment.
You can use reference class like this:
foobar <- setRefClass(
'foobar',
fields = list(count='numeric'),
methods = list(
initialize=function() {
.self$initFields(count = 0L)
},
statfoo = function(x) {
count <<- count + 1L
list(list(mean=mean(x), count=count), plus=2)
},
statbar = function(x){
count <<- count + 1L
list(list(mean=mean(x), count=count), minus=3)
}
)
)()
foobar$statfoo(5)
foobar$statbar(3)
It makes it relatively clear that neither statfoo nor statbar is a pure function.
You could get rid of the factory functions, and more explicitly use environments. A solution like this would work as well
.env<-(function() {
count <- 0
f <- function(x) {
count <<- count + 1
return( list(mean=mean(x), count=count))
}
return(environment())
})()
statfoo <- function(x){
list(.env$f(x),plus=2)
}
statbar <- function(x){
list(.env$f(x),minus=3)
}
The .env variable is created by immediately executing an anonymous function to get its environment. We then extract the function from the environment itself to modify its values.

Resources