How can I decorate a function in R? - r

I'm trying to instrument some functions in R, by replacing them with my own versions which record some information and then call the original versions. My trouble is, how do I construct the call such that it exactly replicates the original arguments.
First, we'll set up an environment
env <- new.env()
One way I could do this is by getting the call and modifying it. However, I don't know how to do this with a primitive function.
## Option 1: get the call and modify it
env$`[` <- function(x, i, j, ...) {
my.call <- match.call()
## This isn't allowed.
my.call[[1]] <- as.name(.Primitive("["))
record.some.things.about(i, j)
eval(my.call, envir = parent.frame())
}
Alternatively, I could just get the arguments and use do.call. However, I haven't worked out how to extract the arguments correctly.
## Option 2: do a call with the arguments
env$`[` <- function(x, i, j, ...) {
## This outputs a list, so if 'i' is missing, then 'j' ends up in the 'i' position.
## If I could get an alist here instead, I could keep 'j' in the right position.
my.args <- match.call()[-1]
record.some.things.about(i, j)
do.call(
.Primitive("["),
my.args,
envir = parent.frame()
)
}
Then, if we've done things right, we can eval some expression which uses [ inside the enviroment we've constructed, and our instrumented function will be called instead:
## Should return data.frame(b = c(4, 5, 6))
evalq(
data.frame(a = c(1, 2, 3), b = c(4, 5, 6))[, "b"],
envir = env
)
Can anyone help/advise?

Use trace:
trace(`[.data.frame`, quote({print("hello!")}), at = 1)
iris[1,]
#Tracing `[.data.frame`(iris, 1, ) step 1
#[1] "hello!"
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa

Can't you just capture everything for original function args with triple dots arg, that gets passed on to the original function?
sum <- function(..., na.rm = TRUE) {
print("my sum") # here is where you can "record some info"
base::sum(..., na.rm = na.rm) # then call original function w/ ...
}
base::sum(5,5, NA)
##[1] NA
# your function
sum(5,5, NA)
##[1] "my sum"
##[1] 10

Related

Pass distribution functions, named in vectors, to a function

This question is a follow up on two questions I had answered before:
Create the function
Calculate mean
I have a couple of variables (var1, var2 and var3), which have different distribution functions:
var1_distr1 <- pdqr::as_d(function(x)dnorm(x, mean = 3, sd = 1))
var1_distr2 <- pdqr::as_d(function(x)dnorm(x, mean = 6, sd = 1))
var1_distr3 <- pdqr::as_d(function(x)dnorm(x, mean = 2, sd = 2))
var2_distr1 <- pdqr::as_d(function(x)dnorm(x, mean = 5, sd = 3))
var2_distr2 <- pdqr::as_d(function(x)dnorm(x, mean = 3, sd = 1))
var2_distr3 <- pdqr::as_d(function(x)dnorm(x, mean = 4, sd = 2))
var3_distr1 <- pdqr::as_d(function(x)dnorm(x, mean = 4, sd = 1))
var3_distr2 <- pdqr::as_d(function(x)dnorm(x, mean = 5, sd = 1))
var3_distr3 <- pdqr::as_d(function(x)dnorm(x, mean = 7, sd = 2))
To create proportional distribution function, to match the combination of two or three different variables whith their appropriate probablity functions I have created the next function I learned in the first question:
foo <- function(...){
#set x values
x <- seq(1, 10, by = 1)
#create y values
y <- 1L
for (fun in list(...)) y <- y * fun(x)
#create new PDF
p <- data.frame(x,y)
pdqr::new_d(p, type = "continuous")
}
So, if I want to create a proportional distribution function var2_distr1__var3_distr3 of var2_distr1 and var3_distr3 I can just do this: var2_distr1__var3_distr3 <- foo(var2_distr1, var3_distr3), works like charm.
Now I have per for each variable, per case, I have selected the appropriate distrubution, using a simple if_else, which returns the appropriate distribution in a dataframe like this:
df <- data.frame(var1 = c("var1_distr1", "var1_distr3", "var1_distr1", "var1_distr2", "var1_distr2", "var1_distr1", "var1_distr3"),
var2 = c("var2_distr2", "var2_distr1", "var2_distr2", "var2_distr1", "var2_distr3", "var2_distr3", "var2_distr1"),
var3 = c("var3_distr2", "var3_distr3", "var3_distr1", "var3_distr1", "var3_distr2", "var3_distr3", "var3_distr1"))
If I want the mean for the relavant individual distributions per case for a single variable I can use this
df$var2_distr1_mean <- sapply(mget(df$var2_distr1), pdqr::summ_mean)
df$var3_distr3_mean <- sapply(mget(df$var3_distr3), pdqr::summ_mean)
which I learned in the second question.
However, if I want to get the mean of the proportional distributions given in var1 and var2 I get into trouble.
> df$var1_2_mean <- mapply(pdqr::summ_mean, foo(df$var1, df$var2))
Error in fun(x) : could not find function "fun"
While if I individually pass the distribution functions, this happens:
> df$var1_2_mean <- mapply(summ_mean, foo(var1_distr1, var2_distr2))
Error in dots[[1L]][[1L]] : object of type 'closure' is not subsettable
As suggested by #Limey, if put the PDF's in a list:
PDFS <- list(var1_distr1 = var1_distr1, var1_distr2 = var1_distr2, var1_distr3 = var1_distr3,
var2_distr1 = var2_distr1, var2_distr2 = var2_distr2, var2_distr3 = var2_distr3,
var3_distr1 = var3_distr1, var3_distr2 = var3_distr2, var3_distr3 = var3_distr3)
However, when calling that (using this approach apply-list-of-functions-to-list-of-values) I get this:
> df$var1_2_mean <- foo(sapply(PDFS, mapply, df$var1, df$var2))
Error in (function (x) : unused argument (dots[[2]][[1]])
> sapply(PDFS, mapply, df$var1, df$var2)
Error in (function (x) : unused argument (dots[[2]][[1]])
> sapply(PDFS, mapply, df$var1)
Error: `x` must be 'numeric', not 'character'.
> df$var1_2_mean <- foo(sapply(PDFS, mapply, paste(df$var1, df$var2, sep = ", ")))
Error: `x` must be 'numeric', not 'character'.
> df$var1_2_mean <- summ_mean(foo(sapply(PDFS, mapply, paste(df$var1, df$var2, sep = ", "))))
Error: `x` must be 'numeric', not 'character'.
> df$var1_2_mean <- sapply(foo(mget(mapply(PDFS, sapply, df$var1, df$var2))), pdqr::summ_mean)
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'PDFS' of mode 'function' was not found
> lapply(PDFS, function(x) x())
Error in x() : argument "x" is missing, with no default
I'm still missing something, and I believe it's on vectorisation. Might invoke_map work?
I don't have the pdqr package, so I can't solve your exact problem, but here's a proof-of-concept example that may be helpful. As I mention in comments, you haven't specified your exact use case, but I do feel you are imposing constraints that make your life more difficult than it need be. For example passing function names rather than functions to your summary function, using a data frame rather than a list, etc.
Anyway, start by defining some functions and store them in a list.
foo1 <- function() {"Foo 1"}
foo2 <- function() {"Foo 2"}
foo3 <- function() {"Foo 3"}
funcList <- list(foo1, foo2, foo3)
Now use utils::combn() to generate all combinations of two of these three functions and call each member of each pair in turn.
combn(
funcList,
m=2,
FUN=function(combination) {
lapply(combination, function(x) x())
}
)
Giving
[,1] [,2] [,3]
[1,] "Foo 1" "Foo 1" "Foo 2"
[2,] "Foo 2" "Foo 3" "Foo 3"
combn() takes the list of functions as input. m=2 requests the generation of all combinations of 2 elements from the list. FUN= specifies a function to be applied to each combination. The anonymous function supplied simply takes the supplied combination and simply calls each element of the combination in turn.

R function ambigous call

I'm working on a project, trying to convert an R function to CUDA C++, but I can't understand some R function call, I'm really new to R and I can't find what I'm really looking after. To be exactly, this is the main R function code:
for (i in 1:ncy) {
res <- apply(allsubset, 2, banddepthforonecurve, xdata=x, ydata=y[,i], tau=tau, use=use)
depth[i] <- sum(res[1,])
localdepth[i] <- sum(res[2,])
}
The part that I can't really understand is "banddepthforonecurve" function call, this is the "banddepthforonecurve" function code:
banddepthforonecurve <- function(x, xdata, ydata, tau, use) {
envsup <- apply(xdata[,x], 1, max)
envinf <- apply(xdata[,x], 1, min)
inenvsup <- ydata <= envsup
inenvinf <- ydata >= envinf
depth <- all(inenvsup) & all(inenvinf)
localdepth <- depth & use(envsup-envinf) <= tau
res <- c(depth,localdepth)
return(res)
}
When it is called in:
res <- apply(allsubset, 2, banddepthforonecurve, xdata=x, ydata=y[,i], tau=tau, use=use)
I don't really get what it set for the first parameter "x" of the "banddepthforonecurve", I supposed its like banddepthforonecurve(i, xdata=x, ydata=y[,i], tau = tau, use=use)
but if I try to run it separately on R studio to try to understand it better I get:
apply(xdata[, x], 1, max) : dim(X) must have a positive length
Why when I compile the whole R project there isn't this error? What it set for the "x" parameter when called in the "res <- apply(...)"? I hope I was clear, sorry for my bad english, Thank you in advance !
# This apply function
res = apply(X = input, MAR = 2, FUN = foo, ...)
# is essentially syntactical sugar for this:
res = list()
for(i in 1:ncol(X)) {
res[[i]] = foo(X[, i], ...)
}
# plus an attempt simplify `res` (e.g., to a matrix or vector)
So in your line:
apply(allsubset, 2, banddepthforonecurve, xdata=x, ydata=y[,i], tau=tau, use=use)
In a single iteration of your for loop, the first parameter of banddepthforonecurve (x) will be allubset[, 1], then allsubset[, 2], ..., allsubset[, ncol(allsubset)].
The xdata parameter is always x, the tau and use parameters are always tau and use, and the for loop iterates over the columns of y to use as the ydata argument. You can think of it as a nested loop, for each column of y, use it as ydata and (via apply) iterate over all columns of allsubset.
(If the MAR argument of apply was 1, then it would iterate over rows instead of columns.)

List objects as function arguments with overridable list element defaults

I have an R function which takes a large number of arguments (18) which I would like to pass in as a list. When I am running this function by hand, so to speak, I generally want to use all the defaults but one or two, but I also want to run this same function many times with various combinations of default and non-default items, which I would like to assemble programmatically as lists.
I know that I could just have my 18+ arguments as individual formals and then assemble them into a list inside the function, but I wish I could have a list as a default for a formal, and then have the elements have defaults as well. Like this:
> f <<- function(x, y = list(a=0, b=3)) {with(y, (x + a + b))}
> f(1)
[1] 4
> f(x=1, y$a = 1)
Error: unexpected '=' in "f(x=1, y$a ="
(or alternatively)
In y$a <- 1 :
Error in eval(substitute(expr), data, enclos = parent.frame()) :
object 'a' not found
except with the output of 5 rather than an error. I suspect there is no way to do this, because R does not recognise the assignments in the list as creating defaults, but only as creating named elements. But maybe with the assignment form of formals? or through some clever use of do.call?
Here are some alternatives:
1) modifyList Use modifyList to process the defaults.
f1 <- function(x, y = list()) {
defaults <- list(a = 0, b = 3)
with(modifyList(defaults, y), {
x + a + b
})
}
f1(x = 1)
## [1] 4
f1(x = 1, y = list(a = 1))
## [1] 5
2) do.call Another possibility is to have two functions. The first does not use a list and the second (which is the one the user calls) does using do.call to invoke the first.
f2impl <- function(x, a = 0, b = 3) x + a + b
f2 <- function(x, y = list()) do.call("f2impl", c(x, y))
f2(x = 1)
## [1] 4
f2(x = 1, y = list(a = 1))
## [1] 5

Get name of a functions inside a list

What I wish to achieve
So I want to get the names of my function inside a list of function.
Here is an example:
foo = list(foo1 = sum, foo2 = mean)
What I wish to extract from foo is:
list("sum", "mean")
And I would like it to be a function, meaning:
> foo = list(foo1 = sum, foo2 = mean)
> super_function(foo)
list("sum", "mean")
What I have checked
Applying names:
> sapply(foo , names)
$`foo1`
NULL
$foo2
NULL
Applying deparse(substitute())
> my_f <- function(x)deparse(substitute(x))
> sapply(foo, my_f)
foo1 foo2
"X[[i]]" "X[[i]]"
Neither idea works....
More background:
Here are some more details. One don't need them to understand the first question, but are extra details asked by community.
I'm using those functions as aggregation functions given by the user.
data(iris)
agg_function<-function(data, fun_to_apply){
res <- list()
for (col_to_transform in names(fun_to_apply)){
res[col_to_transform] <- (fun_to_apply[[col_to_transform]])(data[[col_to_transform]])
}
res
}
agg_function(iris, fun_to_apply = list("Sepal.Length" = mean, "Petal.Length" = sum))
Result is:
$`Sepal.Length`
[1] 5.843333
$Petal.Length
[1] 563.7
In this example I'm performing aggregation on two columns of iris. But I wish to have the name of the performed function in the name of each field of my result.
NB: This is an over simplification of what I'm doing;
Conclusion:
Do you have any ideas?
If you are starting from just the list foo = list(foo1 = sum, foo2 = mean), then it's not possible. The call to list() will evaluate the parameters returning the values that the variables sum and mean point to but it will not remember those variable names. Functions don't have names in R. But functions can be assigned to variables. However in R functions can live without names as well.
You've basically just created a named list of function. That might also look like this
foo = list(foo1 = function(x) sum(x+1),
foo2 = function(x) mean(x+1))
Here we also have functions, but these functions don't have "names" other than the names you gave to them in the list.
This only chance you have of making this work is using something other than list() when creating foo in the first place. Or having them actually explicitly call list() in the function call (which isn't very practical).
Despite you already said that tidyverse is not suitable for you, I will add this as an other idea.
agg_function <- function(df, x, ...){
df %>%
summarise_at(.vars = x, funs(...))
}
agg_function(iris, c("Sepal.Length", "Petal.Length"), mean, sum)
Sepal.Length_mean Petal.Length_mean Sepal.Length_sum Petal.Length_sum
1 5.843333 3.758 876.5 563.7
You can use a list with the functions as strings
foo <- list(foo1 = "mean", foo2 = "sum")
foo
$foo1
[1] "mean"
$foo2
[1] "sum"
get(foo[[1]])(1:10)
[1] 5.5
get(foo[[2]])(1:10)
[1] 55
Or use the rlang package and do something like
library(rlang)
foo <- quos(foo1 = mean, foo2 = sum)
getNames <- function(x) {
+ sapply(x, function(x) x[[2]])
+ }
getNames(foo)
$foo1
mean
$foo2
sum
eval_tidy(foo[[1]])(1:10)
[1] 5.5
eval_tidy(foo[[2]])(1:10)
[1] 55
This also works with non named functions
foo <- quos(foo1 = function(x) sum(x + 1), foo2 = sum)
getNames(foo)
$foo1
function(x) sum(x + 1)
$foo2
sum
eval_tidy(foo[[1]])(1:10)
[1] 65

Passing a function argument to other arguments which are functions themselves

Assume I have an outer function that has a numeric argument and an argument which is a function itself (inner function). How can I pass the value of the numeric argument of the outer function as an argument to the inner function? Consider this toy example:
innerfun <- function(M){
1:M
}
outerfun <- function(x, fun){
x * fun
}
outerfun(x = 3, fun = innerfun(M = 3)) ## works
outerfun(x = 3, fun = innerfun(M = x)) ## error because innerfun can't find 'x'
outerfun(x = 3, fun = innerfun(M = get("x"))) ## doesn't work either...
So what I want to do is to call innerfun at the moment the arguments of outerfun are evaluated, using those outerfun-arguments in the call to innerfun. Any ideas or suggestions?
I would do something like this :
outerfun <- function(x, fun,...){
x * fun(x,...)
}
innerfun <- function(M){
seq_len(M) ## safer than 1:M
}
outerfun(x=3, innerfun)
[1] 3 6 9
Note that If inner function has more than one argument, it still works :
innerfun2 <- function(M,Z){
seq(M+Z)
}
outerfun(x=3, innerfun2,Z=3)
[1] 3 6 9 12 15 18
Add a "global" variable:
param = 3;
outerfun(x = param, fun = innerfun(M = param))

Resources