What I'm trying to do is trivial but I've not found a clear solution to it:
For instance, I have the following function:
sample.function <- function(a, b, named="test") {
...
}
I wish I could inspect the function and obtain the arguments (maybe as an R list), given ret is the returned value of the desired function, fhe following assertions should be all True
ret <- magicfunction(sample.function)
ret[[1]] == "a"
ret[[2]] == "b"
ret$named == "test"
can it be done?
Here are a few things you can look at, inside or outside of the function.
> f <- function(FUN = sum, na.rm = FALSE) {
c(formals(f), args(f), match.fun(FUN))
}
> f()
$FUN
sum
$na.rm
[1] FALSE
[[3]]
function (FUN = sum, na.rm = FALSE)
NULL
[[4]]
function (..., na.rm = FALSE) .Primitive("sum")
This will work if the function encloses its body with brace brackets (which nearly all functions do). It gives a list whose names are the argument names and whose values are the defaults:
sample.function <- function(a, b, named="test") {} # test function
L <- as.list(formals(sample.function))); L
## $a
##
## $b
##
## $named
## [1] "test"
This is slightly longer but works even for functions whose bodies are not surrounded by brace brackets:
head(as.list(args(sample.function)), -1)
# same output
head(as.list(args(sin)), -1) # sin has no {}
## $x
Returning to the first example, to examine the default values for missing:
sapply(L, identical, formals(function(x) {})$x)
## a b named
## TRUE TRUE FALSE
Revised
Related
My R function uses missing() to switch between two alternative ways of specifying input data. However, if the input is a factor, I want to automatically apply the function on the factor levels instead, so I make a recursive call. Now I have the problem that I forward all arguments to the recursive call, so none of the parameters are missing anymore! How do I make a recursive call with all the parameters missing that are also missing in the parent call?
Minimal example:
f <- function(a, b = 1){
print(missing(b))
if(length(a)>0) f(a = a[-1], b = b)
}
f(1:2) prints:
[1] TRUE
[1] FALSE
[1] FALSE
What I want is
[1] TRUE
[1] TRUE
[1] TRUE
This works when b has no default value, so f is instead
f <- function(a, b){
...
Also of course I have the option to switch the recursive call using if(missing(b))
if(missing(b)) f(a = a[-1]) else f(a = a[-1], b = b)
... but this gets complicated for multiple parameters and also deprives me of the option to learn something about the strange wonders of parameter handling in R ;-)
You can capture the call with match.call, and substitute a for a[-1] in the call. Then instead of calling the function with arguments, use do.call, which will only supply the arguments initially put into the function in the recursive calls.
f <- function(a, b = 1){
print(missing(b))
call_list <- as.list(match.call())[-1]
if(length(a) > 0) {
a <- a[-1]
call_list$a <- a
do.call(f, call_list)
}
}
f(1:2)
#> [1] TRUE
#> [1] TRUE
#> [1] TRUE
Given a regular R function f, I'd like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.
For example:
f <- function(x, y) {
z <- x + y
df <- data.frame(z=z)
df
}
# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
e <- new.env()
altered <- function(...) {
parent <- parent.frame()
e <- something...(func, environment(), parent, etc., etc.)
result <- func(...)
saveRDS(as.list(e), 'foo.rds')
result
}
environment(func) <- e
altered
}
f_debug <- capturing(f)
I'm not sure whether my knowledge gap to do this is large or small, anyone have a solution?
Solution 1: Steal the function's code
Here's a solution which doesn't return a new function which captures intermediate calculations, but rather calls the given function's code internally. There's some limitations, such as it probably only works with named arguments. Instead of storing the intermediate calculations as an RDS, it attaches them as an attribute.
capturing <- function(fun, ...) {
fun <- match.fun(fun)
code <- body(fun)
parent <- environment(fun)
env <- new.env(parent = parent)
for (val in names(list(...))) {
env[[val]] <- list(...)[[val]]
}
result <- eval(code, envir = env, enclos = parent.frame())
attr(result, "intermediate") <- env
result
}
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
intermediates <- function(x) {
attr(x, "intermediate", exact = TRUE)
}
value <- capturing(my_add, x = 1, y = 7)
ls(envir = intermediates(value))
#> [1] "u" "w" "x" "y" "z"
intermediates(value)$x
#> [1] 1
# Created on 2022-02-08 by the reprex package (v2.0.1)
Solution 2: Modify the function's code
One weakness of this solution is that if the chosen function features a call to on.exit(add=FALSE), some additional work needs to be done to modify the function so the internal environment is captured. However, it does work when the function accepts ... arguments.
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
insert_capture <- function(code) {
# `<<-` assigns into the global environment if no variable of the given name is found
# while traveling up to the global environment. If you need this assignment to go elsewhere,
# I'd recommend passing in `assign()`. Of course, you could also modify the `on.exit()`
# to use saveRDS.
parse(text=append(deparse(code),
"on.exit(._last_capture <<- environment(), add = TRUE)",
after = 1L))
}
capturing2 <- function(fun) {
fun <- match.fun(fun)
code <- insert_capture(body(fun))
body(fun) <- code
fun
}
my_add2 <- capturing2(my_add)
my_add2(1, 7)
#> [1] 8
ls(envir = ._last_capture)
#> [1] "u" "w" "x" "y" "z"
._last_capture$u
#> [1] -6
Created on 2022-02-08 by the reprex package (v2.0.1)
What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.
capturing <- function(func, ...) {
cc <- as.call(c(quote(utils::dump.frames), list(...)))
cc <- call("on.exit", cc, add = TRUE)
body(func) <- call("{", cc, body(func))
func
}
capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:
dumpto, a character string giving the name to be used for the "dump.frames" object
to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well
A quick example, which you should try yourself:
tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)
f <- function(x, y) {
z <- x + y
z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
d <- g(a, b)
d + 1
}
h12 <- h(1, 2)
load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
##
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
##
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"
ls(zzz[[1L]])
## [1] "a" "b"
ls(zzz[[2L]])
## [1] "z" "x" "y"
utils::debugger(zzz)
## Message: Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
##
## Enter an environment number, or 0 to exit
## Selection: 2
## Browsing in the environment with call:
## #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q
setwd(cwd)
unlink(tmp, recursive = TRUE)
See ?browser if you are unfamiliar with R's environment browser.
My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.
Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.
The problem I am trying to tackle here is needing to apply (execute) an S3 object which is essentially a vector-like structure. This may contain various formulas which at some stage I need to evaluate for a single argument, in order to get back a vector-like object of the original shape, containing the evaluation of its constituent formulas at the given argument.
Examples of this (just to illustrate) might be a matrix of transformation - say rotation - which would take the angle to rotate by, and produce a matrix of values by which to multiply a point, for the given rotation. Another example might be the vector of states in a problem in classical mechanics. Then given t, v, a, etc, it could return s...
Now, I have created my container object in S3, and its working fine in most respects, using generic methods; I also found the Ops.myClass system of operator overloading very useful.
To complete my class, all I need now is a way to specify it as executable.
I see that there are various mechanisms that will do what I want in part, for instance I suppose that as.function() will convert the object to behave as I want, and something like lapply() could be used for the "reverse" application of the argument to the functions. What I am not sure how to do is link it all up so that I can do something like this mock-up:
new_Object <- function(<all my function vector stuff spec>)
vtest <- new_Object(<say, sin, cos, tan>)
vtest(1)
==>
myvec(.8414709848078965 .5403023058681398 1.557407724654902)
(Yes, I have already specified a generic print() routine that will make it appear nice)
All suggestions, sample code, links to examples are welcome.
PS =====
I have added some basic example code as per request.
I am not sure how much would be too much, so the full working minimal example, including operator overloading is in this gist here.
I am only showing the constructor and helper functions below:
# constructor
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <-function(...){
vec <- unlist(list(...),use.names = FALSE)
new_Struct("up",vec)
}
down <-function(...){
vec <- unlist(list(...),use.names = FALSE)
new_Struct("down",vec)
}
The above code behaves thus:
> u1 <- up(1,2,3)
> u2 <- up(3,4,5)
> d1 <- down(u1)
> d1
[1] down(1, 2, 3)
> u1+u2
[1] up(4, 6, 8)
> u1+d1
Error: '+' not defined for opposite tuple types
> u1*d1
[1] 14
> u1*u2
[,1] [,2] [,3]
[1,] 3 4 5
[2,] 6 8 10
[3,] 9 12 15
> u1^2
[1] 14
> s1 <- up(sin,cos,tan)
> s1
[1] up(.Primitive("sin"), .Primitive("cos"), .Primitive("tan"))
> s1(1)
Error in s1(1) : could not find function "s1"
What I need, is for it to be able to do this:
> s1(1)
[1] up(.8414709848078965 .5403023058681398 1.557407724654902)
You can not call each function in a list of functions without a loop.
I'm not fully understanding all requirements, but this should give you a start:
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec) || is.function(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <- function(...) UseMethod("up")
up.default <- function(...){
vals <- list(...)
stopifnot(all(vapply(vals, is.vector, FUN.VALUE = logical(1))))
vec <- unlist(vals, use.names = FALSE)
new_Struct("up",vec)
}
up.function <- function(...){
funs <- list(...)
stopifnot(all(vapply(funs, is.function, FUN.VALUE = logical(1))))
new_Struct("up", function(x) new_Struct("up", sapply(funs, do.call, list(x))))
}
up(1, 2, 3)
#[1] 1 2 3
#attr(,"class")
#[1] "Struct"
#attr(,"type")
#[1] "up"
up(1, 2, sin)
#Error in up.default(1, 2, sin) :
# all(vapply(vals, is.vector, FUN.VALUE = logical(1))) is not TRUE
up(sin, 1, 2)
#Error in up.function(sin, 1, 2) :
# all(vapply(funs, is.function, FUN.VALUE = logical(1))) is not TRUE
s1 <- up(sin, cos, tan)
s1(1)
#[1] 0.8414710 0.5403023 1.5574077
#attr(,"class")
#[1] "Struct"
#attr(,"type")
#[1] "up"
After some thought I have come up with a way to approach this, it's not perfect, it would be great if someone could figure out a way to make the function call implicit/transparent.
So, for now I just use the call() mechanism on the object, and that seems to work fine. Here's the pertinent part of the code, minus checks. I'll put up the latest full version on the same gist as above.
# constructor
new_Struct <- function(stype , vec){
stopifnot(is.character(stype)) # enforce up | down
stopifnot(is.vector(vec))
structure(vec,class="Struct", type=stype)
}
# constructor helper functions --- need to allow for nesting!
up <- function(...){
vec <- unlist(list(...), use.names = FALSE)
new_Struct("up",vec)
}
down <- function(...){
vec <- unlist(list(...), use.names = FALSE)
new_Struct("down",vec)
}
# generic print for tuples
print.Struct <- function(s){
outstr <- sprintf("%s(%s)", attributes(s)$type, paste(c(s), collapse=", "))
print(noquote(outstr))
}
# apply the structure - would be nice if this could be done *implicitly*
call <- function(...) UseMethod("call")
call.Struct <- function(s,x){
new_Struct(attributes(s)$type, sapply(s, do.call, list(x)))
}
Now I can do:
> s1 <- up(sin,cos,tan)
> length(s1)
[1] 3
> call(s1,1)
[1] up(0.841470984807897, 0.54030230586814, 1.5574077246549)
>
Not as nice as my ultimate target of
> s1(1)
[1] up(0.841470984807897, 0.54030230586814, 1.5574077246549)
but it will do for now...
I'm writing a function such that callers of this function can write schemas declaratively:
myschema <- Schema(
patientID = character,
temp = numeric,
treated = logical,
reason_treated = factor(levels=c('fever', 'chills', 'nausea'))
)
Later, I'd to be able to assemble dataframes using the types declared in this schema. I think the best candidate for this job is to use the metaprogramming features available in rlang:
Schema = function(...) {
schematypes = rlang::enexprs(...)
}
However, most of the examples pertain to capturing the expression and thereafter using them as arguments to functions, rather than as functions themselves. That is, I'm finding it hard to capture the right side of the following expression:
patientID = character
and then later being able to evaluate it later as character(myvec), whenever I get myvec. The same applies to the following:
reason_treated = factor(levels=c('fever', 'chills', 'nausea'))
which I would later like to evaluate as factor(myvec, levels=c('fever', 'chills', 'nausea'))
Thanks!
If I understand correctly, you are effectively constructing a schema out of functions, and you want to apply those functions to some arguments when those become available. This falls under the umbrella of functional programming rather than rlang metaprogramming.
A large portion of the functionality you want is already captured by purrr::map and its "engine" as_mapper. You can employ it directly to define
Schema <- function(...) { purrr::map( list(...), purrr::as_mapper ) }
You can now employ it to build new schemas like you suggested (with minor modifications to function definitions):
myschema <- Schema(
patientID = as.character, # Note the correct function name
temp = as.numeric, # Ditto
treated = as.logical, # Tritto
reason_treated = ~factor(., levels=c('fever', 'chills', 'nausea'))
)
# $patientID
# function (x, ...)
# as.character(x = x, ...)
# <environment: base>
#
# $temp
# function (x, ...)
# as.double(x = x, ...)
# <environment: base>
#
# $treated
# function (x, ...)
# as.logical(x = x, ...)
# <environment: base>
#
# $reason_treated
# function (..., .x = ..1, .y = ..2, . = ..1)
# factor(., levels = c("fever", "chills", "nausea"))
# <bytecode: 0x00000000027a2d00>
Given your new schema, registering new patients can be done using a sister function of map that lines up arguments from two lists / vectors:
register_patient <- function(myvec) { purrr::map2( myschema, myvec, ~.x(.y) ) }
JohnDoe <- register_patient( c(1234, 100, TRUE, "fever") )
# $patientID
# [1] "1234"
#
# $temp
# [1] 100
#
# $treated
# [1] TRUE
#
# $reason_treated
# [1] fever
# Levels: fever chills nausea
Let's verify the type of each element:
purrr::map( JohnDoe, class )
# $patientID
# [1] "character"
#
# $temp
# [1] "numeric"
#
# $treated
# [1] "logical"
#
# $reason_treated
# [1] "factor"
I would like to ask this, because it is hard to search for. Is there more efficient way to write the following:
a <- list(x=FALSE,z=TRUE,l=list()) # a$y is not defined, list contains also lists
f <- function() 1
if(!is.null(a$x)) { if(a$x==TRUE) f() }
if(!is.null(a$y)) { if(a$y==TRUE) f() }
if(!is.null(a$z)) { if(a$z==TRUE) f() }
[1] 1
The idea is that if list any of pre-given list elements x, y or z have value TRUE function f() is called and otherwise not.
To aim is to run function f() only once, and write the function call f() only once to the code. Function f() is run if one of conditions x, y or z hold. The conditions are stored in the list a, which contains also other elements. However, list a might not contain all conditions, only some of them, which makes the missing conditions to be false.
EDIT:
I found quite convenient solution:
for (b in c("x","y","z")) {
if (!is.null(a[[b]]) & c(a[[b]]),F)[1] == T ) {
print(f())
break
}
}
but in order to prevent error:
if(!is.null(a[["y"]]) & a[["y"]] == T) 1
Error in if (!is.null(a[["y"]]) & a[["y"]] == T) 1 :
argument is of length zero
I had to make a coalesce-like solution c(a[["y"]],F)[1]:
if(!is.null(a[["y"]]) & c(a[["y"]],F)[1] == T) 1
which works, but does not look so nice, because I am not sure whether the following condition will work always, even if it does here (?):
> c(NULL,1) == c(1)
[1] TRUE
Since there is a list embedded inside another list, ie nested lists, we would rather use rapply
a <- list(x=FALSE,z=TRUE,l=list())
f <- function() 1
rapply(a,function(k)if(!is.null(k)&k==T)f())
z
1
b=list(x=FALSE,Z=TRUE,l=list(x=TRUE,y=TRUE))
rapply(b,function(k)if(!is.null(k)&k==T)f())
Z l.x l.y
1 1 1
With this kind of data:
a <- list(x=FALSE,z=TRUE,l=list()) # a$y is not defined, list contains also lists
f <- function() 1
the aim was to run function f() if one of conditions x, y or z hold. Thanks to hint from #Onyambu, it is necessary to test only conditions which are found in the list, because not found conditions are false. Below are steps which were used in final solution (last execution row):
> intersect(names(a),c("x","y","z"))
[1] "x" "z"
> a[intersect(names(a),c("x","y","z"))]
$x
[1] FALSE
$z
[1] TRUE
> unlist(a[intersect(names(a),c("x","y","z"))])
x z
FALSE TRUE
> any(unlist(a[intersect(names(a),c("x","y","z"))]))
[1] TRUE
> if (any(unlist(a[intersect(names(a),c("x","y","z"))]))) f()
[1] 1