Given a list of arguments, and a function, I'd like to see if I have a valid function call.
My attempt so far:
args_match <- function (fun, ...) {
fun_name <- deparse(substitute(fun))
safe_mc <- purrr::safely(match.call)
x <- safe_mc(fun, do.call(call, list(fun_name, ...)))
return(is.null(x$error))
}
This works for some functions:
> args_match(fivenum, y = 1:10)
[1] FALSE
> args_match(fivenum, x = 1:10)
[1] TRUE
But it fails for functions with a ... argument:
> args_match(mean, x = 1:10)
[1] TRUE
> args_match(mean, y = 1:10)
[1] TRUE
> mean(y = 1:10)
Error in mean.default(y = 1:10) :
argument "x" is missing, with no default
Is there a way to improve on this? The answer may be no, because some R functions indeed don't care if their arguments are missing... :-/
Related
Given a regular R function f, I'd like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.
For example:
f <- function(x, y) {
z <- x + y
df <- data.frame(z=z)
df
}
# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
e <- new.env()
altered <- function(...) {
parent <- parent.frame()
e <- something...(func, environment(), parent, etc., etc.)
result <- func(...)
saveRDS(as.list(e), 'foo.rds')
result
}
environment(func) <- e
altered
}
f_debug <- capturing(f)
I'm not sure whether my knowledge gap to do this is large or small, anyone have a solution?
Solution 1: Steal the function's code
Here's a solution which doesn't return a new function which captures intermediate calculations, but rather calls the given function's code internally. There's some limitations, such as it probably only works with named arguments. Instead of storing the intermediate calculations as an RDS, it attaches them as an attribute.
capturing <- function(fun, ...) {
fun <- match.fun(fun)
code <- body(fun)
parent <- environment(fun)
env <- new.env(parent = parent)
for (val in names(list(...))) {
env[[val]] <- list(...)[[val]]
}
result <- eval(code, envir = env, enclos = parent.frame())
attr(result, "intermediate") <- env
result
}
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
intermediates <- function(x) {
attr(x, "intermediate", exact = TRUE)
}
value <- capturing(my_add, x = 1, y = 7)
ls(envir = intermediates(value))
#> [1] "u" "w" "x" "y" "z"
intermediates(value)$x
#> [1] 1
# Created on 2022-02-08 by the reprex package (v2.0.1)
Solution 2: Modify the function's code
One weakness of this solution is that if the chosen function features a call to on.exit(add=FALSE), some additional work needs to be done to modify the function so the internal environment is captured. However, it does work when the function accepts ... arguments.
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
insert_capture <- function(code) {
# `<<-` assigns into the global environment if no variable of the given name is found
# while traveling up to the global environment. If you need this assignment to go elsewhere,
# I'd recommend passing in `assign()`. Of course, you could also modify the `on.exit()`
# to use saveRDS.
parse(text=append(deparse(code),
"on.exit(._last_capture <<- environment(), add = TRUE)",
after = 1L))
}
capturing2 <- function(fun) {
fun <- match.fun(fun)
code <- insert_capture(body(fun))
body(fun) <- code
fun
}
my_add2 <- capturing2(my_add)
my_add2(1, 7)
#> [1] 8
ls(envir = ._last_capture)
#> [1] "u" "w" "x" "y" "z"
._last_capture$u
#> [1] -6
Created on 2022-02-08 by the reprex package (v2.0.1)
What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.
capturing <- function(func, ...) {
cc <- as.call(c(quote(utils::dump.frames), list(...)))
cc <- call("on.exit", cc, add = TRUE)
body(func) <- call("{", cc, body(func))
func
}
capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:
dumpto, a character string giving the name to be used for the "dump.frames" object
to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well
A quick example, which you should try yourself:
tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)
f <- function(x, y) {
z <- x + y
z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
d <- g(a, b)
d + 1
}
h12 <- h(1, 2)
load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
##
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
##
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"
ls(zzz[[1L]])
## [1] "a" "b"
ls(zzz[[2L]])
## [1] "z" "x" "y"
utils::debugger(zzz)
## Message: Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
##
## Enter an environment number, or 0 to exit
## Selection: 2
## Browsing in the environment with call:
## #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q
setwd(cwd)
unlink(tmp, recursive = TRUE)
See ?browser if you are unfamiliar with R's environment browser.
My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.
Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.
I am trying to define options programmatically by using a string vector as shown below. However, the option does not get defined and it returns a NULL value. Are there any best practices or functions for this?
f <- "z"
options(f = TRUE)
getOption("z")
# returns NULL
According to the docs:
Options can also be passed by giving a single unnamed argument which is a named list
So you can do
f <- list(z = TRUE)
options(f)
getOption("z")
#> [1] TRUE
Or, if you want to be able to use the input format in your question, you can use the following function:
prog_options <- function(...)
{
mc <- as.list(match.call()[-1])
names(mc) <-
sapply(names(mc), function(x) eval(as.name(x), envir = parent.frame()))
options(mc)
}
Which allows the following:
f <- "z"
g <- "y"
prog_options(f = TRUE, g = "Yes")
getOption("z")
#> [1] TRUE
getOption("y")
#> [1] "Yes"
I have the following code:
fn <- 'George'
mn <- 'Walker'
ln <- 'Bush'
f <- function(...) { print(list(...)) }
When I call it, it produces the following output:
f(fn,mn,ln)
[[1]]
[1] "George"
[[2]]
[1] "Walker"
[[3]]
[1] "Bush"
Suppose I wanted something similar to this (note the parameter names):
fn:George
mn:Walker
ln:Bush
Question: I know how to get the VALUES of the arguments inside a function. How do I get the NAMES of the arguments inside the function?
Thanks, CC.
You may use
f <- function(...) {
nm1 <- as.list(match.call()[-1])
val <- list(...)
cat(paste(nm1, val, sep=":", collapse="\n"),'\n') }
f(fn,mn,ln)
#fn:George
#mn:Walker
#ln:Bush
I expect this code to set plt equal to 10:
> var = "plt"
> eval(paste0(var, "<-", 10))
[1] "plt<-10"
But instead, it returns a string.
I tried eval(as.expression(paste0(var, "<-", 10))) and other options, but it still doesn't give the expected result.
What's wrong with the code?
If I understand your comment correctly there is no reason to dive into the shark-infested waters of eval(parse()). Try something like this instead:
myfun <- function(x, fun) {
if (is.character(fun)) fun <- match.fun(fun)
fun(x)
}
myfun(1:5, mean)
#[1] 3
myfun(1:5, "mean")
#[1] 3
See: ?parse. Your demo code:
> var = "plt"
> eval(parse(text = paste0(var, "<-", 10)))
> plt
[1] 10
Update: based on #Anton's comment about the original goal - what about:
> f <- function(type, ...) {
+ assign('plt', do.call(deparse(substitute(type)), list(...)), envir = .GlobalEnv)
+ }
> f(mean, x = 1:20)
> plt
[1] 10.5
PS: I still trying to implement what the OP is after, not what he might or should be after -- that's why I used above assign and .GlobalEnv, although it's not a great idea BTW.
Is there a way to write a function in which one of the arguments indicates what function to apply?
For example, if I have a function:
mf = function(data, option, level)
where I want option to tell whether to calculate the mean, median or sd of a data set?
Yes, one option is to just pass a function to option. E.g.
mf <- function(data, option) {
option <- match.fun(option)
option(data)
}
set.seed(42)
dat <- rnorm(10)
mf(dat, option = mean)
Which gives:
> set.seed(42)
> dat <- rnorm(10)
> mean(dat)
[1] 0.5472968
> mf(dat, option = mean)
[1] 0.5472968
> sd(dat)
[1] 0.8354488
> mf(dat, option = sd)
[1] 0.8354488
match.fun() is the standard R way of matching to an available function. In the example I pass the function itself, but match.fun() allows other ways of referring to a function, for example as a character string:
> mf(dat, option = "mean")
[1] 0.5472968
match.fun() returns a function that can be used as any other function, hence option() is a function that is essentially the same as the function passed to the option argument or is the function named in the option argument.
It isn't clear how the level argument was supposed to be used to I have ignored that above.
I should probably add that if you want to pass in any arguments to the applied function then you'll want to use ... in the function definition, e.g.:
mf <- function(data, option, ...) {
option <- match.fun(option)
option(data, ...)
}
Hence we can do things like this
set.seed(42)
dat2 <- rnorm(10)
dat2[4] <- NA
mean(dat2)
mean(dat2, na.rm = TRUE)
mf(dat2, mean, na.rm = TRUE)
the last three lines giving
> mean(dat2)
[1] NA
> mean(dat2, na.rm = TRUE)
[1] 0.5377895
> mf(dat2, mean, na.rm = TRUE)
[1] 0.5377895
There is a bit of a problem in that "data set" in R usually means a dataframe and there is no median.data.frame so you need to use both lapply and do.call:
df <- data.frame(x=rnorm(10), y=rnorm(10))
mf = function(data, option="mean") {lapply( data,
function(col) do.call(option, list(col))) }
mf(df)
#-------------
$x
[1] 0.01646814
$y
[1] 0.5388518
You did not indicate what "level" was supposed to do, so I left it out of the equation,
> mf(df, sd)
$x
[1] 1.169847
$y
[1] 0.8907117