R: Helper function that checks arguments within an environment - r

In R (3.4.3),
I'm trying to make my code more succinct before I move it to a package. I'm struggling to find a simple way to check actual arguments passed to multiple functions within an environment, especially when some of those arguments have default values. My goal is not paste redundant code in each of the functions, maintaining readability, and not running into scoping issues. Here is an example (imagine these functions being within the same environment/script):
Current Code
foo_add <- function(x, y = x, divisible = TRUE) {
if(missing(x) || !is.numeric(x)) stop("define x as a number")
if(!is.numeric(y)) y <- x
if(!is.logical(divisible)) stop("define divisible as TRUE or FALSE")
...}
foo_subtract <- function(x, y = x, divisible = TRUE) {
if(missing(x) || !is.numeric(x)) stop("define x as a number")
if(!is.numeric(y)) y <- x
if(!is.logical(divisible)) stop("define divisible as TRUE or FALSE")
...}
foo_divide <- function(x, y = x, divisible = TRUE) {
if(missing(x) || !is.numeric(x)) stop("define x as a number")
if(!is.numeric(y)) y <- x
if(!is.logical(divisible)) stop("define divisible as TRUE or FALSE")
...}
You can see that I'm repeating code just to verify the useR defined the arguments correctly. Below I've added a list return just to check if things worked. My rendition of a cleaner version:
Desired Code
foo_add <- function(x, y = x, divisible = TRUE) {
do.call("check_args", mget(names(formals())))
list(x,y,divisible)
}
foo_subtract <- function(x, y = x, divisible = TRUE) {
do.call("check_args", mget(names(formals())))
list(x,y,divisible)
}
foo_divide <- function(x, y = x, divisible = TRUE) {
do.call("check_args", mget(names(formals())))
list(x,y,divisible)
}
check_args <- function(x,y,divisible) {
if(missing(x) || !is.numeric(x)) stop("define x as a number")
if(!is.numeric(y)) y <- x
if(!is.logical(divisible)) stop("define divisible as TRUE or FALSE")
}
Basically what I want to do is always check if actual arguments are specified correctly while controlling for default arguments and missing arguments. From my basic knowledge I kind of want to make a constructor-esk function for actual arguments. I also want to have the ability of not only error-checking but modifying actual argument values and returning them to the parent function for use (e.g., like the condition if(!is.numeric(y)) y <- x).
The Desired Code seems to work, I just want to make sure, given how new I am to R programming, that I'm accounting for things. Specifically if I modify objects within the check_args() then pass them to a parent function using do.call. Is there a cleaner way to go about doing this?
Updated Solution I've created a, somewhat extensible, solution see here

Related

Effective way to do sanity checks in R programming functions?

Does anyone know about a package or function that lets me do different sanity checks about data classes or check matching lengths of the variables?
Any suggestions are welcomed beyond the basic:
f1 <- function(data, x, y) {
if (!is.data.frame(data)) stop("data must be data.frame!")
if (!is.vector(x)) stop("x must be a vector!")
...code...
}
I am looking for something along (any other suggestions welcomed)
f2 <- function(data, x, y) {
check(
data = data.frame,
err1 = "data must be data.frame",
x = vector,
err2 = "x must be vector",
...
)
... code ...
}

Function call with additional custom arguments

I'm trying to wrap my head around ellipsis in R. I have a function and want to be able to pass additional arguments to the function as needed, for example whether or not to return a df or similar. Can I not specify variable names? This is a very simplified example and I want to be able to make this optional to keep function calls as easy and clean as possible with multiple possible conditionals within the function for various scenarios.
custom.fun<-function(x, y, z, ...){
a<-sum(x, y, z)
if (exists('return.var') && return.var=='yes'){
return(a)
}
}
A<-custom.fun(1,2,3,return.var='yes')
This returns Null, as it is obviously not passing on return.var.
I guess you can do something similar to this, capture all the optional argument in list and check if any of them have the required name and value.
custom.fun<-function(x, y, z, ...){
opt_args <- list(...)
a <- sum(x, y, z)
if (any(names(opt_args) == 'return.var' & opt_args == 'yes'))
return(a)
else
return('No arg')
}
custom.fun(1,2,3,return.var = 'yes')
#[1] 6
custom.fun(1,2,3,var = 'yes')
#[1] "No arg"
custom.fun(1,2,3,var='no', return.var = 'yes')
#[1] 6

ISO a good way to let a function accept a mix of supplied arguments, arguments from a list, and defaults

I would like to have a function accept arguments in the usual R way, most of which will have defaults. But I would also like it to accept a list of named arguments corresponding to some or some or all of the formals. Finally, I would like arguments supplied to the function directly, and not through the list, to override the list arguments where they conflict.
I could do this with a bunch of nested if-statements. But I have a feeling there is some elegant, concise, R-ish programming-on-the-language solution -- probably multiple such solutions -- and I would like to learn to use them. To show the kind of solution I am looking for:
> arg_lst <- list(x=0, y=1)
> fn <- function(a_list = NULL, x=2, y=3, z=5, ...){
<missing code>
print(c(x, y, z))
}
> fn(a_list = arg_list, y=7)
Desired output:
x y z
0 7 5
I like a lot about #jdobres's approach, but I don't like the use of assign and the potential scoping breaks.
I also don't like the premise, that a function should be written in a special way for this to work. Wouldn't it be better to write a wrapper, much like do.call, to work this way with any function? Here is that approach:
Edit: solution based off of purrr::invoke
Thinking a bit more about this, purrr::invoke almost get's there - but it will result in an error if a list argument is also passed to .... But we can make slight modifications to the code and get a working version more concisely. This version seems more robust.
library(purrr)
h_invoke = function (.f, .x = NULL, ..., .env = NULL) {
.env <- .env %||% parent.frame()
args <- c(list(...), as.list(.x)) # switch order so ... is first
args = args[!duplicated(names(args))] # remove duplicates
do.call(.f, args, envir = .env)
}
h_invoke(fn, arg_list, y = 7)
# [1] 0 7 5
Original version borrowing heavily from jdobres's code:
hierarchical_do_call = function(f, a_list = NULL, ...){
formal_args = formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('f', 'a_list', '...')] = NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args[c('f', 'a_list')] = NULL # ...but remove the argument list and the function
a_list[names(supplied_args)] = supplied_args
do.call(what = f, args = a_list)
}
fn = function(x=2, y=3, z=5) {
print(c(x, y, z))
}
arg_list <- list(x=0, y=1)
hierarchical_do_call(f = fn, a_list = arg_list, y=7)
# x y z
# 0 7 5
I'm not sure how "elegant" this is, but here's my best attempt to satisfy the OP's requirements. The if/else logic is actually pretty straightforward (no nesting needed, per se). The real work is in collecting and sanitizing the three different input types (formal defaults, the list object, and any supplied arguments).
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
formal_args <- formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]])
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]])
}
}
print(c(x, y, z))
}
arg_lst <- list(x = 0, y = 1)
fn(a_list = arg_lst, y=7)
[1] 0 7 5
With a little more digging into R's meta-programming functions, it's actually possible to pack this hierarchical assignment into its own function, which is designed to operate on the function environment that called it. This makes it easier to reuse this functionality, but it definitely breaks scope and should be considered dangerous.
The "hierarchical assignment" function, mostly the same as before:
hierarchical_assign <- function(a_list) {
formal_args <- formals(sys.function(-1)) # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call(sys.function(-1), sys.call(-1)))[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]], envir = parent.frame())
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]], envir = parent.frame())
}
}
}
And the usage. Note that the the calling function must have an argument named a_list, and it must be passed to hierarchical_assign.
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
hierarchical_assign(a_list)
print(c(x, y, z))
}
[1] 0 7 5
I think do.call() does exactly what you want. It accepts a function and a list as arguments, the list being arguments for the functions. I think you will need a wrapper function to create this behavior of "overwriting defaults"

Setting conditional(ifelse) arguments in a function

I have a simple function and one of its arguments needs to be made conditional, in the sense:
IFELSE NOT MISSING take the given value
IFELSE NOT EXISTS then give some default value
ELSE give the global value of that argument
something like this:
f <- function(x,y=ifelse(!missing("y"),y,ifelse(!exists("y"),1,get("y",envir=.GlobalEnv))))
{
assign("y",y,envir=.GlobalEnv)
return(x+y)
}
required outputs:
# :f(3) should give me 4 with global y=1
# :f(4,2) should give me 6 with global y=2
# :f(5) should give me 7 with global y=2
#Note that y does not exist when we first run the function
running the above function with f(3) gives me:- Error in get("y", envir = .GlobalEnv) : object 'y' not found
If i specify y in the first-go then the function works perfectly, but I want it to run even if the user doesnt specify the y argument in the first-go
Any suggestions?
Thanks.
EDIT:
#Richard i dont think it matters if missing() is in the function or in the argument, in any case try to do the same using this function:
f <- function(x,y)
{
y=ifelse(!missing("y"),y,ifelse(!exists("y"),1,get("y",envir=.GlobalEnv)))
assign("y",y,envir=.GlobalEnv)
return(x+y)
}
First, I'll begin by stating that I don't recommend you do this. You'll be messing around with reassignment of global objects, which can be dangerous and lead to trouble if you need to go back and recall a value that you've overwritten.
Next, missing should not be used in the argument list of a function. It should be used in the function body to check the arguments in the argument list, and has specific usage that is noted in the help file.
That said, if you must do it, here you go.
f <- function(x, y)
{
if(missing(y)) {
y <- if(exists("y", envir = .GlobalEnv)) {
get("y", envir = .GlobalEnv)
} else {
1L
}
}
assign("y", y, .GlobalEnv)
x + y
}
> rm(y)
> f(3)
#[1] 4
> f(4, 2)
#[1] 6
> f(5)
#[1] 7
Try:
f <- function(x,y=NULL)
{
y = ifelse( !is.null(y), y,
ifelse(!exists("y"), 1,get("y",envir=.GlobalEnv)
)
)
assign("y",y,envir=.GlobalEnv)
return(x+y)
}
If y does not exist, try:
f <- function(x,y)
{
if(missing("y") || !exists('y')) y = 1
assign("y",y,envir=.GlobalEnv)
x+y
}

Passing arguments to iterated function through apply

I have a function like this dummy-one:
FUN <- function(x, parameter){
if (parameter == 1){
z <- DO SOMETHING WITH "x"}
if (parameter ==2){
z <- DO OTHER STUFF WITH "x"}
return(z)
}
Now, I would like to use the function on a dataset using apply.
The problem is, that apply(data,1,FUN(parameter=1))
wont work, as FUN doesn't know what "x" is.
Is there a way to tell apply to call FUN with "x" as the current row/col?
`
You want apply(data,1,FUN,parameter=1). Note the ... in the function definition:
> args(apply)
function (X, MARGIN, FUN, ...)
NULL
and the corresponding entry in the documentation:
...: optional arguments to ‘FUN’.
You can make an anonymous function within the call to apply so that FUN will know what "x" is:
apply(data, 1, function(x) FUN(x, parameter = 1))
See ?apply for examples at the bottom that use this method.
Here's a practical example of passing arguments using the ... object and *apply. It's slick, and this seemed like an easy example to explain the use. An important point to remember is when you define an argument as ... all calls to that function must have named arguments. (so R understands what you're trying to put where). For example, I could have called times <- fperform(longfunction, 10, noise = 5000) but leaving off noise = would have given me an error because it's being passed through ... My personal style is to name all of the arguments if a ... is used just to be safe.
You can see that the argument noise is being defined in the call to fperform(FUN = longfunction, ntimes = 10, noise = 5000) but isn't being used for another 2 levels with the call to diff <- rbind(c(x, runtime(FUN, ...))) and ultimately fun <- FUN(...)
# Made this to take up time
longfunction <- function(noise = 2500, ...) {
lapply(seq(noise), function(x) {
z <- noise * runif(x)
})
}
# Takes a function and clocks the runtime
runtime <- function(FUN, display = TRUE, ...) {
before <- Sys.time()
fun <- FUN(...)
after <- Sys.time()
if (isTRUE(display)) {
print(after-before)
}
else {
after-before
}
}
# Vectorizes runtime() to allow for multiple tests
fperform <- function(FUN, ntimes = 10, ...) {
out <- sapply(seq(ntimes), function(x) {
diff <- rbind(c(x, runtime(FUN, ...)))
})
}
times <- fperform(FUN = longfunction, ntimes = 10, noise = 5000)
avgtime <- mean(times[2,])
print(paste("Average Time difference of ", avgtime, " secs", sep=""))

Resources