R: check mandatory/required arguments are provided - r

I'd like to be able to have a way to check all arguments without defaults are specified when calling a function in R. This seems like a sensible thing to do to me, as it avoids a function failing later on (which could be after a lot of processing) when a value is found to be missing.
One way to accomplish this task would be to write a if (missing(arg)) statement for each argument, but this requires keeping the arguments for the function and the above statements consistent with each other, so I'm looking for a better solution.
At present I use the following function, which works in most cases, but not all.
# check for required arguments by getting arguments for the
# definition of the calling function and comparing to the arguments
# in the function call
check_required_args <- function () {
def <- sys.function(-1)
f_args <- formals(def)
f_args <- f_args[sapply(f_args, is.name)] # remove arguments with defaults
f_args <- names(f_args)
f_args <- f_args[f_args != '...'] # remove ellipsis argument if present
call <- match.call(definition=def, call=sys.call(-1))
f_name <- call[1]
c_args <- names(as.list(call[-1]))
for(n in f_args) {
if (!n %in% c_args) {
stop("Argument '", n, "' missing from call to function ",
f_name, "()", call.=FALSE)
}
}
}
f <- function(a, b, c=2) check_required_args()
f(a=1) # should fail (missing argument b)
f(2, 3) # should work
f(2, c=5) # should fail (missing argument b)
f(2, 3, 4) # should work
f <- function(a, b, ...) f2(a, b, ...)
f2 <- function(a, b, c, ...) check_required_args()
f2(a=1, b=2, c=3) # should work
f2(a=1, b=2) # should fail (missing argument c for function f2)
f(a=1, b=2, c=3) # should work
f(a=1, b=2) # should fail (missing argument c for function f2)
Can this function be improved to work in all of these cases? If not is there a more appropriate solution to perform this check?

Maybe this?
check_required_args <- function (fun = sys.function(-1), ncall = 3) {
f_args <- formals(fun)
f_args <- f_args[vapply(f_args, is.symbol, FUN.VALUE = TRUE)]
f_args <- names(f_args)
f_args <- setdiff(f_args, "...")
test <- vapply(f_args,
function(x) missingArg(as.name(x), envir = parent.frame(ncall), eval = TRUE),
FUN.VALUE = TRUE)
stopifnot(!any(test))
return(invisible(NULL))
}
f <- function(a, b, c=2) {
check_required_args()
return("Hello!")
}
f(a=1) # should fail (missing argument b)
#Error: !any(test) is not TRUE
f(2, 3) # should work
#[1] "Hello!"
f(2, c=5) # should fail (missing argument b)
# Error: !any(test) is not TRUE
f(2, 3, 4) # should work
#[1] "Hello!"
x <- 1
f(a=x, 3)
#[1] "Hello!"
f <- function(a, b, ...) f2(a, b, ...)
f2 <- function(a, b, c, ...) {
check_required_args()
return("Hello!")
}
f2(a=1, b=2, c=3) # should work
#[1] "Hello!"
f2(a=1, b=2) # should fail (missing argument c for function f2)
#Error: !any(test) is not TRUE
f(a=1, b=2, c=3) # should work
#[1] "Hello!"
f(a=1, b=2) # should fail (missing argument c for function f2)
#Error: !any(test) is not TRUE
Edit:
You might want to use get to check for existence:
check_required_args <- function (fun = sys.function(-1), ncall = 3) {
f_args <- formals(fun)
f_args <- f_args[vapply(f_args, is.symbol, FUN.VALUE=TRUE)]
f_args <- names(f_args)
f_args <- setdiff(f_args, "...")
test <- lapply(f_args,
function(x) {
get(x, envir = parent.frame(ncall), inherits = TRUE)
return(NULL)
})
#possibly use a for loop instead
#wrap in tryCatch for customized error messages
}
f <- function(a, b, ...) f2(a, b, ...)
f2 <- function(a, b, c, ...) {
check_required_args()
return("Hello!")
}
f(c=2)
#Error in get(x, envir = parent.frame(ncall), inherits = TRUE) :
# argument "a" is missing, with no default
If you don't want check in enclosing frames, set inherits = FALSE.

Related

Pass a function as an another function argument

I have two functions, as for example:
a <- function(x) return(mean(x))
b <- function(x) return(median(x))
I would like to have another function that passes either a or b as an argument.
The goal is something like this:
oper <- function(f, x) {
ifelse(f == "a", a(x), b(x))
}
If for example I was to execute the function:
oper(a, c(3,4,5))
I get the following error message:
Error in f == "a" :
comparison (1) is possible only for atomic and list types
Disclosure: mean(x) and median(x) are just for example purposes.
Because R has first-class functions, you can simply pass your function and call it directly:
oper2 <- function(f, x) {
f(x)
}
x <- c(2, 3, 8)
oper2(a, x)
# 4.333333
oper2(b, x)
# 3

Unpack dots provided from another function with missing named arguments

Similar to the question here. Given a function f with named arguments and a function g taking any number of arguments through ..., how would one
f <- function(a)
g(a = a)
g <- function(...)
list(...)
f()
Error in g(a = a) : argument "a" is missing, with no default
rlang::dots_list sadly did not provide an answer
f2 <- function(a)
h(a = a)
h <- function(...)
rlang::dots_list(..., .ignore_empty = 'all')
f2()
Error in eval(expr, p) : argument "a" is missing, with no default
Edit:
To make the problem more clear, the function g may be called by a myriad of functions, and I'm looking for a way to handle the missing arguments within g and not f.
You can forward ... to subfunctions to multiple depths without evaluating them as long as the subfunctions don't actually perform any evaluation themselves so you don't have to handle this in all functions that receive ... but at the point where it is evaluated you will need to deal with it somehow.
Assuming that f() should return a empty list handle the missing argument separately within g
f <- function(a) g(a = a)
g <- function(..., default = list()) if (missing(..1)) default else list(...)
f()
## [1] list()
or the following which checks each element of ... :
g <- function(..., default = list()) {
L <- list()
for(i in seq_len(...length())) {
x <- try(eval.parent(list(...)[[i]]), silent = TRUE)
L[[i]] <- if (inherits(x, "try-error")) default else x
}
names(L) <- names(substitute(alist(...))[-1])
L
}
f()
## $a
## list()
or within f:
f <- function(a) if (missing(a)) g() else g(a = a)
g <- function(...) list(...)
f()
## [1] list()
Your code seems to be OK except you call f() without a argument at the end... try this:
f <- function(a)
g(a = a)
g <- function(...)
list(...)
f("example")
Or you have to provide a default value for a:
f <- function(a = "example")
g(a = a)
g <- function(...)
list(...)
f()
So the problem is not a missing argument in g(...), but missing argument value in f() when calling g(a = a) without having a.

How to pass additional parameters (stored in "...") to multiple cores for parLapply, in R?

I'm writing a function f1() that accepts additional parameters "..." to be passed to a function f0() inside of f1(). More precise, f0() is called inside a parallel lapply call, inside of f1. It works fine as long as there is at least one parameter to be passed, but if "..." is empty, I get the error message: " Error in get(name, envir = envir) : argument "..." is missing, with no default"
It works fine if I use lapply, instead of parLapply.
Is there a proper solution? I don't want to define all parameters for f0() explicitly in the definition of f1().
First, I define the function with the two optional parameters.
f0 <- function(a, b, d1 = NULL, d2 = NULL){
if(is.null(d1)){
ret <- a * b
}else{
ret <- a * b / d1
}
if(!is.null(d2)){
ret <- ret - d2
}
ret
}
Next,the functions f1() and f1_par() that do the same, one with sapply and the other with parSapply.
f1 <- function(A, ...){
# A ... vector of a
B <- rev(A)
sapply(seq_along(A), function(i){
f0(A[i], B[i], ...)
})
}
f1_par <- function(A, ...){
# A ... vector of a
B <- rev(A)
cl <- parallel::makeCluster(2)
parallel::clusterExport(cl, envir = environment(), c("A", "B", "f0","..."))
ret <- parallel::parSapply(cl, seq_along(A), function(i){
f0(A[i], B[i], ...)
})
parallel::stopCluster(cl)
ret
}
I get the right results for all of the following six functions calls, except for the last one:
A <- 1:4
# sapply
f1(A, d1 = 2, d2 = 4)
f1(A, d1 = 2)
f1(A)
# parSapply
f1_par(A, d1 = 2, d2 = 4)
f1_par(A, d1 = 2)
f1_par(A) # this one causes the error
I'm pretty sure you cannot export ... that way. Instead, make sure to pass down ... as arguments as in:
f1 <- function(A, ...) {
# A ... vector of a
B <- rev(A)
sapply(seq_along(A), function(i, ...) {
f0(A[i], B[i], ...)
}, ...)
}
Then, do the same with parallel::parSapply().

Subsetting in a second level R function

Function foo1 can subset a list by a requested variable (e.g., by = type == 1). Otherwise, foo1 will simply output the inputted list itself.
For my purposes, I need to use foo1 within a new function called foo2.
In my code below, my desired output is obtained like so: foo2(data = D, by = G[[1]]) ; foo2(data = D, by = G[[2]]) ; foo2(data = D, by = G[[3]]).
But, I wonder why when I loop over G using lapply, I get an error as shown below?
foo1 <- function(data, by){
L <- split(data, data$study.name) ; L[[1]] <- NULL
if(!missing(by)){
L <- lapply(L, function(x) do.call("subset", list(x, by)))
}
return(L)
}
foo2 <- function(data, by){
eval(substitute(foo1(data = data, by = by)))
}
## EXAMPLE OF USE:
D <- read.csv("https://raw.githubusercontent.com/izeh/i/master/k.csv", h = T) ## Data
G <- lapply(unique(na.omit(D$type)), function(i) bquote(type == .(i)))# all levels of `type`
foo2(data = D, by = G[[1]]) # Works fine without `lapply` :-)
lapply(1:3, function(i) foo2(data = D, by = G[[i]])) # Doesn't work with `lapply`! :-(
# Error in do.call("subset", list(x, by)) : object 'i' not found
Your foo2 function tries to evaluate the expression
foo1(data = D, by = G[[i]])
but it doesn't have i available. You need to evaluate G[[i]] in the anonymous function you're passing to lapply to get an expression defining the subset, and then evaluate that subset in foo2. I recommend naming that function instead of using an anonymous one; it makes debugging a lot easier.
Here's some recoding that appears to work:
Redefine foo2 to
foo2 <- function(data, by){
by <- eval(by, envir = data)
foo1(data = data, by = by)
}
and
foo3 <- function(i) {
expr <- G[[i]]
foo2(data = D, by = expr)
}
and then
lapply(1:3, foo3)
I'm not sure this does exactly what you want, but it should be close enough that you can fix it up.
Instead of using lapply, here a for loop can be used
lst1 <- vector("list", length(G))
for(i in 1:3) lst1[[i]] <- foo2(data = D, by = G[[i]])
-checking
identical(lst1[[2]], foo2(data = D, by = G[[2]]))
#[1] TRUE
identical(lst1[[3]], foo2(data = D, by = G[[3]]))
#[1] TRUE
For the lapply part, there seems to be a conflict with i anonymous function which is also called in the G. If we use a new variable say 'j'
lst2 <- lapply(1:3, function(j) foo1(data = D, by = G[[j]]))
should work
identical(lst2[[2]], lst1[[2]])
#[1] TRUE

Passing all arguments to another function

I'd like to be able to pass current arguments in a function to another function without individually listing each of the arguments. This is for a slightly more complex function which will have about 15 arguments with potentially more arguments later added (it's based on an API for data which might have more complex data added later):
f_nested <- function(a, b, ...) {
c <- a + b
return(c)
}
f_main <- function(a, b) {
d <- do.call(f_nested, as.list(match.call(expand.dots = FALSE)[-1]))
c <- 2 / d
return(c)
}
f_main(2, 3)
#> [1] 0.4
sapply(2:4, function(x) f_main(x, 4))
#> Error in (function (a, b, ...) : object 'x' not found
Created on 2019-06-28 by the reprex package (v0.3.0)
The first call to f_main(2, 3) produces the expected result. However, when iterating over a vector of values with sapply an error arises that the object was not found. I suspect my match.call() use is not correct and I'd like to be able to iterate over my function.
I'll borrow from lm's used of match.call, replacing the first element with the next function. I think one key is to call eval with the parent.frame(), so that x will be resolved correctly.
# no change
f_nested <- function(a, b, ...) {
c <- a + b
return(c)
}
# changed, using `eval` instead of `do.call`, reassigning the function name
f_main <- function(a, b) {
thiscall <- match.call(expand.dots = TRUE)
thiscall[[1]] <- as.name("f_nested")
d <- eval(thiscall, envir = parent.frame())
c <- 2 / d
return(c)
}
sapply(2:4, function(x) f_main(x, 4))
# [1] 0.3333333 0.2857143 0.2500000
As #MrFlick suggested, this can be shortened slightly with:
f_main <- function(a, b) {
thiscall <- match.call(expand.dots = TRUE)
thiscall[[1]] <- as.name("f_nested")
d <- eval.parent(thiscall)
c <- 2 / d
return(c)
}

Resources