Semi-automating argument validation for R functions - r

I would like the end-user functions in my R package (S3 style) to validate their arguments and give the user informative errors or warnings when a particular validity check fails.
The obvious (but tedious and unmaintainable) way to do this would be:
foo<-function(aa,bb,cc,dd){
if(length(aa)!=1) stop("The argument 'aa' must have a single value");
if(!is.numeric(aa)) stop("The argument 'aa' must be numeric");
if(!is.character(bb)) stop("The argument 'bb' must be a character");
if(length(bb)>=4||length(bb)<=2) stop("The argument 'bb' must be a vector with a length between 2 and 4");
if(!is.recursive(cc)) stop("The argument 'cc' must be a list-like object");
if(!is.integer(dd)) stop("The argument 'dd' must contain only integers");
if(any(dd<aa)) stop("All values in the argument 'dd' must be greater than the value of argument 'aa'");
## ...and so on
}
I'm assuming that I'm by far not the first one to do this. So, can anybody suggest a package that automates all or part of such validation tasks? Or, failing that, some concise, generic idioms that will limit the ugliness to as few lines as possible within each function?
Thanks.

stopifnot might be similar to what you're looking for. The error messages won't be quite as nice though
foo <- function(x){
stopifnot(length(x) == 1, is.numeric(x))
return(x)
}
which gives
> foo(c(1,3))
Error: length(x) == 1 is not TRUE
> foo("a")
Error: is.numeric(x) is not TRUE
> foo(3)
[1] 3

You can write a helper function like this (rudimentary example):
validate <- function(x, ...){
for(s in c(...)) switch(s,
lengthone = if(length(x)!=1) stop("argument has length != 1."),
numeric = if(!all(is.numeric(x))) stop("non-numeric arguments."),
positive = if(any(x <= 0)) stop("non-positive arguments."),
nonmissing = if(any(is.na(x))) stop("Missing values in arguments.")
)
}
Results:
> validate(1, "numeric", "positive")
> validate(0, "numeric", "positive")
Error in validate(0, "numeric", "positive") : non-positive arguments.

Related

Conditional expression for '...' not working in function

I am trying to pass multiple if-else conditions in my function and so I am using dots because the function will take many arguments.
I am trying to return yes when the first element of dots is called as a list. For example (simple version):
testt <- function(...){
ar <- list(...)
val <- ar[[1]]
if(is(val == 'list') == TRUE){
print('yes')
} else {print('no')}
}
test(list(1))
>[1] "no"
Warning message:
In if (is(val == "list") == TRUE) { :
the condition has length > 1 and only the first element will be used
It won't accept that ar[[1]] is a list, how can I work around this? additionaly, is the warning trying to tell me something as to why this is not working?

How to check if vector is a single NA value without length warning and without suppression

I have a function with NA as a default, but if not NA should be a character vector not restricted to size 1. I have a check to validate these, but is.na produces the standard warning when the vector is a character vector with length greater than 1.
so_function <- function(x = NA) {
if (!(is.na(x) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
so_function(c("A", "B"))
#> Warning in if (!(is.na(x) | is.character(x))) {: the condition has length >
#> 1 and only the first element will be used
An option to prevent the warning I came up with was to use identical:
so_function <- function(x = NA) {
if (!(identical(x, NA) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
My issue here is that this function will generally be taking Excel sheet data loaded into R as inputs, and the NA values generated from that are often NA_character_, NA_integer_, and NA_real_, so identical(x, NA) is often FALSE when I actually need it to be TRUE.
For the broader context, I am experiencing this issue for S3 classes I am creating for a package, and the function below approximates how I am validating multiple attributes for that class, which is when the warnings are appearing. Because of this, I am trying to avoid suppressing warnings as the solution, so would be interested to know what best practice exists to solve this issue.
Edit
In order to make use cases clearer, this is validating attributes for a class, where I want to ensure the attribute is either a single NA value, or a character vector of any length:
so_function(NA_character_) # should pass
so_function(NA_integer_) # should pass
so_function(c(NA, NA)) # should fail
so_function(c("A", "B")) # should pass
so_function(c(1, 2, 3)) # should fail
The length warning comes from the use of if, which expects a length 1 vector, and is.na which is vectorised.
You could use any or all around the is.na to compress it to a length 1 vector but there may be edge cases where it doesn't work as you expect so I would use shortcircuit evaluation to check it is length 1 on the is.na check:
so_function <- function(x = NA) {
if (!((length(x)==1 && is.na(x)) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
so_function(NA_character_) # should pass
so_function(NA_integer_) # should pass
so_function(c(NA, NA)) # should fail
Error in so_function(c(NA, NA)) : This was just an example for you SO!
so_function(c("A", "B")) # should pass
so_function(c(1, 2, 3)) # should fail
Error in so_function(c(1, 2, 3)) : This was just an example for you SO!
Another option is to use NULL as the default value instead.
I don't think the problem arises from is.na() - it is a vectorized function which produces a vector as an output. is.character(x) on the other hand is not vectorized so it only will output a single value.
You can leverage apply-like functions to overcome this e.g.
sapply(c("a", NA, 5), is.character)
if also functions similarly - you are better off using ifelse for by-element comparison.
I don't think I quite grasped what you what do to with you function but it could rewritten like this:
so_function_2 <- function(x = NA) {
condit <- !(is.na(x) | sapply(x, is.character))
ifelse(condit, "This was just an example for you SO!", "FALSE")
}

How to fix "multi-argument returns are not permitted" when making function

To learn how to create functions, I am trying to make one that calculates averages, with three different error codes. However, I get two different error messages when running this code.
If i try avg(5), that is of just one number, I get "argument "no" is missing, with no default". And when trying avg("f"), so of something that is not a number I get the error: " multi-argument returns are not permitted"
.
What I want is for it to state it need several numbers if just one is given, and that the argument must be numerical if a character is given. I do belive the second problem could be solved by some kind of "Halt" commmand, however my (probably horrible) googling hasn't lead me to anything such as that.
Appreciate all help, and thanks in advance!
avg <- function(x){
ifelse(class(x) == "numeric" & length(x)>1,
return(sum(x)/length(x)),
ifelse(class(x)!= "numeric",
return("Need to be numeric",
ifelse(length(x) <= 1,
return("Need more than one number"),
return("Unknown error")))))
}
This is just to show you that the problem is your inappropriate use of ifelse. It should be only used if you have a condition of length > 1. Otherwise, you should (in this specific case must) use if and else:
avg <- function(x){
if (class(x) == "numeric" & length(x)>1)
return(sum(x)/length(x)) else
if (class(x)!= "numeric")
return("Need to be numeric") else
if (length(x) <= 1)
return("Need more than one number") else
return("Unknown error")
}
avg(5)
#[1] "Need more than one number"
avg("f")
#[1] "Need to be numeric"
avg(c(1.5, 1.6))
#[1] 1.55
There are other issues here:
You should not return these messages. Instead you should create an error (using stop).
You should use is.numeric(x) instead of class(x) == "numeric". The former will be TRUE for integers, the latter won't.
The else is not actually needed if you return or stop if the condition is TRUE.
Using your code without simplification to have it works as is :
avg <- function(x){
ifelse(
class(x) == "numeric" & length(x)>1,
return(sum(x)/length(x)),
ifelse(
class(x)!= "numeric",
return("Need to be numeric"),
ifelse(length(x) <= 1,
return("Need more than one number"),
return("Unknown error")
)
)
)
}
avg(numeric())
avg(1)
avg(c(1, 2))
avg("f")
avg(c(NA, 1))
Please have a look at #Roland 's answer to improve code quality.

Default argument to function only if condition is met

I am currently writing a function that I want to pass default arguments if a condition is met. If the condition is not met, no argument should be passed. How can I achieve this?
I tried it with ifelse and NULL like in this minimal example but it did not work:
my_function <- function(.data,
.variable = ifelse("var1" %in% names(.data), "var1", NULL)){
...
}
If "var1" is no variable name of .data and I don't pass another argument on .variable, I want to get an error like "argument ".variable" is missing, with no default". My solution works but I get other error messages.
It seems that ifelse doesn't like having NULL as the response in the case the condition is FALSE:
ifelse(2 < 1, 1, NULL)
# Error in ans[!test & ok] <- rep(no, length.out = length(ans))[!test & :
# replacement has length zero
# In addition: Warning message:
# In rep(no, length.out = length(ans)) :
# 'x' is NULL so the result will be NULL
It seems to come from the fact that ifelse returns
A vector of the same length and attributes (including dimensions and
"class") as test and data values from the values of yes or no.
and
If yes or no are too short, their elements are recycled.
Seeing rep in the error message and the fact that length(NULL) is zero seems to be a good evidence. So, instead you may want to use, e.g.,
my_function <- function(.data, .variable = if("var1" %in% names(.data)) "var1" else NULL)
is.null(.variable)
my_function("1")
# [1] TRUE
See ?ifelse for other warnings.
I would suggest not doing it directly in the default argument, but at the start of the function, with something along the lines of:
my_function <- function(.data,
.variable = NULL) {
if (is.null(.variable)) {
if ("var1" %in% names(.data)) {
.variable = "var1"
} else {
stop(".variable undefined with no suitable default")
}
}
...
}

Better error message for stopifnot?

I am using stopifnot and I understand it just returns the first value that was not TRUE. I f that is some freaky dynamic expression someone who is not into the custom function cannot really make something out of that. So I would love to add a custom error message. Any suggestions?
Error: length(unique(nchar(check))) == 1 is not TRUE
Basically states that the elements of the vector check do not have the same length.
Is there a way of saying: Error: Elements of your input vector do not have the same length!?
Use stop and an if statement:
if(length(unique(nchar(check))) != 1)
stop("Error: Elements of your input vector do not have the same length!")
Just remember that stopifnot has the convenience of stating the negative, so your condition in the if needs to be the negation of your stop condition.
This is what the error message looks like:
> check = c("x", "xx", "xxx")
> if(length(unique(nchar(check))) != 1)
+ stop("Error: Elements of your input vector do not have the same length!")
Error in eval(expr, envir, enclos) :
Error: Elements of your input vector do not have the same length!
A custom message can be added as a label to your expression:
stopifnot("Elements of your input vector do not have the same length!" =
length(unique(nchar(check))) == 1)
# Error: Elements of your input vector do not have the same length!
The assertive and assertthat packages have more readable check functions.
library(assertthat)
assert_that(length(unique(nchar(check))) == 1)
## Error: length(unique(nchar(check))) == 1 are not all true.
library(assertive)
assert_is_scalar(unique(nchar(check)))
## Error: unique(nchar(check)) does not have length one.
if(!is_scalar(unique(nchar(check))))
{
stop("Elements of check have different numbers of characters.")
}
## Error: Elements of check have different numbers of characters.
Or you could package it up.
assert <- function (expr, error) {
if (! expr) stop(error, call. = FALSE)
}
So you have:
> check = c("x", "xx", "xxx")
> assert(length(unique(nchar(check))) == 1, "Elements of your input vector do not have the same length!")
Error: Elements of your input vector do not have the same length!
What about embedding the stopifnot into tryCatch and then recasting the exception with stop using customized message?
Something like:
tryCatch(stopifnot(...,error=stop("Your customized error message"))
Unlike some other solutions this does not require additional packages. Compared to using if statement combined with stop you retain the performance advantages of stopifnot, when you use new R versions. Since R version 3.5.0 stopifnot evaluates expressions sequentially and stops on first failure.
I would recommend you check out Hadley's testthat package. It allows for intuitive testing: the names of the functions are great and the way you write them is like a sentence -- "I expect that length(unique(nchar(check))) is [exactly|approximately] 1". The errors produced are informative.
See here:
http://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf
In your case,
> library(testthat)
> check = c("x", "xx", "xxx")
> expect_that(length(unique(nchar(check))), equals(1))
Error: length(unique(nchar(check))) not equal to 1
Mean relative difference: 2
Also note that you don't have the problem that #Andrie referenced with sometimes having to think about double negatives with stopifnot. I know it seems simple, but it caused me many headaches!
The answers already provided are quite good, and mine is just an addition to that collection. For some people it could be more convenient to use one-liner in form of the following function:
stopifnotinform <- function(..., message = "err") {
args <- list(...)
if (length(args) == 0) {
return(NULL)
}
for (i in 1:length(args)) {
result <- args[[i]]
if (is.atomic(result) && result == FALSE) {
stop(message)
}
}
}
# throws an error
stopifnotinform(is.integer(1L), is.integer(2), message = "Some number(s) provided is not an integer")
# continues with execution
stopifnotinform(is.integer(1L), is.integer(2L), message = "Some number(s) provided is not an integer")
Bear in mind that this solution provides you with only one (common) error message for all parameters in ....
Try this:
same_length <- FALSE
stopifnot("Elements of your input vector do not have the same length!" = same_length)
#> Error : Elements of your input vector do not have the same length!

Resources