Default argument to function only if condition is met - r

I am currently writing a function that I want to pass default arguments if a condition is met. If the condition is not met, no argument should be passed. How can I achieve this?
I tried it with ifelse and NULL like in this minimal example but it did not work:
my_function <- function(.data,
.variable = ifelse("var1" %in% names(.data), "var1", NULL)){
...
}
If "var1" is no variable name of .data and I don't pass another argument on .variable, I want to get an error like "argument ".variable" is missing, with no default". My solution works but I get other error messages.

It seems that ifelse doesn't like having NULL as the response in the case the condition is FALSE:
ifelse(2 < 1, 1, NULL)
# Error in ans[!test & ok] <- rep(no, length.out = length(ans))[!test & :
# replacement has length zero
# In addition: Warning message:
# In rep(no, length.out = length(ans)) :
# 'x' is NULL so the result will be NULL
It seems to come from the fact that ifelse returns
A vector of the same length and attributes (including dimensions and
"class") as test and data values from the values of yes or no.
and
If yes or no are too short, their elements are recycled.
Seeing rep in the error message and the fact that length(NULL) is zero seems to be a good evidence. So, instead you may want to use, e.g.,
my_function <- function(.data, .variable = if("var1" %in% names(.data)) "var1" else NULL)
is.null(.variable)
my_function("1")
# [1] TRUE
See ?ifelse for other warnings.

I would suggest not doing it directly in the default argument, but at the start of the function, with something along the lines of:
my_function <- function(.data,
.variable = NULL) {
if (is.null(.variable)) {
if ("var1" %in% names(.data)) {
.variable = "var1"
} else {
stop(".variable undefined with no suitable default")
}
}
...
}

Related

Conditional expression for '...' not working in function

I am trying to pass multiple if-else conditions in my function and so I am using dots because the function will take many arguments.
I am trying to return yes when the first element of dots is called as a list. For example (simple version):
testt <- function(...){
ar <- list(...)
val <- ar[[1]]
if(is(val == 'list') == TRUE){
print('yes')
} else {print('no')}
}
test(list(1))
>[1] "no"
Warning message:
In if (is(val == "list") == TRUE) { :
the condition has length > 1 and only the first element will be used
It won't accept that ar[[1]] is a list, how can I work around this? additionaly, is the warning trying to tell me something as to why this is not working?

How to include a not-yet indexed parameter into if-else

I have the following function:
foo <- function(...){
dots <- list(...)
response <- dots[[1]]
if(is(dots[[2]],'list') == TRUE){print('yes')} else print('no')
}
This produces the following output:
foo('yes'):
Error in dots[[2]] : subscript out of bounds
How can I use a 'not-yet' indexed parameter so that I can stall the function when it's TRUE or when its FALSE. For example, when it's TRUE I would do some stuff based on this, otherwise when it is FALSE the part of the function that uses it won't run.
However, R want's me to at-least index dots with some list values.
For example, If I wanted to use just:
foo('yes')
>Error in dots[[2]] : subscript out of bounds
#otherwise
foo('yes',c('some','list'))
>'yes'
I want to be able to run foo('yes') and for it to print no. Essentially, some parameters won't get used in the function, and so in this case when it's not assigned anything then run the else statement.
Picking up on #Rui Barradas and #Allan Camerons comments, I can achieve the same expectation with function(pred=NULL,...) by using:
foo <- function(...){
dots <- list(...)
response <- dots[[1]]
print(response)
if(length(dots) > 1){
if(is(dots[[2]],'list') == TRUE){
print('yes')
} else print('no')
} else if (length(dots) == 1){
dots[[2]] = NULL
}
}
Results:
> foo('yes',list(1, 2, 3))
[1] "yes"
> foo('yes')
[1] "yes"
Are there any cleaner alternatives to this that reduce the amount of code? My approach produces quite some clutter. The only issue I have with this is that If I wanted dots[[3]], I would have to implement further conditionals to access this or set it to NULL.

Trying to understand error length of 'dimnames' [2] not equal to array extent in R data.frame

Here is my original function:
best <- function(state=NULL, outcome){
colNum <- list("heart attack" = 11, "heart failure" = 17, "pneumonia" = 23)[outcome]
if (!is.null(colNum[[1]])){
df <- read.csv("outcome-of-care-measures.csv", colClasses = "character")[,c(2,7,colNum[[1]])]
df <- df[df[2] == state & df[3] != "Not Available",]
if(nrow(df)==0){stop("invalid state")}
df
} else {stop("invalid outcome")}
}
When I call best(outcome = "heart attack") I get the following error:
Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
length of 'dimnames' [2] not equal to array extent
Debug shows the error being thrown on this line:
df <- df[df[2] == state & df[3] != "Not Available",]
However, when I change this line to df <- df[df[[2]] == state & df[[3]] != "Not Available",] the code runs correctly.
Alternatively, if I change the default value of state to "NULL" instead of NULL, I also don't get an error.
I believe the problem is something to do with the fact the length(NULL) == 0 but what I don't quite understand is why this problem is negated by including the double square brackets here df <- df[df[[2]] == state & df[[3]] != "Not Available",]. I also was under the assumption that it was good practise to use NULL as a default argument value but this suggests not?
The following function I wrote purely to recreate the error without the CSV so feel free to use:
best2 <- function(state=NULL, outcome=NULL){
df <- data.frame(hospital = c("H1", "H2", "H3"), state = c("NY","NY","CA"), mortality = c("Not Available", "14.1", "16.2"))
df <- df[df[2] == state & df[3] != "Not Available",]
df
}
If you call best2() with single square brackets it throws the same Error in matrix but with double square brackets it throws a warning Warning message: In is.na(e2) : is.na() applied to non-(list or vector) of type 'NULL' but runs.
So my questions are:
1) Please can someone explain how the error is occurring?
2) Is it bad practise to use 'NULL' as default value?
3) What is the difference between df[2] and df[[2]] here? Using the class() function df[2] is a data.frame and df[[2]] is a character vector but I'm confused about why they both work, why one affects the aforementioned error and which is best practise to use.

How to check if vector is a single NA value without length warning and without suppression

I have a function with NA as a default, but if not NA should be a character vector not restricted to size 1. I have a check to validate these, but is.na produces the standard warning when the vector is a character vector with length greater than 1.
so_function <- function(x = NA) {
if (!(is.na(x) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
so_function(c("A", "B"))
#> Warning in if (!(is.na(x) | is.character(x))) {: the condition has length >
#> 1 and only the first element will be used
An option to prevent the warning I came up with was to use identical:
so_function <- function(x = NA) {
if (!(identical(x, NA) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
My issue here is that this function will generally be taking Excel sheet data loaded into R as inputs, and the NA values generated from that are often NA_character_, NA_integer_, and NA_real_, so identical(x, NA) is often FALSE when I actually need it to be TRUE.
For the broader context, I am experiencing this issue for S3 classes I am creating for a package, and the function below approximates how I am validating multiple attributes for that class, which is when the warnings are appearing. Because of this, I am trying to avoid suppressing warnings as the solution, so would be interested to know what best practice exists to solve this issue.
Edit
In order to make use cases clearer, this is validating attributes for a class, where I want to ensure the attribute is either a single NA value, or a character vector of any length:
so_function(NA_character_) # should pass
so_function(NA_integer_) # should pass
so_function(c(NA, NA)) # should fail
so_function(c("A", "B")) # should pass
so_function(c(1, 2, 3)) # should fail
The length warning comes from the use of if, which expects a length 1 vector, and is.na which is vectorised.
You could use any or all around the is.na to compress it to a length 1 vector but there may be edge cases where it doesn't work as you expect so I would use shortcircuit evaluation to check it is length 1 on the is.na check:
so_function <- function(x = NA) {
if (!((length(x)==1 && is.na(x)) | is.character(x))) {
stop("This was just an example for you SO!")
}
}
so_function(NA_character_) # should pass
so_function(NA_integer_) # should pass
so_function(c(NA, NA)) # should fail
Error in so_function(c(NA, NA)) : This was just an example for you SO!
so_function(c("A", "B")) # should pass
so_function(c(1, 2, 3)) # should fail
Error in so_function(c(1, 2, 3)) : This was just an example for you SO!
Another option is to use NULL as the default value instead.
I don't think the problem arises from is.na() - it is a vectorized function which produces a vector as an output. is.character(x) on the other hand is not vectorized so it only will output a single value.
You can leverage apply-like functions to overcome this e.g.
sapply(c("a", NA, 5), is.character)
if also functions similarly - you are better off using ifelse for by-element comparison.
I don't think I quite grasped what you what do to with you function but it could rewritten like this:
so_function_2 <- function(x = NA) {
condit <- !(is.na(x) | sapply(x, is.character))
ifelse(condit, "This was just an example for you SO!", "FALSE")
}

Semi-automating argument validation for R functions

I would like the end-user functions in my R package (S3 style) to validate their arguments and give the user informative errors or warnings when a particular validity check fails.
The obvious (but tedious and unmaintainable) way to do this would be:
foo<-function(aa,bb,cc,dd){
if(length(aa)!=1) stop("The argument 'aa' must have a single value");
if(!is.numeric(aa)) stop("The argument 'aa' must be numeric");
if(!is.character(bb)) stop("The argument 'bb' must be a character");
if(length(bb)>=4||length(bb)<=2) stop("The argument 'bb' must be a vector with a length between 2 and 4");
if(!is.recursive(cc)) stop("The argument 'cc' must be a list-like object");
if(!is.integer(dd)) stop("The argument 'dd' must contain only integers");
if(any(dd<aa)) stop("All values in the argument 'dd' must be greater than the value of argument 'aa'");
## ...and so on
}
I'm assuming that I'm by far not the first one to do this. So, can anybody suggest a package that automates all or part of such validation tasks? Or, failing that, some concise, generic idioms that will limit the ugliness to as few lines as possible within each function?
Thanks.
stopifnot might be similar to what you're looking for. The error messages won't be quite as nice though
foo <- function(x){
stopifnot(length(x) == 1, is.numeric(x))
return(x)
}
which gives
> foo(c(1,3))
Error: length(x) == 1 is not TRUE
> foo("a")
Error: is.numeric(x) is not TRUE
> foo(3)
[1] 3
You can write a helper function like this (rudimentary example):
validate <- function(x, ...){
for(s in c(...)) switch(s,
lengthone = if(length(x)!=1) stop("argument has length != 1."),
numeric = if(!all(is.numeric(x))) stop("non-numeric arguments."),
positive = if(any(x <= 0)) stop("non-positive arguments."),
nonmissing = if(any(is.na(x))) stop("Missing values in arguments.")
)
}
Results:
> validate(1, "numeric", "positive")
> validate(0, "numeric", "positive")
Error in validate(0, "numeric", "positive") : non-positive arguments.

Resources