alternative to "!is.null()" in R - r

my R code ends up containing plethora of statements of the form:
if (!is.null(aVariable)) {
do whatever
}
But this kind of statement is hard to read because it contains two negations. I would prefer something like:
if (is.defined(aVariable)) {
do whatever
}
Does a is.defined type function that does the opposite of !is.null exist standard in R?
cheers,
yannick

You may be better off working out what value type your function or code accepts, and asking for that:
if (is.integer(aVariable))
{
do whatever
}
This may be an improvement over isnull, because it provides type checking. On the other hand, it may reduce the genericity of your code.
Alternatively, just make the function you want:
is.defined = function(x)!is.null(x)

If it's just a matter of easy reading, you could always define your own function :
is.not.null <- function(x) !is.null(x)
So you can use it all along your program.
is.not.null(3)
is.not.null(NULL)

Ian put this in the comment, but I think it's a good answer:
if (exists("aVariable"))
{
do whatever
}
note that the variable name is quoted.

I have also seen:
if(length(obj)) {
# do this if object has length
# NULL has no length
}
I don't think it's great though. Because some vectors can be of length 0. character(0), logical(0), integer(0) and that might be treated as a NULL instead of an error.

To handle undefined variables as well as nulls, you can use substitute with deparse:
nullSafe <- function(x) {
if (!exists(deparse(substitute(x))) || is.null(x)) {
return(NA)
} else {
return(x)
}
}
nullSafe(my.nonexistent.var)

The shiny package provides the convenient functions validate() and need() for checking that variables are both available and valid. need() evaluates an expression. If the expression is not valid, then an error message is returned. If the expression is valid, NULL is returned. One can use this to check if a variable is valid. See ?need for more information.
I suggest defining a function like this:
is.valid <- function(x) {
require(shiny)
is.null(need(x, message = FALSE))
}
This function is.valid() will return FALSE if x is FALSE, NULL, NA, NaN, an empty string "", an empty atomic vector, a vector containing only missing values, a logical vector containing only FALSE, or an object of class try-error. In all other cases, it returns TRUE.
That means, need() (and is.valid()) covers a really broad range of failure cases. Instead of writing:
if (!is.null(x) && !is.na(x) && !is.nan(x)) {
...
}
one can write simply:
if (is.valid(x)) {
...
}
With the check for class try-error, it can even be used in conjunction with a try() block to silently catch errors: (see https://csgillespie.github.io/efficientR/programming.html#communicating-with-the-user)
bad = try(1 + "1", silent = TRUE)
if (is.valid(bad)) {
...
}

Related

R - Writing a function to return binary output using if statement

Good day,
I am a beginner and trying to understand why I am getting the error below.
I am trying to create a function that would return 0 or 1 based on column values in data set.
LT = function(Lost.time) {
For (i in 1:dim(df)) {
if (df$Lost.time > 0) {
x = 1
}
else {
x = 0
}
return(x)
}
}
Error: no function to return from, jumping to top level In addition: Warning
message: In if (df$Lost.time > 0) { : the condition has length > 1 and only
the first element will be used> } Error: unexpected '}' in "}"
There are a couple of mistakes in the code:
R is case sensitive. Use for instead of For.
If you are looping over the entries in df$Lost.time, the individual elements should be addressed within the loop using df$Lost.time[i]. However, a loop is not necessary for this task.
An else statement should not begin on a new line of the code. The parser cannot know that the if statement is not finished after the first block. If the else statement is enclosed in curly braces like in } else { there will be no problem in this sense.
The parameter passed to the function is not suitable. Maybe you could pass df, instead of Lost.time, but it may be necessary to rewrite parts of the function.
The use of 1:dim(df) in the for loop should work, but it will trigger a warning message. It is better to use 1:nrow(df).
Those are syntax problems. However, the main issue is probably what has been addressed in the answer by #TimBiegeleisen: In the loop you are checking for each of the ̀nrow(df) elements of df$Lost.time whether a specific condition is fulfilled. It therefore does not seem to make sense to have a single binary result as return value. The purpose of the function should be clarified before it is implemented.
An alternative to this function could be constructed in a one-liner with ifelse.
It is not clear what you actually want to return in your function. return can only be called once, after which it will return a single value and the function will terminate.
If you want to get a vector which will contain 1 or 0 depending on whether a given row in your data frame has Lost.time > 0, then the following one liner should do the trick:
x <- as.numeric(df$Lost.time > 0)
If loops are used for writing a function indices should be used for each element.
Create a variable(x) in your dataframe, if the statements goes true it prints 1 else 0
LT = function(Lost.time) {
for (i in 1:dim(df)) {
if (as.numeric(df$Lost.time[i]) > 0) {
df$x[i] <- 1
}else{
df$x[i] <- 0
}
}
}

Catching use of return without parentheses in R

I just tracked down a silly bug in some R code that I had written. The bug was equivalent to this:
brokenEarlyReturn = function(x=TRUE) {
if (x) return # broken with bare return
stop("Should not get here if x is TRUE. x == ", x)
}
brokenEarlyReturn(TRUE)
# Error in brokenEarlyReturn(TRUE) :
# Should not get here if x is TRUE. x == TRUE
The problem is that instead of return() I had just a bare return without the following parentheses. This causes the if statement be roughly equivalent to if (x) constant, where the body is a bareword that performs no action. In this case, the bareword is the definition of the return function itself, and the function continues rather than returning. The correct version would look like this:
workingEarlyReturn = function(x=TRUE) {
if (x) return() # parentheses added to return
stop("Should not get here if x is TRUE. x == ", x)
}
It makes sense that R requires parentheses after return, but as a C programmer I'm likely to occasionally forget to add them. Usually there would be a parsing error if they are omitted, but in this case of a bare return in the body of an if statement there is not.
Assuming I want the ability to put a "guard" statement at the top of a function that will return without a value if some condition is not met, how I can avoid making this error in the future? Or at least, how can I make it easier to track down this error when I do make it? Is there some "expression has no effect" warning that I can turn on?

What does substitute(substitute()) do?

I am not entirely sure I understand what substitute does, although I've used in it code before. Today I encountered in shiny::exprToFunction the following lines of code:
function (expr, env = parent.frame(2), quoted = FALSE, caller_offset = 1)
{
expr_sub <- eval(substitute(substitute(expr)),
...
}
Can someone please explain why nested substitute is used here? A easy to run example would really help.
Take a look at
a<-function(aa) {
b(aa)
}
b<-function(bb) {
z(bb)
}
z<-function(zz) {
print(substitute(zz))
print(substitute(substitute(zz)))
print(eval(substitute(substitute(zz)), parent.frame()))
}
q<-5
a(q)
# bb
# substitute(bb)
# aa
The first/inner substitute grabs the name/symbol that was passed to the called function. The second/outer substitute() simply wraps a substitute() command around that discovered name/symbol. Then that substitute() is evaluated in the parent environment where it came from.
The method of using substitute to capture variable names only works when parameters are still promises; that is, they have not yet been evaluated.

checking if data types are appropriate and returning an error message if not

I want to return my error message if arguments passed to the function are not of the required type (character)
if (!typeof(param1) == "character" || !typeof(param2) == "character") {
stop("You must provide valid arguments")
}
This only works if I provide invalid arguments. How can I ensure the message is displayed if some of the parameters are missing, because if doesn't work if I call the function without any parameters ?
You can use missing() to check whether an argument is provided. This is very much preferred over the other answers that suggest using default values that are of a different type than what expected (how confusing!). Only use defaults when it makes sense to have a default.
Also, it is not a good idea to use typeof() for type checking, in general. The typeof() function returns how the data are stored in the R implementation. Usually, a function cares whether an object presents a particular API, which corresponds to the class. To check inheritance, use is(). But in this case, for both readability and just to follow conventions, consider using is.character().
So you might have something like:
if (missing(param1) || !is.character(param1)) {
stop("'param1' must be provided as a character vector")
}
Also, things to keep in mind when checking vectors:
Often we really are expecting a scalar, i.e., a length-one vector, but a vector can have arbitrary length, so we should check that it is of length one.
Vectors can contain missing values, which code often cannot handle, so we often need to ensure that the values are not missing.
You might find it useful to define helpers for this, such as this function from the S4Vectors package in Bioconductor:
isSingleString <- function (x)
{
is.character(x) && length(x) == 1L && !is.na(x)
}
Then:
if (missing(param1) || !isSingleString(param1)) {
stop("param1 must be a single, non-NA string")
}
To ensure not to have problms with misisng parameters, you should provide default arguments.
Use stopifnot, it is designed to check arguments.
here how I would do this:
func_check <-
function(param1="",param2=""){
stopifnot(typeof(param1) == "character",
typeof(param2) == "character")
}
## param2 is numeric
func_check(param1= 'a',param2=2)
## param2 is missing
func_check(param1= 1)
EDIT
In case you want to check for missing values , toy shoudl use dotted parameters. Then you can deal it with using the match.call. Here an example where I test for missings and not valid parameters.
func_check <-
function(...){
ll <- as.list((match.call()[-1]))
stopifnot(c('param1','param2' )%in% names(ll))
param1 = ll$param1
param2 = ll$param2
stopifnot(typeof(param1) == "character",
typeof(param2) == "character")
}
func_check(param1= 'a',param2=2)
If I define the following function:
myfunc <- function(param1, param2, numeric1, param3) {
if (!is.character(param1) || !is.character(param2)) {
stop("You must provide valid arguments for param1 and param2")
} else if(!is.numeric(numeric1)) {
stop("Please provide numeric input")
} else if(!is.character(param3)){
stop("Please provide character input for param3")
}
#some code here for your function, e.g.
paste0(param1, param2, numeric1, param3)
}
It throws an error whenever I don't give it 4 input parameters in the call of myfunc, including if I call it without any arguments (so the error occurs both if aguments are missing and when they are not of the correct type). Is that different from what you are looking for?

Forcing specific data types as arguments to a function

I was just wondering if there was a way to force a function to only accept certain data types, without having to check for it within the function; or, is this not possible because R's type-checking is done at runtime (as opposed to those programming languages, such as Java, where type-checking is done during compilation)?
For example, in Java, you have to specify a data type:
class t2 {
public int addone (int n) {
return n+1;
}
}
In R, a similar function might be
addone <- function(n)
{
return(n+1)
}
but if a vector is supplied, a vector will (obviously) be returned. If you only want a single integer to be accepted, then is the only way to do to have a condition within the function, along the lines of
addone <- function(n)
{
if(is.vector(n) && length(n)==1)
{
return(n+1)
} else
{
return ("You must enter a single integer")
}
}
Thanks,
Chris
This is entirely possible using S3 classes. Your example is somewhat contrived in the context or R, since I can't think of a practical reason why one would want to create a class of a single value. Nonetheless, this is possible. As an added bonus, I demonstrate how the function addone can be used to add the value of one to numeric vectors (trivial) and character vectors (so A turns to B, etc.):
Start by creating a generic S3 method for addone, utlising the S3 despatch mechanism UseMethod:
addone <- function(x){
UseMethod("addone", x)
}
Next, create the contrived class single, defined as the first element of whatever is passed to it:
as.single <- function(x){
ret <- unlist(x)[1]
class(ret) <- "single"
ret
}
Now create methods to handle the various classes. The default method will be called unless a specific class is defined:
addone.default <- function(x) x + 1
addone.character <- function(x)rawToChar(as.raw(as.numeric(charToRaw(x))+1))
addone.single <- function(x)x + 1
Finally, test it with some sample data:
addone(1:5)
[1] 2 3 4 5 6
addone(as.single(1:5))
[1] 2
attr(,"class")
[1] "single"
addone("abc")
[1] "bcd"
Some additional information:
Hadley's devtools wiki is a valuable source of information on all things, including the S3 object system.
The S3 method doesn't provide strict typing. It can quite easily be abused. For stricter object orientation, have a look at S4 classes, reference based classesor the proto package for Prototype object-based programming.
You could write a wrapper like the following:
check.types = function(classes, func) {
n = as.name
params = formals(func)
param.names = lapply(names(params), n)
handler = function() { }
formals(handler) = params
checks = lapply(seq_along(param.names), function(I) {
as.call(list(n('assert.class'), param.names[[I]], classes[[I]]))
})
body(handler) = as.call(c(
list(n('{')),
checks,
list(as.call(list(n('<-'), n('.func'), func))),
list(as.call(c(list(n('.func')), lapply(param.names, as.name))))
))
handler
}
assert.class = function(x, cls) {
stopifnot(cls %in% class(x))
}
And use it like
f = check.types(c('numeric', 'numeric'), function(x, y) {
x + y
})
> f(1, 2)
[1] 3
> f("1", "2")
Error: cls %in% class(x) is not TRUE
Made somewhat inconvenient by R not having decorators. This is kind of hacky
and it suffers from some serious problems:
You lose lazy evaluation, because you must evaluate an argument to determine
its type.
You still can't check the types until call time; real static type checking
lets you check the types even of a call that never actually happens.
Since R uses lazy evaluation, (2) might make type checking not very useful,
because the call might not actually occur until very late, or never.
The answer to (2) would be to add static type information. You could probably
do this by transforming expressions, but I don't think you want to go there.
I've found stopifnot() to be highly useful for these situations as well.
x <- function(n) {
stopifnot(is.vector(n) && length(n)==1)
print(n)
}
The reason it is so useful is because it provides a pretty clear error message to the user if the condition is false.

Resources