Disclaimer: I'm trying to make someone else's code more user friendly and I'm pretty new to R, so if you see me using mismatched coding conventions, that's why.
I'm trying to write my script's status to the terminal as it goes through a list of files, checking to make sure they are valid before using them as input to a model. Therefore, I need to pass a variable (filename) and status ("looks good") to a function that will concatenate them and write them to the terminal in green. When I test the function like so, it works:
say <- function(words){
cat(green(words))
}
hi <- "Hello"
say(c(hi, "World!"))
# Hello World!
But when I call say() from within the ifelse() that I need to call it from, I get an error I cannot decipher:
FileList = as.data.frame(list.files(path = "./R_ModelInputs_SecondaryData",
pattern = ".tif$", all.files = FALSE,
full.names = FALSE, recursive = FALSE,
ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE))
names(FileList)=c("FileName")
for(NAME in FileList$FilName){
data=raster(paste("./R_ModelInputs_SecondaryData/",NAME,sep=""))
ifelse(nrow(data)!=1737,
say(c(NAME, "has a problem"),
ifelse(ncol(data)!=4008,
say(c(NAME, "has a problem")),
say(c(NAME, "looks good"))
))
}
# Goode_FireBrightness_80_10kMax_20002015.tif looks goodError in ans[!test & ok] <- rep(no, length.out = length(ans))[!test &
# :
# replacement has length zero
# Calls: ifelse -> ifelse
# In addition: Warning message:
# In rep(no, length.out = length(ans)) :
# 'x' is NULL so the result will be NULL
# Execution halted
I've tried googling this error but all I've come up with is that it seems to be from the ifelse() call. This doesn't make sense to me because the fact that it's writing the "looks good" part means that it's successfully navigated both ifelse()'s. I inserted a print() statement at the top of the for loop to ensure that it wasn't throwing the error when it tried to evaluate the ifelse() in the second iteration of my for loop, but that's not what it is, as that print() statement only printed once.
ifelse() should be used when you want to return a vector. It expects to return a vector the same length as your first parameter. Your say() function is returning the value from cat() which just returns NULL. There's no way to make NULL the same length as your test condition. This is throwing the ifelse off.
ifelse should not be used for control flow logic. You should be using a standard if/else here for conditional code execution.
Use
if(nrow(data)!=1737) {
say(c(NAME, "has a problem")
} else if (ncol(data)!=4008) {
say(c(NAME, "has a problem"))
} else {
say(c(NAME, "looks good"))
}
Or do this in a more R-y way like this
ff <- list.files(path = ".", pattern = "\\.tif$", full=TRUE)
r <- lapply(ff, raster)
x <- t(sapply(r, dim))
good <- x[,1] == 1737 & x[,2] == 4008
# good
basename(ff)[good]
# problem
basename(ff)[!good]
Related
I have a regression model (lm or glm or lmer ...) and I do fitmodel <- lm(inputs) where inputs changes inside a loop (the formula and the data). Then, if the model function does not produce any warning I want to keep fitmodel, but if I get a warning I want to update the model and I want the warning not printed, so I do fitmodel <- lm(inputs) inside tryCatch. So, if it produces a warning, inside warning = function(w){f(fitmodel)}, f(fitmodel) would be something like
fitmodel <- update(fitmodel, something suitable to do on the model)
In fact, this assignation would be inside an if-else structure in such a way that depending on the warning if(w$message satisfies something) I would adapt the suitable to do on the model inside update.
The problem is that I get Error in ... object 'fitmodel' not found. If I use withCallingHandlers with invokeRestarts, it just finishes the computation of the model with the warning without update it. If I add again fitmodel <- lm(inputs) inside something suitable to do on the model, I get the warning printed; now I think I could try suppresswarnings(fitmodel <- lm(inputs)), but yet I think it is not an elegant solution, since I have to add 2 times the line fitmodel <- lm(inputs), making 2 times all the computation (inside expr and inside warning).
Summarising, what I would like but fails is:
tryCatch(expr = {fitmodel <- lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
fitmodel <- update(fitmodel, something suitable to do on the model)
} else if (w$message satisfies something2){
fitmodel <- update(fitmodel, something2 suitable to do on the model)
}
}
)
What can I do?
The loop part of the question is because I thought it like follows (maybe is another question, but for the moment I leave it here): it can happen that after the update I get another warning, so I would do something like while(get a warning on update){update}; in some way, this update inside warning should be understood also as expr. Is something like this possible?
Thank you very much!
Generic version of the question with minimal example:
Let's say I have a tryCatch(expr = {result <- operations}, warning = function(w){f(...)} and if I get a warning in expr (produced in fact in operations) I want to do something with result, so I would do warning = function(w){f(result)}, but then I get Error in ... object 'result' not found.
A minimal example:
y <- "a"
tryCatch(expr = {x <- as.numeric(y)},
warning = function(w) {print(x)})
Error in ... object 'x' not found
I tried using withCallingHandlers instead of tryCatch without success, and also using invokeRestart but it does the expression part, not what I want to do when I get a warning.
Could you help me?
Thank you!
The problem, fundamentally, is that the handler is called before the assignment happens. And even if that weren’t the case, the handler runs in a different scope than the tryCatch expression, so the handler can’t access the names in the other scope.
We need to separate the handling from the value transformation.
For errors (but not warnings), base R provides the function try, which wraps tryCatch to achieve this effect. However, using try is discouraged, because its return type is unsound.1 As mentioned in the answer by ekoam, ‘purrr’ provides soundly typed functional wrappers (e.g. safely) to achieve a similar effect.
However, we can also build our own, which might be a better fit in this situation:
with_warning = function (expr) {
self = environment()
warning = NULL
result = withCallingHandlers(expr, warning = function (w) {
self$warning = w
tryInvokeRestart('muffleWarning')
})
list(result = result, warning = warning)
}
This gives us a wrapper that distinguishes between the result value and a warning. We can now use it to implement your requirement:
fitmodel = with(with_warning(lm(inputs)), {
if (! is.null(warning)) {
if (conditionMessage(warning) satisfies something) {
update(result, something suitable to do on the model)
} else {
update(result, something2 suitable to do on the model)
}
} else {
result
}
})
1 What this means is that try’s return type doesn’t distinguish between an error and a non-error value of type try-error. This is a real situation that can occur, for example, when nesting multiple try calls.
It seems that you are looking for a functional wrapper that captures both the returned value and side effects of a function call. I think purrr::quietly is a perfect candidate for this kind of task. Consider something like this
quietly <- purrr::quietly
foo <- function(x) {
if (x < 3)
warning(x, " is less than 3")
if (x < 4)
warning(x, " is less than 4")
x
}
update_foo <- function(x, y) {
x <- x + y
foo(x)
}
keep_doing <- function(inputs) {
out <- quietly(foo)(inputs)
repeat {
if (length(out$warnings) < 1L)
return(out$result)
cat(paste0(out$warnings, collapse = ", "), "\n")
# This is for you to see the process. You can delete this line.
if (grepl("less than 3", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1.5)
} else if (grepl("less than 4", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1)
}
}
}
Output
> keep_doing(1)
1 is less than 3, 1 is less than 4
2.5 is less than 3, 2.5 is less than 4
[1] 4
> keep_doing(3)
3 is less than 4
[1] 4
Are you looking for something like the following? If it is run with y <- "123", the "OK" message will be printed.
y <- "a"
#y <- "123"
x <- tryCatch(as.numeric(y),
warning = function(w) w
)
if(inherits(x, "warning")){
message(x$message)
} else{
message(paste("OK:", x))
}
It's easier to test several argument values with the code above rewritten as a function.
testWarning <- function(x){
out <- tryCatch(as.numeric(x),
warning = function(w) w
)
if(inherits(out, "warning")){
message(out$message)
} else{
message(paste("OK:", out))
}
invisible(out)
}
testWarning("a")
#NAs introduced by coercion
testWarning("123")
#OK: 123
Maybe you could assign x again in the handling condition?
tryCatch(
warning = function(cnd) {
x <- suppressWarnings(as.numeric(y))
print(x)},
expr = {x <- as.numeric(y)}
)
#> [1] NA
Perhaps not the most elegant answer, but solves your toy example.
Don't put the assignment in the tryCatch call, put it outside. For example,
y <- "a"
x <- tryCatch(expr = {as.numeric(y)},
warning = function(w) {y})
This assigns y to x, but you could put anything in the warning body, and the result will be assigned to x.
Your "what I would like" example is more complicated, because you want access to the expr value, but it hasn't been assigned anywhere at the time the warning is generated. I think you'll have to recalculate it:
fitmodel <- tryCatch(expr = {lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
update(lm(inputs), something suitable to do on the model)
} else if (w$message satisfies something2){
update(lm(inputs), something2 suitable to do on the model)
}
}
)
Edited to add:
To allow the evaluation to proceed to completion before processing the warning, you can't use tryCatch. The evaluate package has a function (also called evaluate) that can do this. For example,
y <- "a"
res <- evaluate::evaluate(quote(x <- as.numeric(y)))
for (i in seq_along(res)) {
if (inherits(res[[i]], "warning") &&
conditionMessage(res[[i]]) == gettext("NAs introduced by coercion",
domain = "R"))
x <- y
}
Some notes: the res list will contain lots of different things, including messages, warnings, errors, etc. My code only looks at the warnings. I used conditionMessage to extract the warning message, but
it will be translated to the local language, so you should use gettext to translate the English version of the message for comparison.
I am trying to write a function that cleans spreadsheets. However, some of the spreadsheets are corrupted and will not open. I want the function to recognize this, print an error message, and skip execution of the rest of the function (since I am using lapply() to iterate across files), and continues. My current attempt looks like this:
candidate.cleaner <- function(filename){
#this function cleans candidate data spreadsheets into an R dataframe
#dependency check
library(readxl)
#read in
cand_df <- tryCatch(read_xls(filename, col_names = F),
error = function (e){
warning(paste(filename, "cannot be opened; corrupted or does not exist"))
})
print(filename)
#rest of function
cand_df[1,1]
}
test_vec <- c("test.xls", "test2.xls", "test3.xls")
lapply(FUN = candidate.cleaner, X = test_vec)
However, this still executes the line of the function after the tryCatch statement when given a .xls file that does not exist, which throws a stop since I'm attempting to index a dataframe that doesn't exist. This exits the lapply call. How can I write the tryCatch call to make it skip execution of the rest of the function without exiting lapply?
One could set a semaphore at the start of the tryCatch() indicating that things have gone OK so far, then handle the error and signal that things have gone wrong, and finally check the semaphore and return from the function with an appropriate value.
lapply(1:5, function(i) {
value <- tryCatch({
OK <- TRUE
if (i == 2)
stop("stopping...")
i
}, error = function(e) {
warning("oops: ", conditionMessage(e))
OK <<- FALSE # assign in parent environment
}, finally = {
## return NA on error
OK || return(NA)
})
## proceed
value * value
})
This allows one to continue using the tryCatch() infrastructure, e.g., to translate warnings into errors. The tryCatch() block encapsulates all the relevant code.
Turns out, this can be accomplished in a simple way with try() and an additional help function.
candidate.cleaner <- function(filename){
#this function cleans candidate data spreadsheets into an R dataframe
#dependency check
library(readxl)
#read in
cand_df <- try(read_xls(filename, col_names = F))
if(is.error(cand_df) == T){
return(list("Corrupted: rescrape", filename))
} else {
#storing election name for later matching
election_name <- cand_df[1,1]
}
}
Where is.error() is taken from Hadley Wickham's Advanced R chapter on debugging. It's defined as:
is.error <- function(x) inherits(x, "try-error")
I was looking at the source code for the r function expand.grid and noticed a command I don't understand (the last line):
function (..., KEEP.OUT.ATTRS = TRUE, stringsAsFactors = TRUE)
{
nargs <- length(args <- list(...))
if (!nargs)
I am not familiar with this syntax for an if-statement. What is if(!nargs) testing? I tried testing it with something that did exist, and it worked. But it doesn't work with something that doesn't exist...
x <- list(1, 4, 6)
nargs <- length(x)
if(nargs)
print("Success") #This does print "Success" and nargs exists
if(!nargs)
print("Fail") #Doesn't print, as you'd expect
if(!dogs)
print("Success") #Error: object 'dogs' not found
So my guess (that it was an existence test) is wrong, or I am testing it wrong.
!nargs means you are testing the value of nargs is FALSE (or 0 in your case). nargs must exist before testing its value. In your example it does exist.
!dogs is trying to test a non-existing object, so you cannot test its value, object dogs not found is the correct error message there.
What I'm trying to achieve
I'm trying to write my own 'impute' function in R with a tryCatch statement which:
1. outputs a warning/error message containing the function name so I can debug easier.
2. Raises a warning if the function runs ok but doesn't impute all the missing values.
ImputeVariables <- function(impute.var, impute.values,
filter.var){
# function to impute values.
# impute.var = variables with NAs
# impute.values = the missing value(s) to replace NAs, value labesl are levels
# filter.var = the variables to filter on.
# filter.levels = the categories of filter.var
tryCatch({
filter.levels <- names(impute.values)
# Validation
stopifnot(class(impute.var) == class(impute.values),
length(impute.values) > 0,
sum(is.na(impute.values)) == 0)
# Impute values
for(level in filter.levels){
impute.var[which(filter.var == level & is.na(impute.var))] <-
impute.values[level]
}
# Check if all NAs removed. Throw warning if not.
if(sum(is.na(impute.var)) > 0){
warning("Not all NAs removed")
}
# Return values
return(impute.var)
},
error = function(err) print(paste0("ImputeValues: ",err)),
warning = function(war) print(paste0("ImputeValues: ",war))
)
}
impute.var and filter.var are vectors taken from a data.frame (they are vectors of Ages and Titles (e.g. 'Mr', 'Mrs')
impute.values is a vector of the same type as impute.var but with labels taken from filter.var (i.e. is of the form c('Mr' = 30, 'Mrs' = 25...))
The problem
To check if my validation was working I supplied the function with a named vector of NAs, thusly:
ages <- c(34, 22, NA, 17, 38, NA)
titles <- c("Mr", "Mr", "Mr", "Mrs", "Mrs", "Mrs")
ages.values <- c("Mr" = NA, "Mrs" = NA)
ages.new <- ImputeVariables(ages, ages.values, titles)
print(ages.new)
But it outputs the following:
"ImputeValues: Error: class(impute.var) == class(impute.values) is not TRUE\n"
"ImputeValues: Error: class(impute.var) == class(impute.values) is not TRUE\n"
The two lines are due to the function printing the ages.new vector and the following print statement printing ages.new (why?)
If I comment out the validation (the stopifnot function) then I just get:
"ImputeValues: simpleWarning in doTryCatch(return(expr), name, parentenv, handler): Not all NAs removed\n"
What I'm asking
Why does the tryCatch block behave this way?
Is my validation and error handling strategy optimal (obviously without the bug)?
Many thanks for your time.
Rob
Thanks Oliver.
The working code is now:
ImputeVariables <- function(impute.var, impute.values,
filter.var){
# function to impute values.
# impute.var = variables with NAs
# impute.values = the missing value(s) to replace NAs, value labesl are levels
# filter.var = the variables to filter on.
# filter.levels = the categories of filter.var
tryCatch({
filter.levels <- names(impute.values)
# Validation
stopifnot(class(impute.var) == class(impute.values),
length(impute.values) > 0,
sum(is.na(impute.values)) == 0)
# Impute values
for(level in filter.levels){
impute.var[which(filter.var == level & is.na(impute.var))] <-
impute.values[level]
}
# Check if all NAs removed. Throw warning if not.
if(sum(is.na(impute.var)) > 0){
warning("Not all NAs removed")
}
# Return values
return(impute.var)
},
error = function(err) stop(paste0("ImputeValues: ",err)),
warning = function(war) {
message(paste0("ImputeValues: ",war))
return(impute.var)}
)
}
This is essentially two different problems. The first problem is that print statements within a function do not print to the terminal, they print to the scope of the function. As an example:
> foo <- function(){
print("bar")
}
> foo()
[1] "bar"
It didn't print "bar" to your screen, it printed it to the function scope and then returned it. The reason it returned it was that it was the last value printed to the function scope, and so (lacking an explicit return() call) is the best candidate for what to return.
So, your code is (in sequence):
Throwing an error;
Not treating that error normally, but instead passing it into tryCatch's error handler, where it is printed;
Because it is the last thing printed within the function scope, since the return() statement is never hit due to the error, treating it as the return value from the function.
If you really want to continue processing the input values even if the stopifnot() conditions are met, you don't want a stopifnot(): however you structure that it's likely to prevent the return() call from running and cause weirdness. What I'd suggest is instead moving the conditional checks currently in stopifnot() outside the tryCatch, and sticking them in a series of if() statements that throw warnings (not errors) if they don't match up. tryCatch isn't really necessary in this situation.
I am using stopifnot and I understand it just returns the first value that was not TRUE. I f that is some freaky dynamic expression someone who is not into the custom function cannot really make something out of that. So I would love to add a custom error message. Any suggestions?
Error: length(unique(nchar(check))) == 1 is not TRUE
Basically states that the elements of the vector check do not have the same length.
Is there a way of saying: Error: Elements of your input vector do not have the same length!?
Use stop and an if statement:
if(length(unique(nchar(check))) != 1)
stop("Error: Elements of your input vector do not have the same length!")
Just remember that stopifnot has the convenience of stating the negative, so your condition in the if needs to be the negation of your stop condition.
This is what the error message looks like:
> check = c("x", "xx", "xxx")
> if(length(unique(nchar(check))) != 1)
+ stop("Error: Elements of your input vector do not have the same length!")
Error in eval(expr, envir, enclos) :
Error: Elements of your input vector do not have the same length!
A custom message can be added as a label to your expression:
stopifnot("Elements of your input vector do not have the same length!" =
length(unique(nchar(check))) == 1)
# Error: Elements of your input vector do not have the same length!
The assertive and assertthat packages have more readable check functions.
library(assertthat)
assert_that(length(unique(nchar(check))) == 1)
## Error: length(unique(nchar(check))) == 1 are not all true.
library(assertive)
assert_is_scalar(unique(nchar(check)))
## Error: unique(nchar(check)) does not have length one.
if(!is_scalar(unique(nchar(check))))
{
stop("Elements of check have different numbers of characters.")
}
## Error: Elements of check have different numbers of characters.
Or you could package it up.
assert <- function (expr, error) {
if (! expr) stop(error, call. = FALSE)
}
So you have:
> check = c("x", "xx", "xxx")
> assert(length(unique(nchar(check))) == 1, "Elements of your input vector do not have the same length!")
Error: Elements of your input vector do not have the same length!
What about embedding the stopifnot into tryCatch and then recasting the exception with stop using customized message?
Something like:
tryCatch(stopifnot(...,error=stop("Your customized error message"))
Unlike some other solutions this does not require additional packages. Compared to using if statement combined with stop you retain the performance advantages of stopifnot, when you use new R versions. Since R version 3.5.0 stopifnot evaluates expressions sequentially and stops on first failure.
I would recommend you check out Hadley's testthat package. It allows for intuitive testing: the names of the functions are great and the way you write them is like a sentence -- "I expect that length(unique(nchar(check))) is [exactly|approximately] 1". The errors produced are informative.
See here:
http://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf
In your case,
> library(testthat)
> check = c("x", "xx", "xxx")
> expect_that(length(unique(nchar(check))), equals(1))
Error: length(unique(nchar(check))) not equal to 1
Mean relative difference: 2
Also note that you don't have the problem that #Andrie referenced with sometimes having to think about double negatives with stopifnot. I know it seems simple, but it caused me many headaches!
The answers already provided are quite good, and mine is just an addition to that collection. For some people it could be more convenient to use one-liner in form of the following function:
stopifnotinform <- function(..., message = "err") {
args <- list(...)
if (length(args) == 0) {
return(NULL)
}
for (i in 1:length(args)) {
result <- args[[i]]
if (is.atomic(result) && result == FALSE) {
stop(message)
}
}
}
# throws an error
stopifnotinform(is.integer(1L), is.integer(2), message = "Some number(s) provided is not an integer")
# continues with execution
stopifnotinform(is.integer(1L), is.integer(2L), message = "Some number(s) provided is not an integer")
Bear in mind that this solution provides you with only one (common) error message for all parameters in ....
Try this:
same_length <- FALSE
stopifnot("Elements of your input vector do not have the same length!" = same_length)
#> Error : Elements of your input vector do not have the same length!