I'm using lapply to run a complex function on a large number of items, and I'd like to save the output from each item (if any) together with any warnings/errors that were produced so that I can tell which item produced which warning/error.
I found a way to catch warnings using withCallingHandlers (described here). However, I need to catch errors as well. I can do it by wrapping it in a tryCatch (as in the code below), but is there a better way to do it?
catchToList <- function(expr) {
val <- NULL
myWarnings <- NULL
wHandler <- function(w) {
myWarnings <<- c(myWarnings, w$message)
invokeRestart("muffleWarning")
}
myError <- NULL
eHandler <- function(e) {
myError <<- e$message
NULL
}
val <- tryCatch(withCallingHandlers(expr, warning = wHandler), error = eHandler)
list(value = val, warnings = myWarnings, error=myError)
}
Sample output of this function is:
> catchToList({warning("warning 1");warning("warning 2");1})
$value
[1] 1
$warnings
[1] "warning 1" "warning 2"
$error
NULL
> catchToList({warning("my warning");stop("my error")})
$value
NULL
$warnings
[1] "my warning"
$error
[1] "my error"
There are several questions here on SO that discuss tryCatch and error handling, but none that I found that address this particular issue. See How can I check whether a function call results in a warning?, warnings() does not work within a function? How can one work around this?, and How to tell lapply to ignore an error and process the next thing in the list? for the most relevant ones.
Maybe this is the same as your solution, but I wrote a factory to convert plain old functions into functions that capture their values, errors, and warnings, so I can
test <- function(i)
switch(i, "1"=stop("oops"), "2"={ warning("hmm"); i }, i)
res <- lapply(1:3, factory(test))
with each element of the result containing the value, error, and / or warnings. This would work with user functions, system functions, or anonymous functions (factory(function(i) ...)). Here's the factory
factory <- function(fun)
function(...) {
warn <- err <- NULL
res <- withCallingHandlers(
tryCatch(fun(...), error=function(e) {
err <<- conditionMessage(e)
NULL
}), warning=function(w) {
warn <<- append(warn, conditionMessage(w))
invokeRestart("muffleWarning")
})
list(res, warn=warn, err=err)
}
and some helpers for dealing with the result list
.has <- function(x, what)
!sapply(lapply(x, "[[", what), is.null)
hasWarning <- function(x) .has(x, "warn")
hasError <- function(x) .has(x, "err")
isClean <- function(x) !(hasError(x) | hasWarning(x))
value <- function(x) sapply(x, "[[", 1)
cleanv <- function(x) sapply(x[isClean(x)], "[[", 1)
Try the evaluate package.
library(evaluate)
test <- function(i)
switch(i, "1"=stop("oops"), "2"={ warning("hmm"); i }, i)
t1 <- evaluate("test(1)")
t2 <- evaluate("test(2)")
t3 <- evaluate("test(3)")
It currently lacks a nice way of evaluating expression though - this is mainly because it's targetted towards reproducing exactly what R output's given text input at the console.
replay(t1)
replay(t2)
replay(t3)
It also captures messages, output to the console, and ensures that everything is correctly interleaved in the order in which it occurred.
I have merged Martins soulution (https://stackoverflow.com/a/4952908/2161065) and the one from the R-help mailing list you get with demo(error.catching).
The main idea is to keep both, the warning/error message as well as the command triggering this problem.
myTryCatch <- function(expr) {
warn <- err <- NULL
value <- withCallingHandlers(
tryCatch(expr, error=function(e) {
err <<- e
NULL
}), warning=function(w) {
warn <<- w
invokeRestart("muffleWarning")
})
list(value=value, warning=warn, error=err)
}
Examples:
myTryCatch(log(1))
myTryCatch(log(-1))
myTryCatch(log("a"))
Output:
myTryCatch(log(1))
# $value
# [1] 0
#
# $warning
# NULL
#
# $error
# NULL
myTryCatch(log(-1))
# $value
# [1] NaN
#
# $warning
# <simpleWarning in log(-1): NaNs produced>
#
# $error
# NULL
myTryCatch(log("a"))
# $value
# NULL
#
# $warning
# NULL
#
# $error
# <simpleError in log("a"): non-numeric argument to mathematical function>
The purpose of my answer (and modification to Martin's excellent code) is so that the factory-ed function returns the data structure expected if all goes well. If a warning is experienced, it is attached to the result under the factory-warning attribute. data.table's setattr function is used to allow for compatibility with that package. If an error is experienced, the result is the character element "An error occurred in the factory function" and the factory-error attribute will carry the error message.
#' Catch errors and warnings and store them for subsequent evaluation
#'
#' Factory modified from a version written by Martin Morgan on Stack Overflow (see below).
#' Factory generates a function which is appropriately wrapped by error handlers.
#' If there are no errors and no warnings, the result is provided.
#' If there are warnings but no errors, the result is provided with a warn attribute set.
#' If there are errors, the result retutrns is a list with the elements of warn and err.
#' This is a nice way to recover from a problems that may have occurred during loop evaluation or during cluster usage.
#' Check the references for additional related functions.
#' I have not included the other factory functions included in the original Stack Overflow answer because they did not play well with the return item as an S4 object.
#' #export
#' #param fun The function to be turned into a factory
#' #return The result of the function given to turn into a factory. If this function was in error "An error as occurred" as a character element. factory-error and factory-warning attributes may also be set as appropriate.
#' #references
#' \url{http://stackoverflow.com/questions/4948361/how-do-i-save-warnings-and-errors-as-output-from-a-function}
#' #author Martin Morgan; Modified by Russell S. Pierce
#' #examples
#' f.log <- factory(log)
#' f.log("a")
#' f.as.numeric <- factory(as.numeric)
#' f.as.numeric(c("a","b",1))
factory <- function (fun) {
errorOccurred <- FALSE
library(data.table)
function(...) {
warn <- err <- NULL
res <- withCallingHandlers(tryCatch(fun(...), error = function(e) {
err <<- conditionMessage(e)
errorOccurred <<- TRUE
NULL
}), warning = function(w) {
warn <<- append(warn, conditionMessage(w))
invokeRestart("muffleWarning")
})
if (errorOccurred) {
res <- "An error occurred in the factory function"
}
if (is.character(warn)) {
data.table::setattr(res,"factory-warning",warn)
} else {
data.table::setattr(res,"factory-warning",NULL)
}
if (is.character(err)) {
data.table::setattr(res,"factory-error",err)
} else {
data.table::setattr(res, "factory-error", NULL)
}
return(res)
}
}
Because we don't wrap the result in an extra list we can't make the kind of assumptions that allow for some of his accessor functions, but we can write simple checks and decide how to handle the cases as is appropriate to our particular resulting data-structure.
.has <- function(x, what) {
!is.null(attr(x,what))
}
hasWarning <- function(x) .has(x, "factory-warning")
hasError <- function(x) .has(x, "factory-error")
isClean <- function(x) !(hasError(x) | hasWarning(x))
Related
I want to make a new function that has 3 skills
if an error occurs, print "error". Do not stop the whole process, make it keep going
if a warning occurs, print the warning message
if there is no error nor warning, print "Nothing".
For 1) I made a function as follows:
efunc <- function(error){
return(NA)
}
tryc <- function(x){tryCatch(x, error = efunc)}
for 2), 3) , I refer to this page( How do I save warnings and errors as output from a function?) and
adapted some code, making the following functions
myTryCatch <- function(expr) {
warn <- err <- NULL
value <- withCallingHandlers(
tryCatch(expr, error=function(e) {
err <<- e # <<- is not typo
print("error")
}), warning=function(w) {
warn <<- w # <<- is not typo
invokeRestart("muffleWarning")
})
if(is.null(warn)){
warn<-'Nothing'
}
if(is.null(err)){
err<-'Nothing'
}
warning=warn
paste(unlist(warning),collapse="")
}
However, thinking back, maybe it is possible to combine these two functions..
or, myTryCatch() already has a tryc()'s skill..
but I am not sure.
The following function gives all the structures of an object in R. This function (plus dput()/str() for object attributes) fully qualifies everything about an arbitrary R object.
ObjectStructure <- function(x, ShowAll=FALSE) {
op <- options("warn")
options(warn=-1) # Assign "-1" to warn option to temporarily close warnings
on.exit(options(op)) # Reset the settings to initial ones after exit from the ObjectStructure function
is.Functions <- grep(methods(is), pattern="<-", invert=TRUE, value=TRUE) # 55 (is.X) functions
isDotlessFunctions <- character()
packs <- c('base', 'utils', 'methods') # include more packages if needed
for (pkg in packs) {
library(pkg, character.only = TRUE)
objects <- grep("^is.+\\w$", ls(envir=as.environment(paste('package', pkg, sep=':'))), value=TRUE)
objects <- grep("<-", objects, invert=TRUE, value=TRUE)
if (length(objects) > 0)
isDotlessFunctions <- append(isDotlessFunctions, objects[sapply(objects, function(x) class(eval(parse(text=x))) == "function")])
}
FunctionsList <- union(is.Functions, isDotlessFunctions)
result <- data.frame(test=character(), value=character(), warn=character(), stringsAsFactors=FALSE)
# Loop all the "is.(...)" functions and save the results
for(islev in FunctionsList) {
res <- try(eval(call(islev,x)),silent=TRUE) # In cases of error, let error be processed
# in errored cases, try produces try-error object that contains error message
if(class(res)=="try-error") { next() # In case of error, ignore current iteration and pass to the next iteration in the loop
} else if (length(res)>1) {
warn <- "*Applies only to the first element of the provided object"
value <- paste(res,"*",sep="")
} else {
warn <- ""
value <- res
}
result[nrow(result)+1,] <- list(islev, value, warn)
}
result <- result[order(result$value, decreasing=TRUE),] # Order the results
rownames(result) <- NULL # to arrange row numbers in a way that they start from 1 and ordered
if(ShowAll) return(result) # Show only the structures that give TRUE
else return(result[which(result$value=="TRUE"),]) # All the function results that give a TRUE/FALSE value
}
ObjectStructure(1L, TRUE) # See how the function works
As the user #dominic-comtois emphasize here, more packages can be included in packs variable. So,
What are the packages that include internal is.X (is.vector, is.numeric, etc.) and all isX (isS4, isOpen etc.) functions in general?
I would like to write a wrapper for the debug() function so that I can remove all debugging flag when needed.
For functions in the search path it is simple.
.debugged <- NULL
debug.wrapper <- function(fun){
f <- deparse(substitute(fun))
.debugged <<- unique(c(.debugged, f))
debug(f)
}
debug.wrapper.off <- function() {
z=sapply(.debugged, undebug)
.debugged <<- NULL
}
It works because I can use the character version of the function symbol.
f <- function() print("hello")
debug.wrapper(f)
isdebugged(f)
# [1] TRUE
debug.wrapper.off()
isdebugged(f)
# [1] FALSE
Anyway with namespaces it does not work:
debug.wrapper(tools:::psnice)
# Error in debug(f) could not find function "tools:::psnice"
Also:
debug(substitute(tools:::psnice))
# Error in debug(fun, text, condition) : argument must be a function
How can I store the function symbols for later reuse?
Note
It seems that concatenating function symbols creates a sort of "soft pointer" rather than a copy, that is:
x <- c(tools:::psnice, identity)
Taking the first function, we get:
x[[1]]
# function (pid = Sys.getpid(), value = NA_integer_)
# {
# res <- .Call(ps_priority, pid, value)
# if (is.na(value))
# res
# else invisible(res)
# }
# <bytecode: 0x00000000189f1f80>
# <environment: namespace:tools>
The bytecode and environment are the same as with tools:::psnice.
Therefore un/debug(x[[1]]) is like un/debug(tools:::psnice)
Solution
Given the note above, the solution is trivial. Debug wrappers are defined as:
.debugged <- NULL
debug.wrapper <- function(fun){
.debugged <<- unique(c(.debugged, fun))
debug(fun)
}
debug.wrapper.off <- function() {
z=sapply(.debugged, undebug)
.debugged <<- NULL
}
Using them brings:
f <- function() print("hello")
debug.wrapper(f)
debug.wrapper(tools:::psnice)
isdebugged(f)
# [1] TRUE
isdebugged(tools:::psnice)
# [1] TRUE
debug.wrapper.off()
isdebugged(f)
isdebugged(tools:::psnice)
.debugged
# NULL
Of course, one can add conditions to manage the case when passed fun is a string.
Thanks to #Rich Scriven, who gave useful insights.
So I'm changing the class of some functions that I'm building in R in order to add a description attribute and because I want to use S3 generics to handle everything for me. Basically, I have a structure like
foo <- function(x) x + 1
addFunction <- function(f, description) {
class(f) <- c("addFunction", "function")
attr(f, "description") <- description
f
}
foo <- addFunction(foo, "Add one")
and then I do stuff like
description <- function(x) UseMethod("description")
description.default <- function(x) deparse(substitute(x))
description.addFunction <- function(x) attr(x, "description")
This works fine, but it's not that elegant. I'm wondering if it is possible to define a new class of functions such that instances of this class can be defined in a syntax similar to the function syntax. In other words, is it possible to define addFunction such that foo is generated in the following way:
foo <- addFunction(description = "Add one", x) {
x + 1
}
(or something similar, I have no strong feelings about where the attribute should be added to the function)?
Thanks for reading!
Update: I have experimented a bit more with the idea, but haven't really reached any concrete results yet - so this is just an overview of my current (updated) thoughts on the subject:
I tried the idea of just copying the function()-function, giving it a different name and then manipulating it afterwards. However, this does not work and I would love any inputs on what is happening here:
> function2 <- `function`
> identical(`function`, function2)
[1] TRUE
> function(x) x
function(x) x
> function2(x) x
Error: unexpected symbol in "function2(x) x"
> function2(x)
Error: incorrect number of arguments to "function"
As function() is a primitive function, I tried looking at the C-code defining it for more clues. I was particularly intrigued by the error message from the function2(x) call. The C-code underlying function() is
/* Declared with a variable number of args in names.c */
SEXP attribute_hidden do_function(SEXP call, SEXP op, SEXP args, SEXP rho)
{
SEXP rval, srcref;
if (TYPEOF(op) == PROMSXP) {
op = forcePromise(op);
SET_NAMED(op, 2);
}
if (length(args) < 2) WrongArgCount("function");
CheckFormals(CAR(args));
rval = mkCLOSXP(CAR(args), CADR(args), rho);
srcref = CADDR(args);
if (!isNull(srcref)) setAttrib(rval, R_SrcrefSymbol, srcref);
return rval;
}
and from this, I conclude that for some reason, at least two of the four arguments call, op, args and rho are now required. From the signature of do_function() I am guessing that the four arguments passed to do_function should be a call, a promise, a list of arguments and then maybe an environment. I tried a lot of different combinations for function2 (including setting up to two of these arguments to NULL), but I keep getting the same (new) error message:
> function2(call("sum", 2, 1), NULL, list(x=NULL), baseenv())
Error: invalid formal argument list for "function"
> function2(call("sum", 2, 1), NULL, list(x=NULL), NULL)
Error: invalid formal argument list for "function"
This error message is returned from the C-function CheckFormals(), which I also looked up:
/* used in coerce.c */
void attribute_hidden CheckFormals(SEXP ls)
{
if (isList(ls)) {
for (; ls != R_NilValue; ls = CDR(ls))
if (TYPEOF(TAG(ls)) != SYMSXP)
goto err;
return;
}
err:
error(_("invalid formal argument list for \"function\""));
}
I'm not fluent in C at all, so from here on I'm not quite sure what to do next.
So these are my updated questions:
Why do function and function2 not behave in the same way? Why
do I need to call function2 using a different syntax when they are
deemed identical in R?
What are the proper arguments of function2
such that function2([arguments]) will actually define a function?
Some keywords in R such as if and function have special syntax in the way that the underlying functions get called. It's quite easy to use if as a function if desired, e.g.
`if`(1 == 1, "True", "False")
is equivalent to
if (1 == 1) {
"True"
} else {
"False"
}
function is trickier. There's some help on this at a previous question.
For your current problem here's one solution:
# Your S3 methods
description <- function(x) UseMethod("description")
description.default <- function(x) deparse(substitute(x))
description.addFunction <- function(x) attr(x, "description")
# Creates the pairlist for arguments, handling arguments with no defaults
# properly. Also brings in the description
addFunction <- function(description, ...) {
args <- eval(substitute(alist(...)))
tmp <- names(args)
if (is.null(tmp)) tmp <- rep("", length(args))
names(args)[tmp==""] <- args[tmp==""]
args[tmp==""] <- list(alist(x=)$x)
list(args = as.pairlist(args), description = description)
}
# Actually creates the function using the structure created by addFunction and the body
`%{%` <- function(args, body) {
stopifnot(is.pairlist(args$args), class(substitute(body)) == "{")
f <- eval(call("function", args$args, substitute(body), parent.frame()))
class(f) <- c("addFunction", "function")
attr(f, "description") <- args$description
f
}
# Example. Note that the braces {} are mandatory even for one line functions
foo <- addFunction(description = "Add one", x) %{% {
x + 1
}
foo(1)
#[1] 2
Suppose we have this functions in a R package.
prova <- function() {
print(attr(prova, 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
'myattr<-' <- function(x, value) {
attr(x, 'myattr') <- value
x
}
myattr <- function(x) attr(x, 'myattr')
So, I install the package and then I test it. This is the result:
prova()
# NULL
# NULL
myattr(prova) <- 'ciao' # setting 'ciao' for 'myattr' attribute
prova()
# NULL
# NULL # Why NULL here ?
myattr(prova)
# [1] "ciao"
attr(prova, 'myattr')
# [1] "ciao"
The question is: how to get the attribute of the function from within itself?
Inside the function itself I cannot get its attribute, as demonstrated by the example.
I suppose that the solution will be of the serie "computing on the language" (match.call()[[1L]], substitute, environments and friends). Am I wrong?
I think that the important point here is that this function is in a package (so, it has its environment and namespace) and I need its attribute inside itself, in the package, not outside.
you can use get with the envir argument.
prova <- function() {
print(attr(get("prova", envir=envir.prova), 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
eg:
envir.prova <- environment()
prova()
# NULL
# NULL
myattr(prova) <- 'ciao'
prova()
# [1] "ciao"
# [1] "ciao"
Where envir.prova is a variable whose value you set to the environment in which prova is defined.
Alternatively you can use get(.. envir=parent.frame()), but that is less reliable as then you have to track the calls too, and ensure against another object with the same name between the target environment and the calling environment.
Update regarding question in the comments:
regarding using parent.frame() versus using an explicit environment name: parent.frame, as the name suggests, goes "up one level." Often, that is exactly where you want to go, so that works fine. And yet, even when your goal is get an object in an environment further up, R searches up the call stack until it finds the object with the matching name. So very often, parent.frame() is just fine.
HOWEVER if there are multiple calls between where you are invoking parent.frame() and where the object is located AND in one of the intermediary environments there exists another object with the same name, then R will stop at that intermediary environment and return its object, which is not the object you were looking for.
Therefore, parent.frame() has an argument n (which defaults to 1), so that you can tell R to begin it's search at n levels back.
This is the "keeping track" that I refer to, where the developer has to be mindful of the number of calls in between. The straightforward way to go about this is to have an n argument in every function that is calling the function in question, and have that value default to 1. Then for the envir argument, you use: get/assign/eval/etc (.. , envir=parent.frame(n=n) )
Then if you call Func2 from Func1, (both Func1 and Func2 have an n argument), and Func2 is calling prova, you use:
Func1 <- function(x, y, ..., n=1) {
... some stuff ...
Func2( <some, parameters, etc,> n=n+1)
}
Func2 <- function(a, b, c, ..., n=1) {
.... some stuff....
eval(quote(prova()), envir=parent.frame(n=n) )
}
As you can see, it is not complicated but it is * tedious* and sometimes what seems like a bug creeps in, which is simply forgetting to carry the n over.
Therefore, I prefer to use a fixed variable with the environment name.
The solution that I found is:
myattr <- function(x) attr(x, 'myattr')
'myattr<-' <- function(x, value) {
# check that x is a function (e.g. the prova function)
# checks on value (e.g. also value is a function with a given precise signature)
attr(x, 'myattr') <- value
x
}
prova <- function(..., env = parent.frame()) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], env)
# print(eval(as.call(c(myattr, this)), env)) # alternative
print(myattr(this))
# print(attr(this, 'myattr')
invisible(TRUE)
}
I want to thank #RicardoSaporta for the help and the clarification about keeping tracks of the calls.
This solution doesn't work when e.g. myattr(prova) <- function() TRUE is nested in func1 while prova is called in func2 (that it's called by func1). Unless you do not properly update its parameter env ...
For completeness, following the suggestion of #RicardoSaporta, I slightly modified the prova function:
prova <- function(..., pos = 1L) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], parent.frame(n = pos)
print(myattr(this))
# ...
}
This way, it works also when nested, if the the correct pos parameter is passed in.
With this modification it is easier to go to fish out the environment in which you set the attribute on the function prova.
myfun1 <- function() {
myattr(prova) <- function() print(FALSE)
myfun2(n = 2)
}
myfun2 <- function(n) {
prova(pos = n)
}
myfun1()
# function() print(FALSE)
# <environment: 0x22e8208>