Debugging lapply/sapply calls - r

Code written using lapply and friends is usually easier on the eyes and more Rish than loops. I love lapply just as much as the next guy, but how do I debug it when things go wrong? For example:
> ## a list composed of numeric elements
> x <- as.list(-2:2)
> ## turn one of the elements into characters
> x[[2]] <- "what?!?"
>
> ## using sapply
> sapply(x, function(x) 1/x)
Error in 1/x : non-numeric argument to binary operator
Had I used a for loop:
> y <- rep(NA, length(x))
> for (i in 1:length(x)) {
+ y[i] <- 1/x[[i]]
+ }
Error in 1/x[[i]] : non-numeric argument to binary operator
But I would know where the error happened:
> i
[1] 2
What should I do when using lapply/sapply?

Use the standard R debugging techniques to stop exactly when the error occurs:
options(error = browser)
or
options(error = recover)
When done, revert to standard behaviour:
options(error = NULL)

If you wrap your inner function with a try() statement, you get more information:
> sapply(x, function(x) try(1/x))
Error in 1/x : non-numeric argument to binary operator
[1] "-0.5"
[2] "Error in 1/x : non-numeric argument to binary operator\n"
[3] "Inf"
[4] "1"
[5] "0.5"
In this case, you can see which index fails.

Use the plyr package, with .inform = TRUE:
library(plyr)
laply(x, function(x) 1/x, .inform = TRUE)

Like geoffjentry said:
> sapply(x, function(x) {
res <- tryCatch(1 / x,
error=function(e) {
cat("Failed on x = ", x, "\n", sep="") ## browser()
stop(e)
})
})
Also, your for loop could be rewritten to be much cleaner (possibly a little slower):
> y <- NULL
> for (xi in x)
y <- c(y, 1 / xi)
Error in 1/xi : non-numeric argument to binary operator
For loops are slow in R, but unless you really need the speed I'd go with a simple iterative approach over a confusing list comprehension.
If I need to figure out some code on the fly, I'll always go:
sapply(x, function(x) {
browser()
...
})
And write the code from inside the function so I see what I'm getting.
-- Dan

Using debug or browser isn't a good idea in this case, because it will stop your code so frequently. Use Try or TryCatch instead, and deal with the situation when it arises.

You can debug() the function, or put a browser() inside the body. This is only particularly useful if you don't have a gajillion iterations to work through.
Also, I've not personally done this, but I suspect you could put a browser() in as part of a tryCatch(), such that when the error is generated you can use the browser() interface.

I've faced the same problem and have tended to make my calls with (l)(m)(s)(t)apply to be functions that I can debug().
So, instead of blah<-sapply(x,function(x){ x+1 })
I'd say,
myfn<-function(x){x+1}
blah<-sapply(x,function(x){myfn(x)})
and use debug(myfn) with options(error=recover).
I also like the advice about sticking print() lines here and there to see what is happening.
Even better is to design a test of myfn(x) that it has to pass and to be sure it passes said test before subjecting it to sapply. I only have patience to to this about half the time.

Related

Is there a way to use tryCatch (or similar) in R as a loop, or to manipulate the expr in the warning argument?

I have a regression model (lm or glm or lmer ...) and I do fitmodel <- lm(inputs) where inputs changes inside a loop (the formula and the data). Then, if the model function does not produce any warning I want to keep fitmodel, but if I get a warning I want to update the model and I want the warning not printed, so I do fitmodel <- lm(inputs) inside tryCatch. So, if it produces a warning, inside warning = function(w){f(fitmodel)}, f(fitmodel) would be something like
fitmodel <- update(fitmodel, something suitable to do on the model)
In fact, this assignation would be inside an if-else structure in such a way that depending on the warning if(w$message satisfies something) I would adapt the suitable to do on the model inside update.
The problem is that I get Error in ... object 'fitmodel' not found. If I use withCallingHandlers with invokeRestarts, it just finishes the computation of the model with the warning without update it. If I add again fitmodel <- lm(inputs) inside something suitable to do on the model, I get the warning printed; now I think I could try suppresswarnings(fitmodel <- lm(inputs)), but yet I think it is not an elegant solution, since I have to add 2 times the line fitmodel <- lm(inputs), making 2 times all the computation (inside expr and inside warning).
Summarising, what I would like but fails is:
tryCatch(expr = {fitmodel <- lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
fitmodel <- update(fitmodel, something suitable to do on the model)
} else if (w$message satisfies something2){
fitmodel <- update(fitmodel, something2 suitable to do on the model)
}
}
)
What can I do?
The loop part of the question is because I thought it like follows (maybe is another question, but for the moment I leave it here): it can happen that after the update I get another warning, so I would do something like while(get a warning on update){update}; in some way, this update inside warning should be understood also as expr. Is something like this possible?
Thank you very much!
Generic version of the question with minimal example:
Let's say I have a tryCatch(expr = {result <- operations}, warning = function(w){f(...)} and if I get a warning in expr (produced in fact in operations) I want to do something with result, so I would do warning = function(w){f(result)}, but then I get Error in ... object 'result' not found.
A minimal example:
y <- "a"
tryCatch(expr = {x <- as.numeric(y)},
warning = function(w) {print(x)})
Error in ... object 'x' not found
I tried using withCallingHandlers instead of tryCatch without success, and also using invokeRestart but it does the expression part, not what I want to do when I get a warning.
Could you help me?
Thank you!
The problem, fundamentally, is that the handler is called before the assignment happens. And even if that weren’t the case, the handler runs in a different scope than the tryCatch expression, so the handler can’t access the names in the other scope.
We need to separate the handling from the value transformation.
For errors (but not warnings), base R provides the function try, which wraps tryCatch to achieve this effect. However, using try is discouraged, because its return type is unsound.1 As mentioned in the answer by ekoam, ‘purrr’ provides soundly typed functional wrappers (e.g. safely) to achieve a similar effect.
However, we can also build our own, which might be a better fit in this situation:
with_warning = function (expr) {
self = environment()
warning = NULL
result = withCallingHandlers(expr, warning = function (w) {
self$warning = w
tryInvokeRestart('muffleWarning')
})
list(result = result, warning = warning)
}
This gives us a wrapper that distinguishes between the result value and a warning. We can now use it to implement your requirement:
fitmodel = with(with_warning(lm(inputs)), {
if (! is.null(warning)) {
if (conditionMessage(warning) satisfies something) {
update(result, something suitable to do on the model)
} else {
update(result, something2 suitable to do on the model)
}
} else {
result
}
})
1 What this means is that try’s return type doesn’t distinguish between an error and a non-error value of type try-error. This is a real situation that can occur, for example, when nesting multiple try calls.
It seems that you are looking for a functional wrapper that captures both the returned value and side effects of a function call. I think purrr::quietly is a perfect candidate for this kind of task. Consider something like this
quietly <- purrr::quietly
foo <- function(x) {
if (x < 3)
warning(x, " is less than 3")
if (x < 4)
warning(x, " is less than 4")
x
}
update_foo <- function(x, y) {
x <- x + y
foo(x)
}
keep_doing <- function(inputs) {
out <- quietly(foo)(inputs)
repeat {
if (length(out$warnings) < 1L)
return(out$result)
cat(paste0(out$warnings, collapse = ", "), "\n")
# This is for you to see the process. You can delete this line.
if (grepl("less than 3", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1.5)
} else if (grepl("less than 4", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1)
}
}
}
Output
> keep_doing(1)
1 is less than 3, 1 is less than 4
2.5 is less than 3, 2.5 is less than 4
[1] 4
> keep_doing(3)
3 is less than 4
[1] 4
Are you looking for something like the following? If it is run with y <- "123", the "OK" message will be printed.
y <- "a"
#y <- "123"
x <- tryCatch(as.numeric(y),
warning = function(w) w
)
if(inherits(x, "warning")){
message(x$message)
} else{
message(paste("OK:", x))
}
It's easier to test several argument values with the code above rewritten as a function.
testWarning <- function(x){
out <- tryCatch(as.numeric(x),
warning = function(w) w
)
if(inherits(out, "warning")){
message(out$message)
} else{
message(paste("OK:", out))
}
invisible(out)
}
testWarning("a")
#NAs introduced by coercion
testWarning("123")
#OK: 123
Maybe you could assign x again in the handling condition?
tryCatch(
warning = function(cnd) {
x <- suppressWarnings(as.numeric(y))
print(x)},
expr = {x <- as.numeric(y)}
)
#> [1] NA
Perhaps not the most elegant answer, but solves your toy example.
Don't put the assignment in the tryCatch call, put it outside. For example,
y <- "a"
x <- tryCatch(expr = {as.numeric(y)},
warning = function(w) {y})
This assigns y to x, but you could put anything in the warning body, and the result will be assigned to x.
Your "what I would like" example is more complicated, because you want access to the expr value, but it hasn't been assigned anywhere at the time the warning is generated. I think you'll have to recalculate it:
fitmodel <- tryCatch(expr = {lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
update(lm(inputs), something suitable to do on the model)
} else if (w$message satisfies something2){
update(lm(inputs), something2 suitable to do on the model)
}
}
)
Edited to add:
To allow the evaluation to proceed to completion before processing the warning, you can't use tryCatch. The evaluate package has a function (also called evaluate) that can do this. For example,
y <- "a"
res <- evaluate::evaluate(quote(x <- as.numeric(y)))
for (i in seq_along(res)) {
if (inherits(res[[i]], "warning") &&
conditionMessage(res[[i]]) == gettext("NAs introduced by coercion",
domain = "R"))
x <- y
}
Some notes: the res list will contain lots of different things, including messages, warnings, errors, etc. My code only looks at the warnings. I used conditionMessage to extract the warning message, but
it will be translated to the local language, so you should use gettext to translate the English version of the message for comparison.

R: eval parse function call not accessing correct environments

I'm trying to read a function call as a string and evaluate this function within another function. I'm using eval(parse(text = )) to evaluate the string. The function I'm calling in the string doesn't seem to have access to the environment in which it is nested. In the code below, my "isgreater" function finds the object y, defined in the global environment, but can't find the object x, defined within the function. Does anybody know why, and how to get around this? I have already tried adding the argument envir = .GlobalEnv to both of my evals, to no avail.
str <- "isgreater(y)"
isgreater <- function(y) {
return(eval(y > x))
}
y <- 4
test <- function() {
x <- 3
return(eval(parse(text = str)))
}
test()
Error:
Error in eval(y > x) : object 'x' not found
Thanks to #MrFlick and #r2evans for their useful and thought-provoking comments. As far as a solution, I've found that this code works. x must be passed into the function and cannot be a default value. In the code below, my function generates a list of results with the x variable being changed within the function. If anyone knows why this is, I would love to know.
str <- "isgreater(y, x)"
isgreater <- function(y, x) {
return(eval(y > x))
}
y <- 50
test <- function() {
list <- list()
for(i in 1:100) {
x <- i
bool <- eval(parse(text = str))
list <- append(list, bool)
}
return(list)
}
test()
After considering the points made by #r2evans, I have elected to change my approach to the problem so that I do not arrive at this string-parsing step. Thanks a lot, everyone.
I offer the following code, not as a solution, but rather as an insight into how R "works". The code does things that are quite dangerous and should only be examined for its demonstration of how to assert a value for x. Unfortunately, that assertion does destroy the x-value of 3 inside the isgreater-function:
str <- "isgreater(y)"
isgreater <- function(y) {
return(eval( y > x ))
}
y <- 4
test <- function() {
environment(isgreater)$x <- 5
return(eval(parse(text = str) ))
}
test()
#[1] FALSE
The environment<- function is used in the R6 programming paradigm. Take a look at ?R6 if you are interested in working with a more object-oriented set of structures and syntax. (I will note that when I first ran your code, there was an object named x in my workspace and some of my efforts were able to succeed to the extent of not throwing an error, but they were finding that length-10000 vector and filling up my console with logical results until I escaped the console. Yet another argument for passing both x and y to isgreater.)

Warnings instead of errors from assert_that()?

I'm using R's assertthat package and am wanting to (temporarily) output a warning instead of an error on assertion failure. What's the easiest way to do that with the assertthat package?
I realize that wanting warnings instead of errors kind of goes against what assertions are supposed to be used for. In the long term, we indeed want to be outputting errors on assertion failure. In the short term, we still want the code to function even with bad input, since the output with bad inputs is still "good enough" for now.
A simple example: suppose I have a function that takes x as input and outputs x+5. I want to output a warning if x!=3. Since we will be using assert_that ultimately, it would be nice if we can use assertthat package for the warning.
In the long term, we'll use this:
> x <- 3
> fn <- function(x) {assert_that(x==3); return(x+5)}
> fn(3)
[1] 8
> fn(4)
Error: x not equal to 3
In the short term, here's the best I have so far:
> fn <- function(x) {if(!see_if(x==3)) warning(validate_that(x==3)); return(x+5)}
> fn(3)
[1] 8
> fn(4)
[1] 9
Warning message:
In fn(4) : x not equal to 3
I'm looking for a more concise solution, if possible (best case would be passing an "output_warning" parameter to assert_that, but I don't think that exists).
I created a user defined function which accepts a string corresponding to an expression against which you would like to run validate_that() (ultimately assert_that()). The function prints a warning if the assertion fails and remains silent otherwise. See below for usage. You could easily extend this custom function to accept more than one expression if necessary. Note that I also use sys.calls() to obtain the name of the function which called this helper function. This is an important piece of information so you can correlate your warnings with the code that actually generated them.
assert_that_soft <- function(exp) {
if (!exp) {
print (paste("Error in function:",
parse(sys.calls()[[sys.nframe()-1]])) ) # name of caller
}
}
Usage:
> fn <- function(x) { assert_that_soft(x==3); return(x+5) }
> fn(3)
[1] 8
> fn(8)
[1] "Error in function: fn(8)"
[1] 13
Another option is to wrap assert_that in tryCatch.
fn <- function(x) tryCatch(assert_that(x == 3), error = function(e) warning(e), finally = return(x+5))
fn(3)
# [1] 8
fn(8)
# [1] 13
# Warning message:
# x not equal to 3
I think the easiest way to overwrite the function would be to copy most of the assert_that function as is, and call the new function by the same name so you don't need to change all the code when you go into error mode.
assert_that <- function(..., env=parent.frame()) {
res <- see_if(..., env=env)
if (res)
return(TRUE)
warning(attr(res, "msg"))
TRUE
}
fn <- function(x) { assert_that(x==3); return(x+5) }
fn(3)
# [1] 8
fn(8)
# [1] 13
# Warning message:
# In assert_that(x == 3) : x not equal to 3
I am proposing an extension of the assertthat package to allow for simple warnings, see
https://github.com/hadley/assertthat/issues/69
any feedback is welcome!

R language: Unexpected behaviour with function arguments in lapply

I am attempting to create a list of matrices containing iid Normal numbers. For the sake of a simple example, let the matrices be 4 by 2 and consider a list of length 3. The following code seemed like it should work (to me):
MyMatrix <- lapply(1:3, function() {matrix(rnorm(8), 4, 2)})
But it failed, with the following error:
Error in FUN(1:3[[1L]], ...) : unused argument (1:3[[1]])
On a whim, I tried:
MyMatrix <- lapply(1:3, function(x) {matrix(rnorm(8), 4, 2)})
And it worked! But why? x is not used anywhere in the function, and on experimentation, the behaviour of the expression is not affected by whether x already exists in the workspace or not. It appears to be entirely superfluous.
I am new to R, so I would be very grateful if an experienced user could explain what is going on here and why my first line fails.
You can't have a function that doesn't take arguments and then pass it arguments. Which is exactly what you are doing when you run lapply, as each value is passed in turn as the first argument to the function. E.g.
out <- lapply(1:3, function(x) x)
str(out)
#List of 3
# $ : int 1
# $ : int 2
# $ : int 3
Simple example throwing an error:
test <- function() {"woot"}
test()
#[1] "woot"
test(1)
#Error in test(1) : unused argument (1)
lapply(1:3, test)
#Error in FUN(1:3[[1L]], ...) : unused argument (1:3[[1]])
It's good form for R to error out, as it likely means you're expecting the function's returned result to change based on the arguments passed to the function. And it wouldn't. There are functions like this included in base R, like Sys.time(), which will fail if you try to pass it superfluous arguments which might otherwise make sense:
Sys.time()
#[1] "2014-07-07 13:22:11 EST"
Sys.time(tz="UTC")
#Error in Sys.time(tz = "UTC") : unused argument (tz = "UTC")

Get variables in error messages?

Here is my code:
test <- function(y){
irisname <- c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species")
if(y %in% irisname){
print(y)
} else{
test <- function(...) stop("dummy error")
test(y)
}
}
> test("ds")
Error in test(y) : dummy error
In the result: "Error in test(y) : dummy error ", I need "ds" in test("ds"), not test(y).
How can I do that?
This almost does it (there's an extra colon ...), by using call.=FALSE to suppress the information about the call and hacking it into the error message.
update: added quotation marks to error #1; explained a bit more about why this problem is hard.
I don't know the structure of your code, but you are making life considerably harder for yourself by passing objects farther down into the structure. It would be a lot easier to call stop() directly from within your first level, or to use the information carried in y directly within your error message.
test <- function(y,stop=FALSE){
irisname <- c("Sepal.Length","Sepal.Width",
"Petal.Length","Petal.Width","Species")
if (stop) stop(sprintf("premature stop: var %s",y))
if(y %in% irisname){
print(y)
} else{
test <- function(...) {
stop(sprintf("in test(\"%s\"): dummy error",...),
call.=FALSE)
}
test(y)
}
}
test("junk")
## Error: in test("junk"): dummy error
test("junk",stop=TRUE)
## Error in test("junk", stop = TRUE) : premature stop: var junk
Getting rid of the spurious first colon in the output of test("junk") will be considerably harder, because the Error: string is hard-coded within R. Your best bet is probably, somehow, to print your own custom error message and then stop silently, or recreate the behaviour of stop() without generating the message (see ?condition: e.g. return(invisible(simpleError("foo")))). However, you're going to have to jump through a lot of hoops to do this, and it will be hard to ensure that you get exactly the same behaviour that you would have with stop() (e.g. will the error message have been saved in the error-message buffer?)
What you want to do is probably possible by mucking around with R internals enough, but in my opinion so hard that it would be better to rethink the problem ...
Good luck.
You could check the argument right at the start of the function. match.arg might come in handy, or you could print custom message and return NA.
two updates below
> test <- function(y)
{
if(!(y %in% names(iris))){
message(sprintf('test("%s") is an error. "%s" not found in string', y, y))
return(NA) ## stop all executions and exit the function
}
return(y) ## ... continue
}
> test("Sepal.Length")
# [1] "Sepal.Length"
> test("ds")
# test("ds") is an error. "ds" not found in string
# [1] NA
Add/Edit : Is there a reason why you're nesting a function when the function goes to else? I removed it, and now get the following. It seems all you are doing is checking an argument, and end-users (and RAM) want to know immediately if they enter an incorrect default arguments. Otherwise, you're calling up unnecessary jobs and using memory when you don't need to.
test <- function(y){
irisname <- c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width","Species")
if(y %in% irisname){
print(y)
} else{
stop("dummy error")
}
}
> test("ds")
# Error in test("ds") : dummy error
> test("Sepal.Length")
# [1] "Sepal.Length"
You could also use pmatch, rather than match.arg, since match.arg prints a default error.
> test2 <- function(x)
{
y <- pmatch(x, names(iris))
if(is.na(y)) stop('dummy error')
names(iris)[y]
}
> test2("ds")
# Error in test2("ds") : dummy error
> test2("Sepal.Length")
# [1] "Sepal.Length"

Resources