Correct use of tryCatch to skip plot error - r

This is a follow up question to this question which didn't get any traction. I realise now that this has nothing to do with purrr::possibly specifically as tryCatch also doesn't work like I thought.
I'm trying to write a function which will run any other arbitrary function without throwing an error. This might not be good practice but I want to understand why this does not work:
library(ggplot2)
## function I thought I could use to run other functions safely
safe_plot <- function(.FUN, ...) {
tryCatch({
.FUN(...)
},
error = function(e) {
message(e)
print("I have got passed the error")
})
}
## Simple plot function
plot_fun <- function(df) {
ggplot(df, aes(xvar, yvar)) +
geom_point()
}
## Works for good data
some_data <- data.frame(xvar = 1:10, yvar = 1:10)
safe_plot(.FUN = plot_fun, df = some_data)
## Works here for data that is not defined:
> safe_plot(.FUN = plot_fun, df = not_defined)
object 'not_defined' not found[1] " I have got passed the error"
## Falls over for an empty data frame
empty_data <- data.frame()
safe_plot(.FUN = plot_fun, df = empty_data)
## Why does't this get past this error and print the message?
> safe_plot(.FUN = plot_fun, df = empty_data)
Error in FUN(X[[i]], ...) : object 'xvar' not found
Why doesn't the last call get to the print statement? I suspect I am abusing tryCatch but I don't know why. Reading some other questions (1, 2) I checked for the class of what tryCatch returns:
> ## What is returned
> empty_data <- data.frame()
> what_is_this <- safe_plot(.FUN = plot_fun, df = empty_data)
> what_is_this
Error in FUN(X[[i]], ...) : object 'xvar' not found
> class(what_is_this)
[1] "gg" "ggplot"
So in the plot function above ggplot(df, does not fail because there is a df of empty_data. The error must occur in aes(xar, . But why does this not return an error class and instead returns a ggplot?

Related

How to use purrr::possibly to catch plot errors

I tried to write a function which would run other functions "safely". By safely here I just mean I don't want scripts to fall over if some plot functions fail. Simple example:
library(ggplot2)
library(purrr)
## function I thought I could use to run other functions safely
safe_plot <- function(.FUN, ...) {
safe_fun <- purrr::possibly(.f = .FUN, otherwise = NULL)
safe_fun(...)
}
## Simple plot function
plot_fun <- function(df) {
df %>%
ggplot(aes(xvar, yvar)) +
geom_point()
}
## Works for good data
some_data <- data.frame(xvar = 1:10, yvar = 1:10)
safe_plot(.FUN = plot_fun, df = some_data)
## Works here for data that is not defined:
> safe_plot(.FUN = plot_fun, df = not_defined)
NULL
## Falls over for an empty data frame
empty_data <- data.frame()
> safe_plot(.FUN = plot_fun, df = empty_data)
Error in FUN(X[[i]], ...) : object 'xvar' not found
What I would like is a generic function that I can pass plot functions to which - if an error occurs - won't stop a script. I know I could do the following for example:
## Simple plot function with check
plot_fun <- function(df) {
if (nrow(df) > 0) {
df %>%
ggplot(aes(xvar, yvar)) +
geom_point()
}
}
But here I'm more interested in learning about why purrr::possibly did not catch the error and a perhaps a better way to go about this problem...assuming I don't care what may have caused an error.
You could directly use tryCatch.
In either cases, the reason for the error not being catched by possibly and eventually tryCatch is due to the lazy evaluation of print.ggplot which occurs outside tryCatch.
Solution is to force print() into tryCatch():
library(ggplot2)
safe_plot <- function(.FUN, ...) {
tryCatch(print(.FUN(...)),error = function(e) NULL)
}
## Simple plot function
plot_fun <- function(df) {
df %>%
ggplot(aes(xvar, yvar)) +
geom_point()
}
## Works for good data
some_data <- data.frame(xvar = 1:10, yvar = 1:10)
safe_plot(.FUN = plot_fun, df = some_data)
#> NULL
empty_data <- data.frame()
safe_plot(.FUN = plot_fun, df = empty_data)
#> NULL

Trigger a helpful error if I provide a wrongly named argument

When writing a function including a ... argument, making a mistake
in an argument name will not trigger an error immediately.
it's especially annoying when we forget to "dot" an argument
addxy <- function(..., .x, .y, .z){
.x + .y
}
addxy(.x=1,.y=2)
# [1] 3
addxy(.x=1,y=2)
# Error in addxy(.x = 1, y = 2) : argument ".y" is missing, with no default
In a real situation the error might be less explicit, and y might be a valid
input or not so I can't dismiss it from the start.
How can I give a friendly error in this case to help the user correct their call ?
We can wrap the function call in try and then if it fails, do some
gymnastics to get analyse the original call and the formals and determine what
might have gone wrong.
Putting it all in one function we get :
with_friendly_dot_error <- function(fun){
fiendly_fun <- fun
body(fiendly_fun) <- substitute({
MC <- match.call()
MC[[1]] <- quote(fun)
res <- try(eval.parent(MC),silent = TRUE)
if(inherits(res,"try-error")){
frmls <- setdiff(names(formals()),"...")
dot_names <- names(eval(substitute(alist(...))))
candidates <- intersect(paste0(".",dot_names), frmls)
stop(attr(res,"condition")$message,
"\nDid you forget the dots in argument(s): ",
paste0(candidates, collapse = ", ")," ?")
}
res
})
fiendly_fun
}
Works normally when no error :
with_friendly_dot_error(addxy)(.x=1,.y=2)
# [1] 3
Gives helpful error when relevant :
with_friendly_dot_error(addxy)(.x=1, y=2)
# Error in with_friendly_dot_error(addxy)(.x = 1, y = 2) :
# argument ".y" is missing, with no default
# Did you forget the dots in argument(s): .y ?

lapply not working with my own variable

I created this custom function of mine here:
generate_portfolio <- function(price_list_w_returns, initial_AUM){
price_list_w_returns1 = lapply(price_list_w_returns, transform, USD_portfolio = initial_AUM*cum_returns )
}
where initial_AUM is something that someone can change to whatever number he or she wants.
USD_portfolio in this case is a new column that i am trying to create and cum_returns is an existing column. price_list_w_returns is a list of dataframes with similar columns and rows.
The error i am getting is:
Error in eval(substitute(list(...)), `_data`, parent.frame()) :
object 'initial_AUM' not found
The problem is of not fully correct definition of lambda-function in lapply. You are trying to pass an argument, which is not defined, to the function. Please see below how it is solved in generate_portfolio2:
# simulate list of data.frames
price_list_w_returns <- replicate(10, data.frame(id = 1:10, cum_returns = abs(rnorm(10)/ 10)), simplify = FALSE)
str(price_list_w_returns)
## Not run:
generate_portfolio <- function(price_list_w_returns, initial_AUM){
price_list_w_returns1 = lapply(price_list_w_returns, transform, USD_portfolio = initial_AUM * cum_returns )
}
generate_portfolio(price_list_w_returns, 2)
## Error in eval(substitute(list(...)), `_data`, parent.frame()) :
## object 'initial_AUM' not found
## End(Not run)
generate_portfolio2 <- function(price_list_w_returns, initial_AUM){
lapply(price_list_w_returns, function(x) {
x$USD_portfolio = initial_AUM * x$cum_returns
x
})
}
generate_portfolio2(price_list_w_returns, 2)

"undefined columns selected" in lapply() call

Error in [.data.frame(meuse#data, , x) : undefined columns selected
MWE:
library(automap)
data(meuse)
coordinates(meuse) = ~ x+y
lapply(1:1, function (x) {
automap::autofitVariogram(meuse#data[, x] ~ 1, input_data = meuse)
})
Executing meuse#data[,1] outside the lapplycall works fine and returns a numeric vector.
Also automap::autofitVariogram(meuse#data[, 1] ~ 1, input_data = meuse) runs fine.
Hence I expected it the problem to be caused by the lapply call. However, using another dataset of mine (SpPointsDaFr) does not cause the problem and runs fine.
Looking at the error message more closely, I am not sure if the second "comma" after "meuse#data," is always present in 'subset' error messages?
Edit:
Another approach which does not work: Addressing via string (note that I only use [1:1] instead of [1] for further function use)
cols <- names(meuse#data) [1:1]
> lapply(cols, function (x) {
+ automap::autofitVariogram(meuse#data[, x] ~ 1, input_data = meuse)
+ })
I found a workaround. Addressing/subsetting the required values of meuse before the call of autofitVariogram and then putting the object tmp in works.
lapply(1:1, function (x) {
tmp <- meuse#data[, x]
emp.svgm <- automap::autofitVariogram(tmp ~ 1, meuse)
})
The error when trying to subset inside the function call is still open for discussion though.

I do not understand error "object not found" inside the function

I have roughly this function:
plot_pca_models <- function(models, id) {
library(lattice)
splom(models, groups=id)
}
and I'm calling it like this:
plot_pca_models(data.pca, log$id)
wich results in this error:
Error in eval(expr, envir, enclos) : object 'id' not found
when I call it without the wrapping function:
splom(data.pca, groups=log$id)
it raises this error:
Error in log$id : object of type 'special' is not subsettable
but when I do this:
id <- log$id
splom(models, groups=id)
it behaves as expected.
Please can anybody explain why it behaves like this and how to correct it? Thanks.
btw:
I'm aware of similar questions here, eg:
Help understand the error in a function I defined in R
Object not found error with ddply inside a function
Object disappears from namespace in function
but none of them helped me.
edit:
As requested, there is full "plot_pca_models" function:
plot_pca_models <- function(data, id, sel=c(1:4), comp=1) {
# 'data' ... princomp objects
# 'id' ... list of samples id (classes)
# 'sel' ... list of models to compare
# 'comp' ... which pca component to compare
library(lattice)
models <- c()
models.size <- 1:length(data)
for(model in models.size) {
models <- c(models, list(data[[model]]$scores[,comp]))
}
names(models) <- 1:length(data)
models <- do.call(cbind, models[sel])
splom(models, groups=id)
}
edit2:
I've managed to make the problem reproducible.
require(lattice)
my.data <- data.frame(pca1 = rnorm(100), pca2 = rnorm(100), pca3 = rnorm(100))
my.id <- data.frame(id = sample(letters[1:4], 100, replace = TRUE))
plot_pca_models2 <- function(x, ajdi) {
splom(x, group = ajdi)
}
plot_pca_models2(x = my.data, ajdi = my.id$id)
which produce the same error like above.
The problem is that splom evaluates its groups argument in a nonstandard way.A quick fix is to rewrite your function so that it constructs the call with the appropriate syntax:
f <- function(data, id)
eval(substitute(splom(data, groups=.id), list(.id=id)))
# test it
ir <- iris[-5]
sp <- iris[, 5]
f(ir, sp)
log is a function in base R. Good practice is to not name objects after functions...it can create confusion. Type log$test into a clean R session and you'll see what's happening:
object of type 'special' is not subsettable
Here's a modification of Hong Oi's answer. First I would recommend to include id in the main data frame, i.e
my.data <- data.frame(pca1 = rnorm(100), pca2 = rnorm(100), pca3 = rnorm(100), id = sample(letters[1:4], 100, replace = TRUE))
.. and then
plot_pca_models2 <- function(x, ajdi) {
Call <- bquote(splom(x, group = x[[.(ajdi)]]))
eval(Call)
}
plot_pca_models2(x = my.data, ajdi = "id")
The cause of the confusion is the following line in lattice:::splom.formula:
groups <- eval(substitute(groups), data, environment(formula))
... whose only point is to be able to specify groups without quotation marks, that is,
# instead of
splom(DATA, groups="ID")
# you can now be much shorter, thanks to eval and substitute:
splom(DATA, groups=ID)
But of course, this makes using splom (and other functions e.g. substitute which use "nonstandard evaluation") harder to use from within other functions, and is against the philosophy that is "mostly" followed in the rest of R.

Resources