tryCatch with a complicated function and plyr in R - r

I've got a complicated, long function that I'm using to do simulations. It can generate errors, mostly having to do with random vectors ending up with equal values with zero-variance, getting fed either into PCA's or logistic regressions.
I'm executing it on a cluster using doMC and plyr. I don't want to tryCatch every little thing inside of the function, because the possibilities for errors are many and the probabilities of each of them are small.
How do I tryCatch each run, rather than tryCatching every little line?? The code is something like this:
iteration = function(){
a really long simulation function where errors can happen
}
reps = 10000
results = llply(1:reps, function(idx){out<-iteration()},.parallel=TRUE)
EDIT about a year later:
The foreach package makes this substantially easier than it is with plyr
library(foreach)
output <- foreach(i=1:reps, .errorhandling = 'remove')%dopar%{
function
}

Can you wrap the try catch loop in the function you pass to llply?
results = llply(1:reps, function(idx){
out = NA
try({
out<-iteration()
}, silent=T)
out
},.parallel=TRUE)

You can put tryCatch within you function iteration, For example:
iteration <- function(idx){
tryCatch(
{ idx <- idx+1
## very long treatments here...
## I add a dummy error here to test my tryCatch
if(idx %% 2000 ==0) stop("too many iterations")
},error = function(e) print(paste('error',idx)))
}
Now testing it within llply,
library(plyr)
reps = 10000
results = llply(1:reps, iteration,.parallel=TRUE)
1] "error 2000"
[1] "error 4000"
[1] "error 6000"
[1] "error 8000"
[1] "error 10000"

Related

Errors when parallelizing for loop

I'm new to parallel-processing and attempting to parallelize a for loop in which I create new columns in a data frame by matching a column in said data frame with two other data frames. j, the data frame I'm attempting to create columns in is 400000 x 54. a and c, the two data frames I'm matching j with are 5000 x 12 and 45000 x 8 (respectively).
Below is my initial loop prior to the attempt at parallelizing:
for(i in 1:nrow(j)) {
if(j$Inspection_Completed[i] == TRUE) {
next
}
j$Assigned_ID <- a$Driver[match(j$car_name, a$CarName)]
j$Title <- c$Title[match(j$Site_ID, c$LocationID)]
j$Status <- c$Status[match(j$Site_ID, c$LocationID)]
}
So far I have attempted the following:
cl <- snow::makeCluster(4)
doSNOW::registerDoSNOW(cl)
foreach::foreach(i = 1:nrow(j)) foreach::`%dopar%` {
if(j$Inspection_Completed[i] == TRUE) {
next
}
j$Assigned_ID <- a$Driver[match(j$car_name, a$CarName)]
j$Title <- c$Title[match(j$Site_ID, c$LocationID)]
j$Status <- c$Status[match(j$Site_ID, c$LocationID)]
}
stopCluster(cl)
However, when I run the code above I receive several errors.
Error: unexpected symbol in "foreach::foreach(i = 1:nrow(j)) foreach"
And:
Error: object 'i' not found
Lastly:
Error: unexpected '}' in "}"
I'm not sure why I'm getting these errors. None of the columns in any of the data frames are factors and I haven't been able to spot any mismatched parentheses or brackets. I've also done this without the snow and doSNOW packages and the result is the same. I've ran it without the tick marks around dopar as well with the same result.
(I didn't know this before.)
R doesn't like infix operators with the ::-notation. Even if you're doing that for namespace management, R isn't having it:
1L %in% 1:2
# [1] TRUE
1L base::%in% 1:2
# Error: unexpected symbol in "1L base"
1L base::`%in%` 1:2
# Error: unexpected symbol in "1L base"
Workarounds:
Redefine your own infix that just mimics the other, as in
`%myin%` <- base::`%in%`
1L %myin% 1:2
# [1] TRUE
Use explicit namespace inclusion with library(foreach) before that point in your code, and just use %dopar%. (Not that it helps much, but using library(foreach) does not mean you cannot use foreach::foreach, though it is unnecessary.)

Is there a way to use tryCatch (or similar) in R as a loop, or to manipulate the expr in the warning argument?

I have a regression model (lm or glm or lmer ...) and I do fitmodel <- lm(inputs) where inputs changes inside a loop (the formula and the data). Then, if the model function does not produce any warning I want to keep fitmodel, but if I get a warning I want to update the model and I want the warning not printed, so I do fitmodel <- lm(inputs) inside tryCatch. So, if it produces a warning, inside warning = function(w){f(fitmodel)}, f(fitmodel) would be something like
fitmodel <- update(fitmodel, something suitable to do on the model)
In fact, this assignation would be inside an if-else structure in such a way that depending on the warning if(w$message satisfies something) I would adapt the suitable to do on the model inside update.
The problem is that I get Error in ... object 'fitmodel' not found. If I use withCallingHandlers with invokeRestarts, it just finishes the computation of the model with the warning without update it. If I add again fitmodel <- lm(inputs) inside something suitable to do on the model, I get the warning printed; now I think I could try suppresswarnings(fitmodel <- lm(inputs)), but yet I think it is not an elegant solution, since I have to add 2 times the line fitmodel <- lm(inputs), making 2 times all the computation (inside expr and inside warning).
Summarising, what I would like but fails is:
tryCatch(expr = {fitmodel <- lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
fitmodel <- update(fitmodel, something suitable to do on the model)
} else if (w$message satisfies something2){
fitmodel <- update(fitmodel, something2 suitable to do on the model)
}
}
)
What can I do?
The loop part of the question is because I thought it like follows (maybe is another question, but for the moment I leave it here): it can happen that after the update I get another warning, so I would do something like while(get a warning on update){update}; in some way, this update inside warning should be understood also as expr. Is something like this possible?
Thank you very much!
Generic version of the question with minimal example:
Let's say I have a tryCatch(expr = {result <- operations}, warning = function(w){f(...)} and if I get a warning in expr (produced in fact in operations) I want to do something with result, so I would do warning = function(w){f(result)}, but then I get Error in ... object 'result' not found.
A minimal example:
y <- "a"
tryCatch(expr = {x <- as.numeric(y)},
warning = function(w) {print(x)})
Error in ... object 'x' not found
I tried using withCallingHandlers instead of tryCatch without success, and also using invokeRestart but it does the expression part, not what I want to do when I get a warning.
Could you help me?
Thank you!
The problem, fundamentally, is that the handler is called before the assignment happens. And even if that weren’t the case, the handler runs in a different scope than the tryCatch expression, so the handler can’t access the names in the other scope.
We need to separate the handling from the value transformation.
For errors (but not warnings), base R provides the function try, which wraps tryCatch to achieve this effect. However, using try is discouraged, because its return type is unsound.1 As mentioned in the answer by ekoam, ‘purrr’ provides soundly typed functional wrappers (e.g. safely) to achieve a similar effect.
However, we can also build our own, which might be a better fit in this situation:
with_warning = function (expr) {
self = environment()
warning = NULL
result = withCallingHandlers(expr, warning = function (w) {
self$warning = w
tryInvokeRestart('muffleWarning')
})
list(result = result, warning = warning)
}
This gives us a wrapper that distinguishes between the result value and a warning. We can now use it to implement your requirement:
fitmodel = with(with_warning(lm(inputs)), {
if (! is.null(warning)) {
if (conditionMessage(warning) satisfies something) {
update(result, something suitable to do on the model)
} else {
update(result, something2 suitable to do on the model)
}
} else {
result
}
})
1 What this means is that try’s return type doesn’t distinguish between an error and a non-error value of type try-error. This is a real situation that can occur, for example, when nesting multiple try calls.
It seems that you are looking for a functional wrapper that captures both the returned value and side effects of a function call. I think purrr::quietly is a perfect candidate for this kind of task. Consider something like this
quietly <- purrr::quietly
foo <- function(x) {
if (x < 3)
warning(x, " is less than 3")
if (x < 4)
warning(x, " is less than 4")
x
}
update_foo <- function(x, y) {
x <- x + y
foo(x)
}
keep_doing <- function(inputs) {
out <- quietly(foo)(inputs)
repeat {
if (length(out$warnings) < 1L)
return(out$result)
cat(paste0(out$warnings, collapse = ", "), "\n")
# This is for you to see the process. You can delete this line.
if (grepl("less than 3", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1.5)
} else if (grepl("less than 4", out$warnings[[1L]])) {
out <- quietly(update_foo)(out$result, 1)
}
}
}
Output
> keep_doing(1)
1 is less than 3, 1 is less than 4
2.5 is less than 3, 2.5 is less than 4
[1] 4
> keep_doing(3)
3 is less than 4
[1] 4
Are you looking for something like the following? If it is run with y <- "123", the "OK" message will be printed.
y <- "a"
#y <- "123"
x <- tryCatch(as.numeric(y),
warning = function(w) w
)
if(inherits(x, "warning")){
message(x$message)
} else{
message(paste("OK:", x))
}
It's easier to test several argument values with the code above rewritten as a function.
testWarning <- function(x){
out <- tryCatch(as.numeric(x),
warning = function(w) w
)
if(inherits(out, "warning")){
message(out$message)
} else{
message(paste("OK:", out))
}
invisible(out)
}
testWarning("a")
#NAs introduced by coercion
testWarning("123")
#OK: 123
Maybe you could assign x again in the handling condition?
tryCatch(
warning = function(cnd) {
x <- suppressWarnings(as.numeric(y))
print(x)},
expr = {x <- as.numeric(y)}
)
#> [1] NA
Perhaps not the most elegant answer, but solves your toy example.
Don't put the assignment in the tryCatch call, put it outside. For example,
y <- "a"
x <- tryCatch(expr = {as.numeric(y)},
warning = function(w) {y})
This assigns y to x, but you could put anything in the warning body, and the result will be assigned to x.
Your "what I would like" example is more complicated, because you want access to the expr value, but it hasn't been assigned anywhere at the time the warning is generated. I think you'll have to recalculate it:
fitmodel <- tryCatch(expr = {lm(inputs)},
warning = function(w) {if (w$message satisfies something) {
update(lm(inputs), something suitable to do on the model)
} else if (w$message satisfies something2){
update(lm(inputs), something2 suitable to do on the model)
}
}
)
Edited to add:
To allow the evaluation to proceed to completion before processing the warning, you can't use tryCatch. The evaluate package has a function (also called evaluate) that can do this. For example,
y <- "a"
res <- evaluate::evaluate(quote(x <- as.numeric(y)))
for (i in seq_along(res)) {
if (inherits(res[[i]], "warning") &&
conditionMessage(res[[i]]) == gettext("NAs introduced by coercion",
domain = "R"))
x <- y
}
Some notes: the res list will contain lots of different things, including messages, warnings, errors, etc. My code only looks at the warnings. I used conditionMessage to extract the warning message, but
it will be translated to the local language, so you should use gettext to translate the English version of the message for comparison.

Infinite loop using WHILE even though condition is met in R

I am trying to learn how to implement control structures such as FOR and while loops.
I created a function the simulates betting in a famous Brazilian lotto.
In the lotto, a player bets on 6 unique integers from a 1:60 vector (called your_bet).
The function samples 6 values from the 1 to 60 universe ("result") and tests how many values in result match your_bet, printing out:
your_bet
result
total points (out of 6 possible)
one of three possible commentaries on the result of the bet.
Code as follows:
```
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE)){
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
}
```
I then tried to implement a loop that would make the function go over and over again in a loop until total points>= a given target (here as target_points), modifying the function as bellow.
```
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE), stubborn_until_x_points=FALSE,
target_points)#inserted stubborn_until_x_points and target_points arguments{
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
if (stubborn_until_x_points==TRUE)#Inserted WHILE loop here{
while(total_points < target_points){
LOTTO(your_bet, stubborn_until_x_points=TRUE, target_points)}
}
}
```
This did make the function repeat in a loop, but for some reason, it keeps looping even when the condition is met.
Observation - I realized that when the condition is met on the first run, it actually stops, but after entering a loop, it goes on forever.
I can´t find what is wrong. Any ideas?
Thanks in advance.
I haven't dug quite deep into the function, but the only reason that this should not break at any point is when target_points > n_samples (6 here).
The problem in this case is quite obvious and simple to fix. Reduce target_points to be less than 6 or add n_samples (6 atm.) and make it greater than target_points. Rather than this I suspect the main problem lies within the recursive function though. R is rather restrictive when it comes to recursion, if one tries to do a simple recursion for example
i <- 0
f <- function(){
i <<- i + 1
f()
}
f()
i
# 896 on my pc
one can see that we cannot use recursion for very deep recursive functions (in R). This throws the very unhelpful error
Error: C stack usage 7974196 is too close to the limit
To alleviate this, one simply has to remove the recursion (simply in italian because sometimes it is not simple). In this case we just move the while loop to contain the main body of the function, and use an if statement to break early if necessary.
Below is a slightly modified version of the function (note that sample_n and number_range has been added as arguments).
In this function the while loop has been moved to contain the main body, and the result is instead printed at the end (using the variable res to figure out the result). At the end of the loop I use a if(isFALSE(stubborn_until_x_points))break statement to exit early if necessary.
LOTTO <- function(your_bet,
sample_n = 6,
number_range = 1:60,
stubborn_until_x_points = FALSE,
target_points){
if(missing(target_points) || target_points > sample_n)
stop('missing target_points or target_points too large')
total_points <- -Inf # Always smaller than target_points
i <- 0
res <- 0
# If you want a new bet every iteration.
# Add this at the end of the loop, but remove the 'if'
if(missing(your_bet))
your_bet <- sample(number_range, size=sample_n, replace=FALSE)
while(total_points < target_points){
result <- sample(number_range, size=sample_n, replace=FALSE)
logical_vector <- your_bet %in% result
total_points <- sum(logical_vector)
if (total_points==6){
res <- 1
}else if (total_points==5){
res <- 2
}
i <- i + 1
if(isFALSE(stubborn_until_x_points))
break
}
if(res == 1)
cat(res <- 'You\'re a millionaire!\n', sep = '\n')
else if(res == 2)
cat(res <- '5 points, you are rich!\n', sep = '\n')
else
cat(res <- 'Better luck next time.', sep = '\n')
c(result = res, number_of_tries = i)
}
The function is called as before, but now also returns the number of attempts and the result obtained from trials as shown below.
LOTTO(target_points = 6, stubborn_until_x_points = TRUE)
You're a millionaire!
#Output:
result number_of_tries
"You're a millionaire!\n" "8297820"
Removing recursion, including the body of the function inside the while loop, assigning -Inf to the initial total_points and adding the break statement were much usefull.
Addapting the answer by #Oliver yielded exactly what I was looking for:
LOTTO<-function(your_bet=sample(1:60, size=6, replace=FALSE), stubborn_until_x_points=FALSE,
target_points=0){
total_points<--Inf
while(total_points < target_points){
result<-sample(1:60, size=6, replace=FALSE)
logical_vector<-(your_bet %in% result)
total_points<-sum(as.integer(logical_vector))
print(paste(c("Your bet:", as.character(your_bet))), collapse="")
print(paste(c("Result", as.character(result))), collapse="")
print(paste(c("Total points", as.character(total_points))), collapse="")
if (total_points==6)
print("You are a millonaire")
else if (total_points==5)
print("5 points, you are rich!")
else print("good luck next time")
if (isFALSE(stubborn_until_x_points==TRUE))
break
}
His answer, however, yields interesting additional results, and better control over inadequate argument inputs

Function raise error with return statement

I want to process a own designed function on every cell using the calc function of the "raster" package.
Everything works perfectly when I try to print the "final" result of the function (value I want to return), but when I try to use return statement, I got an error :
Error in .local(x, values, ...) :
values must be numeric, integer or logical.
Here is the code leading to that error
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
return(years[idx]) # this raises error
# print(years[idx]) # This does *not* raises any error
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)
How is it possible to have a return statement to make it fails ?
Some of my tests leads to the fact that it seems that the process fails just after the execution of the calc function on the very last cell of the rasterBrick.
But I have no clue of where to start to try to fix this.
Input image is available here
[EDIT]
I just noticed that if I use return(idx) instead of return(year[idx]) the process works without error raised.
So it seems that the problem is more at fetching the value of the year variable.
Is therefore any particular thing that I missed in the use of indexes with R ?
Comment of user2554330 put me on the good track, issue was that calc cannot handle a "numeric(0)" result.
Updated code is then
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
if (idx==0){
return(0)
} else {
return(as.integer(years[idx]))
}
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)

How to make 'for loop' skip some iterations in between which shows error?

I am running a for loop from (1:1700) in R, but I am loading different data in each iteration. But I am getting error in some iterations in between (may be because of corresponding data is missing).
I want to know if there is any way I could skip those particular iterations in which I get error and at least for loop should complete all the 1700 iterations skipping aforementioned error showing iterations.
I have to run a for loop, there is no other option.
Yoy can use tryCatch within your loop. here an example where I loop from 1 to 5 , and for some counter value I get an error ( i create it here using stop), I catch it and then I continue for other values of the counters.
for( i in 1:5) ## replace 5 by 1700
tryCatch({
if(i %in% c(2,5)) stop(e)
print(i) ## imagine you read a file here, or any more complicated process
}
,error = function(e) print(paste(i,'is error')))
[1] 1
[1] "2 is error"
[1] 3
[1] 4
[1] "5 is error"
I use try for such issues. It allows your loop to continue through the cycle of values without stopping at the error message.
Example
make data
set.seed(1)
dat <- vector(mode="list", 1800)
dat
tmp <- sample(1800, 900) # only some elements are filled with data
for(i in seq(tmp)){
dat[[tmp[i]]] <- rnorm(10)
}
dat
loop without try
#gives warning
res <- vector(mode="list", length(dat))
for(i in seq(dat)){
res[[i]] <- log(dat[[i]]) # warning given when trying to take the log of the NULL element
}
loop with try
#cycles through
res <- vector(mode="list", length(dat))
for(i in seq(dat)){
res[[i]] <- try(log(dat[[i]]), TRUE) # cycles through
}

Resources