Can future processes free themselves? - r

i have a script/workframe that calls a function with the future package. The function does not return anything, but instead starts an API-Call, something like this:
library(future)
future::plan("multisession")
future_get_function = function (msg) {
link <- ("http://127.0.0.1:7774/echo?")
message <- list(msg=msg)
httr::GET(link, query=message)
Sys.sleep(20)
}
future::future({
future_get_function(msg)
}, globals = list(msg=msg)}
library(plumber)
#* Echo back the input
#* #param msg The message to echo
#* #get /echo
function(msg="") {
print("received something")
list(msg = paste0("The message is: '", msg, "'"))
}
Now, from what ive seen, I need to free up the processes that are done with the value function afterwards, correct? Otherwise, if I have more calls then processers/tasks available, they are stuck?
So is there a function that i can insert at the end of future_get_function or somewhere in the future call (promises would also be fine), that frees the process automatically once its done, without me having to call value or similar?
Thanks in advance.

Future processes definitely free themselves after completion.
You could use then to follow up promises completion, and promise_all to check when all promises are finished :
library(future)
library(promises)
plan(multisession)
tic <- Sys.time()
future_get_function = function (msg,wait) {
#link <- ("http://127.0.0.1:7774/echo?")
#message <- list(msg=msg)
#httr::GET(link, query=message)
print(paste(msg,": launched after",round(difftime(Sys.time(),tic),1), 'seconds'))
Sys.sleep(wait)
invisible(msg)
}
f1 <- future(future_get_function('msg1',5))
f2 <- future(future_get_function('msg2',10))
f_after <- function(res) {
print(paste(res,"sent after",round(difftime(Sys.time(),tic),1), 'seconds'))}
f_failed <- function(res) {
print(paste("future failed"))}
f_end <- function(res) {
print(paste("All messages sent after",round(difftime(Sys.time(),tic),1), 'seconds'))}
promises::then(f1, onFulfilled = f_after, onRejected = f_failed)
promises::then(f2, onFulfilled = f_after, onRejected = f_failed)
f_wait_all <- promises::promise_all(f1,f2)
promises::then(f_wait_all, onFulfilled = f_end, onRejected = f_failed)
[1] "msg1 : launched after 0 seconds"
[1] "msg1 sent after 5.1 seconds"
[1] "msg2 : launched after 0.2 seconds"
[1] "msg2 sent after 10.2 seconds"
[1] "All messages sent after 10.2 seconds"

Related

why future_promise error is "Unhandled" with onRejected

Using future_promise with catch(.) (or then(onRejected=)), the error is still marked as "Unhandled" even though I think I'm attempting to handle it.
library(future)
library(promises)
plan(multicore, workers = 3)
Sys.getpid()
# [1] 1270066
nbrOfFreeWorkers()
# [1] 3
thisproc <- promises::future_promise({
setTimeLimit(elapsed = 2, transient = TRUE)
Sys.sleep(3)
1L
}, seed = TRUE) %>%
promises::then(
function(val) {
message("survived: ", val)
val
}) %>%
catch(
function(error) {
message("terminated: ", substring(conditionMessage(error), 1, 16))
NULL
})
# terminated: Unexpected resul
# Unhandled promise error: Unexpected result (of class ‘NULL’ != ‘FutureResult’) retrieved for MulticoreFuture future (label = ‘<none>’, expression = ‘{; setTimeLimit(elapsed = 2, transient \
# = TRUE); Sys.sleep(3); 1L; }’):
once that returned, then
nbrOfFreeWorkers()
# [1] 2
resetWorkers(plan())
# NULL
nbrOfFreeWorkers()
# [1] 3
(I used substring just to differentiate an error-like string messaged by my function, and the full condition message being catted to the console.)
It reports "Unhandled" despite the code handling it (the terminated: is printed);
It reports it twice;
and most importantly,
The worker is not freed until I explicitly free it, despite the then-chain functioning.
According to the source, it appears that rejectionHandled is only referenced/set in this one file, and it should be setting the flag:
doRejectFinalReason = function(reason) {
private$value <- reason
private$state <- "rejected"
later::later(function() {
lapply(private$onRejected, function(f) {
private$rejectionHandled <- TRUE
f(private$value)
})
private$onRejected <- list()
later::later(~{
if (!private$rejectionHandled) {
# warning() was unreliable here
cat(file=stderr(), "Unhandled promise error: ", reason$message, "\n", sep = "")
}
})
})
}
I assumed that since f(private$value) is called (as evidenced by the "terminated:" message I inserted), then the private_rejectionHandled should already have been updated. There are no other errors in this pipe, so it should not have reset the rejectionHandled flag.
Incidentally, I tried (without success) to call later::run_now() throughout the process to make sure that all promises are being run, though I was only stabbing in the dark (and it did not change the result).
Also, I can find attributes(thisproc)$promise_impl$.__enclos_env__$private$rejectionHandled, see it is FALSE, and change it to TRUE, but we still see the "Unhandled" console output, indicating something else is at work here. https://github.com/rstudio/promises/issues/86#issuecomment-1252420305 is related (or perhaps the same thing).
Linux, R-4.2.2, future-1.30.0 (1.31.0 is released), promises-1.2.0.1, later-1.3.0.

Add logs in a R Plumber Api

I'm creating some APIs with R and Plumber. I configure the entrypoint.R like that
library(plumber)
library(logger)
# Create Log directory
log_dir <- "/api/logs"
if (!fs::dir_exists(log_dir)) fs::dir_create(log_dir)
log_appender(appender_tee(tempfile("plumber_", log_dir, ".log")))
convert_empty <- function(string) {
if (string == "") {
"-"
} else {
string
}
}
# Initiate
# pr <- plumber::plumb(here::here('plumber.R')) # local test
pr <- plumber::plumb(rev(commandArgs())[1]) # via Docker
args <- list(host = '0.0.0.0', port = 8000)
if (packageVersion('plumber') >= '1.0.0') { pr$setDocs(TRUE) } else { args$swagger <- TRUE }
# Create log hook
pr$registerHooks(
list(
preroute = function() {
# Start timer for log info
tictoc::tic()
},
postroute = function(req, res) {
end <- tictoc::toc(quiet = TRUE)
# Log details about the request and the response
# log_info('{convert_empty(req$REMOTE_ADDR)} "{convert_empty(req$HTTP_USER_AGENT)}" {convert_empty(req$HTTP_HOST)} {convert_empty(req$REQUEST_METHOD)} {convert_empty(req$PATH_INFO)} {convert_empty(res$status)} {round(end$toc - end$tic, digits = getOption("digits", 5))}')
log_info('{convert_empty(req$REMOTE_ADDR)}')
}
)
)
# Run entrypoint
do.call(pr$run, args)
I can't have the full logs because this line
log_info('{convert_empty(req$REMOTE_ADDR)} "{convert_empty(req$HTTP_USER_AGENT)}" {convert_empty(req$HTTP_HOST)} {convert_empty(req$REQUEST_METHOD)} {convert_empty(req$PATH_INFO)} {convert_empty(res$status)} {round(end$toc - end$tic, digits = getOption("digits", 5))}')
creates an error and the application stops to work.
Also, I want to add my custom logs in the API to track the flow for debug purposes. For example in the api
#* Ping
#* #get /ping
function() {
rtn <- jsonlite::unbox(data.frame(Status = TRUE))
return(rtn)
}
I want to add my log so I can check what functions are calling and what is happening in them.
I found a library called Log4R that allows me to add logs from my functions and save them on a file. Same examples here.
Install the package
install.packages("log4r")
Use the package
logger <- logger()
info(logger, "Located nearest gas station.")
#> INFO [2019-09-04 16:31:04] Located nearest gas station.
warn(logger, "Ez-Gas sensor network is not available.")
#> WARN [2019-09-04 16:31:04] Ez-Gas sensor network is not available.
debug(logger, "Debug messages are suppressed by default.")

Is it possible to handle simple messages in R? If yes, how?

To handle warnings or errors one can use
result = tryCatch({
expr
}, warning = function(w) {
warning-handler-code
}, error = function(e) {
error-handler-code
}, finally = {
cleanup-code
}
but if expr gives a message through simpleMessage, how can I get it? Is there something like?:
message = function(m) {message-handler-code}
Or another function which allows me to capture the message?
Thank you!
Yes, you can use message = just as you can with warning and error:
result = tryCatch({
message("Hello world")
1 + 1
}, message = function(m) {m}
)
result
#> <simpleMessage in message("Hello world"): Hello world
>
It's more likely however that you would want to capture your result and message separately:
result <- withCallingHandlers({
message("Hello world")
1 + 1
}, message = function(m) {
lastMessage <<- m
invokeRestart("muffleMessage")
})
result
#> [1] 2
if(exists("lastMessage")) message(lastMessage)
#> Hello world
tryCatch is the most commonly useful solution for handling conditions.
However, tryCatch aborts execution and returns the value of the handler, rather than resuming execution of the code. This may not always be what you want; sometimes you want to handle a condition and carry on.
R allows this thanks to its incredibly powerful condition system.
For example, you can choose to silence all messages:
suppressMessages(expr)
The nice thing here is that the suppressMessages isn’t magic — it’s a plain old R function and you could write it yourself (but you do need to understand the condition system, and the price for its versatility is that it’s fairly complicated).
To illustrate this, here’s another way of handling messages — using withCallingHandlers — which suppresses the messages, carries on executing the code it’s called with, but at the same time logs the message in a list:
messages = list()
withCallingHandlers({
message('Hello world')
1 + 1
}, message = function (msg) {
messages <<- c(messages, msg)
tryInvokeRestart('muffleMessage')
})
tryInvokeRestart('muffleMessage') is the same method by which suppressMessages works. The only difference is that we added code to store the message.
As a last step, the above can even be wrapped inside a function:
with_logging = function (expr) {
messages = list()
log_message = function (msg) {
messages <<- c(messages, conditionMessage(msg))
tryInvokeRestart('muffleMessage')
}
result = withCallingHandlers(expr, message = log_message)
list(result = result, messages = messages)
}
And to use it:
with_logging({
message('this is a test')
1 + 1
})
$result
[1] 2
$messages
$messages[[1]]
[1] "this is a test\n"

Reconnect if connection fails using R

I have written small function but somehow does not work as expected.
I have connection to server and sometime the server is down so I cannot connect. The script is running in batch so I have to have it automatized.
The script should run the conn <- function(..) successfully (it means no error message) if not restart/re-check and restart again in approx. 1min time. This should run in loop until connection is established something like 12 hours. (approx.). The connection should be assigned to conn object so the object must return successful connection. (something like <Connection established, # 20180522 20:43:41 CET>
The function which does not work is here:
connect <- function(c) { message(paste("remaining run", c));
withRestarts(err <- tryCatch({ conn <- server.coonect(settings...) },
error=function(e) { invokeRestart("reconnect") }), reconnect = function() { message("re-connecting");
stopifnot(c > 0); for(i in 1:10) { Sys.sleep(6); cat(i) }; connect(c-1) }) }
connect(1000) # with approx. 1min sleep we get here over 12 hours running..
So the question is what is wrong and how to rewrite the function such it runs as expected. Thanks.
EDIT
It seems that the function should be:
connect <- function(c) { message(paste("remaining run", c));
withRestarts(err <- tryCatch({ server.coonect(settings...) },
error=function(e) { invokeRestart("reconnect") }), reconnect = function() { message("re-connecting");
stopifnot(c > 0); for(i in 1:10) { Sys.sleep(6); cat(i) } }) }
conn <- connect(1000)
EDIT 2
Here is comment for the above function I have tested:
I have tested the EDIT function by simulating the connection by first running the function without internet connection (now the function checks every 1:10 o 6sec, and after the function is running I connect to the internet, now I expect the function in next iteration pics up and connects to server if available...) what happens is that the function does not pick up the later possibility to connect...
If you only want to loop over the connection establishment this will work:
# simulate an instable connection
server.connect <- function(...) {
if (round(as.numeric(Sys.time())) %% 10 != 0) # about 90 % failed connections
stop("Connection error")
return(TRUE) # success
}
connect <- function(attempts = 1, sleep.seconds = 6) {
for (i in 1:attempts) {
res <- try(server.connect("my connection string"), silent = TRUE)
if (!("try-error" %in% class(res))) {
print("Connected...")
return(res)
}
print(paste0("Attempt #", i, " failed"))
Sys.sleep(sleep.seconds)
}
stop("Maximum number of connection attempts exceeded")
}
con <- connect(attempts = 10, sleep = 1)
Example execution log:
[1] "Attempt #1 failed"
[1] "Attempt #2 failed"
[1] "Attempt #3 failed"
[1] "Attempt #4 failed"
[1] "Attempt #5 failed"
[1] "Attempt #6 failed"
[1] "Connected..."

Do loop again when error

I tried to read everything, but i kind of got stuck on one problem.
By using bigrquery I create queries to Google BigQuery to get data - unfortunately sometimes my query doesn't work because of a time-out.
Q is a SQL-Query and BQ is supposed to store the data downloaded from BigQuery.
Does anybody know how to re-do a loop every time tryCatch gives me an error?
I got this so far:
BQ_Foo <- NULL
tryCatch(
{
repeat{
BQ_Foo <- query_exec(Q_foo,"bigquery")
if(is.list(BQ_Foo) == TRUE)break }
}
,error=function(e){cat("ERROR : Query not loaded!", "\n")}
)
EDIT:
I tried my first approach again and this time i received this error message:
Error in curl::curl_fetch_memory(url, handle = handle) :
Operation was aborted by an application callback
Does anybody know how to handle this?
Widely based on r2evans answer, here's how to do the same kind of things with withRestarts, with some helps from This blog post:
set.seed(2)
foo <- NULL
operation <- function(x,tries) {
message(paste("x is",x,"remaining tries",tries))
withRestarts(
tryCatch({
if (runif(1) < x) stop("fail!") else 1
},
error=function(e) { invokeRestart("retry")}),
retry = function() {
message("Retrying")
stopifnot(tries > 0)
operation(x,tries-1)
}
)
}
> operation(0.9,5)
# x is 0.9 remaining tries 5
# Retrying
# x is 0.9 remaining tries 4
# Retrying
# x is 0.9 remaining tries 3
# Retrying
# x is 0.9 remaining tries 2
# Retrying
# x is 0.9 remaining tries 1
[1] 1
It's a kind of recursive call, so you can do whatever you want before calling the function again.
You may do it in the tryCatch error handler the same way, the interest to use restarts handlers is to call a specific function, if you had two tryCatch for which you want nearly same handler behavior then you can add a parameter and use the same handler for each try catch, i.e.:
testfun <- function(x) {
withRestarts({
tryCatch(
{
ifelse(runif(1) < 0.5,stop("Error Message"),warning("Warning message"))
},
warning=function(e) { invokeRestart("logger", level="warning", message=e ) },
error=function(e) { invokeRestart("logger", level="error", message=e ) }
)
},
logger = function(level,message) {
message(date()," [",level,"]: ",message[['message']])
}
)
}
Giving:
> set.seed(2)
> testfun()
Fri Jul 29 14:15:11 2016 [error]: Error Message
> testfun()
Fri Jul 29 14:15:12 2016 [warning]: Warning message
> testfun()
Fri Jul 29 14:15:13 2016 [warning]: Warning message
> testfun()
Fri Jul 29 14:15:13 2016 [error]: Error Message
Main interest here is the factorizing of the logger method and to reduce code duplication.
Naïve Solution
You might start with a mildly naïve attempt of putting the repeat/while outside the tryCatch, something like this:
set.seed(2)
foo <- NULL
while (is.null(foo)) {
foo <- tryCatch({
if (runif(1) < 0.9) stop("fail!") else 1
},
error = function(e) { message("err"); NULL; }
)
}
# err
# err
# err
# err
message("success: ", foo)
# success: 1
Unfortunately you introduce the possibility that the loop will never return. To protect against this, you can try a counter ...
Less-Naïve Solution
set.seed(2)
foo <- NULL
max_attempts <- 3
counter <- 0
while (is.null(foo) && counter < max_attempts) {
counter <- counter + 1
foo <- tryCatch({
if (runif(1) < 0.9) stop("fail!") else 1
},
error = function(e) { message("err"); NULL; }
)
}
# err
# err
# err
if (is.null(foo)) message("final failure") else message("success: ", foo)
# final failure
Now this is better for you, but it may still inadvertently introduce a denial-of-service "attack" on the server. (Consider "why" the query failed: if it is because the server is temporarily inundated, then you are making things worse by clobbering it even for a few limited requests.) Though it slows you down a little, in the case of a busy server, putting in pauses will ease the burden on the server and possibly give you a better chance of a successful query before failing.
Better Solution
In network parlance, small TCP packets can cause congestion when repeated retried (see Nagle's Algorithm for a quick reference). Using some form of exponential backoff is common, and to guard against two (or more) clients doing exactly same backoff simultaneously, some clients jitter slightly (for example, httr::RETRY).
set.seed(2)
foo <- NULL
max_attempts <- 3
# borrowed from hadley/httr::RETRY
pause_cap <- pause_base <- 1
counter <- 0
while (is.null(foo) && counter < max_attempts) {
if (counter > 0L) {
length <- stats::runif(1, max = min(pause_cap, pause_base * (2 ^ counter)))
message("sleeping ", round(length, 1))
Sys.sleep(length)
}
counter <- counter + 1
foo <- tryCatch({
if (runif(1) < 0.9) stop("fail!") else 1
},
error = function(e) { message("err"); NULL; }
)
}
# err
# sleeping 0.7
# err
# sleeping 0.2
if (is.null(foo)) message("final failure") else message("success: ", foo)
# success: 1
Wrap-Up
Somewhat sloppy code, but I hope you get the point. Putting loops on network queries without some form of self-limit can very easily escalate into an inadvertent DOS.
Based on your ideas i created this code, which seems to work - i just need to stresstest it.
QFoo <- paste0('SQL Code', dateBQ, ' ')
BQFoo <- NULL
testfun <- function(QFoo) {
withRestarts({
tryCatch({
query_exec(QFoo, "bigquery")
},
warning = function(e) { invokeRestart("logger", level="warning", message = e) },
error = function(e) { invokeRestart("logger", level="error", message = e) })
},
logger = function(level, message) {
message(date(), " [", level, "]: ", message[['message']])
})
}
testfun(QFoo)

Resources