R - plumber, how to add a stop/quit condition to the run? - r

I have a small API, that I need to stop running if a certain time in the day is reached, say midnight for example, and then continue on to the rest of the program. And that independant from a call from a request. Beacause one could include a function that breaks the process by checking the Sy.time(), but that will only be executed if a request come through.
My guess is that I have to modify some attribute of the $run, but can't find much about it on the internet.
I know that one could get the pid and kill it with a system command, but I don't know if that solution makes sure that the rest of the program runs.
Does anyone have an idea?
Thanks in advance.
The file containing the function looks like :
(my_file.R)
#* #param x My argument
#* #get /lag_lead
function(x){
x <- as.numeric(x)
c(x-1, x+1)
}
and the running script :
library(plumber)
mon_api <- plumb('my_file.R')
mon_api$run(port = 8000)
print('hello')

One way to do so, would be to set a timeout with withTimeout which is a wrapper for setTimeLimit
library(R-utils)
withTimeout(mon_api$run(port = 8000), timeout = 30)
# timeout is in seconds
# calculate timeout, using difftime between now and midnight
tmo <- as.numeric(difftime(as.POSIXct("2018-12-28 00:00:00"), Sys.time(), units = "secs"))
withTimeout(mon_api$run(port = 8000), timeout = tmo)
Some other timeout solutions, one using parallel
Time out an R command via something like try()

Related

Using Sys.sleep() to delay API call

I'm using R to make an API call to a weather data provider to download some weather forecasts. I'm using a free key that allows me to make no more than 10 calls per minute. I've tried using Sys.sleep() to ensure I don't go over the threshold but the API resource monitor tells me that I've exceeded the number of calls.
For example, if I'm making 6 calls, a time interval of 10 seconds between the calls ought to be sufficient (not taking into account the time R would need).
dat <- list()
for(i in 1:6){
dat[[i]] <- getWeatherData(web_url, api_key, history_date, data_format)
Sys.sleep(10)
web_url <- gsub(i-1, i, url)
}
The getWeatherData function does the following:
makes the API call (only one API call is made each time the function is invoked. Uses httr::GET() to get the data),
parses the XML output to get desired variables (regulat expressions),
performs some clean-up (for missing/garbage values),
converts strings to R date-time objects (POSIXct), and
rounds values to the nearest hour (lubridate::round_date()).
Function inputs:
web_url is a custom url,
api_key is my personal key,
history_date is a string (formatted as "%d/%m/%Y %H:%M:%S"), and
data_format specifies if I want an .XML or .json file as output.
I can not share the url/key for obvious reasons. As soon as I run this, I get a notification from the data provider that I've exceeded the allowable calls per minute (10). I don't get a notification every time - not sure why that is either.
Any help is appreciated!
This solution should be helpful for you if Sys.sleep doesn't do the trick.
Basically, this replaces the use of Sys.sleep with while logic.
dat <- list()
delay_seconds<-10
for(i in 1:6){
dat[[i]] <- getWeatherData(web_url, api_key, history_date, data_format)
date_time<-Sys.time()
while((as.numeric(Sys.time()) - as.numeric(date_time))<delay_seconds){}
web_url <- gsub(i-1, i, url)
}
Here, we are:
defining a number of seconds to wait ( delay_seconds<-10 )
defining a start time for comparison ( date_time<-Sys.time() )
using a while loop that checks the present time in comparison to our comparison time and seeing if this is less than our chosen delay interval ( (as.numeric(Sys.time()) - as.numeric(date_time)<delay_seconds )
doing nothing until the wait time is over( {} )
Not knowing if you need/want to, but in the case that you're hoping to get your data out of the lists and into a longer combined form, I recommend the dplyr function bind_rows().
dat2<-bind_rows(dat)
Thanks to an answer by rbtj to this question: How to make execution pause, sleep, wait for X seconds in R?

Loop to wait for result or timeout in r

I've written a very quick blast script in r to enable interfacing with the NCBI blast API. Sometimes however, the result url takes a while to load and my script throws an error until the url is ready. Is there an elegant way (i.e. a tryCatch option) to handle the error until the result is returned or timeout after a specified time?
library(rvest)
## Definitive set of blast API instructions can be found here: https://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/new/BLAST_URLAPI.html
## Generate query URL
query_url <-
function(QUERY,
PROGRAM = "blastp",
DATABASE = "nr",
...) {
put_url_stem <-
'https://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Put'
arguments = list(...)
paste0(
put_url_stem,
"&QUERY=",
QUERY,
"&PROGRAM=",
PROGRAM,
"&DATABASE=",
DATABASE,
arguments
)
}
blast_url <- query_url(QUERY = "NP_001117.2") ## test query
blast_session <- html_session(blast_url) ## create session
blast_form <- html_form(blast_session)[[1]] ## pull form from session
RID <- blast_form$fields$RID$value ## extract RID identifier
get_url <- function(RID, ...) {
get_url_stem <-
"https://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Get"
arguments = list(...)
paste0(get_url_stem, "&RID=", RID, "&FORMAT_TYPE=XML", arguments)
}
hits_xml <- read_xml(get_url(RID)) ## this is the sticky part
Sometimes it takes several minutes for the get_url to go live so what I would like is to do is to keep trying let's say every 20-30 seconds until it either produces the url or times out after a pre-specified time.
I think you may find this answer about the use of tryCatch useful
Regarding the 'keep trying until timeout' part. I imagine you can work on top of this other answer about a tryCatch loop on error
Hope it helps.

Prompt user without waiting

I have a long running process, coded in "R". I would like to continue running it in RStudio, I don't want to use batch mode.
I would like to allow the user to gracefully terminate the long running process, for example by pressing the escape key. If the user doesn't press anything, the process continues, without waiting.
I have read other StackOverflow posts, perhaps I need to prompt the user using scan/readline on a different thread. That way, the main execution thread isn't blocked.
Isn't there a simpler way?
Thank you for any pointers/suggestions.
Richard Rogers
Further comments:
I've made a few mistakes:
I didn't realize that pressing escape in RStudio while the code is
running halts execution.
I can't seem to determine where execution ends when I press escape.
Maybe I can use a simpler question.
Here is a simple function:
ProcessData <- function()
{
Continue <- TRUE
Iteration <- 1
TestData <- vector(mode = "integer", length = 100000)
while (Continue)
{
writeLines(sprintf("Processing iteration %d, Current time is %s", Iteration, Sys.time()))
process.events()
TestData <- round(runif(100000, min = 1, max = 10))
# Continue <- PromptUser()
Iteration <- Iteration + 1
}
writeLines("Processing ending.")
head(TestData)
}
If I press escape while the loop is running, the writeLines and head calls don't get executed. How can I ensure that they do?
Thank you again,
Richard
I know this is an old question, but since I had a similar context (long-running task), here is what I came up with:
long_computation <- function() for(i in 1:10) Sys.sleep(1)
exit_gracefully <- function() cat("Saving results so far...\n")
tryCatch(
long_computation(),
finally = exit_gracefully()
)
If we press escape during the computation, no error condition seems to be thrown; however the finally part of tryCatch gets executed. This allows us to clean up, close connections etc.

Manual API rate limiting

I am trying to write a manual rate-limiting function for the rgithub package. So far this is what I have:
library(rgithub)
pull <- function(i){
commits <- get.pull.request.commits(owner = owner, repo = repo, id = i, ctx = get.github.context(), per_page=100)
links <- digest_header_links(commits)
number_of_pages <- links[2,]$page
if (number_of_pages != 0)
try_default(for (n in 1:number_of_pages){
if (as.integer(commits$headers$`x-ratelimit-remaining`) < 5)
Sys.sleep(as.integer(commits$headers$`x-ratelimit-reset`)-as.POSIXct(Sys.time()) %>% as.integer())
else
get.pull.request.commits(owner = owner, repo = repo, id = i, ctx = get.github.context(), per_page=100, page = n)
}, default = NULL)
else
return(commits)
}
list <- c(500, 501, 502)
pull_lists <- lapply(list, pull)
The intention i that if the x-ratelimit-remaining variable goes below a certain threshold the script should wait until the time specified in x-ratelimit-reset has passed, and then continue the script. However, I'm not sure if this is the actual behavior of the if else set up that I have here.
The function runs fine, but I have some doubts about whether it actually does the rate limiting or whether it somehow skips that steps. Hence I ask: a) how can I find out if it actually does rate-limiting, and b) if not, how can I rewrite it so that it actually does rate limiting? Would a while condition/loop perhaps be better?
You can test if it does the rate limiting changing 5 to a large enough number and adding a display of the timing of Sys.sleep using:
print(system.time(Sys.sleep(...)))
That said, the function seems ok to me, unfortunately I cannot test it easily as rgithub is not available for my version of R (3.1.3).
Not a canonical answer, but some working example.
You should add some logging in your script, even kind of write.csv(append=TRUE).
I've implemented automatic antiddos process which prevent your ip to be banned by the exchange market. You can find it jangorecki/Rbitcoin/R/utils.R.
Rbitcoin.last_api_call is env object stored in package namespace, kind of session package cache.
This can help you with setting it in your package.
You should also consider a optional parallel supported version. Linking to database with concurrency read. My function can be easy modified to queue call and recheck timing every X seconds.
Edit
I forget to add that mentioned function support multiple source systems. That allows for example to extend your rgithub for bitbucket, etc. and still effectively manage API rate limiting.

Kill a calculation programme after user defined time in R

Say my executable is c:\my irectory\myfile.exe and my R script calls on this executeable with system(myfile.exe)
The R script gives parameters to the executable programme which uses them to do numerical calculations. From the ouput of the executable, the R script then tests whether the parameters are good ore not. If they are not good, the parameters are changed and the executable rerun with updated parameters.
Now, as this executable carries out mathematical calculations and solutions may converge only slowly I wish to be able to kill the executable once it has takes to long to carry out the calculations (say 5 seconds)
How do I do this time dependant kill?
PS:
My question is a little related to this one: (time non dependant kill)
how to run an executable file and then later kill or terminate the same process with R in Windows
You can add code to your R function which issued the executable call:
setTimeLimit(elapse=5, trans=T)
This will kill the calling function, returning control to the parent environment (which could well be a function as well). Then use the examples in the question you linked to for further work.
Alternatively, set up a loop which examines Sys.time and if the expected update to the parameter set has not taken place after 5 seconds, break the loop and issue the system kill command to terminate myfile.exe .
There might possibly be nicer ways but it is a solution.
The assumption here is, that myfile.exe successfully does its calculation within 5 seconds
try.wtl <- function(timeout = 5)
{
y <- evalWithTimeout(system(myfile.exe), timeout = timeout, onTimeout= "warning")
if(inherits(y, "try-error")) NA else y
}
case 1 (myfile.exe is closed after successfull calculation)
g <- try.wtl(5)
case 2 (myfile.exe is not closed after successfull calculation)
g <- try.wtl(0.1)
MSDOS taskkill required for case 2 to recommence from the beginnging
if (class(g) == "NULL") {system('taskkill /im "myfile.exe" /f',show.output.on.console = FALSE)}
PS: inspiration came from Time out an R command via something like try()

Resources