Asynchronous API endpoints with plumber - asynchronous

How do I define asynchronous API endpoints in plumber?
I didn't really find plumber-specific documentation on the topic except this example and this GitHub issue
When I try to reproduce the example, I get an error that R doesn't know how to turn a promise into JSON (at least that's what I think the problem is):
<simpleError: No method asJSON S3 class: promise>
Example
library(promises)
sleep_count <- 5
# add 5 seconds of sleep time
add_async_sleep <- function(p) {
n <- 20
for (i in 1:(sleep_count * n)) {
p <- then(p, function(value) {
Sys.sleep(1/n)
"" # return value
})
}
p
}
# use name_ as a placeholder for name when there are extra args
time <- function(name_, name = name_) {
paste0(name, ": ", Sys.time())
}
new_promise <- function() {
promise(function(resolve, reject){ resolve(NULL) })
}
#' #get /async
function() {
new_promise() %>%
add_async_sleep() %...>%
time("async")
}
Say this code lives in file plumber.R, then you should be able to start the API server and bring up Swagger with
r <- plumber::plumb(here::here("plumber.R"))
r$run()
Once I try out the endpoint /async, my R console reports
Starting server to listen on port 7361
Running the swagger UI at http://127.0.0.1:7361/__swagger__/
<simpleError: No method asJSON S3 class: promise>
and Swagger looks like this:
Disclaimer
I'm new to future and promises and only made it mid-way through the docs on https://rstudio.github.io/promises/ yet.

Related

Logging variables in Plumber

I have followed the Rstudio example here for logging for logging requests in Plumber (R API package) and would like to add other variables to the log. However, the registerHooks statement does not recognise global variables (<<-).
# Enable CORS Filtering
#' #filter auth_filter
auth_filter <- function(req, res) {
req_user <<- req$HEADERS['authorization'] %>% as.character()
req_tenant <<- req$HTTP_TENANT
}
pr$registerHooks(
list(
preroute = function() {
# Start timer for log info
tictoc::tic()
},
postroute = function(req, res) {
end <- tictoc::toc(quiet = TRUE)
# Log details about the request and the response
log_info('{convert_empty(req_user)} {convert_empty(req_tenant)} {convert_empty(req$REMOTE_ADDR)} "{convert_empty(req$HTTP_USER_AGENT)}" {convert_empty(req$HTTP_HOST)} {convert_empty(req$REQUEST_METHOD)} {convert_empty(req$PATH_INFO)} {convert_empty(res$status)} {round(end$toc - end$tic, digits = getOption("digits", 5))}')
}))
In the above example, req_user and req_tenant change for every request. The above example gives an error message, stating that req_user and req_tenant do not exist. I have also tried preserialize as an alternative to postroute. How can these variables be logged? The don't need to be global, this was just an additional attempt to solve the problem.
I believe the error is produced because of the convert_empty function not being able to handle null's or na's.
I adjusted this function to handle null's or na's in case they do occur:
convert_empty <- function(string) {
if (is.null(string) || is.na(string) || string == "") {
"-"
} else {
string
}
}
Using the sample plumber.R code that's produced in RStudio, this should work:
#Plumber.R file
library(plumber)
library(logger)
# Specify how logs are written
log_dir <- "logs"
if (!fs::dir_exists(log_dir)) fs::dir_create(log_dir)
log_appender(appender_tee(tempfile("plumber_", log_dir, ".log")))
#* Return the sum of two numbers
#* #param a The first number to add
#* #param b The second number to add
#* #post /sum
function(a, b) {
as.numeric(a) + as.numeric(b)
}
Then we register the hooks as follows:
library(plumber)
pr <- plumb("plumber.R")
convert_empty <- function(string) {
if (is.null(string) || is.na(string) || string == "") {
"-"
} else {
string
}
}
pr$registerHooks(
list(
preroute = function() {
# Start timer for log info
tictoc::tic()
},
postroute = function(req, res) {
end <- tictoc::toc(quiet = TRUE)
# Log details about the request and the response
log_info('{convert_empty(as.character(req$HEADERS["authorization"]))} {convert_empty(req$HTTP_TENANT)} {convert_empty(req$REMOTE_ADDR)} {convert_empty(req$HTTP_USER_AGENT)} {convert_empty(req$HTTP_HOST)} {convert_empty(req$REQUEST_METHOD)} {convert_empty(req$PATH_INFO)} {convert_empty(res$status)} {round(end$toc - end$tic, digits = getOption("digits", 5))}')
}
)
)
pr
In the console, you can do:
pr$run()
so serve your API locally.
From there, go to the terminal in RStudio and do a curl. Assuming is your port, an example would be:
curl -H "Authorization: Bearer my_token" -H "TENANT: 123" -X POST "http://127.0.0.1:9520/sum?a=1&b=2"
You should see a return of 3 in the terminal, and if you look at the R console, you will see the Authorization and tenant headers logged
So it turns out my specific problem was actually the format of tenant name, e.g. req$HTTP_TENANT = "Company A/S". When this is passed by log_info, it gets split into two columns, which is more than the logging table expects. Maybe the forward slash is escaping something.
For some reason, you have to put it in quotes in log_info, so log_info('"{convert_empty(req$HTTP_TENANT)}"') instead of log_info('{convert_empty(req$HTTP_TENANT)}') as with req$HTTP_USER_AGENT in the original post.

R - curl (not httr) POST request w/ JSON body

Let me start by saying that I understand how to do a POST request using "httr" and "crul" packages. I am working on developing an asynchronous method to sending multiple POST request with unique JSON body requests using the basic "curl" package. I have legitimate reasons for trying this with this package, but more importantly I'm just determined to get it to work. This may not be possible, or I may even be trying to wrong functions in "curl"...but wanted to see if anyone had any ideas.
I am trying to send a post request using curl_fetch_multi() as a POST request with a JSON in the body like this...
{
"configuration": {
"Id": 4507
},
"age": 0,
"zip": 32411,
"Date": "2020-12-23"
}
I have succeeded in at least getting getting error messages back form the API indicating an invalid body input using something along the lines of starting with an object containing each body i need to submit
library(curl)
library(jsonlite)
library(magrittr)
pool <- new_pool()
# results only available through call back function
cb <- function(req){cat("done:", req$url, ": HTTP:", req$status, "\n", "content:", rawToChar(req$content), "\n")}
# Create request for each body
for(i in 1:nrow(df)){
curl_fetch_multi(
"http://api.com/values?api_key=1234",
done = cb,
pool = pool,
handle = new_handle() %>%
handle_setopt(post = TRUE) %>%
handle_setheaders("Content-Type"="application/vnd.v1+json") %>%
handle_setform(body = df$body[[i]]) ###df$body[[i]] is a JSON string
)
}
# This actually performs requests
out <- multi_run(pool = pool)
done: http://api.com/values?api_key=1234 : HTTP: 400
content: {"errors":[{"code":"Service.input.invalid","message":"Invalid input"}]}
done: http://api.com/values?api_key=1234 : HTTP: 400
content: {"errors":[{"code":"Service.input.invalid","message":"Invalid input"}]}
....
I'm 90% positive it has to do with how it's attempting to call the JSON in handle_setform() setting of the handle. This is about where I am over my head and documentation is scarce.
Also, I am pretty sure the JSON is structured properly, as I can use them in other packages with no problem.
Any assistance would be greatly appreciated.
Found the solution!!
Needed to use following settings with handle_setopts()
for(i in 1:nrow(df)){
curl_fetch_multi(
"http://api.com/values?api_key=1234",
done = cb,
pool = pool,
handle = new_handle() %>%
handle_setheaders("Content-Type"="application/v1+json") %>%
handle_setopt(customrequest = "POST") %>%
handle_setopt(postfields = df$body[[i]]) #df$body is list of JSON
)
}
out <- multi_run(pool = pool)

Why is the Service Unavailable Error using Lapply?

I am using the spotifyr library where I want to find audio features for multiple tracks. For example I can do this in order to find the audio features of a specific song using it's id.
analysis2 <- get_track_audio_features("2xLMifQCjDGFmkHkpNLD9h",
authorization = get_spotify_access_token())
Yesterday, I wrote this function below that takes all the tracks in a dataframe and finds the audio features for all of them and stores them in a list and it was working fine.
get_analysis <- function(track_id)
{
analysis <- get_track_audio_features(track_id,
authorization = get_spotify_access_token())
}
tracks_list <- lapply(all_tracks$track.id, get_analysis)
Now I am getting an error saying Request failed [503] and Error in get_track_audio_features(track_id, authorization = get_spotify_access_token()) : Service Unavailable (HTTP 503).
I am still able to find the audio features of a specific song so I am not sure which service is unavailable.
I suspect you are reaching a song in your data for which the response is denied from spotify. You could try adding an error-catching mechanism to see which one it is:
get_analysis <- function(track_id){
tryCatch(
expr = {
get_track_audio_features(track_id, authorization = get_spotify_access_token())
},
error = function(e){
print(track_id)
}) -> analysis
return(analysis)
}
tracks_list <- lapply(all_tracks$track.id, get_analysis)
I looked at the source code for the package and didn't see any sneaky rate-limiting issues and the Web API page shows error 503 as a generic error that needs waiting to be resolved (https://developer.spotify.com/documentation/web-api/). Thus you could also try just adding a 10 minute wait (I couldn't find how long exactly it is on Spotify's website):
get_analysis <- function(track_id){
tryCatch(
expr = {
get_track_audio_features(track_id, authorization = get_spotify_access_token()) -> output
return(output)
},
error = function(e){
print(track_id)
return(e)
}) -> output
}
wait.function <- funciton(){
Sys.sleep(600)
}
get_analysis_master <- function(all_tracks){
k <- 1
tracks_list <- list()
for(track.id in all_tracks$track.id){
get_analysis(track.id) -> output
if(!inherits(output, "error")){
tracks_list[[k]] <- output
k <- k + 1
} else {
wait.function()
}
return(tracks_list)
}
get_analysis_master(all_tracks) -> tracks_list

R: 'unable to connect to 'maps.googleapis.com' on port 80' inside foreach loop

I'm new to stackoverflow, so please correct me if I make any major mistakes.
As a part of a bigger project I have a function that requests routes from Google and calculates the driving time, I do this with the package ggmap. This worked perfectly fine until I tried to speed things up on other parts of the project and needed to call the driving time function within a foreach loop. In the loop, when I use %dopar% it throws this error:
unable to connect to 'maps.googleapis.com' on port 80.
Does anyone know, where this error comes from and how it can be fixed?
I managed to produce a small example that shows the behaviour:
# necessary packages
library(ggmap)
library(doParallel)
library(doSNOW)
library(foreach)
# some lines to test the function in a for and a foreach loop
Origins <- c("Bern","Biel","Thun","Spiez")
Destinations <- c("Biel","Thun","Spiez","Bern")
numRoutes = length(Origins)
# numCores = detectCores()
# I use only 1 core in testing to make sure that the debug-file is readable
cl <- snow::makeCluster(1, outfile = "debug.txt")
registerDoSNOW(cl)
timesDoPar <-foreach(idx=1:numRoutes,
.packages = c("ggmap")) %dopar% {
getDrivingTime(Origins[idx], Destinations[idx])
}
timesDo <-foreach(idx=1:numRoutes,
.packages = c("ggmap")) %do% {
getDrivingTime(Origins[idx], Destinations[idx])
}
stopCluster(cl)
The function (with some extra for debugging):
getDrivingTime <- function(from, to){
if (from == to){
drivingTimeMin = 0
} else{
route_simple <- tryCatch({
message("Trying to get route from Google")
route(from, to, structure = "route", mode = "driving", output = "simple")
},
error=function(cond) {
message("Route throws an error:\nHere's the original error message:")
message(cond)
return(data.frame(minutes=0))
},
warning=function(cond) {
message("Route throws a warning:\nHere's the original warning message:")
message(cond)
return(data.frame(minutes=0))
},
finally={
message(paste0("\nProcessed route: ", from, "; ", to, "\n\n"))
})
drivingTimeMin = sum(route_simple$minutes, na.rm = TRUE)
}
return(drivingTimeMin)
}
I'm aware that in this example it would make absolutely no sense to use parallel programming - especially with using only one core - but in the scope of the full project it is needed.
I couldn't find any useful information related to this except for this question, where the person asking suggests that the problem might be with the network in their company. I don't think that this is the case for me, since it works with %do%. I couldn't test it in another network yet, though.
(I'm working on Windows 7, using a portable version of R (R version 3.1.0) and R Studio (Version 0.98.501))

R Shiny REST API communication

I have a shiny app that takes a JSON input file, runs it through a classifier and returns a classified JSON object.
I want the app to be able to communicate with an API. I want the API to post a file to the Shiny App which will do its work and return a classified object.
Basically I want the Shiny app to sit in the background until a file is posted and then do its work.
I know that I can use GET from the httr package to get a file from a url. I can put this in the shiny.server file which is fine if I know the file name for the get command
However the filenames coming from the API will be different. So is there any way that I can make this dynamic according to the Post request that comes from the API.
If you do not have to use Shiny, you can use openCPU. OpenCPU provides each of your R packages as REST service automatically. I work with OpenCPU and it works fine! It is the easiest way to use R from another program.
By now library(plumber) needs to be mentioned as an alternative in this context, however the following example is showing how to handle POST requests directly in shiny.
It is based on Joe Cheng's gist here, which suggests to add an attribute "http_methods_supported" to the UI and use httpResponse to answer the requests.
The below code starts a shiny app in a background R process (This is done only to have a single single file MRE - of course, you can put the app in a separate file and remove the r_bg-line). After the app is launched the parent process sends the iris data.frame to the UI.
In the UI function the req$PATH_INFO is checked (see uiPattern = ".*"), then the numerical columns are multiplied by 10 (query_params$factor) and send back as a json string.
library(shiny)
library(jsonlite)
library(callr)
library(datasets)
ui <- function(req) {
# The `req` object is a Rook environment
# See https://github.com/jeffreyhorner/Rook#the-environment
if (identical(req$REQUEST_METHOD, "GET")) {
fluidPage(
h1("Accepting POST requests from Shiny")
)
} else if (identical(req$REQUEST_METHOD, "POST")) {
# Handle the POST
query_params <- parseQueryString(req$QUERY_STRING)
body_bytes <- req$rook.input$read(-1)
if(req$PATH_INFO == "/iris"){
postedIris <- jsonlite::fromJSON(rawToChar(body_bytes))
modifiedIris <- postedIris[sapply(iris, class) == "numeric"]*as.numeric(query_params$factor)
httpResponse(
status = 200L,
content_type = "application/json",
content = jsonlite::toJSON(modifiedIris, dataframe = "columns")
)
} else {
httpResponse(
status = 200L,
content_type = "application/json",
content = '{"status": "ok"}'
)
}
}
}
attr(ui, "http_methods_supported") <- c("GET", "POST")
server <- function(input, output, session) {}
app <- shinyApp(ui, server, uiPattern = ".*")
# shiny::runApp(app, port = 80, launch.browser = FALSE, host = "0.0.0.0")
shiny_process <- r_bg(function(x){ shiny::runApp(x, port = 80, launch.browser = FALSE, host = "0.0.0.0") }, args = list(x = app))
library(httr)
r <- POST(url = "http://127.0.0.1/iris?factor=10", body = iris, encode = "json", verbose())
recievedIris <- as.data.frame(fromJSON(rawToChar(r$content)))
print(recievedIris)
shiny_process$kill()
Please also check this related PR which is providing further examples (also showing how to use session$registerDataObj) and is aiming at a better description of the httpResponse function.

Resources