How remove backslashes in response for jsonlite in R? - r

I have the following code for R Plumber API server
library(jsonlite)
library(data.table)
#' Home endpoint
#' #get /
function(){
df <- data.table(msg = "Welcome")
toJSON(df)
}
It gives me ["[{\"msg\":\"Welcome\"}]"] result on API.
How to replace \" on " symbol to make it more human-friendly when working in a browser or Postman? Expected result is "msg":"Welcome".
Thanks!

plumber already jsonifies it, you are doubling it. Try this:
#' Home endpoint
#' #get /
function(){
df <- data.table::data.table(msg = "Welcome")
return(df)
}
And then in my console, I ran:
pr <- plumber::plumb("~/StackOverflow/4393334/60918243.R")
pr$run()
# Starting server to listen on port 5225
# Running the swagger UI at http://127.0.0.1:5225/__swagger__/
And then on my bash shell:
$ curl -s localhost:5225
[{"msg":"Welcome"}]

Related

How to send curl variables to plumber function dynamically?

I want to dynamically call the plumber API based on any number of input variables. I need to map the curl input to the input of a function name. For example if the function has an input hi then, curl -s --data 'hi=2' means that hi=2 should be passed as an input parameter to the function. This can be done directly in R with match.call() but it is failing while calling it through the plumber API.
Take the function
#' #post /API
#' #serializer unboxedJSON
tmp <- function(hi) {
out <- list(hi=hi)
out <- toJSON(out, pretty = TRUE, auto_unbox = TRUE)
return(out)
}
tmp(hi=2)
out: {hi:2}
Then
curl -s --data 'hi=10' http://127.0.0.1/8081/API
out: {\n \"hi\": \"2\"\n}
Everything looks good. However, take the function
#' #post /API
#' #serializer unboxedJSON
tmp <- function(...) {
out <- match.call() %>%
as.list() %>%
.[2:length(.)] # %>%
out <- toJSON(out, pretty = TRUE, auto_unbox = TRUE)
return(out)
}
tmp(hi=2)
out: {hi:2}
Then
curl -s --data 'hi=10' http://127.0.0.1/8081/API
out: {"error":"500 - Internal server error","message":"Error: No method asJSON S3 class: R6\n"}
In practice what I really want to do is load my ML model to predict a score with the plumber API. For example
model <- readRDS('model.rds') # Load model as a global variable
predict_score <- function(...) {
df_in <- match.call() %>%
as.list() %>%
.[2:length(.)] %>%
as.data.frame()
json_out <- list(
score_out = predict(model, df_in) %>%
toJSON(., pretty = T, auto_unbox = T)
return(json_out)
}
This function works as expected when running locally, but running through the API via curl -s --data 'var1=1&var2=2...etc' http://listen_address
I get the following error: {"error":"500 - Internal server error","message":"Error in as.data.frame.default(x[[i]], optional = TRUE): cannot coerce class \"c(\"PlumberResponse\", \"R6\")\" to a data.frame\n"}
Internally plumber match parameters in your request to the name of the parameters in your function. There are special arguments that you could use to explore all args in the request. If you have an argument named req, it will give you an environnement containing the entire request metadata, one of which is req$args. Which you could then parse. The first two args are self reference to special arguments req and res. They are environment and should not be serialized. I would not advise doing what is shown here in any production code as it opens up the api to abuse.
model <- readRDS('model.rds') # Load model as a global variable
#' #post /API
#' #serializer unboxedJSON
predict_score <- function(req) {
df_in <- as.data.frame(req$args[-(1:2)])
json_out <- list(
score_out = predict(model, df_in)
return(json_out)
}
But for your use case, what I would actually advise is having a single parameter named df_in. Here is how you would set that up.
model <- readRDS('model.rds') # Load model as a global variable
#' #post /API
#' #param df_in
#' #serializer unboxedJSON
predict_score <- function(df_in) {
json_out <- list(
score_out = predict(model, df_in)
return(json_out)
}
Then with curl
curl --header "Content-Type: application/json" \
--request POST \
--data '{"df_in":{"hi":2, "othercrap":4}}' \
http://listen_address
When the body of request starts with "{" plumber will parse the content of the body with jsonlite:fromJSON and use the name of the parsed objects to maps to parameters in your function.
Currently both CRAN and master branch on github do not handle this correctly via the swagger api but it will works just fine via curl or other direct calling method. Next plumber version will handle all that and more I believe.
See a similar answer to this of question here : https://github.com/rstudio/plumber/issues/512#issuecomment-605735332

Quit a Plumber API once a condition is met

I am trying to run a Plumber API inline to receive an input, and once the proper input is received and a specified condition is met, the input is returned to the globalenv and the API closes itself such that the script can continue to run.
I've specified a condition within a #get endpoint that calls quit(), stop() etc, none of which successfully shut down the API.
I've attempted to run the API in parallel using future such that the parent script can close the Plumber API.
It appears that there isn't actually a method in the Plumber API class object to close the Plumber API, and the API can't be closed from within itself.
I've been through the extended documentation, SO, and the Github Issues in search of a solution. The only semi-relevant solution suggested is to use R.Utils::withTimeout to create a time-bounded timeout. However, this method is also unable to close the API.
A simple use case:
Main Script:
library(plumber)
code_api <- plumber::plumb("code.R")
code_api$run(port = 8000)
code.R
#' #get /<code>
function(code) {
print(code)
if (nchar(code) == 3) {
assign("code",code,envir = globalenv())
quit()}
return(code)
}
#' #get /exit
function(exit){
stop()
}
The input is successfully returned to the global environment, but the API does not shut down afterward, nor after calling the /exit endpoint.
Any ideas on how to accomplish this?
You could look at Iterative testing with plumber #Irène Steve's, Dec 23 2018 with:
trml <- rstudioapi::terminalCreate()
rstudioapi::terminalKill(trml)
excerpt of her article (2nd version of 3):
.state <- new.env(parent = emptyenv()) #create .state when package is first loaded
start_plumber <- function(path, port) {
trml <- rstudioapi::terminalCreate(show = FALSE)
rstudioapi::terminalSend(trml, "R\n")
Sys.sleep(2)
cmd <- sprintf('plumber::plumb("%s")$run(port = %s)\n', path, port)
rstudioapi::terminalSend(trml, cmd)
.state[["trml"]] <- trml #store terminal name
invisible(trml)
}
kill_plumber <- function() {
rstudioapi::terminalKill(.state[["trml"]]) #access terminal name
}
Running a Plumber in the terminal might work in some cases but as I needed access to the R session (for insertText) I had to come up with the different approach. While not ideal the following solution worked:
# plumber.R
#* Insert
#* #param msg The msg to insert to the cursor location
#* #post /insert
function(msg="") {
rstudioapi::insertText(paste0(msg))
stop_plumber(Sys.getpid())
}
.state <- new.env(parent = emptyenv()) #create .state when package is first loaded
stop_plumber <- function(pid) {
trml <- rstudioapi::terminalCreate(show = FALSE)
Sys.sleep(2) # Wait for the terminal to initialize
# Wait a bit for the Plumber to flash the buffers and then send a SIGINT to the R session process,
# to terminate the Plumber
cmd <- sprintf("sleep 2 && kill -SIGINT %s\n", pid)
rstudioapi::terminalSend(trml, cmd)
.state[["trml"]] <- trml # store terminal name
invisible(trml)
Sys.sleep(2) # Wait for the Plumber to terminate and then kill the terminal
rstudioapi::terminalKill(.state[["trml"]]) # access terminal name
}

R Plumber posting a PDF

I am trying to access a PDF through an HTTP post request with R Plumber, read it with the tabulizer package, and respond with the PDF in JSON format. I am posting a 53kb PDF through Postman to my route and receiving the error:
Error in normalizePath(path.expand(path), winslash, mustWork).
My R API route code is below:
#' #post /tab
#' #json
function(req){
library("tabulizer")
f <- req$postBody
extract_tables(f)
}
When I use the extract_tables() function with a local path to the PDF that I am using it works perfectly as a get route.
#' #get /tab
#' #json
function(){
library("tabulizer")
f <- "C:/Users/kelse/Desktop/Rscripts/Tessaract/table.pdf"
extract_tables(f)
}
Does anybody know how to post a pdf file through Plumber and access it in a function?
You can try by using # serializer to get posted through HTTP
#* #serializer contentType list(type="application/pdf")
#* #get /pdf
function(){
tmp <- tempfile()
pdf(tmp)
plot(1:10, type="b")
text(4, 8, "PDF from plumber!")
text(6, 2, paste("The time is", Sys.time()))
dev.off()
readBin(tmp, "raw", n=file.info(tmp)$size)
}

Web scraping PDF files from a map

I've been trying to download pdfs embedded in a map following this code (original one can be found here). Each pdf refers to a brazilian municipality (5,570 files).
library(XML)
library(RCurl)
url <- "http://simec.mec.gov.br/sase/sase_mapas.php?uf=RJ&tipoinfo=1"
page <- getURL(url)
parsed <- htmlParse(page)
links <- xpathSApply(parsed, path="//a", xmlGetAttr, "href")
inds <- grep("*.pdf", links)
links <- links[inds]
regex_match <- regexpr("[^/]+$", links, perl=TRUE)
destination <- regmatches(links, regex_match)
for(i in seq_along(links)){
download.file(links[i], destfile=destination[i])
Sys.sleep(runif(1, 1, 5))
}
I already used this code in other projects a few times and it worked. For this specific case, it doesn't. In fact, I've tried many things to scrape these files but it seems impossible to me. Recently, I got the following link. Then it makes possible to combine uf (state) and muncod (municipal code) to download the file, but I dont know how to include this to the code though.
http://simec.mec.gov.br/sase/sase_mapas.php?uf=MT&muncod=5100102&acao=download
Thanks in advance!
devtools::install_github("ropensci/RSelenium")
library(rvest)
library(httr)
library(RSelenium)
# connect to selenium server from within r (REPLACE SERVER ADDRESS)
rem_dr <- remoteDriver(
remoteServerAddr = "192.168.50.25", port = 4445L, browserName = "firefox"
)
rem_dr$open()
# get the two-digit state codes for brazil by scraping the below webpage
tables <- "https://en.wikipedia.org/wiki/States_of_Brazil" %>%
read_html() %>%
html_table(fill = T)
states <- tables[[4]]$Abbreviation
# for each state, we are going to go navigate to the map of that state using
# selenium, then scrape the list of possible municipality codes from the drop
# down menu present in the map
get_munip_codes <- function(state) {
url <- paste0("http://simec.mec.gov.br/sase/sase_mapas.php?uf=", state)
rem_dr$navigate(url)
# have to wait until the drop down menu loads. 8 seconds will be enough time
# for each state
Sys.sleep(8)
src <- rem_dr$getPageSource()
out <- read_html(src[[1]]) %>%
html_nodes(xpath = "//select[#id='muncod']/option[boolean(#value)]") %>%
xml_attrs("value") %>%
unlist(use.names = F)
print(state)
out
}
state_munip <- sapply(
states, get_munip_codes, USE.NAMES = TRUE, simplify = FALSE
)
# now you can download each pdf. first create a directory for each state, where
# the pdfs for that state will go:
lapply(names(state_munip), function(x) dir.create(file.path("brazil-pdfs", x)))
# ...then loop over each state/municipality code and download the pdf
lapply(
names(state_munip), function(state) {
lapply(state_munip[[state]], function(munip) {
url <- sprintf(
"http://simec.mec.gov.br/sase/sase_mapas.php?uf=%s&muncod=%s&acao=download",
state, munip
)
file <- file.path("brazil-pdfs", state, paste0(munip, ".pdf"))
this_one <- paste0("state ", state, ", munip ", munip)
tryCatch({
GET(url, write_disk(file, overwrite = TRUE))
print(paste0(this_one, " downloaded"))
},
error = function(e) {
print(paste0("couldn't download ", this_one))
try(unlink(file, force = TRUE))
}
)
})
}
)
STEPS:
Get the IP address of your windows machine (see https://www.digitalcitizen.life/find-ip-address-windows)
start selenium server docker container by running this:
docker run -d -p 4445:4444 selenium/standalone-firefox:2.53.1
start rocker/tidyverse docker container by running this:
docker run -v `pwd`/brazil-pdfs:/home/rstudio/brazil-pdfs -dp 8787:8787 rocker/tidyverse
Go into your preferred browser and enter this address: http://localhost:8787 ...This will take you to the login screen for rstudio server. login using the username "rstudio" and password "rstudio"
Copy/paste the code shown above in a new Rstudio .R document. Replace the value for remoteServerAddr with the IP address you found in step 1.
Run the code...this should write the pdfs to a directory "brazil-pdfs" that is both inside the container and mapped to your windows machine (in other words, the pdfs will show up in the brazil-pdfs dir on your local machine as well). note, it takes a while to run the code b/c there are a lot of pdfs.

Using RCurl postForm to gather JSON data

I am trying to extract time series data from the BLS API using RCurl.
The BLS supplies the following sample code for command line extraction:
curl -i -X POST -H 'Content-Type: application/json'
-d '{"seriesid":["LEU0254555900", "APU0000701111"],
"startyear":"2002", "endyear":"2012"}'
http://api.bls.gov/publicAPI/v1/timeseries/data/
I have also confirmed that the specified files (i.e. series ids) are both present as the following both yield a JSON formatted object:
require(RCurl)
bls.content_test1 <- getURLContent("http://api.bls.gov/publicAPI/v1/timeseries/data/LEU0254555900")
bls.content_test2 <- getURLContent("http://api.bls.gov/publicAPI/v1/timeseries/data/APU0000701111")
Based on a variety of posts with the RCurl tag (and this post in particular), I have ported the command line script to the following chunk of code:
require(RJSONIO)
jsonbody <- toJSON(list("seriesid"=paste('"["CFU0000008000"', '[LEU0254555900"]"')
,"startyear"="2012"
,"endyear"="2013"))
httpheader <- c(Accept="application/json; charset=UTF-8",
"Content-Type"="application/json")
bls.content <- postForm("http://api.bls.gov/publicAPI/v1/timeseries/data/"
,.opts=list(httpheader=httpheader
,postfields=jsonbody))
Which yields:
[1] "{\"status\":\"REQUEST_FAILED\",\"responseTime\":0,\"message\":[\"Your request has failed, please check your input parameters and try your request again.\"],\"Results\":null}"
attr(,"Content-Type")
charset
"application/json" "UTF-8"
Does this appear to be a problem with my implementation of RCurl or does this appear to be a problem with the BLS API?
It's actually a problem with the way you are creating your json body. With your version if you do cat(jsonbody) you get
{
"seriesid": "\"[\"CFU0000008000\" [LEU0254555900\"]\"",
"startyear": "2012",
"endyear": "2013"
}
which has those extra escapes and brackets in there. It's not correct. Instead try
jsonbody <- toJSON(list("seriesid"=c("CFU0000008000", "LEU0254555900"),
startyear="2012",
endyear="2013"))
which gives
{
"seriesid": [ "CFU0000008000", "LEU0254555900" ],
"startyear": "2012",
"endyear": "2013"
}
which is valid JSON. Just changing that part and using the rest of the code as you had it gives me a REQUEST_SUCCEEDED message.
Here's an approach with httr:
library(httr)
library(jsonlite)
# Need unbox() to tell jsonlite() numbers are scalars, not vectors of length 1
body <- list(
seriesid = c("CFU0000008000", "LEU0254555900"),
startyear = unbox(2012),
endyear = unbox(2013)
)
r <- POST("http://api.bls.gov/publicAPI/v1/timeseries/data/", body = body, encode = "json")
stop_for_status(r)
# Need to specify type since site returns incorrect type of text/plain
str(content(r, type = "application/json"))

Resources