How to set empty body in httr get request (? - r

How do I set an empty body for my get request without leaving out the Body parameter?
See the following example:
url <- "https://www.capitalonecareers.com/search-jobs/results?ActiveFacetID=0&CurrentPage=3&RecordsPerPage=15&Distance=50&RadiusUnitType=0&Keywords=&Location=&Latitude=&Longitude=&ShowRadius=False&CustomFacetName=&FacetTerm=&FacetType=0&SearchResultsModuleName=Search+Results&SearchFiltersModuleName=Search+Filters&SortCriteria=0&SortDirection=1&SearchType=5&CategoryFacetTerm=&CategoryFacetType=&LocationFacetTerm=&LocationFacetType=&KeywordType=&LocationType=&LocationPath=&OrganizationIds=&PostalCode=&fc=&fl=&fcf=&afc=&afl=&afcf="
GET(url = url, verbose())$headers$`content-length`
I get a result with Content length of 9125.
How can I do the equivilant with Setting a Body Parameter:
GET(url = url, body = NULL, verbose())$headers$`content-length`
(Has Status Code = 200, but no results besides an empty JSON – > Content length of 55).
What I tried:
Finding documentation on body. E.g. https://cran.r-project.org/web/packages/httr/vignettes/quickstart.html.
Trying to set "empty values":
Code:
GET(url = url, body = list() verbose())$headers$`content-length`
GET(url = url, body = "", verbose())$headers$`content-length`
GET(url = url, body = NULL, verbose())$headers$`content-length`
GET(url = url, body = c(), verbose())$headers$`content-length`
Examine the results from verbose(), see the Code above. But I don't see any differences for the request being sent.
Why I want to do it:
For some code (with dynamic request Methods) it seems easier to specify an empty Default value instead of adding an if Statement and adding a Body Parameter if one is present or leaving it out in case it is not needed.
I am aware that a Body should/would not have an Impact in a get request, see e.g.
https://stackoverflow.com/a/983458/3502164. And for this question that would be the case.

Related

HTTR POST method not sending JSON encoding right, getting unsupported media response back

Hopefully a quick question, I'm trying to connect to the KuCoin API, not super relevant as I think this is more an issue with how I'm using the POST function and how it sends JSON along
Here is my function that is supposed to place an order:
API.Order <- function(pair,buysell,price,size) {
path = "/api/v1/orders"
now = as.integer(Sys.time()) * 1000
json <- list(
clientOid = as.character(now),
side = buysell,
symbol=pair,
type="limit",
price=price,
size=size
)
json=toJSON(json, auto_unbox = TRUE)
str_to_sign = (paste0(as.character(now), 'POST', path, json))
signature = as.character(base64Encode(hmac(api_secret,str_to_sign,"sha256", raw=TRUE)))
passphrase=as.character(base64Encode(hmac(api_secret,api_passphrase,"sha256", raw=TRUE)))
response=content(POST(url=url,
path=path,
body=json,
encode="json",
config = add_headers("KC-API-SIGN"=signature,
"KC-API-TIMESTAMP"=as.character(now),
"KC-API-KEY"=api_key,
"KC-API-PASSPHRASE"=passphrase,
"KC-API-KEY-VERSION"="2")
),
"text",encoding = "UTF-8")
response
data.table(fromJSON(response)$data)
}
API.Order(pair,"sell",1.42,1.0)
And everything works, except I get the following response:
"{\"code\":\"415000\",\"msg\":\"Unsupported Media Type\"}"
Which is puzzling to me. Everything else checks out (the signature and other auth headers), and I set the encode to "json" in the POST.. I also can put it as standard "application/json" and neither works. I've been staring at this for hours now and I can't see what (likely very little) thing I got wrong?
Thanks

Running a POST request to get a Service Ticket inside a for loop

I'm working with the NIH/NLM REST API and attempting to programmatically pull lots of data at once. I've never worked with an API that validates with Service Tickets (TGT and ST) instead of OAUTH, that need to be refreshed for every GET request you make, so I'm not sure if I"m even going about this the right way. Any help much appreciated.
Here's the code I currently have:
library(httr)
library(jsonlite)
library(xml2)
UTS_API_KEY <- 'MY API KEY'
# post to the CAS endpoint
response <- POST('https://utslogin.nlm.nih.gov/cas/v1/api-key', encode='form', body=list(apikey = 'MY API KEY'))
# print out the status_code and content_type
status_code(response)
headers(response)$`content-type`
doc <- content(response)
action_uri <- xml_text(xml_find_first(doc, '//form/#action'))
action_uri
# Service Ticket
response <- POST(action_uri, encode='form', body=list(service = 'http://umlsks.nlm.nih.gov'))
ticket <- content(response, 'text')
ticket #this is the ST I need for every GET request I make
# build search_uri using the paste function for string concatenation
version <- 'current'
search_uri <- paste('https://uts-ws.nlm.nih.gov/rest/search/', version, sep='')
# pass the the query params into httr GET to get the response
query_string <- 'diabetic foot'
response <- GET(search_uri, query=list(ticket=ticket, string=query_string))
## print out some of the results
search_uri
status_code(response)
headers(response)$`content-type`
search_results_auto_parsed <- content(response)
search_results_auto_parsed
class(search_results_auto_parsed$result$results)
search_results_data_frame <- fromJSON(content(response,'text'))
search_results_data_frame
This code works perfectly for just a handful of GET requests, however, I'm attempting to pull 300-something medical terms. For example, in query string, I'd like to loop through an array of strings (e.g., "diabetes", "blood pressure", "cardiovascular care", "EMT", etc.). I'd need to make the POST request and pass the ST into the GET parameter for every string in the array.
I've played around with this code:
for (i in 1:length(Entity_Subset$Entities)){
ent = Entity_Subset$Entities[i] #Entities represents my df of strings
url <- paste(' https://uts-ws.nlm.nih.gov/rest/search/current?string=',
ent,'&ticket=', sep = "")
print(url)
}
But haven't had much luck piecemealing together the POST and GET requests after putting the strings into the (GET) HTTPS request.
Sidebar: I also attempted writing some pre-scripts in Postman, but oddly the Service Ticket doesn't return as JSON (no key-value pair to grab and pass). Just plain text.
Thank you for any advice you can provide!
I think you can simply wrap both POST and GET requests in a function. Then, lapply that function to a list of characters.
library(httr)
library(jsonlite)
library(xml2)
fetch_data <- function(query_string = 'diabetic foot', UTS_API_KEY = 'MY API KEY', version = 'current') {
response <- POST('https://utslogin.nlm.nih.gov/cas/v1/api-key', encode='form', body=list(apikey = UTS_API_KEY))
# print out the status_code and content_type
message(status_code(response), "\n", headers(response)$`content-type`)
action_uri <- xml_text(xml_find_first(content(response), '//form/#action')); message(action_uri)
# Service Ticket
response <- POST(action_uri, encode = 'form', body=list(service = 'http://umlsks.nlm.nih.gov'))
ticket <- content(response, 'text'); message(ticket)
# build search_uri using the paste function for string concatenation
search_uri <- paste0('https://uts-ws.nlm.nih.gov/rest/search/', version)
# pass the the query params into httr GET to get the response
response <- GET(search_uri, query=list(ticket=ticket, string=query_string))
## print out some of the results
message(search_uri, "\n", status_code(response), "\n", headers(response)$`content-type`)
fromJSON(content(response, 'text'))
}
# if you have a list of query strings, then
lapply(Entity_Subset$Entities, fetch_data, UTS_API_KEY = "blah blah blah")
# The `lapply` above is logically equivalent to
result <- vector("list", length(Entity_Subset$Entities))
for (x in Entity_Subset$Entities) {
result[[x]] <- fetch_data(x, "blah blah blah")
}

Problem with authorization to COINAPI REST API with custom header and key in R

I would like to connect to COINAPI resources. They provide two types of authorization. https://docs.coinapi.io/#authorization
Custom authorization header named X-CoinAPI-Key
Query string parameter named apikey
When I am using the first method, it is working with basic requests. But respond with an error in more advanced.
endpoint<-"/v1/exchangerate/BTC?apikey="
But when I specify endpoint like this:
endpoint <- "/v1/trades/BITSTAMP_SPOT_BTC_USD/history?time_start=2016-01-01T00:00:00/?apikey="
I got error 401.
The second method is not working so far, I do not really understand how can I specify custom header name here.
I need to get data from here:
https://rest.coinapi.io/v1/ohlcv/BTC/USD/history?period_id=1DAY&time_start=2017-01-02T00:00:00.0000000Z&time_end=2019-01-02T00:00:00.0000000Z&limit=10000&include_empty_items=TRUE
I would appreciate any help on this issue.
1. method (working)
library(httr)
library(jsonlite)
base <- "https://rest.coinapi.io"
endpoint <- "/v1/exchangerate/BTC?apikey="
api_key <- <KEY>
call <- paste0(base, endpoint, api_key)
call
get_prices <- GET(call)
http_status(get_prices)
class(get_prices)
get_prices_text <- content(get_prices, "text", encoding = 'UTF-8')
get_prices_json <- fromJSON(get_prices_text, flatten = TRUE)
names(get_prices_json)
get_prices_json$asset_id_base
head(get_prices_json$rates)
data<-as.data.frame(get_prices_json)
2. method (not working)
key<-<KEY>
GET(
url = sprintf("https://rest.coinapi.io/v1/exchangerate/BTC"),
add_headers(`Authorization` = sprintf("X-CoinAPI-Key: ", key))
) -> res
http_status(res)
From reading the examples in the documentation, it looks like it's just looking for a simple header, not an "Authorization" header specifically. Try this
GET(
url = sprintf("https://rest.coinapi.io/v1/exchangerate/BTC"),
add_headers(`X-CoinAPI-Key` = key)
) -> res
http_status(res)

Setting cookies and reading html

I need to read a source code for a reasearch and I can read the full text when I use a browser, but in R there is a hidden part. The code is replaced by a message saying that the content is allowed just for browsers which use cookies.
Based on the question
How to properly set cookies to get URL content using httr
I am using the following code:
library(httr)
url<-"https://www.ogol.com.br/player_results.php?id=5637"
r <- GET(url, query = list(a = 1))
cookies(r)
response<-GET(url,
set_cookies(`__cfduid` = "dde27d084f28a84488910bf48f22f5fa01530024956",
`FORCE_SITE_VERSION` = "desktop",
`FORCE_MODALIDADE` = "1",
`PHPSESSID` = "uou4jukkosdaafidp26857k8t3"))
player_code<-content(x = response,as = "text", encoding = "ISO-8859-1")
But it also hides a part of the code and returns the message:
"Este conteúdo apenas está disponível para browsers que aceitam cookies" (put the message just to identify if your help has the same result :) )
It means: The content is available just for browsers that accept cookies.
Am I using wrong cookie values or any other clue? Thanks in advance.

R - Parse HTML only if http status response is 200

I have a dataframe urls which is just a list of URLs that I want to crawl to obtain a variable pageName defined in the source code. For this purpose I use the following code:
# Crawl Page Names
for(n in 1:length(urls$URL))
{
if (domain(urls$URL[n])=="www.domain.com") {
doc = readLines(con = file(as.character(urls$URL[n]), encoding = "UTF-8"))
close(con)
rowNumber = grep('s.pageName', doc)
datalines = grep(pageNamePattern,doc[rowNumber],value=TRUE)
gg = gregexpr(pageNamePattern,datalines)
matches = mapply(getexpr,datalines,gg)
matches = gsub(" ", "", matches[1], fixed = TRUE)
result = gsub(pageNamePattern,'\\1',matches)
names(result) = NULL
urls$pageName[n] = stri_unescape_unicode(result[1])
} else {
urls$pageName[n] <- NA
}
}
if (domain(urls$URL[n])=="www.domain.com") uses the function domain included in the urltools package and let me crawl just those URLs where I know the pageName variable is defined, which are those in a specific domain.
However, my code is interrupted if the parsed page's http status response returns a 4XX Client Error or a 5XX Server error.
I would like to add a second if to the code for doing the crawl only if the http status response of con is 200 (OK). Does someone have an idea on how to do it or which package or functions to use?

Resources