how to get data from the WTO API in R - r

library(httr)
library(jsonlite)
headers = c(
# Request headers
'Ocp-Apim-Subscription-Key' = '{subscription key}'
)
params = list()
# Request parameters
params['countries[]'] = '{array}'
resp <- GET(paste0("https://api.wto.org/tfad/transparency/procedures_contacts_single_window?"
, paste0(names(params),'=',params,collapse = "&")),
add_headers(headers))
if(!http_error(resp)){
jsonRespText<-fromJSON(rawToChar(content(resp,encoding = 'UTF-8')))$Dataset
jsonRespText
}else{
stop('Error in Response')
}
I don't know how to get response from an API in R. I have executed this code but the server is not responding...

If you examine the value of the resp object after running your code you'll notice a status code:
> resp
Response [https://tfadatabase.org/api/transparency/procedures_contacts_single_window?countries[]=%7Barray%7D]
Date: 2020-04-17 19:25
Status: 422
Content-Type: application/json
Size: 77 B
So the server actually did respond, it just didn't give you what you were hoping for. In the API documentation we can look up this code:
422 Unprocessable Entity
If a member cannot be found, or the request parameters are poorly
formed.
So I just went to the Query Builder and looked for a valid request URL and updated the code. It ran fine - i.e. Status 200.
This was the URL I used in the code:
https://api.wto.org/timeseries/v1/data?i=TP_A_0100&r=000&fmt=json&mode=full&lang=1&meta=false
and the value of resp was
Date: 2020-04-17 19:30
Status: 200
Content-Type: application/json; charset=utf-8
Size: 88 B
I cut out the subscription key in my results above. You can find the Query Builder here. Incidentally, in the Query Builder it automatically includes the subscription key and other "header" info in the URL. You can either remove that first and re-add it in your code, or just change your code to run GET() directly on their version of the URL.

Related

Calling a REST API in R

I recently discovered the dataforseo api and tryed to call it via R
library(httr)
username <- 'mygmailadress#gmail.com'
password <- 'mypassword'
dataforseo_api <- POST('https://api.dataforseo.com/v2/op_tasks_post/$data',
authenticate(username,password),
body = list(grant_type = 'client_credentials'),
type = "basic",
verbose()
)
This is the message I have received:
<- HTTP/1.1 401 Unauthorized
<- Server: nginx/1.14.0 (Ubuntu)
<- Date: Sun, 08 Jul 2018 13:31:34 GMT
<- Content-Type: application/json
<- Transfer-Encoding: chunked
<- Connection: keep-alive
<- WWW-Authenticate: Basic realm="Rest Server"
<- Cache-Control: no-cache, must-revalidate
<- Expires: 0
<- Access-Control-Allow-Origin: *
<- Access-Control-Allow-Methods: POST, GET, OPTIONS
<- Access-Control-Allow-Headers: Content-Type, Access-Control-Allow-Headers, Authorization, X-Requested-With
Do you know where my issue should come? Can you please help?
It looks like you're improperly configuring config. I don't see a config= in your code. The body is also not encoded correctly.
Also, in the API documentation I don't see anything about grant_type. It looks like an array of tasks should go there, e.g. something like:
{882394209: {'site': 'ranksonic.com', 'crawl_max_pages': 10}}
Response:
{'results_count': 1, 'results_time': '0.0629 sec.', 'results': {'2308949': {'post_id': 2308949, 'post_site': 'ranksonic.com',
'task_id': 882394209, 'status': 'ok'}}, 'status': 'ok'}
OK, so first off we need set_config or config=:
username <- 'Hack-R#stackoverflow.com' # fake email
password <- 'vxnyM9s7FAKESeIO' # fake password
set_config(authenticate(username,password), override = TRUE)
GET("https://api.dataforseo.com/v2/cmn_se")
Response [https://api.dataforseo.com/v2/cmn_se]
Date: 2018-07-08 16:20
Status: 200
Content-Type: application/json
Size: 551 kB
{
"status": "ok",
"results_time": "0.0564 sec.",
"results_count": 2187,
"results": [
{
"se_id": 37,
"se_name": "google.com.af",
"se_country_iso_code": "AF",
"se_country_name": "Afghanistan",
...
GET("https://api.dataforseo.com/v2/cmn_se/$country_iso_code")
Response [https://api.dataforseo.com/v2/cmn_se/$country_iso_code]
Date: 2018-07-08 15:48
Status: 200
Content-Type: application/json
Size: 100 B
{
"status": "ok",
"results_time": "0.0375 sec.",
"results_count": 0,
"results": []
GET("https://api.dataforseo.com/v2/cmn_se/$op_tasks_post")
Response [https://api.dataforseo.com/v2/cmn_se/$op_tasks_post]
Date: 2018-07-08 16:10
Status: 200
Content-Type: application/json
Size: 100 B
{
"status": "ok",
"results_time": "0.0475 sec.",
"results_count": 0,
"results": []
That was one thing. Also to POST data they need you to specify it as json, e.g. encode = "json". From their docs:
All POST data should be sent in the JSON format (UTF-8 encoding). The
keywords are sent by POST method passing tasks array. The data should
be specified in the data field of this POST array. We recommend to
send up to 100 tasks at a time.
Further:
The task setting is done using POST method when array of tasks is sent to
the data field. Each of the array elements has the following
structure:
then it goes on to list 2 required fields and many optional ones.
Note also that you can use reset_config() after as a better practice. If you're going to be running this a lot, sharing it, or using more than 1 computer I would also suggest to put your credentials in environment variables instead of your script for security and ease.
Another final word of advice is that you may want to just leverage their published Python client library and large compilation of examples. Since every new API request is something you'll be pioneering in R without their support, it may pay off to just do the data collection in Python.
This is an interesting API. If you get over to the Open Data Stack Exchange you should consider sharing it with that community.

R: fetching pdf documents from Companies House API

I'm trying to fetch documents from the API using R. Appreciate the clarification of the process in this post. I've been following the above steps with partial success, but still fail the last step to get access to documents' content:
Find the document filing you're interested in (e.g. make a filing history request1 for the company). Parse the response for the link to the document in the field "links" : { "document_metadata" : "link URI fragment here" }.
No problem:
library(httr)
library(jsonlite)
library(openssl)
### retrieving filing history ####
company_num = 'FC013908'
key = 'my_key'
fh_path = paste0('/company/', str_to_upper(company_num), "/filing-history")
fh_url <- modify_url("https://api.companieshouse.gov.uk/", path = fh_path)
fh_test <- GET(fh_url, authenticate(key, "")) #status_code = 200
fh_parsed <- jsonlite::fromJSON(content(fh_test, "text",encoding = "utf-8"), flatten = TRUE)
docs <- fh_parsed$items
Done.
2 For a given document request the document metadata via CH Document API3. Parse the response to get the document (mime) types available and the link to the actual document data (document URI fragment).
No problems here:
md_meta_url = docs$links.document_metadata[1]
key_pass <- paste0(key,":")
decoded_auth <- paste0('Basic ', base64_encode(key_pass))
md_test <- GET(md_meta_url,
add_headers(Authorization = decoded_auth)
)
md_test #status_code = 200!
md_parsed <- jsonlite::fromJSON(content(md_test, "text",encoding = "utf-8"), flatten = TRUE)
This way I can obtain the content URL:
cont_url = md_parsed$links$document
Request the actual document9, specifying the mime type (e.g. "application/pdf").
I do it while NOT following the redirect and, as expected, I get the 302 status code with the location header:
accept = 'application/pdf'
cont_test <- GET(cont_url,
add_headers(Authorization = decoded_auth,
Accept = accept),
config(followlocation = FALSE)
)
final_url <- cont_test$headers$location
> final_url
[1] "https://s3-eu-west-1.amazonaws.com/document-api-images-prod/docs/LjBouRHeXXpIYAvqYIPWL06iXaliPz6Pucp1OXCXQhI/application-pdf?AWSAccessKeyId=ASIAJX7TVURFXZTY5DNQ&Expires=1529483765&Signature=uUQx6RTW7XBLqx4L6pYr5tOUySg%3D&x-amz-security-token=FQoDYXdzEP%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDGxe7meYGe3OYhNwcSK3AwcVYJUXaUMf19oVO9s4qNPWN8AHjNNd5rrZhgE9YTkF1OmzyZSL5xHbls664kDP%2Bxd7dz9PIU5O1D%2BVxoDyoYcFiS6acDnO28KpfFE56lUZNfedf1jys%2FP0SJ8f%2F50Cbn93bfOlm0MZA9%2BQ2DYQvPfkWSvrDjMyCXHbu57gpZHjQKPNRTgzGXzUUCvFwREytGMM4eThhn4Glvvx%2FA8IiLbnsvgmEKw9iAj7KWIenhoJq3cTRytUpVeipLnQoBVLau8dFYkKdAHZaYM2Tlx0z6ObRb%2BGdm7W7eOVA1bFXuUXmUmnAHruDIwwLlgOVN2IJ9CxmJU22lY8jrEm%2BUivtrdp2oofn32PryBEJ8jJOg9cIpLbBBx%2FeOkng9zJwnZbute7Nmh%2BnaY2btsId6JjraFNsTvR%2B1qEZX9uuznUdJdqgVfTMj2gGrAmntwk0JAkILlvamzjWC%2F9vAqK7Xvt8aC6hlIMB2vdzTCU9Jf%2FrIMTClTJkk0BzBuvJ86t1l%2BXb4rF5Pab%2FegFpJ6nvZKqde%2F77wMMiTyG35EndmYx4AWqTIh9EofYwKZa9uciNvRT0E2%2BYnT5jZMo%2BdWn2QU%3D"
However, when I try to
Request this URI from Amazon again passing the content type you want again.
I get 400 error:
final_test <- GET(final_url,
add_headers(Authorization = decoded_auth,
Accept = accept
))
> final_test
Response [https://s3-eu-west-1.amazonaws.com/document-api-images-prod/docs/LjBouRHeXXpIYAvqYIPWL06iXaliPz6Pucp1OXCXQhI/application-pdf?AWSAccessKeyId=ASIAJX7TVURFXZTY5DNQ&Expires=1529483765&Signature=uUQx6RTW7XBLqx4L6pYr5tOUySg%3D&x-amz-security-token=FQoDYXdzEP%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDGxe7meYGe3OYhNwcSK3AwcVYJUXaUMf19oVO9s4qNPWN8AHjNNd5rrZhgE9YTkF1OmzyZSL5xHbls664kDP%2Bxd7dz9PIU5O1D%2BVxoDyoYcFiS6acDnO28KpfFE56lUZNfedf1jys%2FP0SJ8f%2F50Cbn93bfOlm0MZA9%2BQ2DYQvPfkWSvrDjMyCXHbu57gpZHjQKPNRTgzGXzUUCvFwREytGMM4eThhn4Glvvx%2FA8IiLbnsvgmEKw9iAj7KWIenhoJq3cTRytUpVeipLnQoBVLau8dFYkKdAHZaYM2Tlx0z6ObRb%2BGdm7W7eOVA1bFXuUXmUmnAHruDIwwLlgOVN2IJ9CxmJU22lY8jrEm%2BUivtrdp2oofn32PryBEJ8jJOg9cIpLbBBx%2FeOkng9zJwnZbute7Nmh%2BnaY2btsId6JjraFNsTvR%2B1qEZX9uuznUdJdqgVfTMj2gGrAmntwk0JAkILlvamzjWC%2F9vAqK7Xvt8aC6hlIMB2vdzTCU9Jf%2FrIMTClTJkk0BzBuvJ86t1l%2BXb4rF5Pab%2FegFpJ6nvZKqde%2F77wMMiTyG35EndmYx4AWqTIh9EofYwKZa9uciNvRT0E2%2BYnT5jZMo%2BdWn2QU%3D]
Date: 2018-06-20 08:37
Status: 400
Content-Type: application/xml
Size: 523 B
<BINARY BODY>
Needless to say, executing
browseURL(final_test$url)
returns Access Denied error. I suspect it may have something to do with Amazon authorization problems similar to those described here. Any ideas how to solve this final hurdle?
Thanks!
The answer was provided by #voracityemail in response to my question on Companies House Developers Hub. Basically, the final call doesn't require the Authorization header, so if you run the following code for final_test:
final_test <- GET(final_url, add_headers(Accept = accept))
It will return 200 code
> final_test
Response [https://s3-eu-west-1.amazonaws.com/document-api-images-prod/docs/Rl1qKy2kNqdskHUIsqU9u0bGzH2goTfJfnCrNg4S0lg/application-pdf?AWSAccessKeyId=ASIAJMG7NTZHYC4NH3MA&Expires=1530093768&Signature=EteMSmwXS%2FqqdOFRmYY%2Fgf187Aw%3D&x-amz-security-token=FQoDYXdzELf%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDOMKrcNPR6jb5bnzGSK3A1yzaoVZWhgAeXYCN9WJnxx8b%2BTKCEEZyZui3aR5j0WoNWIQhW9GIQ8R4xTGVkRjwQIhzgDp%2BRCfXGQ0CfPCOfseaQri5m%2BWTEWBgjfToL7%2FMdcC1IINMTFRrih1APE%2FmmTcQaW7SvyZWv3Q4bVQB%2FtOsiX5k8rWVsT7%2FecfQmnJMljcKF0%2F3vDRTtLRURTCtrdegfnIFrSqXkelLxVVypKY9UeURBgxAgngOgoP7YhYt3wD%2BEz5rBdNfMvF1Zuv91hLGDyBaKuV4fRKMRXlymDHCwNgNZl3JeyuAmnX8pexK6PJzH7MerM8QX8LoPfge1yutvqEj0%2FjRSYEShOWUebecQ2tJqWIEOZly0Ji8fc%2BMtFDO1FWZBrMl6lXgkwTMpELnTH5%2BP4ULMdFfEz30bWSnAuTGXcAxsoFWsFTIE2uO35zgkOsAUT2un4UNGnL2S8XexWbgwq%2B%2Bhtxo9ruP9WA8mTpjBkup2Qe5EpvUiNwGX9APjThi7QFTllVWWvpKgzKTSBh%2Btua9xK8RgiNAYDgEa5k%2BH%2FmWIP56WglBE6r3HGsXgbi%2Bff8Rg8z2lVFLo8f9hVv%2BCYoptXM2QU%3D]
Date: 2018-06-27 10:02
Status: 200
Content-Type: application/pdf
Size: 21.7 kB
<BINARY BODY>
and then
browseURL(final_test$url)
will open the specified document in the browser. Victory!

ArangoDB can't send request with curl

I can't unserstand what I am doing wrong, but when I am sending next request with curl, I am getting error:
echo {"id":1,"question":"aaa"},{"id":2,"question":"bbb?"} | curl -X POST --data-binary #- --dump - http://localhost:8529/_db/otest/_api/document/?collection=sitetestanswers
HTTP/1.1 100 (Continue)
HTTP/1.1 400 Bad Request
Server: ArangoDB
Connection: Keep-Alive
Content-Type: application/json; charset=utf-8
Content-Length: 100
{"error":true,"errorMessage":"failed to parse json object: expecting EOF","code":400,"errorNum":600}
Any ideas? I tied wrap it's to [...]. Nothing do not help.
With [...] validator mark this as valid
Same with D. Here is my code:
void sendQuestionsToArangoDB(Json questions)
{
string collectionUrl = "http://localhost:8529/_db/otest/_api/document/?collection=sitetestanswers";
auto rq = Request();
rq.verbosity = 2;
string s = `{"id":"1","question":"foo?"},{"id":2}`;
auto rs = rq.post(collectionUrl, s, "application/json");
writeln("SENDED");
}
--
POST /_db/otest/_api/document/?collection=sitetestanswers HTTP/1.1
Content-Length: 37
Connection: Close
Host: localhost:8529
Content-Type: application/json
HTTP/1.1 400 Bad Request
Server: ArangoDB
Connection: Close
Content-Type: application/json; charset=utf-8
Content-Length: 100
100 bytes of body received
For D I use this lib: https://github.com/ikod/dlang-requests
Same issue with vibed.
ArangoDB do not understand JSON if it's come ass array like [...]. It should be passed as key-value. So if you need pass array it should have key mykey : [].
Here is working code:
import std.stdio;
import requests.http;
void main(string[] args)
{
string collectionUrl = "http://localhost:8529/_db/otest/_api/document?collection=sitetestanswers";
auto rq = Request();
rq.verbosity = 2;
string s = `{"some_data":[{"id":1, "question":"aaa"},{"id":2, "question":"bbb"}]}`;
auto rs = rq.post(collectionUrl, s, "application/json");
writeln("SENDED");
}
otest - DB name
sitetestanswers - collection name (should be created in DB)
echo '[{"id":1,"question":"aaa"},{"id":2,"question":"bbb?"}]'
should do the trick. You need to put ticks around the JSON. The array brackets are necessary otherwise this is not valid JSON.
You are trying to send multiple documents. The data in the original question separates the documents by comma ({"id":1,"question":"aaa"},{"id":2,"question":"bbb?"}) which is invalid JSON. Thus the failed to parse json object answer from ArangoDB.
Putting the documents into angular brackets ([ ... ]) as some of the commentors suggested will make the request payload valid JSON again.
However, you're sending the data to a server endpoint that handles a single document. The API for POST /_api/document/?collection=... currently accepts a single document at a time. It does not work with multiple documents in a single request. It expects a JSON object, and whenever it is sent something different it will respond with an error code.
If you're looking for batch inserts, please try the API POST /_api/import, described in the manual here: https://docs.arangodb.com/HttpBulkImports/ImportingSelfContained.html
This will work with multiple documents in a single request. ArangoDB 3.0 will also allow sending multiple documents to the POST /_api/document?collection=... API, but this version is not yet released. A technical preview will be available soon however.

Refresh Token for Access Token Google API: R Code

I am attempting to retrieve an access token using my refresh token, client id and client secret for the youtube api using R Code.
This is google's example of how to POST a request.
POST /o/oauth2/token HTTP/1.1 Host: accounts.google.com Content-Type: application/x-www-form-urlencoded client_id=21302922996.apps.googleusercontent.com&client_secret=XTHhXh1SlUNgvyWGwDk1EjXB&refresh_token=1/6BMfW9j53gdGImsixUH6kU5RsR4zwI9lUVX-tqf8JXQ&grant_type=refresh_token
This was my r code:
library(httr)
url<- paste("https://accounts.google.com/o/oauth2/token?client_id=", client_id, "&client_secret=", client_secret, "&refresh_token=", refresh_token, "&grant_type=access_token", sep="")
POST(url)
And I keep getting this response:
Response [https://accounts.google.com/o/oauth2/token?client_id=xxxxxxxxxx&client_secret=xxxxxxxx&refresh_token=xxxxxxxxxxxxxxxxxxxxxx&grant_type=refresh_token]
Date: 2015-09-02 16:43
Status: 400
Content-Type: application/json
Size: 102 B
{
"error" : "invalid_request",
"error_description" : "Required parameter is missing: grant_type"
Is there a better way to do this? Maybe using RCurl? If so, what would the format of the request be? I would appreciate help on this!
The RAdwords package has a function to retrieve the refresh token. If you don't want to add the entire package you can just add the following code to your script.
refreshToken = function(google_auth) {
# This function refreshes the access token.
# The access token deprecates after one hour and has to updated
# with the refresh token.
#
# Args:
# access.token$refreh_token and credentials as input
# Returns:
# New access.token with corresponding time stamp
rt = rjson::fromJSON(RCurl::postForm('https://accounts.google.com/o/oauth2/token',
refresh_token=google_auth$access$refresh_token,
client_id=google_auth$credentials$c.id,
client_secret=google_auth$credentials$c.secret,
grant_type="refresh_token",
style="POST",
.opts = list(ssl.verifypeer = FALSE)))
access <- rt
access
}

Dealing with gzip encoded GET/OAUTH response in R

I'm new to both: R and OAUTH. I've learned a little using coursera examples on github API where OAUTH request gave plaintext response but now I'm trying to do something that is practicall for me and access EVE-Online CREST OAUTH API but instead of what I got when I tried github API (im using "httr" libary):
Response [https://api.github.com/users/jtleek/repos]
Date: 2014-12-14 08:57
Status: 200
Content-type: application/json; charset=utf-8
Size: 154 kB
[
{
"id": 12441219,
"name": "ballgown",
"full_name": "jtleek/ballgown",
"owner": {
"login": "jtleek",
"id": 1571674,
"avatar_url": "https://avatars.githubusercontent.com/u/1571674?v=3",
"gravatar_id": "",
...
I got this BINARY BODY response:
Response [https://crest-tq.eveonline.com/market/10000002/orders/buy/?type=https://crest-tq.eveonline.com/types/185/]
Date: 2014-12-14 08:05
Status: 200
Content-type: application/vnd.ccp.eve.MarketOrderCollection-v1+json; charset=utf-8
Size: 7.61 kB
<BINARY BODY>
And frankly I have no idea what to do with it. I'm preety sure its gzip (I used chrome extension postman to access the same information and header says its encoded with gzip) but I dont know how to uncompress it, maybe there is standard way of dealing with binary/gzip response but my google foo have failed me.
Here is exact code I'm running:
library(httr)
myapp <- oauth_app("my app name redacted", "my id redacted", "my secret redacted")
eve_token <- oauth2.0_token(oauth_endpoint(authorize = "https://login-tq.eveonline.com/oauth/authorize/",access = "https://login-tq.eveonline.com/oauth/token/"), myapp, scope = "publicData")
token <- config(token = eve_token)
req <- GET("https://crest-tq.eveonline.com/market/10000002/orders/buy/?type=https://crest-tq.eveonline.com/types/185/", token)
EDIT:
YES!!! :)
managed to figure it out :)
result <- content(req, type = "application/json; charset=utf-8")
while the reqular content(req) produced just raw binary data, the above translated it to json :)
Like I wrote above, what I needed to do was pass more information about content type and encoding used to content function like this:
result <- content(req, type = "application/json; charset=utf-8")
gzip part as its turned out was handled automagically, but the issue was strage content-type used by EVE API. when i explicitly passed desired content type R was able to read data as json without problem

Resources