Generate OAuth Token using R - r

Novice - first time attempting to extract data via an API and using R.
I obtained a API Key and the Secret.
Converted to base64.
Now perplexed as to the next step where the instructions that I have state that I should "Enter the generated base64value in the header and request body and call the token URI as shown below;
[Code]
Authorization: Basic {base64value}
Content-Type: application/x-www-form-urlencoded
POST https://api.destination.com/oauth/token
grant_type=client_credentials
[/Code]
Any insight as to if R can be used to obtain the OAuth Token?
If so, what are the required packages that I need to install?
What are the specific steps?
Currently reading several books on R but thought that someone will be able to provide some insight.
Thanks in advance.

The httr package is a great place to start learning API's in R. If you haven't been lead there already I highly recommend taking the time to check it out.
library(httr)
base64_value <- your_generated_base64string
response <-
POST(url = "https://api.precisely.com/oauth/token",
add_headers(Authorization = paste("Basic", base64_value))
body = list(grant_type = "client_credentials"),
encode = "form")
# we're hoping this is 200
response$status

The following code does not work. A status code of 401 is the result.
After attempting this numerous times, maybe I need to once again regenerate another key and secret. Then, obtain another base64 value. Then, try again. Other options include trying Rcurl or even trying Python.
I assume that the server eventually locks out a base64value after so many unsuccessful attempts.
Appreciate the time/insight.
Any additional insight is appreciated.
library(httr)
base64_value <-
"123456789="
response14 <-
httr::POST (url = "https://api.precisely.com/oauth/token",
httr::add_headers(Authorization = paste("Basic", base64_value)),
body = list(grant_type = "client_credentials"),
encode = "form"
)

Latest iteration.
Error received is regarding timeout.
library(httr)
base64_value <-
"123456789="
response14 <-
httr::POST (url = "https://api.precisely.com/oauth/token",
httr::add_headers(Authorization = paste("Basic", "123456789=")),
body = list(grant_type = "client_credentials"),
encode = "form"
)
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [api.precisely.com] Resolving timed out after 10000 milliseconds

Related

Using R to download data automatically

I want to download all the data in either pdf or excel for
each State X Crop Year X Standard Reports combination from this website.
I followed this tutorial to do what I want.
Download data from URL
However, I hit an error on the second line.
driver <- rsDriver()
Error in subprocess::spawn_process(tfile, ...) :
group termination: could not assign process to a job: Access is denied
Are there any alternative methods that I could use to download these data?
First, check robots.txt on the website if there is any. Then read the terms and conditions if there is any. And it is always important to throttle the request below.
After checking all the terms and conditions, the code below should get you started:
library(httr)
library(xml2)
link <- "https://aps.dac.gov.in/LUS/Public/Reports.aspx"
r <- GET(link)
doc <- read_html(content(r, "text"))
#write_html(doc, "temp.html")
states <- sapply(xml_find_all(doc, ".//select[#name='DdlState']/option"), function(x)
setNames(xml_attr(x, "value"), xml_text(x)))
states <- states[!grepl("^Select", names(states))]
years <- sapply(xml_find_all(doc, ".//select[#name='DdlYear']/option"), function(x)
setNames(xml_attr(x, "value"), xml_text(x)))
years <- years[!grepl("^Select", names(years))]
rptfmt <- sapply(xml_find_all(doc, ".//select[#name='DdlFormat']/option"), function(x)
setNames(xml_attr(x, "value"), xml_text(x)))
stdrpts <- unlist(lapply(xml_find_all(doc, ".//td/a"), function(x) {
id <- xml_attr(x, "id")
if (grepl("^TreeView1t", id)) return(setNames(id, xml_text(x)))
}))
get_vs <- function(doc) sapply(xml_find_all(doc, ".//input[#type='hidden']"), function(x)
setNames(xml_attr(x, "value"), xml_attr(x, "name")))
fmt <- rptfmt[2] #Excel format
for (sn in names(states)) {
for (yn in names(years)) {
for (srn in seq_along(stdrpts)) {
s <- states[sn]
y <- years[yn]
sr <- stdrpts[srn]
r <- POST(link,
body=as.list(c("__EVENTTARGET"="DdlState",
"__EVENTARGUMENT"="",
"__LASTFOCUS"="",
"TreeView1_ExpandState"="ennnn",
"TreeView1_SelectedNode"="",
"TreeView1_PopulateLog"="",
get_vs(doc),
DdlState=unname(s),
DdlYear=0,
DdlFormat=1)),
encode="form")
doc <- read_html(content(r, "text"))
treeview <- c("__EVENTTARGET"="TreeView1",
"__EVENTARGUMENT"=paste0("sStandard Reports\\", srn),
"__LASTFOCUS"="",
"TreeView1_ExpandState"="ennnn",
"TreeView1_SelectedNode"=unname(stdrpts[srn]),
"TreeView1_PopulateLog"="")
vs <- get_vs(doc)
ddl <- c(DdlState=unname(s), DdlYear=unname(y), DdlFormat=unname(fmt))
r <- POST(link, body=as.list(c(treeview, vs, ddl)), encode="form")
if (r$headers$`content-type`=="application/vnd.ms-excel")
writeBin(content(r, "raw"), paste0(sn, "_", yn, "_", names(stdrpts)[srn], ".xls"))
Sys.sleep(5)
}
}
}
Here is my best attempt:
If you look in the network activities you will see a post request is sent:
Request body data:
If you scroll down you will see the form data that is used.
body <- structure(list(`__EVENTTARGET` = "TreeView1", `__EVENTARGUMENT` = "sStandard+Reports%5C4",
`__LASTFOCUS` = "", TreeView1_ExpandState = "ennnn", TreeView1_SelectedNode = "TreeView1t4",
TreeView1_PopulateLog = "", `__VIEWSTATE` = "", `__VIEWSTATEGENERATOR` = "",
`__VIEWSTATEENCRYPTED` = "", `__EVENTVALIDATION` = "", DdlState = "35",
DdlYear = "2001", DdlFormat = "1"), .Names = c("__EVENTTARGET",
"__EVENTARGUMENT", "__LASTFOCUS", "TreeView1_ExpandState", "TreeView1_SelectedNode",
"TreeView1_PopulateLog", "__VIEWSTATE", "__VIEWSTATEGENERATOR",
"__VIEWSTATEENCRYPTED", "__EVENTVALIDATION", "DdlState", "DdlYear",
"DdlFormat"))
There are certain session related values:
attr_names <- c("__EVENTVALIDATION", "__VIEWSTATEGENERATOR", "__VIEWSTATE", "__VIEWSTATEENCRYPTED")
You could add them like this:
setAttrNames <- function(attr_name){
name <- doc %>%
html_nodes(xpath = glue("//*[#id = '{attr_name}']")) %>%
html_attr(name = "value")
body[[attr_name]] <<- name
}
Then you can add this session specific values:
library(rvest)
library(glue)
url <- "https://aps.dac.gov.in/LUS/Public/Reports.aspx"
doc <- url %>% GET %>% content("text") %>% read_html
sapply(attr_names, setAttrNames)
Sending the request:
Then you can send the request:
response <- POST(
url = url,
encode = "form",
body = body,
hdrs
)
response$status_code # still indicates that we have an error in the request.
Follow up ideas:
I checked for cookies. There is a session cookie, but it does not seem to be necessary for the request.
Adding headers.
Trying to set the request headers
header <- structure(c("aps.dac.gov.in", "keep-alive", "3437", "max-age=0",
"https://aps.dac.gov.in", "1", "application/x-www-form-urlencoded",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36",
"?1", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
"same-origin", "navigate", "https://aps.dac.gov.in/LUS/Public/Reports.aspx",
"gzip, deflate, br", "de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7"), .Names = c("Host",
"Connection", "Content-Length", "Cache-Control", "Origin", "Upgrade-Insecure-Requests",
"Content-Type", "User-Agent", "Sec-Fetch-User", "Accept", "Sec-Fetch-Site",
"Sec-Fetch-Mode", "Referer", "Accept-Encoding", "Accept-Language"
))
hdrs <- header %>% add_headers
response <- POST(
url = url,
encode = "form",
body = body,
hdrs
)
But i get a timeout for this request.
Note: The site does not seem to have a robots.txt. But check the Terms and Conditions of the site.
I tried running these 2 lines myself at work and got somewhat a more explicit error message than you.
Could not open chrome browser.
Client error message:
Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
Further Details: run errorDetails method
Check server log for further details.
It might be because if you are at work without admin privileges, R can't create a child process.
As a matter of fact I used to run into absolutely awful problems myself trying to build a bot using RSelenium. rsDriver() was not consistent at all and kept crashing. I had to include it in a loop with error catching in order to keep it running, but then I had to find out and delete gigabytes of temp files manually.
I tried to install Docker and spent a lot of time doing the setup but finally it wasn't supported on my Windows non-professional edition.
Solution: Selenium from Python is very well documented, never crashes, works like a charm. Coding in the interactive Spyder editor from Anaconda feels almost like R.
And of course you can use something like system("python myscript.py") from R in order to get the process started and the resulting files back into R if you wish so.
EDIT: No admin privileges are required at all for Anaconda or Selenium. I run it myself without any problem from work. If you have trouble with pip install commands being SSL-blocked like me you can bypass it using the --trusted-host argument.
Selenium is useful when you must run the javascript on a webpage. For websites that don't require the javascript to be run (i.e. if the information you're after is contained within the webpage HTML), rvest or httr are your best bets.
In your case though, to download a file, simply use download.file(), which is a function in base R.
The website in your question is currently down (so I can't see it), but here's an example using a random file from another website
download.file("https://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf", "mygreatfile.pdf")
To check that it worked
dir()
# [1] "mygreatfile.pdf"
Depending on how the website is structured, you may be able to obtain a list of the file urls, then loop through them in R downloading one after another.
Lastly, an extra tip. Depending on the file type, and what you're doing with them, you may be able to read them directly into R (instead of saving them first). For example read.csv() works with a url to directly read the csv from the web. Other read functions may be able to do the same.
Update
I currently see an internal 500 error when I visit the site, but I can see the site via the wayback machine, so I can see there is indeed javascript on the webpage. When the site is back up and running, I will attempt to download the files

GET from api containing Key and Client from R

I need to bring data from this site (image below) in R
but I dont know how pass the parameters Key and Client via code, can anyone help me
Follow my code:
dados_get <- GET('http://api.climatempo.com.br/api/v1/forecast/72hours/temperature?idlocale=6873')
Try the following code:
require(httr)
require(jsonlite)
#list of parameters
a = list("key"="01234","type"="json","client"="abc","localeid"="6873")
dados_post <- POST(url='http://api.climatempo.com.br/api/v1/monitoring/weather',
body = toJSON(a))
Let me know if it works.

how to retrieve data from a web server using an oauth2 token in R?

I've successfully received an access token from an oauth2.0 request so that I can start obtaining some data from the server. However, I keep getting error 403 on each attempt. APIs are very new to me and I only am entry level in using R so I can't figure out whats wrong with my request. I'm using the crul package currently, but I've tried to make the request with the httr package as well, but I can't get anything through without encountering the 403 error. I have a shiny app which in the end I'd like to be able to refresh with data imported from this other application which actually stores data, but I want to try to pull data to my console locally first so I can understand the basic process of doing so. I will post some of my current attempts.
(x <- HttpClient$new(
url = 'https://us.castoredc.com',
opts = list( exceptions = FALSE),
headers = list())
)
res.token <- x$post('oauth/token',
body = list(client_id = "{id}",
client_secret = "{secret}",
grant_type = 'client_credentials'))
importantStuff <- jsonlite::fromJSON(res$parse("UTF-8"))
token <- paste("Bearer", importantStuff$access_token)
I obtain my token, but the following doesn't seem to work.###
I'm attempting to get the list of study codes so that I can call on them in
further requests to actually get data from a study.
res.studies <- x$get('/api/study',headers = list(Authorization =
token,client_id = "{id}",
client_secret = "{secret}",
grant_type = 'client_credentials'),
body = list(
content_type = 'application/json'))
Their support team gave me the above endpoint to access the content, but I get 403 so I think i'm not using my token correctly?
status: 403
access-control-allow-headers: Authorization
access-control-allow-methods: Get,Post,Options,Patch
I'm the CEO at Castor EDC and although its pretty cool to see a Castor EDC question on here, I apologize for the time you lost over trying to figure this out. Was our support team not able to provide more assistance?
Regardless, I have actually used our API quite a bit in R and we also have an amazing R Engineer in house if you need more help.
Reflecting on your answer, yes, you always need a Study ID to be able to do anything interesting with the API. One thing that could make your life A LOT easier is our R API wrapper, you can find that here: https://github.com/castoredc/castoRedc
With that you would:
remotes::install_github("castoredc/castoRedc")
library(castoRedc)
castor_api <- CastorData$new(key = Sys.getenv("CASTOR_KEY"),
secret = Sys.getenv("CASTOR_SECRET"),
base_url = "https://data.castoredc.com")
example_study_id <- studies[["study_id"]][1]
fields <- castor_api$getFields(example_study_id)
etc.
Hope that makes you life a lot easier in the future.
So, After some investigation, It turns out that you first have to make a request to obtain another id for each Castor study under your username. I will post some example code that worked finally.
req.studyinfo <- httr::GET(url = "us.castoredc.com/api/study"
,httr::add_headers(Authorization = token))
json <- httr::content(req.studyinfo,as = "text")
studies <- fromJSON(json)
Then, this will give you a list of your studies in Castor for which you can obtain the ID that you care about for your endpoints. It will be a list that contains a data frame containing this information.
you use the same format with whatever endpoint you like that is posted in their documentation to retrieve data. Thank you for your observations! I will leave this here in case anyone is employed to develop anything from data used in the Castor EDC. Their documentation was vague to me, so maybe it will help someone in the future.
Example for next step:
req.studydata <- httr::GET("us.castoredc.com/api/study/{study id obtained
from previous step}/data-point-
collection/study",,httr::add_headers(Authorization =
token))
json.data <- httr::content(req.studydata,as = "text")
data <- fromJSON(json.data)
This worked for me, I removed the Sys.getenv() part
library(castoRedc)
castor_api <- CastorData$new(key = "CASTOR_KEY",
secret = "CASTOR_SECRET",
base_url = "https://data.castoredc.com")
example_study_id <- studies[["study_id"]][1]
fields <- castor_api$getFields(example_study_id)

Connecting to Pocket API with R

I'm trying to connect to the Pocket API via R. I can do this easily by running a POST request in json format like this:
URL: http://getpocket.com/v3/get
POST /v3/get HTTP/1.1
Host: getpocket.com
Content-Type: application/json
{"consumer_key":"xxx-xxxxx",
"access_token":"aaaaa-aaaaaaaaaaaa"}
In R I tried using the POST function in the httr package, but I wasn't able to figure out how to pass the correct parameters:
library(rjson); library(httr)
the_url <- "https://getpocket.com/v3/get"
the_body <- toJSON(list(consumer_key = "xxx-xxxxx", access_token="aaaaa-aaaaaaaaaaaa"))
results <- POST(url=the_url, encode="json", body=the_body)
I always get the status "400 Bad Request". I know the example is not reproducible, but for security reasons I'd rather not share the consumer_key and access_token .
Are you sure your access_token is a good one? If so, I think you just need to change to
url <- "https://getpocket.com/v3/get"
body <- list(consumer_key = "xxx-xxxxx", access_token="aaaaa-aaaaaaaaaaaa")
results <- POST(url, body = body)
content(results)

Using R to send tweets

I saw a cute demonstration of tweeting from R in a presentation some months ago. The scratch code used by the presenter is here:
http://www.r-bloggers.com/twitter-from-r%E2%80%A6-sure-why-not/
the code is short and sweet:
library("RCurl")
opts <- curlOptions(header = FALSE,
userpwd = "username:password", netrc = FALSE)
tweet <- function(status){
method <- "http://twitter.com/statuses/update.xml?status="
encoded_status <- URLencode(status)
request <- paste(method,encoded_status,sep = "")
postForm(request,.opts = opts)
}
With this function, you can send a tweet simply by using the update function:
tweet("This tweet comes from R! #rstats")
I thought that this could be a useful way of announcing when long jobs are completed. I tried to run this on my machine, and I got some error:
[1] "\n\n Basic authentication is not supported\n\n"
attr(,"Content-Type")
charset
"application/xml" "utf-8"
Warning message:
In postForm(request, .opts = opts) : No inputs passed to form
I'm wondering if there has been some changes on the twitter end of this, that make this code produce this error? I don't know too much about getting R to talk to webpages, so any guidance is much appreciated!!
E
Yes, the basic authentication scheme was disabled on the 16th August 2010.. You'll need to set it up to use OAuth. Unfortunately that is not nearly as simple as using basic authentication
See this twitter wiki page for more information and this StackOverflow question about OAuth for R.
Besides the code you show, there is also a full-blown twitteR package on CRAN you could look at.
The easiest way to tweet in R through the Twitter-API is to use the twitteR Package.
You can set your Twitter-API-APP here: https://apps.twitter.com/
First step is to authenticate:
consumer_key <- "yourcredentials"
consumer_secret <- "yourcredentials"
access_token <- "yourcredentials"
access_secret <- "yourcredentials"
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)
And just tweet (limit per day:2400 tweets):
tweet("Hello World")
If twitteR does not work or you simply want to try to build it yourself ...
See here for a demo of how to do your own Twitter authentication and use of the API with help of the httr package.

Resources