How to extact data from a JSON data with CURL - r

I have a curl request that authenticates with the username and password.
library(curl)
library(rjson)
library(jsonlite)
link1<-paste0("The URL of the required data source")
resp<- curl_fetch_memory(link1,h)
zips<- jsonlite::fromJSON(rawToChar(resp$content,multiple = FALSE))
The expected result is that of a list within a list , but i am getting a list with a row of characters all merged ...
Am i moving on with the wrong logic here? Any leads would be greatly appreciated.

Related

Convert my API result into a dataframe in R

I am really struggling to understand how this newly released API works.. Can someone please help me turn it into a useful dataframe in R? My res looks like the below (edited):
library(httr)
library(jsonlite)
library(dplyr)
#GET Function
res = GET("https://comtradeapi.un.org/data/v1/get/C/A/HS?reporterCode=826&period=2020&partnerCode=000&partner2Code=000&cmdCode=TOTAL&flowCode=M HTTP/1.1&subscription-key=6509aa2a08d54ca7b47a2fece2ab5bee")
df= fromJSON(rawToChar(res$content)) #this doesn't work
By pasting your URL into a browser we get:
{"elapsedTime":"0.02 secs","count":0,"data":[],"error":""}
So there appears to be an error with the result itself. Also, I'd strongly advise against publishing your secret API key, as it allows others to access the data you're subscribing to!

Cant connect to Qualtrics API using httr package

I am trying to connect to Qualtrics API using Rstudio Cloud "httr" package to download mailing lists. After a review of the API documentation I was unable to download the data, getting the following error after running the code:
"{"meta":{"httpStatus":"400 - Bad Request","error":{"errorMessage":"Expected authorization in headers, but none provided.","errorCode":"ATP_2"},"requestId":"8fz33cca-f9ii-4bca-9288-5tc69acaea13"}}"
This does not makes me any sense since I am using a inherit auth from parent token. Here is the code:
install.packages("httr")
library(httr)
directoryId<-"POOL_XXXXX"
mailingListId <- "CG_XXXXXX"
apiToken<-"XXXX"
url<- paste("https://iad1.qualtrics.com/API/v3/directories/",directoryId,
"/mailinglists/",mailingListId,"/optedOutContacts", sep = "")
response <- VERB("GET",url, add_headers('X_API-TOKEN' = apiToken),
content_type("application/octet-stream"))
content(response, "text")
Any help will be appreciated.
Thanks in advance.
Your call to httr::VERB breaks the API token and the content type into two arguments to the function, but they should be passed together in a vector to a single "config" argument. Also, content_type isn't a function, it's just the name of an element in that header vector. This should work:
response <- VERB("GET", url, add_headers(c(
'X_API-TOKEN' = apiToken,
'content_type' = "application/octet-stream")))
Note that mailing lists will be returned by Qualtrics as lists that will include both a "meta" element and a "result" element, both of which will themselves be lists. If the list is long, the only the first 100 contacts on the list will be returned; there will be an element response$result$nextpage that will provide the URL required to access the next 100 results. The qualtRics::fetch_mailinglist() function does not work with XM Directory contact lists (which is probably why you got a 500 error when using it), but the code for unpacking the list and looping over each "nextpage" element might be helpful.

Data from httr POST-request is long string instead of table

I'm receiving the data I'm requesting but don't understand how to sufficiently extract the data. Here is the POST request:
library(httr)
url <- "http://tools-cluster-interface.iedb.org/tools_api/mhci/"
body <- list(method="recommended", sequence_text="SLYNTVATLYCVHQRIDV", allele="HLA-A*01:01,HLA-A*02:01", length="8,9")
data <- httr::POST(url, body = body,encode = "form", verbose())
If I print the data with:
data
..it shows the request details followed by a nicely formatted table. However if I try to extract with:
httr::content(data, "text")
This returns a single string with all the values of the original table. The output looks delimited by "\" but I couldn't str_replace or tease it out properly.
I'm new to requests using R (and httr) and assume it's an option I'm missing with httr. Any advice?
API details here: http://tools.iedb.org/main/tools-api/
The best way to do this is to specify the MIME type:
content(data, type = 'text/tab-separated-values')

Set cookies with rvest

I would like to programmatically export the records available at this website. To do this manually, I would navigate to the page, click export, and choose the csv.
I tried copying the link from the export button which will work as long as I have a cookie (I believe). So a wget or httr request will result in the html site instead of the file.
I've found some help from an issue on the rvest github repo but ultimately I can't really figure out like the issue maker how to use objects to save the cookie and use it in a request.
Here is where I'm at:
library(httr)
library(rvest)
apoc <- html_session("https://aws.state.ak.us/ApocReports/Registration/CandidateRegistration/CRForms.aspx")
headers <- headers(apoc)
GET(url = "https://aws.state.ak.us/ApocReports/Registration/CandidateRegistration/CRForms.aspx?exportAll=False&exportFormat=CSV&isExport=True",
add_headers(headers)) # how can I take the output from headers in httr and use it as an argument in GET from httr?
I have checked the robots.txt and this is permissible.
You can get the __VIEWSTATE and __VIEWSTATEGENERATOR from the headers when you GET https://aws.state.ak.us/ApocReports/Registration/CandidateRegistration/CRForms.aspx and then reuse those __VIEWSTATE and __VIEWSTATEGENERATOR in your subsequent POST query and GET csv.
options(stringsAsFactors=FALSE)
library(httr)
library(curl)
library(xml2)
url <- 'https://aws.state.ak.us/ApocReports/Registration/CandidateRegistration/CRForms.aspx'
#get session headers
req <- GET(url)
req_html <- read_html(rawToChar(req$content))
fields <- c("__VIEWSTATE","__VIEWSTATEGENERATOR")
viewheaders <- lapply(fields, function(x) {
xml_attr(xml_find_first(req_html, paste0(".//input[#id='",x,"']")), "value")
})
names(viewheaders) <- fields
#post request. you can get the list of form fields using tools like Fiddler
params <- c(viewheaders,
list(
"M$ctl19"="M$UpdatePanel|M$C$csfFilter$btnExport",
"M$C$csfFilter$ddlNameType"="Any",
"M$C$csfFilter$ddlField"="Elections",
"M$C$csfFilter$ddlReportYear"="2017",
"M$C$csfFilter$ddlStatus"="Default",
"M$C$csfFilter$ddlValue"=-1,
"M$C$csfFilter$btnExport"="Export"))
resp <- POST(url, body=params, encode="form")
print(resp$status_code)
resptext <- rawToChar(resp$content)
#writeLines(resptext, "apoc.html")
#get response i.e. download csv
url <- "https://aws.state.ak.us//ApocReports/Registration/CandidateRegistration/CRForms.aspx?exportAll=True&exportFormat=CSV&isExport=True"
req <- GET(url, body=params)
read.csv(text=rawToChar(req$content))
You might need to play around with the inputs/code to get what you want precisely.
Here is another similar solution using RCurl:
how-to-login-and-then-download-a-file-from-aspx-web-pages-with-r

R - Twitter - fromJSON - get list of tweets

I would like to retrieve a list of tweets from Twitter for a given hashtag using package RJSONIO in R. I think I am pretty close to the solution, but I seem to miss one step.
My code reads as follows (in this example, I use #NBA as a hashtag):
library(httr)
library(RJSONIO)
# 1. Find OAuth settings for twitter:
# https://dev.twitter.com/docs/auth/oauth
oauth_endpoints("twitter")
# Replace key and secret below
myapp <- oauth_app("twitter",
key = "XXXXXXXXXXXXXXX",
secret = "YYYYYYYYYYYYYYYYY"
)
# 3. Get OAuth credentials
twitter_token <- oauth1.0_token(oauth_endpoints("twitter"), myapp)
# 4. Use API
req=GET("https://api.twitter.com/1.1/search/tweets.json?q=%23NBA&src=typd",
config(token = twitter_token))
req <- content(req, as = "text")
response=fromJSON(req)
How can I get the list of tweets from object 'response'?
Eventually, I would like to get something like:
searchTwitter("#NBA", n=5000, lang="en")
Thanks a lot in advance!
The response object should be a list of length two: statuses and metadata. So, for example, to get the text of the first tweet, try:
response$statuses[[1]]$text
However, there are a couple of R packages designed to make just this kind of thing easier: Try streamR for the streaming API, and twitteR for the REST API. The latter has a searchTwitter function exactly as you describe.

Resources