I'm trying to extract the recipe data from Edamam API. All of my GET/POST requests fail.
I've tried to extract the data through Python, which works seamlessly but R gives "must not be NULL"
Here is the code:
library(httr)
library(jsonlite)
# Store the ID and Key in variables
APP_ID = "XXXXXX"
APP_KEY = "XXXXXXXXXX"
# Note: Those are not real ID and Key,
# Replace the string with your own ones that you recieved upon registration
# Setting up the request URL
api_endpoint = "https://api.edamam.com/api/recipes/v2"
url = paste0(api_endpoint, "?app_id=", APP_ID, "&app_key=", APP_KEY)
#Defining the header (as stated in the documentation)
headers = list(
`Content-type` = 'application/json'
)
#Defining the payload of the request (the data we actually want processed)
recipe = list(
`mealType` = 'breakfast'
)
#Submitting a POST request
tmp <- POST(url, body = recipe, encode = "json")
tmp <- GET(url)
tmp <- httr::POST(url, body = recipe, verbose(), content_type("application/json"))
appData <- content(tmp)
Related
The post request looks like this :
curl --location --request POST ''
--form 'ids=["ae8a1312","59569d79"]'
Here the ids need to be passed as an array in the post request. Tried using the following code:
body <- list(
sweep_ids = ids
)
resp <- POST(url, body = body, encode = "form")
Here variable ids is the list of ids. The above piece of code gives the following error
Warning: Error in vapply: values must be length 1,
but FUN(X[[1]]) result is length 2
Can't use encoding as JSON.
How can I pass the list of ids to this form-encoded POST request in R?
l <- paste('["',paste(ids, collapse='","' ),'"]', sep="")
body <- list(
sweep_ids = l
)
resp <- POST(url, body = body, encode = "form")
Collapse the list as text
Im trying to paste multiple vin numbers to the nthsa API.
My working solution looks like this:
vins <- c('4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466','4JGCB5HE1CA138466',)
for (i in vins){
json <- fromJSON(paste0('https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVinValues/',i,'?format=json'))
print(json)
}
This solution is very slow. I tried pbapply, same thing because it pastes one vin at a time.
There is a batch paste option that i just cant figure out. Can someone please assist.
Here is my code so far:
data <- list(data='4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466')
json <- toJSON(list(data=data), auto_unbox = TRUE)
result <- POST('https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/', body = data)
Output <- content(result)
The vin numbers string has to be in the following format: vin;vin;vin;vin;
here is the link: https://vpic.nhtsa.dot.gov/api/ (the last one)
Thanks in advance.
UPDATE:
I also tried this from some other threads but no luck:
headers = c(
`Content-Type` = 'application/json'
)
data = '[{"data":"4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466"}]'
httr::POST(url = 'https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/', httr::add_headers(.headers=headers), body = data)
print(r$status_code)
I am getting status code 200 but server code 500 with no data.
I am not sure if this is possible. The batch endpoint is specifically looking for a dictionary to be passed (ruling out string representations). httr states:
body: must be NULL, FALSE, character, raw or list
I tried using collections library to generate dict
data <- Dict$new(list(format = 'json', data = "4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466"))
httr unsurprisingly rejected it as wrong body ype.
I tried using jsonlite to convert with:
data <- jsonlite::toJSON(data)
Yielding:
Error: No method asJSON S3 class: R6
I think due to data being an environment.
Trying to read in string dictionary to json returns no data:
library(httr)
library(jsonlite)
headers = c(
'Accept' = '*/*',
'Accept-Encoding' = 'gzip, deflate',
'Content-Type' = 'application/x-www-form-urlencoded',
'User-Agent' = 'Mozilla/5.0'
)
data = jsonlite::toJSON('{"format":"json","data":"4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466"}')
r<- httr::POST(url = 'https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/', httr::add_headers(.headers=headers), body = data. encode='json'
print(content(r))
If we examine converted data
> data
["{\"format\":\"json\",\"data\":\"4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466\"}"]
This is no longer the dictionary structure the server expects.
So, I am new to R but seems like it might be easier to just go with Python which has a dictionary object and also a json library which handles comfortably the conversion
string to json:
import requests,json
url = 'https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/'
data = json.loads('{"format": "json", "data":"4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466"}')
r = requests.post(url, data=data)
print(r.json())
dict
import requests
url = 'https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/'
data = {'format': 'json', 'data':'4JGCB5HE1CA138466;4JGCB5HE1CA138466;4JGCB5HE1CA138466'}
r = requests.post(url, data=data).json()
print(r)
I tried to retrieve data from sftp with the below code:
library(RCurl)
protocol <- "sftp"
server <- "xxxx#sftp.xxxx.com"
userpwd <- "xxx:yyy"
tsfrFilename <- "cccccc.tsv"
ouptFilename <- "out.csv"
opts = list(
#ssh.public.keyfile = "true", # file name
ssh.private.keyfile = "xxxxx.ppk",
keypasswd = "userpwd"
)
# Run #
## Download Data
url <- paste0(protocol, "://", server, tsfrFilename)
data <- getURL(url = url, .opts = opts, userpwd=userpwd)
and i received an error message:
Error in function (type, msg, asError = TRUE) : Authentication failure
What am I doing wrong?
Thanks
With a private key you do not need a password with you username. So your getURL statement will be:
data <- getURL(url = url, .opts = opts, username="username")
I had exactly the same problem and have just spent an hour trying different things out. What worked for me was changing the format of the private key to OpenSSH.
To do this, I used the key generator package puttygen. Go to the menu item "Conversions" to import the original private key and export to the OpenSSH format. I exported the converted key to the same folder that my original key was in with a new filename. I kept the *.ppk extension
Then I used the following commands:
opts <- list(
ssh.private.keyfile = "<path to my new OpenSSH Key>.ppk"
)
data <- getURL(url = URL, .opts = opts, username = username, verbose = TRUE)
This seemed to work fine.
I'm trying to download a pdf from the National Information Center via RCurl but I've been having some trouble. For this example URL, I want the pdf corresponding to the default settings, except for "Report Format" which should be "PDF". When I run the following script, it saves the file associated with selecting the other buttons ("Parent(s) of..."/HMDA -- not the default). I tried adding these input elements to params, but it didn't change anything. Could somebody help me identify the problem? thanks.
library(RCurl)
curl = getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', curl = curl)
params = list(rbRptFormatPDF = 'rbRptFormatPDF')
url = 'https://www.ffiec.gov/nicpubweb/nicweb/OrgHierarchySearchForm.aspx?parID_RSSD=2162966&parDT_END=99991231'
html = getURL(url, curl = curl)
viewstate = sub('.*id="__VIEWSTATE" value="([0-9a-zA-Z+/=]*).*', '\\1', html)
event = sub('.*id="__EVENTVALIDATION" value="([0-9a-zA-Z+/=]*).*', '\\1', html)
params[['__VIEWSTATE']] = viewstate
params[['__EVENTVALIDATION']] = event
params[['btnSubmit']] = 'Submit'
result = postForm(url, .params=params, curl=curl, style='POST')
writeBin( as.vector(result), 'test.pdf')
Does this provide the correct PDF?
library(httr)
library(rvest)
library(purrr)
# setup inane sharepoint viewstate parameters
res <- GET(url = "https://www.ffiec.gov/nicpubweb/nicweb/OrgHierarchySearchForm.aspx",
query=list(parID_RSSD=2162966, parDT_END=99991231))
# extract them
pg <- content(res, as="parsed")
hidden <- html_nodes(pg, xpath=".//form/input[#type='hidden']")
params <- setNames(as.list(xml_attr(hidden, "value")), xml_attr(hidden, "name"))
# pile on more params
params <- c(
params,
grpInstitution = "rbCurInst",
lbTopHolders = "2961897",
grpHMDA = "rbNonHMDA",
lbTypeOfInstitution = "-99",
txtAsOfDate = "12/28/2016",
txtAsOfDateErrMsg = "",
lbHMDAYear = "2015",
grpRptFormat = "rbRptFormatPDF",
btnSubmit = "Submit"
)
# submit the req and save to disk
POST(url = "https://www.ffiec.gov/nicpubweb/nicweb/OrgHierarchySearchForm.aspx",
query=list(parID_RSSD=2162966, parDT_END=99991231),
add_headers(Origin = "https://www.ffiec.gov"),
body = params,
encode = "form",
write_disk("/tmp/output.pdf")) -> res2
I'm trying to scrape some table data from a password-protected website (I have a valid username/password) using R and have yet to succeed.
For an example, here's the website to log in to my dentist: http://www.deltadentalins.com/uc/index.html
I have tried the following:
library(httr)
download <- "https://www.deltadentalins.com/indService/faces/Home.jspx?_afrLoop=73359272573000&_afrWindowMode=0&_adf.ctrl-state=12pikd0f19_4"
terms <- "http://www.deltadentalins.com/uc/index.html"
values <- list(username = "username", password = "password", TARGET = "", SMAUTHREASON = "", POSTPRESERVATIONDATA = "",
bundle = "all", dups = "yes")
POST(terms, body = values)
GET(download, query = values)
I have also tried:
your.username <- 'username'
your.password <- 'password'
require(SAScii)
require(RCurl)
require(XML)
agent="Firefox/23.0"
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
curl = getCurlHandle()
curlSetOpt(
cookiejar = 'cookies.txt' ,
useragent = agent,
followlocation = TRUE ,
autoreferer = TRUE ,
curl = curl
)
# list parameters to pass to the website (pulled from the source html)
params <-
list(
'lt' = "",
'_eventID' = "",
'TARGET' = "",
'SMAUTHREASON' = "",
'POSTPRESERVATIONDATA' = "",
'SMAGENTNAME' = agent,
'username' = your.username,
'password' = your.password
)
#logs into the form
html = postForm('https://www.deltadentalins.com/siteminderagent/forms/login.fcc', .params = params, curl = curl)
# logs into the form
html
I can't get either to work. Are there any experts out there that can help?
Updated 3/5/16 to work with package Relenium
#### FRONT MATTER ####
library(devtools)
library(RSelenium)
library(XML)
library(plyr)
######################
## This block will open the Firefox browser, which is linked to R
RSelenium::checkForServer()
remDr <- remoteDriver()
startServer()
remDr$open()
url="yoururl"
remDr$navigate(url)
This first section loads the required packages, sets the login URL, and then opens it in a Firefox instance. I type in my username & password, and then I'm in and can start scraping.
infoTable <- readHTMLTable(firefox$getPageSource(), header = TRUE)
infoTable
Table1 <- infoTable[[1]]
Apps <- Table1[,1] # Application Numbers
For this example, the first page contained two tables. The first is the one I'm interested and has a table of application numbers and names. I pull out the first column (application numbers).
Links2 <- paste("https://yourURL?ApplicantID=", Apps2, sep="")
The data I want are stored in invidiual applications, so this bit created the links that I want to loop through.
### Grabs contact info table from each page
LL <- lapply(1:length(Links2),
function(i) {
url=sprintf(Links2[i])
firefox$get(url)
firefox$getPageSource()
infoTable <- readHTMLTable(firefox$getPageSource(), header = TRUE)
if("First Name" %in% colnames(infoTable[[2]]) == TRUE) infoTable2 <- cbind(infoTable[[1]][1,], infoTable[[2]][1,])
else infoTable2 <- cbind(infoTable[[1]][1,], infoTable[[3]][1,])
print(infoTable2)
}
)
results <- do.call(rbind.fill, LL)
results
write.csv(results, "C:/pathway/results2.csv")
This final section loops through the link for each application, then grabs the table with their contact information (which is either table 2 OR table 3, so R has to check first). Thanks again to Chinmay Patil for the tip on relenium!