I want to make a POST request to https://rest.ensembl.org. Currently the following works:
server <- "https://rest.ensembl.org"
ext <- "/vep/human/hgvs"
r <- POST(paste(server, ext, sep = ""),
content_type("application/json"),
accept("application/json"),
body = '{ "hgvs_notations" : ["chr2:g.10216G>T"] }')
which results in this URL https://rest.ensembl.org/vep/human/hgvs/chr2:g.10216G>T. I would like to use the ? parameter to modify my URL to https://rest.ensembl.org/vep/human/hgvs/chr2:g.10216G>T?CADD=1 however I can't see how to do this in the POST request function in R.
Any help would be great!
If it is always the same parameter you need to send, why not just include it in the URI then?
You could do something like POST( paste0(server, ext, '?CADD=1'), [...] ).
Or would that not be dynamic enough for your usecase?
The following would be the less hacky way to include parameters:
library(httr)
library(jsonlite)
r <- POST(
paste0(server, ext),
query = list('CADD' = 1),
content_type_json(),
accept_json(),
body = toJSON(list('hgvs_notations' = c('chr2:g.10216G>T')))
)
Related
I'm trying to extract the province and city options respectively from this website: https://www.handmadepizza.co.kr/store/store.
For the province, there was no body needed for the POST code, hence this code below works.
raw.data <- POST("https://www.handmadepizza.co.kr/store/StoreSO/selectStoreSiList.do",
body = NULL,
encode = "form")
raw.data <- content(raw.data)
However, the POST code for the city requires a body. Tried to add the body as per the code chunk below, but no data was received.
argus <- '{"_window_name":{
"ADDR_SI" : "%EC%9A%B8%EC%82%B0"
}
}'
raw.data <- POST("https://www.handmadepizza.co.kr/store/StoreSO/selectStoreGuList.do",
body = argus,
encode = "form")
raw.data <- content(raw.data)
I know it's a matter of finding the right way to add the body to the POST code chunk but just can't wrap my head around it!
I'm using R programming and httr package to request a HTTP
I want to apply get function for http using query parameters.
GET(url = NULL, config = list(), ..., handle = NULL)
the request contains , seperated by question mark ?
1- url https://example.com
2- url parameter:'title='
# Function to Get Links to specific page
page_link <- function(){
url <- "https://example.com?"
q1 <- list (title = "")
page_link<- GET (url, query = q1)
return (page_link)
}
If you're asking how to bind the url to get one element to request with GET, you should try paste0(), e.g.:
url <- paste0("https://example.com?",q1[x])<br>
page_link <- GET(url)
I am not very good with working with API's "From scratch" so to speak. My issue here is probably more to do with my ignorance of RESTful API's than the Todoist API specifically, but I'm struggling with Todoist because all of their documentation is geared around python and I'm not sure why my feeble attempts are failing. Once I get connected/authenticated I think I'll be fine.
Todoist documentation
I've tried a couple of configurations using httr::GET(). I would appreciate a little push here as I get started.
Things I've tried, where key is my api token:
library(httr)
r<-GET("https://beta.todoist.com/API/v8/", add_headers(hdr))
for hdr, I've used a variety of things:
hdr<-paste0("Authorization: Bearer", key)
just my key
I also tried with projects at the end of the url
UPDATE These are now implemented in the R package rtodoist.
I think you nearly had it except the url? (or maybe it changed since then) and the header. The following works for me, replacing my_todoist_token with API token found here.
library(jsonlite)
library(httr)
projects_api_url <- "https://api.todoist.com/rest/v1/projects"
# to get the project as a data frame
header <- add_headers(Authorization = paste("Bearer ", my_todoist_token))
project_df <- GET(url = projects_api_url, header) %>%
content("text", encoding = "UTF-8") %>%
fromJSON(flatten = TRUE)
# to create a new project
# unfortunately no way to change the dot color associated with project
header2 <- add_headers(
Authorization = paste("Bearer ", my_todoist_token),
`Content-Type` = "application/json",
`X-Request-Id` = uuid::UUIDgenerate())
POST(url = projects_api_url, header2,
body = list(name = "Your New Project Name"
# parent = parentID
),
encode = "json")
# get a project given project id
GET(url = paste0(projects_api_url, "/", project_df$id[10]),
header) %>%
content("text", encoding = "UTF-8") %>%
fromJSON(flatten = TRUE)
# update a project
POST(url = paste0(projects_api_url, "/", project_df$id[10]),
header2, body = list(name = "IBS-AR Biometric 2019"), encode = "json")
I wanted to scrape some data from following website:
http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search
but when I tried to use rvest:
library(rvest)
session <- html_session("http://predstecajnenagodbe.fina.hr/pn-public-web/predmet/search")
form <- html_form(session)
form
it doesn't find the form, even if it is there (as you can see on the page).
I have also tried with POST function from httr package:
parameters <- list(since = "1.6.2018", until = "5.6.2018", `g-recaptcha-response` = "03AF6jDqXcBw1qmbrxWqadGqh9k8eHAzB9iPbYdnwzhEVSgCwO0Mi6DQDgckigpeMH1ikV70egOC0UppZsO7tO9hgdpEIaI04jTpG6JxGMR6wov27kEkLuVsEp1LhxZB4WFDRkDWdqcZeVN1YkiojUpje4k-swFG7tPyG2pJN86SdT290D9_0fyfrxlpfFNL2VUwE_c15vVthcBEdXIQ68V5qv7ZVooLiwrdTO2qLDLF1yUZWiu9IJoLuBWdFzJ_zdSP6fbuj5wTpfPdsYJ2n988Gcb3q2aYdn-2TVuWoQzqs1wbh7ya_Geo7_8gnDUL92l2nqTeV9CMY58fzppPPYDJcchdHFTTxadGwCGZyKC3WUSh81qiGZ5JhNDUpPnOO-MgSr5aPbA7tei7bbypHV9OOVjPGLLtqA9g")
httr::POST(
url,
body = parameters,
config = list(
add_headers(Referer = "http://predstecajnenagodbe.fina.hr"),
user_agent(get_header()),
accept_encoding = get_encoding(),
use_proxy("xxxx", port = 80,
username = "xxx", password = "xxxx"),
timeout(20L),
tcp_keepalive = FALSE
),
encode = "form",
verbose()
)
but it returns some JS code and message:
Please enable JavaScript to view the page content.Your support ID is:
10544975822212666004
could you please explain why rvest doesn't recognize form and why POST doesn't work eater?
First I'd like to take a moment and thank the SO community,
You helped me many times in the past without me needing to even create an account.
My current problem involves web scraping with R. Not my strong point.
I would like to scrap http://www.cbs.dtu.dk/services/SignalP/
what I have tried:
library(rvest)
url <- "http://www.cbs.dtu.dk/services/SignalP/"
seq <- "MTSKTCLVFFFSSLILTNFALAQDRAPHGLAYETPVAFSPSAFDFFHTQPENPDPTFNPCSESGCSPLPVAAKVQGASAKAQESDIVSISTGTRSGIEEHGVVGIIFGLAFAVMM"
session <- rvest::html_session(url)
form <- rvest::html_form(session)[[2]]
form <- rvest::set_values(form, `SEQPASTE` = seq)
form_res_cbs <- rvest::submit_form(session, form)
#rvest prints out:
Submitting with 'trunc'
rvest::html_text(rvest::html_nodes(form_res_cbs, "head"))
#ouput:
"Configuration error"
rvest::html_text(rvest::html_nodes(form_res_cbs, "body"))
#ouput:
"Exception:WebfaceConfigErrorPackage:Webface::service : 358Message:Unhandled #parameter 'NULL' in form "
I am unsure what is the unhandled parameter.
Is the problem in the submit button? I can not seem to force:
form_res_cbs <- rvest::submit_form(session, form, submit = "submit")
#rvest prints out
Error: Unknown submission name 'submit'.
Possible values: trunc
is the problem the submit$name is NULL?
form[["fields"]][[23]]
I tried defining the fake submit button as suggested here:
Submit form with no submit button in rvest
with no luck.
I am open to solutions using rvest or RCurl/httr, I would like to avoid using RSelenium
EDIT: thanks to hrbrmstr awesome answer I was able to build a function for this task. It is available in the package ragp: https://github.com/missuse/ragp
Well, this is doable. But it's going to require elbow grease.
This part:
library(rvest)
library(httr)
library(tidyverse)
POST(
url = "http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi",
encode = "form",
body=list(
`configfile` = "/usr/opt/www/pub/CBS/services/SignalP-4.1/SignalP.cf",
`SEQPASTE` = "MTSKTCLVFFFSSLILTNFALAQDRAPHGLAYETPVAFSPSAFDFFHTQPENPDPTFNPCSESGCSPLPVAAKVQGASAKAQESDIVSISTGTRSGIEEHGVVGIIFGLAFAVMM",
`orgtype` = "euk",
`Dcut-type` = "default",
`Dcut-noTM` = "0.45",
`Dcut-TM` = "0.50",
`graphmode` = "png",
`format` = "summary",
`minlen` = "",
`method` = "best",
`trunc` = ""
),
verbose()
) -> res
Makes the request you made. I left verbose() in so you can watch what happens. It's missing the "filename" field, but you specified the string, so it's a good mimic of what you did.
Now, the tricky part is that it uses an intermediary redirect page that gives you a chance to enter an e-mail address for notification when the query is done. It does do a regular (every ~10s or so) check to see if the query is finished and will redirect quickly if so.
That page has the query id which can be extracted via:
content(res, as="parsed") %>%
html_nodes("input[name='jobid']") %>%
html_attr("value") -> jobid
Now, we can mimic the final request, but I'd add in a Sys.sleep(20) before doing so to ensure the report is done.
GET(
url = "http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi",
query = list(
jobid = jobid,
wait = "20"
),
verbose()
) -> res2
That grabs the final results page:
html_print(HTML(content(res2, as="text")))
You can see images are missing because GET only retrieves the HTML content. You can use functions from rvest/xml2 to parse through the page and scrape out the tables and the URLs that you can then use to get new content.
To do all this, I used burpsuite to intercept a browser session and then my burrp R package to inspect the results. You can also visually inspect in burpsuite and build things more manually.