As a way of exploring how to make a package in R for the Denver RUG, I decided that it would be a fun little project to write an R wrapper around the datasciencetoolkit API. The basic R tools come from the RCurl package as you might imagine. I am stuck on a seemingly simple problem and I'm hoping that somebody in this forum might be able to point me in the right direction. The basic problem is that I can't seem to use postForm() to pass an un-keyed string as part of the data option in curl, i.e. curl -d "string" "address_to_api".
For example, from the command line I might do
$ curl -d "Tim O'Reilly, Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people"
with success. However, it seems that postForm() requires an explicit key when passing additional arguments into the POST request. I've looked through the datasciencetoolkit code and developer docs for a possible key, but can't seem to find anything.
As an aside, it's pretty straightforward to pass inputs via a GET request to other parts of the DSTK API. For example,
ip2coordinates <- function(ip) {
api <- "http://www.datasciencetoolkit.org/ip2coordinates/"
result <- getURL(paste(api, URLencode(ip), sep=""))
names(result) <- "ip"
return(result)
}
ip2coordinates('67.169.73.113')
will produce the desired results.
To be clear, I've read through the RCurl docs on DTL's omegahat site, the RCurl docs with the package, and the curl man page. However, I'm missing something fundamental with respect to curl (or perhaps .opts() in the postForm() function) and I can't seem to get it.
In python, I could basically make a 'raw' POST request using httplib.HTTPConnection -- is something like that available in R? I've looked at the simplePostToHost function in the httpRequest package as well and it just seemed to lock my R session (it seems to require a key as well).
FWIW, I'm using R 2.13.0 on Mac 10.6.7.
Any help is much appreciated. All of the code will soon be available on github if you're interested in playing around with the data science toolkit.
Cheers.
With httr, this is just:
library(httr)
r <- POST("http://www.datasciencetoolkit.org/text2people",
body = "Tim O'Reilly, Archbishop Huxley")
stop_for_status(r)
content(r, "parsed", "application/json")
Generally, in those cases where you're trying to POST something that isn't keyed, you can just assign a dummy key to that value. For example:
> postForm("http://www.datasciencetoolkit.org/text2people", a="Archbishop Huxley")
[1] "[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"
Would work the same if I'd used b="Archbishop Huxley", etc.
Enjoy RCurl - it's probably my favorite R package. If you get adventurous, upgrading to ~ libcurl 7.21 exposes some new methods via curl (including SMTP, etc.).
From Duncan Temple Lang on the R-help list:
postForm() is using a different style (or specifically Content-Type) of submitting the form than the curl -d command.
Switching the style = 'POST' uses the same type, but at a quick guess, the parameter name 'a' is causing confusion
and the result is the empty JSON array - "[]".
A quick workaround is to use curlPerform() directly rather than postForm()
r = dynCurlReader()
curlPerform(postfields = 'Archbishop Huxley', url = 'http://www.datasciencetoolkit.org/text2people', verbose = TRUE,
post = 1L, writefunction = r$update)
r$value()
This yields
[1]
"[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":0,\"end_index\":17,\"matched_string\":\"Archbishop
Huxley\"}]"
and you can use fromJSON() to transform it into data in R.
I just wanted to point out that there must be an issue with passing a raw string via the postForm function. For example, if I use curl from the command line, I get the following:
$ curl -d "Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people
[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop Huxley"}]
and in R I get
> api <- "http://www.datasciencetoolkit.org/text2people"
> postForm(api, a="Archbishop Huxley")
[1] "[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"
Note that it returns two elements in the JSON string and neither one matches on the start_index or end_index. Is this a problem with encoding or something?
The simplePostToHost function in the httpRequest package might do what you are looking for here.
Related
I see several methods of making a GET in R.
Using httr
myurl <- "https://stackoverflow.com"
library(httr)
httr::GET(myurl)
Using rvest/xml2
library(rvest)
xml2::read_html(myurl)
Using curl
I have tested using curl and can confirm the following works from a standard macbook and from a windows 10 device:
command <- paste("curl", myurl)
system(command)
Question
The third method above (using curl) seems to work fairly universally.
Is there any better way of making a GET request (or similarly a HEAD, POST etc) than the method above using curl?
'better' in this case means works universally across operating systems with minimal coding / external programs/libraries being installed), or is using curl (through a call to system) the best way?
From base R there is url with method = "libcurl"
con <- url("https://www.stackoverflow.com", method = "libcurl")
tmp <- readLines(con)
Also, this is not strictly base R, but from utils there is url.show
utils::url.show("https://stackoverflow.com", method = "curl")
Slack offers a method to upload files through their api. The documentation is found here:
Slack files.upload method
On this page it gives an example of how to post a file:
curl -F file=#dramacat.gif -F "initial_comment=Shakes the cat" -F channels=C024BE91L,D032AC32T -H "Authorization: Bearer xoxa-xxxxxxxxx-xxxx" https://slack.com/api/files.upload
I am trying to translate how to execute this line of code using the httr package in R, with a file in my R working directory. I'm having trouble translating the different parts of the command. Here is what I have so far.
api_token='******'
f_path='c:/mark/consulting/dreamcloud' #this is also my working directory
f_name='alert_picture.png'
res<-httr::POST(url='https://slack.com/api/files.upload', httr::add_headers(`Content-Type` = "multipart/form-data"),
body = list(token=api_token, channels='CCJL7TMC7', title='test', file = httr::upload_file(f_path), filename=f_name))
When I run this I get the following error:
Error in curl::curl_fetch_memory(url, handle = handle) :
read function returned funny value
I tried to find better examples to use but so far no luck. Any suggestions are appreciated!
There's an example in slackr's own gg_slackr method, which creates an image of a GGPlot and uploads it to Slack:
res <- POST(url="https://slack.com/api/files.upload",
add_headers(`Content-Type`="multipart/form-data"),
body=list(file=upload_file(ftmp),
token=api_token, channels=modchan))
Your code seems to be passing a path to a directory rather than a file as the file parameter - consider changing that parameter to file=upload_file(paste(f_path, f_name, sep="/") and see if that fixes your error.
I try to send a request to API, use RCurl library.
My code:
start = "2018-07-30"
end = "2018-08-15"
api_request <- paste("https://api-metrika.yandex.ru/stat/v1/data.csv?id=34904255&date1=",
start,
"&date2=",
end,
"&dimensions=ym:s:searchEngine&metrics=ym:s:visits&dimensions=ym:s:<attribution>SearchPhrase&filters=ym:s:<attribution>SearchPhrase!~'some|phrases|here'&limit=100000&oauth_token=OAuth_token_here", sep="")
s <- getURL(api_request)
And every time I try to do it I have the response "Error 400" or "Bad Request" if I use getUrlContent instead. When I just open this url in my browser - it works correctly.
I still couldn't find any solution for this problem, so if somebody knows something about it - please help me, kind man =)
There are several approaches you can use in case the URL is right. First you can add the following parameter to the getURL function. Setting the parameter followlocation equal TRUE allows to follow any "Location:" header that the server returns as part of the HTTP headers. Processing is recursive, PHP will follow any "Location:" header.
> s <- getURL(url1, .opts=curlOptions(followlocation = TRUE))
If this is not working an alternative way is to use the XML package by calling the htmlParse method instead of getURL
> library(XML)
> s <- htmlParse(api_request)
Another approach would be to use the httr package and call the GET function:
> library(httr)
> s <- GET(api_request)
I would like to call my Philips Hue lights from R via the API and httr package. The problem is however that I can't get the body right. I'm sure the API works because GET calls work fine.
For example, the body in a PUT call to turn the lights on and off should look exactly like {"on":false}. The call looks like PUT(url = url), body = body1)
However, I cannot get this to work in the body section from the httr package. I already tried: body1 <- '{on:"false"}'
Which returns: "{on:\"false\"}", body2 <- list(on = "false") returns $on [1] "false" and body3 <- toJSON(body2) returns {"on":["false"]}.
As you can see none of the above options exactly desired return and they all produce extra punctuation marks. Any idea how I can get exactly {"on":false} in the body?
Unfortunately, I cannot provide you with a reproducible example because there is no public sandbox environment available and I don't want everyone to control my lights ;-) However the documentation can be found here.
If you are using toJSON from the jsonlite package, then you can do
library(jsonlite)
PUT("https://url", body=toJSON(list(on = unbox(FALSE))))
The unbox() will prevent the R vector from being wrapped in the brackets for a JSON array.
R version 3.0.1 (2013-05-16) for Windows 8 knitr version 1.5 Rstudio 0.97.551
I am using knitr to do the markdown of my R code.
As part of my analysis I downloaded various data sets from the web, knitr is totally fine with getting data from http sites but from https ones where it generates an unsupported URL scheme message.
I know when using the download.file function on a mac the method parameter has to be set to curl to get data from an https however this doesn't help when using knitr.
What do I need to do so that knitr will gather data from Https websites?
Edit:
Here is the code chunk that returns an error in Knitr but when run through R works without error.
```{r}
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy")
```
You could use https with download.file() function by passing "curl" to method as :
download.file(url,destination,method="curl")
Edit (May 2016): As of R 3.3.0, download.file() should handle SSL websites automatically on all platforms, making the rest of this answer moot.
You want something like this:
library(RCurl)
data <- getURL("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv",
ssl.verifypeer=0L, followlocation=1L)
That reads the data into memory as a single string. You'll still have to parse it into a dataset in some way. One strategy is:
writeLines(data,'temp.csv')
read.csv('temp.csv')
You can also separate out the data directly without writing to file:
read.csv(text=data)
Edit: A much easier option is actually to use the rio package:
library("rio")
import("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv")
This will read directly from the HTTPS URL and return a data.frame.
Use setInternet2(use = TRUE) before using the download.file() function. It works on Windows 7.
setInternet2(use = TRUE)
download.file(url, destfile = "test.csv")
I am sure you have already found solution to your problem by now.
I was working on an assignment right now and ended up getting the same error. I tried some of the tricks, but that did not work for me. Maybe because I am working on Windows machine.
Anyhow, I changed the link to http: rather than https: and that did the trick.
Following is chunk of my code:
if (!file.exists("./PeerAssesment2")) {dir.create("./PeerAssessment2")}
fileURL <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileURL, dest = "./PeerAssessment2/Data.zip")
install.packages("R.utils")
library(R.utils)
if (!file.exists("./PeerAssessment2/Data")) {
bunzip2 ("./PeerAssessment2/Data.zip", destname = "./PeerAssessment2/Data")
}
list.files("./PeerAssessment2")
noaaData <- read.csv ('./PeerAssessment2/Data')
Hope this helps.
I had the same issue with knitr and download.file() with a https url, on Windows 8.
You could try setInternet2(TRUE) before using the download.file() function. However I'm not sure that this fix works on Unix-like systems.
setInternet2(TRUE) # set the R_WIN_INTERNET2 to TRUE
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy") # now it should work
Source : R documentation (?download.file()) :
Note that https:// URLs are only supported if --internet2 or environment variable R_WIN_INTERNET2 was set or setInternet2(TRUE) was used (to make use of Internet Explorer internals), and then only if the certificate is considered to be valid.
I had the same problem with a https with the following code running perfectly in R and getting unsupported URL scheme when knitting to html:
temp = tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip", temp)
data = read.csv(unz(temp, "activity.csv"), colClasses = c("numeric", "Date", "numeric"))
I tried all the solutions posted here and nothing worked, in my absolute desperation I just eliminated the "s" in the "https" in the url and everything got fine...
Using the R download package takes care of the quirky details typically associated with file downloads. For you example, all you needed to do would have been:
```{r}
library(download)
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download(fileurl, destfile = "C:/Users/xxx/yyy")
```