R install.package() paramaterize download.file() with options() - r

I am running a Jenkins build and leveraging the r-base docker image.
I'm trying to install devtools I suspect that the Self-Signed certificates are my problem.
When I try to install the package.
install.packages("devtools",
method = options("extra", " --insecure --user"))
I get the following error
'arg' must be NULL or a character vector
How can setup package.install to ignore the certificates? From what I've read I need to parameterize the download.file() with options() for the method parameter in install.packages() but I cannot figure out how.
NOTE: I am not an R programmer, if this is something basic, I am happy to learn if there is an R tutorial on stuff like this somewhere.
What am I doing wrong with method = options(...) and how can I pass -k or --insecure to libcurl?

To download.file with method = "libcurl" and some extra options, pass those values in the respective arguments to the download file function.
install.packages("devtools", method = "libcurl", extra = " --insecure --user")
These options can be set with options(). The example below sets the method an other, extra, download file options. The previous settings are saved in old_opt.
libcurl_opts <- list(
download.file.method = "libcurl",
download.file.extra = " --insecure --user"
)
old_opt <- options(libcurl_opts)
Check to see it worked.
getOption("download.file.method")
#[1] "libcurl"
Now reset when done.
options(old_opt)
getOption("download.file.method")
#NULL

Related

How can I switch virtualenv from reticulate in R?

In R package reticulate there is a function use_virtualenv but it does not look like I can call it twice with different virtualenvs, second call is always ignored.
Is there a way to deactivate first virtualenv so I can call use_virtualenv("venv2") with the expected behavior?
#initialize
require(reticulate)
virtualenv_create("venv1")
virtualenv_create("venv2")
#call first virtualenv
use_virtualenv("venv1")
py_config() #show venv1 specs
#call second vrtualenv
use_virtualenv("venv2")
py_config() # still show venv1 specs, I want venv2 here
I think unloadNamespace("reticulate") could work but in my case first call is made by another package...
In short: By restarting the R session! You can't switch virtualenv in reticulate once chosen!
I tried it (but chose "venv2" first).
> use_python("venv1", T)
Error in use_python("venv1", T) :
Specified version of python 'venv1' does not exist.
> use_python("~/.virtualenvs/venv1", T)
ERROR: The requested version of Python ('~/.virtualenvs/venv1') cannot
be used, as another version of Python
('/home/josephus/.virtualenvs/venv2/bin/python') has already been
initialized. Please restart the R session if you need to attach
reticulate to a different version of Python.
Error in use_python("~/.virtualenvs/venv1", T) :
failed to initialize requested version of Python
So reticulate messages, that one has to start a new session to choose new virtual environment.
This must apply to use_virtualenv(<xxx>, T) too, though it is not as verbose as use_python(<xxx>, T).

Post image to slack using httr package in R

Slack offers a method to upload files through their api. The documentation is found here:
Slack files.upload method
On this page it gives an example of how to post a file:
curl -F file=#dramacat.gif -F "initial_comment=Shakes the cat" -F channels=C024BE91L,D032AC32T -H "Authorization: Bearer xoxa-xxxxxxxxx-xxxx" https://slack.com/api/files.upload
I am trying to translate how to execute this line of code using the httr package in R, with a file in my R working directory. I'm having trouble translating the different parts of the command. Here is what I have so far.
api_token='******'
f_path='c:/mark/consulting/dreamcloud' #this is also my working directory
f_name='alert_picture.png'
res<-httr::POST(url='https://slack.com/api/files.upload', httr::add_headers(`Content-Type` = "multipart/form-data"),
body = list(token=api_token, channels='CCJL7TMC7', title='test', file = httr::upload_file(f_path), filename=f_name))
When I run this I get the following error:
Error in curl::curl_fetch_memory(url, handle = handle) :
read function returned funny value
I tried to find better examples to use but so far no luck. Any suggestions are appreciated!
There's an example in slackr's own gg_slackr method, which creates an image of a GGPlot and uploads it to Slack:
res <- POST(url="https://slack.com/api/files.upload",
add_headers(`Content-Type`="multipart/form-data"),
body=list(file=upload_file(ftmp),
token=api_token, channels=modchan))
Your code seems to be passing a path to a directory rather than a file as the file parameter - consider changing that parameter to file=upload_file(paste(f_path, f_name, sep="/") and see if that fixes your error.

TreeTagger in R

I have downloaded TreeTaggerv3.2 for Windows and have configured it per the install.txt. I am trying to use it in R with koRpus package. I have set the kRp.env as -
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en",
preset="en", treetagger="manual", format="file",
TT.tknz=TRUE, encoding="UTF-8" )
.My data to be tagged is in a file and trying to use it as treetag("myfile.txt") but it is throwing the error-
Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, :
'data' must be of a vector type, was 'NULL'
In addition: Warning message:
running command 'C:\windows\system32\cmd.exe /c C:\TreeTagger\bin\tag-english.bat
C:\Users\vivsingh\Desktop\NLP\tree_tag_ex.txt' had status 255
The standalone TreeTagger is working on by windows.Any idea on how it works?
I had the exact same error and warning while trying lemmatization on R word vector following Bernhard Learns blog using windows 7 and R 3.4.1 (x64). The issue was also appearing using textstem package but TreeTagger was running properly in cmd window.
I mixed several answers I found on this post and here is my steps and code running properly:
get into R win_library (~\Documents\R\win-library\3.4\rJava\jri\x64\jri.dll) and copy jri.dll (thanks kravi!) to replace it the parent folder.
close and restart R
library(koRpus)
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
lemma_tagged <- treetag(lemma_unique$word_clean, treetagger="manual", format="obj", TT.tknz=FALSE , lang="en", TT.options=list(path="c:/TreeTagger", preset="en"))
lemma_tagged_tbl <- tbl_df(lemma_tagged#TT.res)
Hope it helps.
I am posting this answer to keep a record. I also faced the same issue due to incorrect specification of the location of jri.dll on 64-Bit processor and windows 8.1. If we call
set.kRp.env(TT.cmd="manual", lang="en", TT.options=list(path="/path/to/tree-tagger-windows-x.x/TreeTagger", preset="en")) and we follow either of following two steps, we can resolve this error:
While installing R, if we install only 64 Bit version of R, and
specify the proper path for these variables
LD_LIBRARY_PATH = /path/to/rJava/jri
JAVA_HOME = /path/to/jdk1.x.x
java.library.path = /path/to/rJava/jri/jri.dll
CLASSPATH = /path/to/rJava/jri
If we already installed both versions viz. 32 bit and 64 bit of R on your computer then just copy jri.dll from /path/to/rJava/jri/x64/jri.dll and replace at path/to/rJava/jri/jri.dll. Further, we need to set the path of above mentioned four variables.
I've got this issue (very similar I guess) and posted query to GitHub.
https://github.com/unDocUMeantIt/koRpus/issues/7
The current working solution for me for this case was easier than I could expect, just downgrading the koRpus package. This can change with time but this version should remain appropriate.
library("devtools")
install_github("unDocUMeantIt/koRpus", ref="0.06-5")
This package is not Java related they said.
You can face the same error while setting up the korpus environment and getting the result from treetagger. For example, when you use:
tagged.text <- treetag(
"C:/temp/sample_text.txt",
treetagger = "manual",
lang = "en",
TT.options = list(
path = "c:/Treetagger",
preset = "en"
),
doc_id = "sample"
)
You would receive a similar error
Error: Awww, this should not happen: TreeTagger didn't return any useful data.
This can happen if the local TreeTagger setup is incomplete or different from what presets expected.
You should re-run your command with the option 'debug=TRUE'. That will print all relevant configuration.
Look for a line starting with 'sys.tt.call:' and try to execute the full command following it in a command line terminal. Do not close this R session in the meantime, as 'debug=TRUE' will keep temporary files that might be needed.
If running the command after 'sys.tt.call:' does fail, you'll need to fix the TreeTagger setup.
If it does not fail but produce a table with proper results, please contact the author!
Here you need to change the value of treetagger, from
treetagger = "manual"
to
treetagger = "kRp.env"
However, before that remember to set the kRp.env as #Xochitl C. suggested in their answer
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
Once you do this, you'll get the desired result.

R produces "unsupported URL scheme" error when getting data from https sites

R version 3.0.1 (2013-05-16) for Windows 8 knitr version 1.5 Rstudio 0.97.551
I am using knitr to do the markdown of my R code.
As part of my analysis I downloaded various data sets from the web, knitr is totally fine with getting data from http sites but from https ones where it generates an unsupported URL scheme message.
I know when using the download.file function on a mac the method parameter has to be set to curl to get data from an https however this doesn't help when using knitr.
What do I need to do so that knitr will gather data from Https websites?
Edit:
Here is the code chunk that returns an error in Knitr but when run through R works without error.
```{r}
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy")
```
You could use https with download.file() function by passing "curl" to method as :
download.file(url,destination,method="curl")
Edit (May 2016): As of R 3.3.0, download.file() should handle SSL websites automatically on all platforms, making the rest of this answer moot.
You want something like this:
library(RCurl)
data <- getURL("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv",
ssl.verifypeer=0L, followlocation=1L)
That reads the data into memory as a single string. You'll still have to parse it into a dataset in some way. One strategy is:
writeLines(data,'temp.csv')
read.csv('temp.csv')
You can also separate out the data directly without writing to file:
read.csv(text=data)
Edit: A much easier option is actually to use the rio package:
library("rio")
import("https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv")
This will read directly from the HTTPS URL and return a data.frame.
Use setInternet2(use = TRUE) before using the download.file() function. It works on Windows 7.
setInternet2(use = TRUE)
download.file(url, destfile = "test.csv")
I am sure you have already found solution to your problem by now.
I was working on an assignment right now and ended up getting the same error. I tried some of the tricks, but that did not work for me. Maybe because I am working on Windows machine.
Anyhow, I changed the link to http: rather than https: and that did the trick.
Following is chunk of my code:
if (!file.exists("./PeerAssesment2")) {dir.create("./PeerAssessment2")}
fileURL <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileURL, dest = "./PeerAssessment2/Data.zip")
install.packages("R.utils")
library(R.utils)
if (!file.exists("./PeerAssessment2/Data")) {
bunzip2 ("./PeerAssessment2/Data.zip", destname = "./PeerAssessment2/Data")
}
list.files("./PeerAssessment2")
noaaData <- read.csv ('./PeerAssessment2/Data')
Hope this helps.
I had the same issue with knitr and download.file() with a https url, on Windows 8.
You could try setInternet2(TRUE) before using the download.file() function. However I'm not sure that this fix works on Unix-like systems.
setInternet2(TRUE) # set the R_WIN_INTERNET2 to TRUE
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download.file(fileurl, destfile = "C:/Users/xxx/yyy") # now it should work
Source : R documentation (?download.file()) :
Note that https:// URLs are only supported if --internet2 or environment variable R_WIN_INTERNET2 was set or setInternet2(TRUE) was used (to make use of Internet Explorer internals), and then only if the certificate is considered to be valid.
I had the same problem with a https with the following code running perfectly in R and getting unsupported URL scheme when knitting to html:
temp = tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip", temp)
data = read.csv(unz(temp, "activity.csv"), colClasses = c("numeric", "Date", "numeric"))
I tried all the solutions posted here and nothing worked, in my absolute desperation I just eliminated the "s" in the "https" in the url and everything got fine...
Using the R download package takes care of the quirky details typically associated with file downloads. For you example, all you needed to do would have been:
```{r}
library(download)
fileurl <- "https://dl.dropbox.com/u/7710864/data/csv_hid/ss06hid.csv"
download(fileurl, destfile = "C:/Users/xxx/yyy")
```

POST request using RCurl

As a way of exploring how to make a package in R for the Denver RUG, I decided that it would be a fun little project to write an R wrapper around the datasciencetoolkit API. The basic R tools come from the RCurl package as you might imagine. I am stuck on a seemingly simple problem and I'm hoping that somebody in this forum might be able to point me in the right direction. The basic problem is that I can't seem to use postForm() to pass an un-keyed string as part of the data option in curl, i.e. curl -d "string" "address_to_api".
For example, from the command line I might do
$ curl -d "Tim O'Reilly, Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people"
with success. However, it seems that postForm() requires an explicit key when passing additional arguments into the POST request. I've looked through the datasciencetoolkit code and developer docs for a possible key, but can't seem to find anything.
As an aside, it's pretty straightforward to pass inputs via a GET request to other parts of the DSTK API. For example,
ip2coordinates <- function(ip) {
api <- "http://www.datasciencetoolkit.org/ip2coordinates/"
result <- getURL(paste(api, URLencode(ip), sep=""))
names(result) <- "ip"
return(result)
}
ip2coordinates('67.169.73.113')
will produce the desired results.
To be clear, I've read through the RCurl docs on DTL's omegahat site, the RCurl docs with the package, and the curl man page. However, I'm missing something fundamental with respect to curl (or perhaps .opts() in the postForm() function) and I can't seem to get it.
In python, I could basically make a 'raw' POST request using httplib.HTTPConnection -- is something like that available in R? I've looked at the simplePostToHost function in the httpRequest package as well and it just seemed to lock my R session (it seems to require a key as well).
FWIW, I'm using R 2.13.0 on Mac 10.6.7.
Any help is much appreciated. All of the code will soon be available on github if you're interested in playing around with the data science toolkit.
Cheers.
With httr, this is just:
library(httr)
r <- POST("http://www.datasciencetoolkit.org/text2people",
body = "Tim O'Reilly, Archbishop Huxley")
stop_for_status(r)
content(r, "parsed", "application/json")
Generally, in those cases where you're trying to POST something that isn't keyed, you can just assign a dummy key to that value. For example:
> postForm("http://www.datasciencetoolkit.org/text2people", a="Archbishop Huxley")
[1] "[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"
Would work the same if I'd used b="Archbishop Huxley", etc.
Enjoy RCurl - it's probably my favorite R package. If you get adventurous, upgrading to ~ libcurl 7.21 exposes some new methods via curl (including SMTP, etc.).
From Duncan Temple Lang on the R-help list:
postForm() is using a different style (or specifically Content-Type) of submitting the form than the curl -d command.
Switching the style = 'POST' uses the same type, but at a quick guess, the parameter name 'a' is causing confusion
and the result is the empty JSON array - "[]".
A quick workaround is to use curlPerform() directly rather than postForm()
r = dynCurlReader()
curlPerform(postfields = 'Archbishop Huxley', url = 'http://www.datasciencetoolkit.org/text2people', verbose = TRUE,
post = 1L, writefunction = r$update)
r$value()
This yields
[1]
"[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":0,\"end_index\":17,\"matched_string\":\"Archbishop
Huxley\"}]"
and you can use fromJSON() to transform it into data in R.
I just wanted to point out that there must be an issue with passing a raw string via the postForm function. For example, if I use curl from the command line, I get the following:
$ curl -d "Archbishop Huxley" "http://www.datasciencetoolkit.org/text2people
[{"gender":"u","first_name":"","title":"archbishop","surnames":"Huxley","start_index":0,"end_index":17,"matched_string":"Archbishop Huxley"}]
and in R I get
> api <- "http://www.datasciencetoolkit.org/text2people"
> postForm(api, a="Archbishop Huxley")
[1] "[{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":44,\"end_index\":61,\"matched_string\":\"Archbishop Huxley\"},{\"gender\":\"u\",\"first_name\":\"\",\"title\":\"archbishop\",\"surnames\":\"Huxley\",\"start_index\":88,\"end_index\":105,\"matched_string\":\"Archbishop Huxley\"}]"
attr(,"Content-Type")
charset
"text/html" "utf-8"
Note that it returns two elements in the JSON string and neither one matches on the start_index or end_index. Is this a problem with encoding or something?
The simplePostToHost function in the httpRequest package might do what you are looking for here.

Resources