Download & decompress JSON file in R using curl or RCurl - r

I have the following bash script to download & decompress a JSON file:
#!/bin/sh -ex
# Ensure data directory (or a link) exists.
test -e results || mkdir results
# Download and decompress data.
curl -u $GISAID_USERNAME:$GISAID_PASSWORD --retry 4 \
https://www.epicov.org/epi3/3p/$GISAID_FEED/export/provision.json.xz \
| xz -d -T8 > results/gisaid.json
Ideally I would like to have an R function to download & decompress this file in a given directory, with the environment variables above $GISAID_USERNAME, $GISAID_PASSWORD & $GISAID_FEED passed as arguments. Would anyone know how to accomplish this, e.g. using package curl or RCurl? (It would also be OK not to decompress it and leave it as .json.xz, as I would be reading the file later using
library(jsonlite)
GISAID_json <- jsonlite::stream_in(gzfile(".//data//GISAID_json//provision.json.xz"))

Something like this should work:
library(curl)
library(glue)
custom_curl <- function(user, pass, feed, dest) {
custom_handle <- curl::new_handle()
curl::handle_setopt(
custom_handle,
username = user,
password = pass
)
url <- glue::glue(
"https://www.epicov.org/epi3/3p/{feed}/export/provision.json.xz"
)
curl::curl_download(url, dest, handle = custom_handle)
}
custom_curl('my_user', 'xxxxxx', 'feed1', 'dest/filename.json.xz')
As I can't test in the real files and url, I'm not sure if little tinkering in the function is needed, but at least is a starter point for you.

Does executing your terminal commands in R using the system function already help you?
Put your terminal call into system() and it should execute and create your file. Afterwards read in the file. Of course you would have to replace the $GISAID_USERNAME, $GISAID_PASSWORD with your actual information. If the login information or the url should be flexible, you can put together a string beforehand, since system() expects a string with the command to execute.
system("curl -u $GISAID_USERNAME:$GISAID_PASSWORD --retry 4 \
https://www.epicov.org/epi3/3p/$GISAID_FEED/export/provision.json.xz \
| xz -d -T8 > results/gisaid.json")
Afterwards just read in the (hopefully) created file.
Couldn't test with your setup, but for me e.g. this small example successfully creates a file:
system("curl https://raw.githubusercontent.com/SteffenMoritz/imputeTS/master/pkgdown/favicon/favicon.ico > /Users/Steve/Downloads/x.ico")

Related

R translating curl commands to post multipart form - problem

I have the following curl command that, when run from command line, works perfectly:
curl -X POST -u "myusername|myemail#domain.com:myPassword"
-H "Content-Type: multipart/form-data"
--form file=#MyFileForUploading.csv
https://mysite-data.herokuapp.com/api/mymarket/setups/uploads
[Apologies: this is not a working example as I cannot provide the real url and credentials. I am hoping you can help me with the translation from curl to httr without running the example yourselves.]
Here's my attempt to translate the above to the language of R's httr, which did NOT work:
library(httr)
POST("https://mysite-data.herokuapp.com/api/mymarket/setups/uploads",
config = authenticate("myusername|myemail#domain.com", "myPassword"),
body = upload_file("MyFileForUploading.csv", type = "text/csv"),
encode = "multipart")
The curl command serves to upload a csv file being used as setup for a web-based trading interface. Setup includes things like trader initial holdings of objects, trader permissions (to buy and sell), etc. All this is simply stored as a csv file (columns = setup parameters; rows = traders).
Can anyone see an obvious mis-translation? I am very ignorant about both curl and httr. My translation is based on learning from examples and I wouldn't be surprised if there's an obvious failure, for example, with the content-type part of the command.
Thanks!
You're really close. This works with environment values setup in "~/Renviron":
library("httr")
post_url <- Sys.getenv("POST_URL")
username <- Sys.getenv("USERNAME")
password <- Sys.getenv("PASSWORD")
csv_file <- Sys.getenv("CSV_FILE")
POST(
url = post_url,
config = authenticate(username, password),
body = list(file = upload_file(csv_file)),
encode = "multipart",
verbose()
) -> response
The key is the file = as you used in your CURL command.

Uploading files to SharePoint from R

I am trying to upload a file from R to SharePoint. I have found similar questions like Saving files to SharePoint folder from R, Saving a file to Sharepoint with R and Copying file to sharepoint library in R but I haven't been able to get it working for myself.
The things I tried so far:
system("curl --ntlm --user username:password --upload-file file.xslx https://companyname.sharepoint.com/sites/sitename/Shared%20documents/file.xlsx")
system("curl --ntlm --user username:password --upload-file file.xslx https://companyname.sharepoint.com/sites/sitename/Documents/file.xlsx")
system("curl --ntlm --user username:password --upload-file file.xslx https://companyname.sharepoint.com/sites/sitename/Shared documents/file.xlsx")
system("curl --ntlm --user username:password --upload-file file.xslx \\\\companyname.sharepoint.com#SSL\\sites\\sitename\\Shared%20documents\\file.xlsx")
Note that our SharePoint is in our native language (Dutch), so the folder "Shared documents" is "Gedeelde documenten". I tried both languages but without success. Not sure if I am supposed to the English names or the Dutch ones.
My guess is that the url I use is not in the correct format or such, so I have played around with that but can't come up with the correct way myself.
Any help is much appreciated.
EDIT:
This is how the page and folder in Sharepoint looks like.
Full url (guessed the part from /Forms till the end is not needed in the command): https://companyname.sharepoint.com/sites/SiteName/Gedeelde%20documenten/Forms/AllItems.aspx?id=%2Fsites%2FOfficeSFMT%2FGedeelde%20documenten%2FGeneral%2FTest
Screenshot of the folder:
My best guess was trying: "--upload-file C:/Users/UserName/Documents/Test.txt", "companyname.sharepoint.com/sites/SiteName/Documenten/General/Test/Test.txt"
I just tested the following code and it worked:
cmd <- paste("curl --max-time 7200 --connect-timeout 7200 --ntlm --user", "username:password", "--upload-file Book1.xlsx","teamsites.companyname.com/sites/SandBox/Documents/UserDocumentation/Test/Book1.xlsx", sep = " ")
system(cmd)
I routinely use the following function.However, the only issue will be the file that was transferred will remain 'checked-out' until the File is manually 'checked-in' for others to use.
saveToSharePoint <- function(fileName)
{
cmd <- paste("curl --max-time 7200 --connect-timeout 7200 --ntlm --user","username:password",
"--upload-file", paste0("/home/username/FolderNameWhereTheFileToTransferExists/",fileName),
paste0("teamsites.OrganizationName.com/sites/PageTitle/Documents/UserDocumentation/FolderNameWhereTheFileNeedsToBeCopied/",fileName), sep = " ")
system(cmd)
}
saveToSharePoint("SomeFileName.Ext")

R - curl Github API to access private repos

All I'm trying to do is read all the repos and issues in my organizations private repos. I can from my Windows 7 cmd.exe execute
curl -u "user:pass" https://api.github.com/orgs/:org/repos
and I get back all of my repositories. I can pipe this to a file:
curl -u "user:pass" https://api.github.com/orgs/:org/repos > "C:\Users\Location\file.txt"
and this saves the JSON output. I can replicate this in R but in what seems like a terrible way.
fullRepos = system('curl -s -u "user:pass" -G https://api.github.com/orgs/:org/repos',
intern=T,show.output.on.console = F)
This captures the output (intern = T) and the -s gets rid of the progress bars so I can collapse the lines and turn it into a data frame. This gets back all the repositories, public and private.
I tried using RCurl to do the same thing but the code below only provides the public repositories. The httpheader is because otherwise it the API rejects my call.
RCurl::getURL(url="https://api.github.com/orgs/:org/repos",userpwd ="user:pass",
httpheader = c('User-Agent' = "A user agent"))
I also tried httr and it also only provides the public repositories.
httr::GET(url="https://api.github.com/orgs/:org/repos",userpwd="user:pass")
What am I doing wrong with RCurl and httr? I'd rather have a workflow that doesn't make a system command and then paste the lines together.
We can use the authenticate() helper function in httr to build the authentication header for us w/o having to manually create it. Also, verbose() can be used to debug HTTP issues:
httr::GET(url="https://api.github.com/orgs/:‌​org/repos",
httr::authenticate("user", "pass"),
httr::verbose())

download.file in R including pre-requisites

I'm trying to use download.file to get some webpages including embedded images, etc. I think using wget it's the equivalent of the -p -k options, but I can't see how to do this...
if I do:
download.file("http://guardian.co.uk","test.html")
That obviously works, but I get this error:
Warning messages:
1: running command 'wget -p -k "http://guardian.co.uk" -O "test.html"' had status 1
2: In download.file("http://guardian.co.uk", "test.html", method = "wget", :
download had nonzero exit status
When I run:
download.file("http://guardian.co.uk","test.html", method = "wget", extra = "-p -k") #no recursion (-r), but get pre-requisites, and (-k) convert for local viewing
I've done Sys.which("wget") & the path is set (and I'm not trying to access https which I think can cause issues).
Once I've done this I actually want to put it into a loop where I download a set of urls (& their embedded content) to create a single html output...
Easy solution, just use system to call wget directly:
system("wget http://guardian.co.uk -p -k")
I think the issue is that passing an output file ('test.html') means -O option specified, so you can't also invoke -r -k whereas calling wget directly means it saves the files separately.

Post to Twitter using Terminal with CURL

I got this far:
:~ curl -u username:password -d status="new_status" http://twitter.com/statuses/update.xml
Now, how can I alias this with variables so I can easily twit from Terminal? How can I make the alias working through different sessions (when I close Terminal aliases reset).
Thanks!
Basic Authentication is no longer supported by twitter. Please use OAuth.
You clearly have the alias command: stick it in your ~/.bashrc and it will be set up when your bash shell starts. (.shrc should also work for sh-like shells.)
If you stick it in a script file as the previous answer suggests:
(a) add the line
#!/bin/sh
at the top;
(b) make sure it's on your path or you'll have to type the whole path to the script when you want to run it.
(c) to make it executable,
chmod +x tweet.sh
what about putting it a file and using argument 1 as $1:
# tweet.sh "post my status, moron!":
curl -u username:password -d status="$1" http://twitter.com/statuses/update.xml
will that work?
You need to create a file in your home directory that will get referenced each time a new terminal opens.
Do a bit of research as to what to name the file, according to what type of shell you are using (tcsh looks for a file called .tcshrc while bash looks for .bashrc).
Once you have that file, make it executable by running:
chmod +x name_of_file
Then, in that file, create your alias (again, you'll need to research how to do this depending on what type of shell you are using). For tcsh, my alias looks like this:
alias tw 'curl -u username:password -d status=\!^ http://twitter.com/statuses/update.xml'
Bash aliases use an equals sign. A bash alias would look something more like this:
alias tw='curl -u username:password -d status=\!^ http://twitter.com/statuses/update.xml'
Note the change in the command after "status=". The \!^ tells the line of code to insert the first argument passed after the alias itself.
Save your file.
You could then run an update to twitter by typing the following in a new terminal:
tw 'my first post to twitter via the terminal, using aliases'
Don't forget to escape 'special' characters (like exclamations) with the escape character, \ (i.e. \!)
Since Basic Authentication is no longer supported by twitter, you have to use OAuth to achieve your goal.
But if you just want to post to Twitter using terminal, there are many application can do it.
Take a look at Rainbowstream or t
With rainbowstream, the following lines will let you tweet from console:
$ sudo pip install rainbowstream
$ rainbowstream
[#yourscreenname]t whatever you want

Resources