I am trying to download a data package from KNB and keep getting an "cannot open URL" and "InternetOpenUrl failed: 'A certificate is required to complete client authentication'". I have checked my github credentials and everything seems to be in order, but I just updated git and setup a PAT. Code below, but note that you will have to set your own directory.
This worked literally two weeks ago. Not sure what changed.
download.file("https://knb.ecoinformatics.org/knb/d1/mn/v2/packages/application%2Fbagit-097/resource_map_urn%3Auuid%3A14644b19-6e53-4063-aad9-fc823a45ac50", destfile = #your dir#, method = "wininet")
The suggested adding 'mode = "wb"' to download.file() works. I have been able to download, unzip and access the various data types in the downloaded folder.
Thank you!
download.file("https://knb.ecoinformatics.org/knb/d1/mn/v2/packages/application%2Fbagit-097/resource_map_urn%3Auuid%3A14644b19-6e53-4063-aad9-fc823a45ac50", destfile = #your dir#, method = "libcurl", mode = "wb")
unzip(zipfile = "./data.zip", exdir = "./data")
Related
I am downloading a large xlsx file as a part of a function. This file is removed with file.remove() in linux and mac but I have permission denied in windows machines. Below is the code for my function.
download.file(
'http://mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_MTI.xlsx',
'miRTarBase.xlsx', mode = "wb")
readxl::read_excel('miRTarBase.xlsx') -> miRTarBase
write.csv(miRTarBase, 'miRTarBase.csv')
read.csv('miRTarBase.csv', row.names = 1) -> miRTarBase
file.remove("miRTarBase.xlsx")
I get the following error message in my console
Warning message:
In file.remove("miRTarBase.xlsx") :
cannot remove file 'miRTarBase.xlsx', reason 'Permission denied'.
Again this warning only appears in windows.
Furthermore, after checking the properties of the file itself the 'Read-only' attribute is unchecked.
Following this, the following code works perfectly fine so I do not think the issue is with the folder either.
file.remove("miRTarBase.csv")
I believe the issue lies in how .xlsx files are treated in windows.
When I try to delete the .xlsx file while Rstudio is still running I get a File in use warning message. After closing the R session the .xlsx file can be deleted with no hassle.
This has confused me because I am not used to working with windows. Has anyone had this issue before? Would appreciate any help that can be given. Many thanks.
Have you tried saving as a temporary file in windows?
tmp <- tempfile()
download.file(
'http://mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_MTI.xlsx', tmp, mode = "wb")
readxl::read_excel(tmp) -> miRTarBase
write.csv(miRTarBase, 'miRTarBase.csv')
read.csv('miRTarBase.csv', row.names = 1) -> miRTarBase
file.remove(tmp)
OS: Win 7 64 bit
RStudio Version 1.1.463
As per Getting and Cleaning Data course, I attempted to download a csv file with method = curl:
fileUrl <- "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"
download.file(fileUrl, destfile = "./cameras.csv", method = "curl")
Error in download.file(fileUrl, destfile = "./cameras.csv", method =
"curl") : 'curl' call had nonzero exit status
However, method = libcurl resulted a successful download:
download.file(fileUrl, destfile = "./cameras.csv", method = "libcurl")
trying URL
'https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD'
downloaded 9443 bytes
changing from *http***s** to http produced exactly the same results for curl and libcurl, respectively.
Is there anyway to make this download work via method = curl as per the course?
Thanks
As you can see from ?download.file:
For methods "wget" and "curl" a system call is made to the tool given
by method, and the respective program must be installed on your system
and be in the search path for executables. They will block all other
activity on the R process until they complete: this may make a GUI
unresponsive.
Therefore, you should install curlfirst.
See this How do I install and use curl on Windows? to learn how.
Best!
I believe there were a few issues here:
Followed the steps in the link quoted by #JonnyCrunch
a) Reinstalled Git for windows;
b) added C:\Program Files\Git\mingw64\bin\ to the 'PATH' variable;
c) Disabled Use Internet Explorer library/proxy for HTTP in RStudio in: Tools > Options > Packages
d) Attempted steps in 'e)' below and added data.baltimorecity.gov
website to exclusions as per Kaspersky anti-virus' prompt;
e) Then in RStudio:
options(download.file.method = "curl")
download.file(fileUrl, destfile="./data/cameras.csv")
Success!
Thank you
I'm hosting my first shiny app from www.shinyapps.io. My r script uses a glm I created locally that I have stored as a .RDS file.
How can I read this file into my application directly using a free file host such as dropbox or google drive? (or another better alternative?)
test<-readRDS(gzcon(url("https://www.dropbox.com/s/p3bk57sqvlra1ze/strModel.RDS?dl=0")))
however, I get the error:
Error in readRDS(gzcon(url("https://www.dropbox.com/s/p3bk57sqvlra1ze/strModel.RDS?dl=0"))) :
unknown input format
I assume this is because the URL doesn't lead directly to the file but rather a landing page for dropbox?
That being said, I can't seem to find any free file hosting sites that have that functionality.
As always, I'm sure the solution is very obvious, any help is appreciated.
I figured it out. Hosted the file in a GitHub repository. From there I was able to copy the link to the raw file and placed that link in the readRDS(gzcon(url())) wrappers.
Remotely reading using readRDS() can be disappointing. You might want to try this wrapper that saves the data set to a temporary location before reading it locally:
readRDS_remote <- function(file, quiet = TRUE) {
if (grepl("^http", file, ignore.case = TRUE)) {
# temp location
file_local <- file.path(tempdir(), basename(file))
# download the data set
download.file(file, file_local, quiet = quiet, mode = "wb")
file <- file_local
}
readRDS(file)
}
I am trying to download spreadsheets from AQR data library into R directly.
I have this link: http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx which prompts a download. However, when trying the following code:
> url1<-"http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx"
> download.file(url1,destfile="example.xlsx")
I get this error
trying URL 'http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx'
Error in download.file(url1, destfile = "example.xlsx") : cannot open URL 'http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx'
https://www.aqr.com/library/data-sets/value-and-momentum-everywhere-portfolios-monthly is the page from which I am trying to download data(under full set data link).
Could you provide some guidance?
It looks like that link redirects to https, which download.file does not support by default. If you have wget or curl installed you can use
download.file("https://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx",
"example.xlsx",
method = "wget")
or
download.file("https://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx",
"example.xlsx",
method = "curl")
These and other options are discussed at Download a file from HTTPS using download.file()
I'm not quite sure what is causing the problem for you, but the following worked for me:
library(XLConnect)
##
con <- "http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx"
download.file(con,"xlsxFile.xlsx",mode="wb")
##
newWB <- loadWorkbook(
file="xlsxFile.xlsx",
create=F)
##
R> getSheets(newWB)
[1] "VME Portfolios" "Definitions" "Data Sources" "Disclosures"
and here's a screenshot of the downloaded file:
I'm trying to download and extract a zip file using R. Whenever I do so I get the error message
Error in unzip(temp, list = TRUE) : 'exdir' does not exist
I'm using code based on the Stack Overflow question Using R to download zipped data file, extract, and import data
To give a simplified example:
# Create a temporary file
temp <- tempfile()
# Download ZIP archive into temporary file
download.file("http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip",temp)
# ZIP is downloaded successfully:
# trying URL 'http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip'
# Content type 'application/zip' length 4533970 bytes (4.3 Mb)
# opened URL
# downloaded 4.3 Mb
# Try to do something with the downloaded file
unzip(temp,list=TRUE)
# Error in unzip(temp, list = TRUE) : 'exdir' does not exist
What I've tried so far:
Accessing the temp file manually and unzipping it with 7zip: Can do this no problem, file is there and accessible.
Changing the temp directory to c:\temp. Again, the file is downloaded successfully, I can access it and unzip it with 7zip but R throws the exdir error message when it tries to access it.
R version 2.15.2
R-Studio version 0.97.306
Edit: The code works if I use unz instead of unzip but I haven't been able to figure out why one works and the other doesn't. From CRAN guidance:
unz reads (only) single files within zip files...
unzip extracts files from or list a zip archive
On a windows setup:
I had this error when I had exdir specified as a path. For me the solution was removing the trailing / or \\ in the path name.
Here's an example and it did create the new folder if it didn't already exist
locFile <- pathOfMyZipFile
outPath <- "Y:/Folders/MyFolder"
# OR
outPath <- "Y:\\Folders\\MyFolder"
unzip(locFile, exdir=outPath)
This can manifest another way, and the documentation doesn't make clear the cause. Your exdir cannot end in a "/", it must be just the name of the target folder.
For example, this was failing with 'exdir' does not exist:
unzip(temp, overwrite = F, exdir = "data_raw/system-data/")
And this worked fine:
unzip(temp, overwrite = F, exdir = "data_raw/system-data")
Presumably when unzip sees the "/" at the end of the exdir path it keeps looking; whereas omitting the "/" tells unzip "you've found it, unzip here".
A couple of years late but I still get this error when trying to use unzip(). It appears to be a bug because the man pages for unzip state if exdir is specified it will be created:
exdir The directory to extract files to (the equivalent of unzip -d).
It will be created if necessary.
A workaround I've been using is to manually create the necessary directory:
dir.create("directory")
unzip("file-to-unzip.zip", exdir = "directory/")
A pain, but it seems to work, at least for me.
I am using R3.2.1 on a Windows 7 machine.
The way I found to address this issue takes a few steps, but it works for me:
Create a vector that contains the name of the url from where you are downloading the file, e.g.
file_url <- "http://your.file.com/file_name.zip"
Use download.file to specify the url where you are downloading the file from (using your newly created vector), followed by the file name of the zipped file (that should be the last part of the url name). It will be saved as such in your working directory*, e.g.
download.file(file_url, "file_name.zip")
*If you are not sure of your working directory, you can use getwd() to check it. If you want to change your working directory, you can use setwd("C:users/username/...") to set it to what you want.
Use "unzip" to unzip the file into your working directory, with the name you will set using exdir, e.g.
unzip("file_name.zip", exdir = "file_name")
To check your work, you can use list.files, e.g.
list.files("file_name")
Hope this helps!