download.file from url without specifying filename using R - r

Am trying to download a file from a url. If I manually download and unzip it works fine, however, using download.file the zip is corrupt. I wonder if it is something to do with manually specifying the file name. Why is this neccesary when using the download.file function? Is it not possible to simply specify a url and folder and it download the file as it is named on the server?
##using R
##download url
url <- 'https://www.eea.europa.eu/data-and-maps/data/eea-reference-grids-2/gis-files/denmark-shapefile/at_download/file'
download.file(url, destfile ='Denmark_shapefile.zip')
unzip(zipfile = 'Denmark_shapefile.zip', exdir = '.')
Unzip fails as the file is corrupt

destfile ='Denmark_shapefile.zip' is only specifying the name the downloaded file should have. You can write anything you want in there.
Your code works for me exactly as it is, but you could try using this download statement instead of yours which is specified for writing and is binary safe.
download.file(url, destfile ='Denmark_shapefile.zip', mode='wb', cacheOK=FALSE)

One way to do is,
url <- 'https://www.eea.europa.eu/data-and-maps/data/eea-reference-grids-2/gis-
files/denmark-shapefile/at_download/file'
Your working directory is were files are downloaded
library(stringr)
fold = str_replace_all(getwd(), '/', '\\\\')
A random file name is assigned to zip file.
fil = tempfile(pattern = "file", tmpdir = fold, fileext = '.zip')
download.file(url, fil, mode = 'wb')

Related

Getting a zip file with R httr package: file is empty

I am downloading a zip file from a site that uses basic authentication. The download seems to be going OK but when I try the unzip the file, it turns out to be empty. When I download the file by hand, it has several folders and files in it.
Here's what I am doing:
library(httr)
dest <- paste0(getwd(), "/data/weekly_2017-02-18.zip")
GET("https://www.example.com/weeklydata/weekly_2017-02-18.zip",
authenticate("myemail", "mypassword", "basic"),
write_disk(dest, overwrite = TRUE))
unzip(dest) # <-- THIS FAILS
What am I doing wrong?
unzip(dest)) <-- there is an extra closing bracket here
use:
unzipped <- unzip(dest)

R download.file, downloading excel file does not work

I try to download an excel file using download.file().
If I go directly to the link using the browser, I can download the file without problems.
However, using download.file does only download a broken file with Excel error: "The file you are trying to open is in a different format than specified by the file extension."
Here is my code:
url <- "http://obieebr.banrep.gov.co/analytics/saw.dll?Download&Format=excel2007&Extension=.xlsx&BypassCache=true&path=%2Fshared%2fSeries%20Estad%c3%adsticas%2F1.%20Tasa%20Interbancaria%20%28TIB%29%2F1.1.TIB_Serie%20hist%C3%B3rica%20IQY&lang=es&NQUser=publico&NQPassword=publico&SyncOperation=1"
download.file(url, destfile = paste0(base_dir, "test.xls"), mode = "wb", method="libcurl")
Any ideas how to download this file?
Many thanks for your help!
Try this, it works for me:
download.file(url,destfile = "./second.xlsx",mode = "wb")
The file you are trying to download is simply not an excel file. Actually what you obtain is an html file (try to change the file extension to '.html', then open in your browser). So your code is not the problem.

save zipped file directly from the web in r

I am trying to save a zip file from the internet onto my computer. I can download the content straight into R with:
sfile <- "http://xweb.geos.ed.ac.uk/~smaccal1/ARCLake/v3_0/PL/ALID0001.zip"
temp <- tempfile()
download.file(sfile,temp)
From here, how can I then save that zipped file on my computer without having to open it in R by unzipping the folder and then using read.table
data <- read.table(unz(temp, "a1.dat"))
unlink(temp)
and then save that data. Essentially I would like to save the files directly from the web (still zipped). How can this be done?
You can use download.file to save the file in a specified location:
sfile <- "http://xweb.geos.ed.ac.uk/~smaccal1/ARCLake/v3_0/PL/ALID0001.zip"
download.file(sfile, destfile = "/path/to/myfile.zip")

How to download file from internet via R

I have an url, and I want to download the file via R, I notice that download.file would be helpful, but my problem seems different:
url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile)
It doesn't work! I notice that if my url is in the form of xxx.pdf, then the code above is no problem, otherwise the file that is downloaded is corrupt.
Does anyone know how to solve this problem?
Setting the mode might be required to treat the file as binary data while saving it. If I leave that argument out, I get a blank file, but this way works for me:
url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?
attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile, mode="wb")
I am trying to download an nc file with R. It downloads well but I get this error when trying to open it:
Error in R_nc4_open: NetCDF: Unknown file format Error in
nc_open("SM_D2010323_Map_SATSSS_data_1day.nc") : Error in nc_open
trying to open file SM_D2010323_Map_SATSSS_data_1day.nc
(return_on_error= FALSE )
url <- "https://www.star.nesdis.noaa.gov/data/socd1/coastwatch/products/miras/nc/SM_D2010323_Map_SATSSS_data_1day.nc"
destfile <- "***/SM_D2010323_Map_SATSSS_data_1day.nc"
download.file(url, destfile)
nc_data <- nc_open('SM_D2010323_Map_SATSSS_data_1day.nc')
But when I use the same URL on my web browser, I can open the file without any problems with R.

R exdir does not exist error

I'm trying to download and extract a zip file using R. Whenever I do so I get the error message
Error in unzip(temp, list = TRUE) : 'exdir' does not exist
I'm using code based on the Stack Overflow question Using R to download zipped data file, extract, and import data
To give a simplified example:
# Create a temporary file
temp <- tempfile()
# Download ZIP archive into temporary file
download.file("http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip",temp)
# ZIP is downloaded successfully:
# trying URL 'http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip'
# Content type 'application/zip' length 4533970 bytes (4.3 Mb)
# opened URL
# downloaded 4.3 Mb
# Try to do something with the downloaded file
unzip(temp,list=TRUE)
# Error in unzip(temp, list = TRUE) : 'exdir' does not exist
What I've tried so far:
Accessing the temp file manually and unzipping it with 7zip: Can do this no problem, file is there and accessible.
Changing the temp directory to c:\temp. Again, the file is downloaded successfully, I can access it and unzip it with 7zip but R throws the exdir error message when it tries to access it.
R version 2.15.2
R-Studio version 0.97.306
Edit: The code works if I use unz instead of unzip but I haven't been able to figure out why one works and the other doesn't. From CRAN guidance:
unz reads (only) single files within zip files...
unzip extracts files from or list a zip archive
On a windows setup:
I had this error when I had exdir specified as a path. For me the solution was removing the trailing / or \\ in the path name.
Here's an example and it did create the new folder if it didn't already exist
locFile <- pathOfMyZipFile
outPath <- "Y:/Folders/MyFolder"
# OR
outPath <- "Y:\\Folders\\MyFolder"
unzip(locFile, exdir=outPath)
This can manifest another way, and the documentation doesn't make clear the cause. Your exdir cannot end in a "/", it must be just the name of the target folder.
For example, this was failing with 'exdir' does not exist:
unzip(temp, overwrite = F, exdir = "data_raw/system-data/")
And this worked fine:
unzip(temp, overwrite = F, exdir = "data_raw/system-data")
Presumably when unzip sees the "/" at the end of the exdir path it keeps looking; whereas omitting the "/" tells unzip "you've found it, unzip here".
A couple of years late but I still get this error when trying to use unzip(). It appears to be a bug because the man pages for unzip state if exdir is specified it will be created:
exdir The directory to extract files to (the equivalent of unzip -d).
It will be created if necessary.
A workaround I've been using is to manually create the necessary directory:
dir.create("directory")
unzip("file-to-unzip.zip", exdir = "directory/")
A pain, but it seems to work, at least for me.
I am using R3.2.1 on a Windows 7 machine.
The way I found to address this issue takes a few steps, but it works for me:
Create a vector that contains the name of the url from where you are downloading the file, e.g.
file_url <- "http://your.file.com/file_name.zip"
Use download.file to specify the url where you are downloading the file from (using your newly created vector), followed by the file name of the zipped file (that should be the last part of the url name). It will be saved as such in your working directory*, e.g.
download.file(file_url, "file_name.zip")
*If you are not sure of your working directory, you can use getwd() to check it. If you want to change your working directory, you can use setwd("C:users/username/...") to set it to what you want.
Use "unzip" to unzip the file into your working directory, with the name you will set using exdir, e.g.
unzip("file_name.zip", exdir = "file_name")
To check your work, you can use list.files, e.g.
list.files("file_name")
Hope this helps!

Resources