R shell.exec requires full file path - r

I installed R and RStudio this week on a new Windows 10 machine. I want to use this R code to launch Excel and open a CSV file that is in a subdirectory of the current working directory:
file <- "example.csv"
sub_dir <- "subdirectory"
shell.exec(file.path(sub_dir, file))
But I get this error:
Error in shell.exec(file.path(sub_dir, file)) :
'subdirectory/example.csv' not found
However, if I provide shell.exec with the full file path, this code works as expected:
shell.exec(file.path(getwd(), sub_dir, file))
The documentation for shell.exec states:
The path in file is interpreted relative to the current working
directory.
R versions 2.13.0 and earlier interpreted file relative to the R home
directory, so a complete path was usually needed.
Why doesn't my original code (without getwd) not work? Thanks.

It looks to be related to the path separator in some wacky way. Below, I specify the the file path separator as \ and the command executes as expected. You could keep your call to file.path() and simply wrap in normalizePath() as another option.
file <- "example.csv"
sub_dir <- "subdirectory"
dir.create(sub_dir)
writeLines("myfile",file.path(sub_dir, file))
# Works
shell.exec(file.path(sub_dir, file, fsep = "\\"))
shell.exec(file.path(sub_dir, file))
#> Error in shell.exec(file.path(sub_dir, file)): 'subdirectory/example.csv' not found

Related

Downloading and unzipping GitHub zipped files directly in R

I am trying to download and unzip a folder of files from GitHub into R. I can manually download the file at https://github.com/dylangomes/SO/blob/main/Shape.zip and then extract all files in working directory, but I'd like to work directly from R.
utils::unzip("https://github.com/dylangomes/SO/blob/main/Shape.zip")
# Warning message:
# In utils::unzip("https://github.com/dylangomes/SO/blob/main/Shape.zip", :
# error 1 in extracting from zip file
It says it is a warning message, although nothing has been downloaded or unzipped into my wd.
I can download the file to my machine:
utils::download.file("https://github.com/dylangomes/SO/blob/main/Shape.zip")
But I get the same message with the unzip function:
utils::unzip("Shape.zip")
And the downloaded file cannot manually be extracted. Here, I get the error that the compressed folder is empty. The unzip line works on the manually downloaded .zip file, which tells me something is wrong with the download.file line.
So if I add raw=TRUE to the end (which can make a difference in downloading data from GitHub):
utils::download.file("https://github.com/dylangomes/SO/blob/main/Shape.zip?raw=TRUE","Shape.zip")
utils::unzip("Shape.zip")
I get a different warning with, similarly, nothing being executed:
Warning message:
In utils::unzip("Shape.zip") : internal error in 'unz' code
I have tried most of the answers at Using R to download zipped data file, extract, and import data, but they appear to be for single files that are zipped and aren't helping here. I've tried the answers at r function unzip error 1 in extracting from zip file, which mentions the same warning message I am getting, but none of the solutions work in this case.
Any idea of what I am doing wrong?
You need to use:
download.file(
"https://github.com/dylangomes/SO/blob/main/Shape.zip?raw=TRUE",
"Shape.zip",
mode = "wb"
)
Without the query string ?raw=TRUE you are downloading the webpage and not the file.
(For Windows) R will use mode = "wb" by default when it detects from the end of the URL that certain file formats, including .zip, are being downloaded. However, the URL finishing with a query string instead of a file format means the check fails so you need to set the mode explicitly.

Download a large zipped CSV file, unzip and read into R on Linux

I wish to read into my environment a large CSV (~ 8Gb) but I am having issues.
My data is a publicly available dataset:
# CREATE A TEMP FILE TO STORE THE DOWNLOADED DATA
temp <- tempfile()
# DOWNLOAD THE FILE FROM THE CMS
download.file("https://download.cms.gov/nppes/NPPES_Data_Dissemination_February_2022.zip",
destfile = temp)
This is where I'm running into difficulty, I am unfamiliar with linux working directories and where temp folders are created.
When I use list.dir() or list.files() I don't see any reference to this temp file.
I am working in an R project and my working director is as follows:
getwd()
[1] "/home/myName/myProjectName"
I'm able to read in the first part of the file but my system crashes after about 4Gb.
# UNZIP THE NPI FILE
npi <- unz(temp, "npidata_pfile_20050523-20220213.csv")
I then came across this post which has a function for decompressing large zip files using the system2 unzip functionality. However due to my limited R knowledge and Linux experience I couldn't get the function to point to the downloaded file in the temp folder
checking the path for temp above I get the following path:
temp
[1] "/tmp/Rtmpl6SHIJ/file7e5e6c1fc693"
Using the system2 function from the link above I tried the following:
x <- decompress_file(directory = temp,
file = "NPPES_Data_Dissemination_February_2022.zip")
But get the following error about setting the working directory:
Any pointers to how I can get this file unzipped given it's size and read it into memory would be much appreciated.
It might be a file permission issue. To get around it work in a directory you're already in, or know you have access to.
# DOWNLOAD THE FILE
# to a directory you can access, and name the file. No need to overcomplicate this.
download.file("https://download.cms.gov/nppes/NPPES_Data_Dissemination_February_2022.zip",
destfile = "/home/myName/myProjectname/npi.csv")
# use the decompress function if you need to, though unzip might work
x <- decompress_file(directory = "/home/myName/myProjectname/",
file = "npi.zip")
# remove .zip file if you need the space back
file.remove("/home/myName/myProjectname/npi.zip")
temp is the path to the file, not just the directory. By default, tempfile does not add a file extension. It can be done by using tempfile(fileext = ".zip")
Consequently, decompress_file can not set the working directory to a file. Try this:
x <- decompress_file(directory = dirname(temp), file = basename(temp))

Can't move file after download and unzip

I'm trying to download a zip file from a source, unzip it and after move to another directory.
First the download:
if (!file.exists("inst/extdata/sp_resultados_universo")) {
tmp <- tempfile(fileext = ".zip")
download.file("ftp://ftp.ibge.gov.br/Censos/Censo_Demografico_2010/Resultados_do_Universo/Agregados_por_Setores_Censitarios/SP_Capital_20180416.zip", tmp, quiet = TRUE)
unzip(tmp, exdir = "inst/extdata/sp_resultados_universo", junkpaths=T)
unlink(tmp)
}
The file i want is on this directory inst/extdata/sp_resultados_universo/SP Capital/Base informa�oes setores2010 universo SP_Capital (codificação inválida)/CSV/, so when i try copy to inst/extdata/sp_resultados_universo/ i get an error
file.rename("inst/extdata/sp_resultados_universo/SP%20Capital/Base%20informa%87oes%20setores2010%20universo%20SP_Capital(condificação inválida)/CSV/Domicilio02_SP1.csv",
"inst/extdata/sp_resultados_universo/Domicilio02_SP1.csv")
Warning message:
In file.rename("inst/extdata/sp_resultados_universo/SP%20Capital/Base%20informa%87oes%20setores2010%20universo%20SP_Capital(condificação inválida)/CSV/Domicilio02_SP1.csv", :
it was not possible to rename file 'inst/extdata/sp_resultados_universo/SP%20Capital/Base%20informa%87oes%20setores2010%20universo%20SP_Capital(condificação inválida)/CSV/Domicilio02_SP1.csv'
for 'inst/extdata/sp_resultados_universo/Domicilio02_SP1.csv',
reason 'File or directory not found'
I'm translating the error message, so it could be inconsistent with english message.
I can change the directory name or move the file manually, but breaks the flow and it's not nice for reproducibility. How can i handle it inside R?
My system info:
Sys.info()
sysname
"Linux"
release
"4.9.0-6-amd64"
version
"#1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07)"
machine
"x86_64"
Many thanks in advance for any help.
when using R you can interact with the linux shell (or the windows cmd line) through a call to system() where you put the quoted command just as you would use in the shell,
for instance:
system("pwd") # prints current working directory
system("date") # prints
system("ls | grep .R") # prints a list of r scripts in the current working directory
system("mv file.txt /home/new_directory/file.txt") # moves your file to another directory

R exdir does not exist error

I'm trying to download and extract a zip file using R. Whenever I do so I get the error message
Error in unzip(temp, list = TRUE) : 'exdir' does not exist
I'm using code based on the Stack Overflow question Using R to download zipped data file, extract, and import data
To give a simplified example:
# Create a temporary file
temp <- tempfile()
# Download ZIP archive into temporary file
download.file("http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip",temp)
# ZIP is downloaded successfully:
# trying URL 'http://cran.r-project.org/bin/windows/contrib/r-release/ggmap_2.2.zip'
# Content type 'application/zip' length 4533970 bytes (4.3 Mb)
# opened URL
# downloaded 4.3 Mb
# Try to do something with the downloaded file
unzip(temp,list=TRUE)
# Error in unzip(temp, list = TRUE) : 'exdir' does not exist
What I've tried so far:
Accessing the temp file manually and unzipping it with 7zip: Can do this no problem, file is there and accessible.
Changing the temp directory to c:\temp. Again, the file is downloaded successfully, I can access it and unzip it with 7zip but R throws the exdir error message when it tries to access it.
R version 2.15.2
R-Studio version 0.97.306
Edit: The code works if I use unz instead of unzip but I haven't been able to figure out why one works and the other doesn't. From CRAN guidance:
unz reads (only) single files within zip files...
unzip extracts files from or list a zip archive
On a windows setup:
I had this error when I had exdir specified as a path. For me the solution was removing the trailing / or \\ in the path name.
Here's an example and it did create the new folder if it didn't already exist
locFile <- pathOfMyZipFile
outPath <- "Y:/Folders/MyFolder"
# OR
outPath <- "Y:\\Folders\\MyFolder"
unzip(locFile, exdir=outPath)
This can manifest another way, and the documentation doesn't make clear the cause. Your exdir cannot end in a "/", it must be just the name of the target folder.
For example, this was failing with 'exdir' does not exist:
unzip(temp, overwrite = F, exdir = "data_raw/system-data/")
And this worked fine:
unzip(temp, overwrite = F, exdir = "data_raw/system-data")
Presumably when unzip sees the "/" at the end of the exdir path it keeps looking; whereas omitting the "/" tells unzip "you've found it, unzip here".
A couple of years late but I still get this error when trying to use unzip(). It appears to be a bug because the man pages for unzip state if exdir is specified it will be created:
exdir The directory to extract files to (the equivalent of unzip -d).
It will be created if necessary.
A workaround I've been using is to manually create the necessary directory:
dir.create("directory")
unzip("file-to-unzip.zip", exdir = "directory/")
A pain, but it seems to work, at least for me.
I am using R3.2.1 on a Windows 7 machine.
The way I found to address this issue takes a few steps, but it works for me:
Create a vector that contains the name of the url from where you are downloading the file, e.g.
file_url <- "http://your.file.com/file_name.zip"
Use download.file to specify the url where you are downloading the file from (using your newly created vector), followed by the file name of the zipped file (that should be the last part of the url name). It will be saved as such in your working directory*, e.g.
download.file(file_url, "file_name.zip")
*If you are not sure of your working directory, you can use getwd() to check it. If you want to change your working directory, you can use setwd("C:users/username/...") to set it to what you want.
Use "unzip" to unzip the file into your working directory, with the name you will set using exdir, e.g.
unzip("file_name.zip", exdir = "file_name")
To check your work, you can use list.files, e.g.
list.files("file_name")
Hope this helps!

Download a file into my working directory

I would like to download a file directly into my working directory
I can do this to a temp directory:
download.file("http://www.abc.com/abc.zip",temp)
but what do I have to replace temp with to get it to download to the working directory?
If your url is in a variable, you can use basename to get the "filename" part out of it:
u <- "http://www.abc.com/abc.zip"
basename(u) # "abc.zip"
# downloads to current directory:
download.file(u, basename(u))
# downloads to subdirectory "foo":
download.file(u, file.path("foo", basename(u)))
The second argument of download.file() is destfile and it must be specified. I don't have a Windows machine to test this on, but both of these work on my linux box and I can't see why at least the second won't work on Windows too:
download.file("http://www.abc.com/abc.zip", "./abc.zip")
download.file("http://www.abc.com/abc.zip", "abc.zip")
The second of those indicates that if you just give a filename, the file will be download to the current working directory and saved under the stated name.

Resources