How to find correct executable with Sys.which on Windows - r

What are the workarounds on Windows to make it so Sys.which finds the proper executables? Two cases that are reoccuring problems:
convert.exe which is both a windows program and the ImageMagik program, but Sys.which only finds the windows one which is never wanted from R no matter how I seem to arrange things on my PATH.
tar.exe is packaged along with various things like git or mingw or whatever, and even when I have Rtools and Rbuildtools first in my path, the tar program from Rtools is never found, for example when installing a package from source.
So, I have resorted to writing a wrapper that calls 7-zip instead whenever I am on windows. This can't be the thing to do can it?
Edit
Actually just adding an environment variable to .Renviron: TAR=path/to/tar.exe works fine for the install.packages example, and I am having trouble remembering where else the tar.exe was biting me, but Josh answered the main one, convert.exe.

I asked a +/- identical question earlier this year over on R-devel. Among the replies was this one, by Henrik Bengtsson, who kindly provided the following useful function:
Sys.which2 <- function(cmd) {
stopifnot(length(cmd) == 1)
if (.Platform$OS.type == "windows") {
suppressWarnings({
pathname <- shell(sprintf("where %s 2> NUL", cmd), intern=TRUE)[1]
})
if (!is.na(pathname)) return(setNames(pathname, cmd))
}
Sys.which(cmd)
}
## Trying out Sys.which & Sys.which2 on my Windows box gives the following:
Sys.which("convert")
# convert
# "C:\\Windows\\system32\\convert.exe"
Sys.which2("convert")
# convert
# "C:\\Program Files\\ImageMagick-6.8.8-Q16\\convert.exe"
I'm really not sure why R-core don't just fix Sys.which() to make it actually portable, but they at least do document root cause of this behavior in ?system (whose functionality is afflicted by the same problem):
The search path for 'command' may be system-dependent: it will
include the R 'bin' directory, the working directory and the
Windows system directories before 'PATH'.

Because Sys.which() is vectorized–and since I think that is useful–I've modified the Henrik Bengtsson code in the following function sys_which(), which should be a more robust and more similar version of Sys.which():
## check if unix
is_unix <- function() grepl("unix", .Platform$OS.type, ignore.case = TRUE)
## robust version of Sys.which
sys_which <- function(x) {
if (is_unix()) {
return(Sys.which(x))
}
sys_win <- function(x) {
if (grepl("\\S", path <- Sys.which(x))) {
return(path)
}
path <- tryCatch(
suppressWarnings(system(sprintf("where %s", x), intern = TRUE)[1]),
warning = function(w) "",
error = function(e) "")
if (!grepl("\\S", path)) {
return(`names<-`("", x))
}
`names<-`(path, x)
}
vapply(x, sys_win, character(1))
}
This has the following advantages:
It's vectorized–sys_which() can handle a vector (one or more input values) e.g., sys_which(c("python", "python3"))
It's more error resistant–use of tryCatch(...) ensures any inputs that result in an error from the system call get passed on to the normal Sys.which() function.
Example:
> sys_which(c("python", "python2.7", "python3", "asdf"))
#> python python2.7 python3 asdf
#> "/usr/bin/python" "/usr/bin/python2.7" "/usr/local/bin/python3" ""

Related

Issue when creating zip file from directory

I have trouble with creating zip file from R. The same code worked perfectly at work with R version 3.4.2, 32 bit computer.
Now I am trying to run the same thing on R version 3.5.1, 64 bit computer, and the zip() command does not seem to work. What is going on?
zip(zipfile = "test.zip",files=list.files(getwd()))
#create zip from whole directory, on 1st machine it works, now nothing happens
I checked the source code for zip() and when I debug it, I found out that system2 command does nothing.
zip <- function (zipfile, files, flags = "-r9X", extras = "", zip = Sys.getenv("R_ZIPCMD",
"zip"))
{
if (missing(flags) && (!is.character(files) || !length(files)))
stop("'files' must a character vector specifying one or more filepaths")
args <- c(flags, shQuote(path.expand(zipfile)), shQuote(files),
extras)
if (.Platform$OS.type == "windows")
invisible(system2(zip, args))
else invisible(system2(zip, args))
}
# I run this manually when trying to debug, nothing happens;
system2(zip, args) ## zip is a parameter here, not a function
####
Browse[2]> zip
[1] "zip"
Browse[2]> args
[1] "-r9X" "\"bla.zip\""
[3] "\"[Content_Types].xml\"" "\"_rels\""
[5] "\"docProps\"" "\"xl\""
[7] ""
For example absurd call does not give an error.
system2("blablađ",2) ## does nothing but no error or warning either
I am stuck trying to understand how does system2() function works and what do I need to change to create a compressed folder.
Thanks
EDIT: After taking the account the help from comment, I got following error:
Browse[2]> system2(zip, args,stderr = T)
Error in system2(zip, args, stderr = T) : '"zip"' not found
SOLVED: After installing Rtools for version 3.5 it worked.
From the zip help:
zip(zipfile, files, flags = "-r9X", extras = "",
zip = Sys.getenv("R_ZIPCMD", "zip"))
zip A character string specifying the external command to be used.
As you can see, the zip function has an argument zip to specify the external command to be used. On my machine it is:
λ where zip
C:\Oracle\Ora11\BIN\zip.exe
C:\Program Files\Rtools\bin\zip.exe
The zip program is available in Rtools, but it is also available on any (Windows?) machine, usually.
To check whether zip is found by R, type:
> Sys.which("zip")
zip
"C:\\Oracle\\Ora11\\bin\\zip.exe"
If you get "", that means zip is not in the path, and if it is neither in the environment variable R_ZIPCMD, you have to specify its path in the zip argument.

How do you tell programmatically if you are running Architect/StatET?

Different IDEs have quirks, and so it's occasionally useful to be able to know what IDE you are using to run R.
You can test if you are running RStudio by testing for the RSTUDIO environment variable.
is_rstudio <- function()
{
env <- Sys.getenv("RSTUDIO")
!is.null(env) && env == "1"
}
(Or, as Hadley commented, gui <- .Platform$GUI; !is.null(gui) && gui == "RStudio".)
You can test for Revolution R by check for a list named Revo.version in the base environment.
is_revo_r <- function()
{
exists("Revo.version", "package:base", inherits = FALSE) && is.list(Revo.version)
}
Is there a similar check that can be done to see if you are running Architect or StatET?
The closest thing I've found is that by default Architect prepends the path to its embedded copy of Rtools to the PATH environment variable.
strsplit(Sys.getenv("PATH"), ";")[[1]][1]
## [1] "D:\\Program Files\\Architect\\plugins\\eu.openanalytics.architect.rtools.win32.win32_0.9.3.201307232256\\rtools\\bin"
It isn't clear to me how to make a reliable cross-platform test out of this. Can you find a better test?
I haven't found any really nice tests, but there are a couple more signs of tweaking by Architect.
Firstly, it loads a package named rj. We can test for this using
"package:rj" %in% search()
Secondly, it overrides the default graphics device (take a look at getOption("device")). This is an anonymous function, so we can't test by name, but I think the value of the name argument should distinguish it from other devices like windows or png.
device_name <- formals(getOption("device"))$name
!is.null(device_name) && device_name == "rj.gd"
Combining these two tests should be reasonably accurate for checking if you are running Architect.
is_architect <- function()
{
"package:rj" %in% search() &&
!is.null(device_name <- formals(getOption("device"))$name) &&
device_name == "rj.gd"
}

Read system TMP dir in R

What is a cross-platform method to find the OS temporary directory from within R? I currently use:
dirname(tempdir())
Which did the job both on Ubuntu and Windows from within an interactive R session. However, then it failed when called from inside RApache. In RApache the value of tempdir() is always /tmp, so dirname(tempdir()) results in /, which is obviously wrong. I also tried:
Sys.getenv("TMP")
Sys.getenv("TEMP")
Sys.getenv("TMPDIR")
as suggested by ?"environment variables" but none of these were set in Ubuntu. It also doesn't seem to be set in any of the files in /etc/R/* so I don't quite understand how R detects this value.
The environment variables "TMPDIR", "TMP", and "TEMP" can be used to modify the value returned by tempdir() if the C variable R_TempDir isn't set (although I'm not sure how that is done). If you want a cross-platform function that will return the path of a reasonable tmp directory, and aren't interested in the value of R_TempDir, you could use something like this:
gettmpdir <- function() {
tm <- Sys.getenv(c('TMPDIR', 'TMP', 'TEMP'))
d <- which(file.info(tm)$isdir & file.access(tm, 2) == 0)
if (length(d) > 0)
tm[[d[1]]]
else if (.Platform$OS.type == 'windows')
Sys.getenv('R_USER')
else
'/tmp'
}
This is based on the function InitTempDir in the file src/main/sysutils.c from the R source distribution, translated from C to R.

How to make "resident folder" to be the working directory? [duplicate]

Is there a way to programmatically find the path of an R script inside the script itself?
I am asking this because I have several scripts that use RGtk2 and load a GUI from a .glade file.
In these scripts I am obliged to put a setwd("path/to/the/script") instruction at the beginning, otherwise the .glade file (which is in the same directory) will not be found.
This is fine, but if I move the script in a different directory or to another computer I have to change the path. I know, it's not a big deal, but it would be nice to have something like:
setwd(getScriptPath())
So, does a similar function exist?
This works for me:
getSrcDirectory(function(x) {x})
This defines an anonymous function (that does nothing) inside the script, and then determines the source directory of that function, which is the directory where the script is.
For RStudio only:
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
This works when Running or Sourceing your file.
Use source("yourfile.R", chdir = T)
Exploit the implicit "--file" argument of Rscript
When calling the script using "Rscript" (Rscript doc) the full path of the script is given as a system parameter. The following function exploits this to extract the script directory:
getScriptPath <- function(){
cmd.args <- commandArgs()
m <- regexpr("(?<=^--file=).+", cmd.args, perl=TRUE)
script.dir <- dirname(regmatches(cmd.args, m))
if(length(script.dir) == 0) stop("can't determine script dir: please call the script with Rscript")
if(length(script.dir) > 1) stop("can't determine script dir: more than one '--file' argument detected")
return(script.dir)
}
If you wrap your code in a package, you can always query parts of the package directory.
Here is an example from the RGtk2 package:
> system.file("ui", "demo.ui", package="RGtk2")
[1] "C:/opt/R/library/RGtk2/ui/demo.ui"
>
You can do the same with a directory inst/glade/ in your sources which will become a directory glade/ in the installed package -- and system.file() will compute the path for you when installed, irrespective of the OS.
This answer works fine to me:
script.dir <- dirname(sys.frame(1)$ofile)
Note: script must be sourced in order to return correct path
I found it in: https://support.rstudio.com/hc/communities/public/questions/200895567-can-user-obtain-the-path-of-current-Project-s-directory-
But I still don´t understand what is sys.frame(1)$ofile. I didn´t find anything about that in R Documentation. Someone can explain it?
#' current script dir
#' #param
#' #return
#' #examples
#' works with source() or in RStudio Run selection
#' #export
z.csd <- function() {
# http://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# must work with source()
if (!is.null(res <- .thisfile_source())) res
else if (!is.null(res <- .thisfile_rscript())) dirname(res)
# http://stackoverflow.com/a/35842176/2292993
# RStudio only, can work without source()
else dirname(rstudioapi::getActiveDocumentContext()$path)
}
# Helper functions
.thisfile_source <- function() {
for (i in -(1:sys.nframe())) {
if (identical(sys.function(i), base::source))
return (normalizePath(sys.frame(i)$ofile))
}
NULL
}
.thisfile_rscript <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
cmdArgsTrailing <- commandArgs(trailingOnly = TRUE)
cmdArgs <- cmdArgs[seq.int(from=1, length.out=length(cmdArgs) - length(cmdArgsTrailing))]
res <- gsub("^(?:--file=(.*)|.*)$", "\\1", cmdArgs)
# If multiple --file arguments are given, R uses the last one
res <- tail(res[res != ""], 1)
if (length(res) > 0)
return (res)
NULL
}
A lot of these solutions are several years old. While some may still work, there are good reasons against utilizing each of them (see linked source below). I have the best solution (also from source): use the here library.
Original example code:
library(ggplot2)
setwd("/Users/jenny/cuddly_broccoli/verbose_funicular/foofy/data")
df <- read.delim("raw_foofy_data.csv")
Revised code
library(ggplot2)
library(here)
df <- read.delim(here("data", "raw_foofy_data.csv"))
This solution is the most dynamic and robust because it works regardless of whether you are using the command line, RStudio, calling from an R script, etc. It is also extremely simple to use and is succinct.
Source: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
I have found something that works for me.
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
How about using system and shell commands? With the windows one, I think when you open the script in RStudio it sets the current shell directory to the directory of the script. You might have to add cd C:\ e.g or whatever drive you want to search (e.g. shell('dir C:\\*file_name /s', intern = TRUE) - \\ to escape escape character). Will only work for uniquely named files unless you further specify subdirectories (for Linux I started searching from /). In any case, if you know how to find something in the shell, this provides a layout to find it within R and return the directory. Should work whether you are sourcing or running the script but I haven't fully explored the potential bugs.
#Get operating system
OS<-Sys.info()
win<-length(grep("Windows",OS))
lin<-length(grep("Linux",OS))
#Find path of data directory
#Linux Bash Commands
if(lin==1){
file_path<-system("find / -name 'file_name'", intern = TRUE)
data_directory<-gsub('/file_name',"",file_path)
}
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name /s', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Directory of ","",file_path)
filepath<-gsub("\\\\","/",file_path)
data_directory<-file_path
}
#Change working directory to location of data and sources
setwd(data_directory)
Thank you for the function, though I had to adjust it a Little as following for me (W10):
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Verzeichnis von ","",file_path)
file_path<-chartr("\\","/",file_path)
data_directory<-file_path
}
In my case, I needed a way to copy the executing file to back up the original script together with its outputs. This is relatively important in research. What worked for me while running my script on the command line, was a mixure of other solutions presented here, that looks like this:
library(scriptName)
file_dir <- gsub("\\", "/", fileSnapshot()$path, fixed=TRUE)
file.copy(from = file.path(file_dir, scriptName::current_filename()) ,
to = file.path(new_dir, scriptName::current_filename()))
Alternatively, one can add to the file name the date and our to help in distinguishing that file from the source like this:
file.copy(from = file.path(current_dir, current_filename()) ,
to = file.path(new_dir, subDir, paste0(current_filename(),"_", Sys.time(), ".R")))
None of the solutions given so far work in all circumstances. Worse, many solutions use setwd, and thus break code that expects the working directory to be, well, the working directory — i.e. the code that the user of the code chose (I realise that the question asks about setwd() but this doesn’t change the fact that this is generally a bad idea).
R simply has no built-in way to determine the path of the currently running piece of code.
A clean solution requires a systematic way of managing non-package code. That’s what ‘box’ does. With ‘box’, the directory relative to the currently executing code can be found trivially:
box::file()
However, that isn’t the purpose of ‘box’; it’s just a side-effect of what it actually does: it implements a proper, modern module system for R. This includes organising code in (nested) modules, and hence the ability to load code from modules relative to the currently running code.
To load code with ‘box’ you wouldn’t use e.g. source(file.path(box::file(), 'foo.r')). Instead, you’d use
box::use(./foo)
However, box::file() is still useful for locating data (i.e. OP’s use-case). So, for instance, to locate a file mygui.glade from the current module’s path, you would write.
glade_path = box::file('mygui.glade')
And (as long as you’re using ‘box’ modules) this always works, doesn’t require any hacks, and doesn’t use setwd.

Getting path of an R script

Is there a way to programmatically find the path of an R script inside the script itself?
I am asking this because I have several scripts that use RGtk2 and load a GUI from a .glade file.
In these scripts I am obliged to put a setwd("path/to/the/script") instruction at the beginning, otherwise the .glade file (which is in the same directory) will not be found.
This is fine, but if I move the script in a different directory or to another computer I have to change the path. I know, it's not a big deal, but it would be nice to have something like:
setwd(getScriptPath())
So, does a similar function exist?
This works for me:
getSrcDirectory(function(x) {x})
This defines an anonymous function (that does nothing) inside the script, and then determines the source directory of that function, which is the directory where the script is.
For RStudio only:
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
This works when Running or Sourceing your file.
Use source("yourfile.R", chdir = T)
Exploit the implicit "--file" argument of Rscript
When calling the script using "Rscript" (Rscript doc) the full path of the script is given as a system parameter. The following function exploits this to extract the script directory:
getScriptPath <- function(){
cmd.args <- commandArgs()
m <- regexpr("(?<=^--file=).+", cmd.args, perl=TRUE)
script.dir <- dirname(regmatches(cmd.args, m))
if(length(script.dir) == 0) stop("can't determine script dir: please call the script with Rscript")
if(length(script.dir) > 1) stop("can't determine script dir: more than one '--file' argument detected")
return(script.dir)
}
If you wrap your code in a package, you can always query parts of the package directory.
Here is an example from the RGtk2 package:
> system.file("ui", "demo.ui", package="RGtk2")
[1] "C:/opt/R/library/RGtk2/ui/demo.ui"
>
You can do the same with a directory inst/glade/ in your sources which will become a directory glade/ in the installed package -- and system.file() will compute the path for you when installed, irrespective of the OS.
This answer works fine to me:
script.dir <- dirname(sys.frame(1)$ofile)
Note: script must be sourced in order to return correct path
I found it in: https://support.rstudio.com/hc/communities/public/questions/200895567-can-user-obtain-the-path-of-current-Project-s-directory-
But I still don´t understand what is sys.frame(1)$ofile. I didn´t find anything about that in R Documentation. Someone can explain it?
#' current script dir
#' #param
#' #return
#' #examples
#' works with source() or in RStudio Run selection
#' #export
z.csd <- function() {
# http://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# must work with source()
if (!is.null(res <- .thisfile_source())) res
else if (!is.null(res <- .thisfile_rscript())) dirname(res)
# http://stackoverflow.com/a/35842176/2292993
# RStudio only, can work without source()
else dirname(rstudioapi::getActiveDocumentContext()$path)
}
# Helper functions
.thisfile_source <- function() {
for (i in -(1:sys.nframe())) {
if (identical(sys.function(i), base::source))
return (normalizePath(sys.frame(i)$ofile))
}
NULL
}
.thisfile_rscript <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
cmdArgsTrailing <- commandArgs(trailingOnly = TRUE)
cmdArgs <- cmdArgs[seq.int(from=1, length.out=length(cmdArgs) - length(cmdArgsTrailing))]
res <- gsub("^(?:--file=(.*)|.*)$", "\\1", cmdArgs)
# If multiple --file arguments are given, R uses the last one
res <- tail(res[res != ""], 1)
if (length(res) > 0)
return (res)
NULL
}
A lot of these solutions are several years old. While some may still work, there are good reasons against utilizing each of them (see linked source below). I have the best solution (also from source): use the here library.
Original example code:
library(ggplot2)
setwd("/Users/jenny/cuddly_broccoli/verbose_funicular/foofy/data")
df <- read.delim("raw_foofy_data.csv")
Revised code
library(ggplot2)
library(here)
df <- read.delim(here("data", "raw_foofy_data.csv"))
This solution is the most dynamic and robust because it works regardless of whether you are using the command line, RStudio, calling from an R script, etc. It is also extremely simple to use and is succinct.
Source: https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
I have found something that works for me.
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
How about using system and shell commands? With the windows one, I think when you open the script in RStudio it sets the current shell directory to the directory of the script. You might have to add cd C:\ e.g or whatever drive you want to search (e.g. shell('dir C:\\*file_name /s', intern = TRUE) - \\ to escape escape character). Will only work for uniquely named files unless you further specify subdirectories (for Linux I started searching from /). In any case, if you know how to find something in the shell, this provides a layout to find it within R and return the directory. Should work whether you are sourcing or running the script but I haven't fully explored the potential bugs.
#Get operating system
OS<-Sys.info()
win<-length(grep("Windows",OS))
lin<-length(grep("Linux",OS))
#Find path of data directory
#Linux Bash Commands
if(lin==1){
file_path<-system("find / -name 'file_name'", intern = TRUE)
data_directory<-gsub('/file_name',"",file_path)
}
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name /s', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Directory of ","",file_path)
filepath<-gsub("\\\\","/",file_path)
data_directory<-file_path
}
#Change working directory to location of data and sources
setwd(data_directory)
Thank you for the function, though I had to adjust it a Little as following for me (W10):
#Windows Command Prompt Commands
if(win==1){
file_path<-shell('dir file_name', intern = TRUE)
file_path<-file_path[4]
file_path<-gsub(" Verzeichnis von ","",file_path)
file_path<-chartr("\\","/",file_path)
data_directory<-file_path
}
In my case, I needed a way to copy the executing file to back up the original script together with its outputs. This is relatively important in research. What worked for me while running my script on the command line, was a mixure of other solutions presented here, that looks like this:
library(scriptName)
file_dir <- gsub("\\", "/", fileSnapshot()$path, fixed=TRUE)
file.copy(from = file.path(file_dir, scriptName::current_filename()) ,
to = file.path(new_dir, scriptName::current_filename()))
Alternatively, one can add to the file name the date and our to help in distinguishing that file from the source like this:
file.copy(from = file.path(current_dir, current_filename()) ,
to = file.path(new_dir, subDir, paste0(current_filename(),"_", Sys.time(), ".R")))
None of the solutions given so far work in all circumstances. Worse, many solutions use setwd, and thus break code that expects the working directory to be, well, the working directory — i.e. the code that the user of the code chose (I realise that the question asks about setwd() but this doesn’t change the fact that this is generally a bad idea).
R simply has no built-in way to determine the path of the currently running piece of code.
A clean solution requires a systematic way of managing non-package code. That’s what ‘box’ does. With ‘box’, the directory relative to the currently executing code can be found trivially:
box::file()
However, that isn’t the purpose of ‘box’; it’s just a side-effect of what it actually does: it implements a proper, modern module system for R. This includes organising code in (nested) modules, and hence the ability to load code from modules relative to the currently running code.
To load code with ‘box’ you wouldn’t use e.g. source(file.path(box::file(), 'foo.r')). Instead, you’d use
box::use(./foo)
However, box::file() is still useful for locating data (i.e. OP’s use-case). So, for instance, to locate a file mygui.glade from the current module’s path, you would write.
glade_path = box::file('mygui.glade')
And (as long as you’re using ‘box’ modules) this always works, doesn’t require any hacks, and doesn’t use setwd.

Resources