Get the path of current script - r

I would like to set the working directory to the path of current script programmatically but first I need to get the path of current script.
So I would like to be able to do:
current_path = ...retrieve the path of current script ...
setwd(current_path)
Just like the RStudio menu does:
So far I tried:
initial.options <- commandArgs(trailingOnly = FALSE)
file.arg.name <- "--file="
script.name <- sub(file.arg.name, "", initial.options[grep(file.arg.name, initial.options)])
script.basename <- dirname(script.name)
script.name returns NULL
source("script.R", chdir = TRUE)
Returns:
Error in file(filename, "r", encoding = encoding) : cannot open the
connection In addition: Warning message: In file(filename, "r",
encoding = encoding) : cannot open file '/script.R': No such file or
directory
dirname(parent.frame(2)$ofile)
Returns: Error in dirname(parent.frame(2)$ofile) : a character vector argument expected
...because parent.frame is null
frame_files <- lapply(sys.frames(), function(x) x$ofile)
frame_files <- Filter(Negate(is.null), frame_files)
PATH <- dirname(frame_files[[length(frame_files)]])
Returns: Null because frame_files is a list of 0
thisFile <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
needle <- "--file="
match <- grep(needle, cmdArgs)
if (length(match) > 0) {
# Rscript
return(normalizePath(sub(needle, "", cmdArgs[match])))
} else {
# 'source'd via R console
return(normalizePath(sys.frames()[[1]]$ofile))
}
}
Returns: Error in path.expand(path) : invalid 'path' argument
Also I saw all answers from here, here, here and here.
No joy.
Working with RStudio 1.1.383
EDIT: It would be great if there was no need for an external library to achieve this.

In RStudio, you can get the path to the file currently shown in the source pane using
rstudioapi::getSourceEditorContext()$path
If you only want the directory, use
dirname(rstudioapi::getSourceEditorContext()$path)
If you want the name of the file that's been run by source(filename), that's a little harder. You need to look for the variable srcfile somewhere back in the stack. How far back depends on how you write things, but it's around 4 steps back: for example,
fi <- tempfile()
writeLines("f()", fi)
f <- function() print(sys.frame(-4)$srcfile)
source(fi)
fi
should print the same thing on the last two lines.

Update March 2019
Based on Alexis Lucattini and user2554330 answers, to make it work on both command line and RStudio. Also solving the "as_tibble" deprecated message
library(tidyverse)
getCurrentFileLocation <- function()
{
this_file <- commandArgs() %>%
tibble::enframe(name = NULL) %>%
tidyr::separate(col=value, into=c("key", "value"), sep="=", fill='right') %>%
dplyr::filter(key == "--file") %>%
dplyr::pull(value)
if (length(this_file)==0)
{
this_file <- rstudioapi::getSourceEditorContext()$path
}
return(dirname(this_file))
}

TLDR: The here package (available on CRAN) helps you build a path from a project's root directory. R projects configured with here() can be shared with colleagues working on different laptops or servers and paths built relative to the project's root directory will still work. The development version is at github.com/r-lib/here.
With git
You certainly store your R code in a directory. This directory is probably part of a git repository and/or an R studio project. I would recommend building all paths relative to that project's root directory. For example let's say that you have an R script that creates reusable plotting functions and that you have an R markdown notebook that loads that script and plots graphs in a nice (so nice) document. The project tree would look something like this
├── notebooks
│   ├── analysis.Rmd
├── R
│   ├── prepare_data.R
│   ├── prepare_figures.R
From the analysis.Rmd notebook, you would import plotting function with here() as such:
source(file.path(here::here("R"), "prepare_figures.R"))
Why?
Hadley Wickham in a Stackoverflow
comment:
"You should never use setwd() in R code - it basically defeats the idea of
using a working directory because you can no longer easily move your code
between computers. – hadley Nov 20 '10 at 23:44 "
From the Ode to the here package:
Do you:
Have setwd() in your scripts? PLEASE STOP DOING THAT.
This makes your script very fragile, hard-wired to exactly one time and place. As soon as you rename or move directories, it breaks. Or maybe you get a new computer? Or maybe someone else needs to run your code?
[...]
Classic problem presentation: Awkwardness around building paths and/or setting working directory in projects with subdirectories. Especially if you use R Markdown and knitr, which trips up alot of people with its default behavior of “working directory = directory where this file lives”. [...]
Install the here package:
install.packages("here")
library(here)
here()
here("construct","a","path")
Documentation of the here() function:
Starting with the current working directory during package load time,
here will walk the directory hierarchy upwards until it finds
a directory that satisfies at least one of the following conditions:
contains a file matching [.]Rproj$ with contents matching ^Version: in
the first line
[... other options ...]
contains a directory .git
Once established, the root directory doesn't change during the active
R session. here() then appends the arguments to the root directory.
The development version of the here package is available on github.
What about
What about files outside the project directory?
If you are loading or sourcing files outside the project directory, the recommended way is to use an environment variable at the Operating System level. Other users of your R code on different laptops or servers would need to set the same environment variable. The advantage is that it is portable.
data_path <- Sys.getenv("PROJECT_DATA")
df <- read.csv(file.path(data_path, "file_name.csv"))
Note: There is a long list of environmental variables which can affect an R session.
What about many projects sourcing each other?
It's time to create an R package.

If you're running an Rscript through the command-line etc
Rscript /path/to/script.R
The function below will assign this_file to /path/to/script
library(tidyverse)
get_this_file <- function() {
commandArgs() %>%
tibble::enframe(name = NULL) %>%
tidyr::separate(
col = value, into = c("key", "value"), sep = "=", fill = "right"
) %>%
dplyr::filter(key == "--file") %>%
dplyr::pull(value)
}
this_file <- get_this_file()
print(this_file)

Here is a custom function to obtain the path of a file in R, RStudio, or from an Rscript:
stub <- function() {}
thisPath <- function() {
cmdArgs <- commandArgs(trailingOnly = FALSE)
if (length(grep("^-f$", cmdArgs)) > 0) {
# R console option
normalizePath(dirname(cmdArgs[grep("^-f", cmdArgs) + 1]))[1]
} else if (length(grep("^--file=", cmdArgs)) > 0) {
# Rscript/R console option
scriptPath <- normalizePath(dirname(sub("^--file=", "", cmdArgs[grep("^--file=", cmdArgs)])))[1]
} else if (Sys.getenv("RSTUDIO") == "1") {
# RStudio
dirname(rstudioapi::getSourceEditorContext()$path)
} else if (is.null(attr(stub, "srcref")) == FALSE) {
# 'source'd via R console
dirname(normalizePath(attr(attr(stub, "srcref"), "srcfile")$filename))
} else {
stop("Cannot find file path")
}
}
https://gist.github.com/jasonsychau/ff6bc78a33bf3fd1c6bd4fa78bbf42e7

Another option to get current script path is funr::get_script_path() and you don't need run your script using RStudio.

I had trouble with all of these because they rely on libraries that I couldn't use (because of packrat) until after setting the working directory (which was why I needed to get the path to begin with).
So, here's an approach that just uses base R. (EDITED to handle windows \ characters in addition to / in paths)
args = commandArgs()
scriptName = args[substr(args,1,7) == '--file=']
if (length(scriptName) == 0) {
scriptName <- rstudioapi::getSourceEditorContext()$path
} else {
scriptName <- substr(scriptName, 8, nchar(scriptName))
}
pathName = substr(
scriptName,
1,
nchar(scriptName) - nchar(strsplit(scriptName, '.*[/|\\]')[[1]][2])
)

If you don't want to use (or have to remember) code, simply hover over the script and the path will appear

The following solves the problem for three cases: RStudio source Button, RStudio R console (source(...), if the file is still in the Source pane) or the OS console via Rscript:
this_file = gsub("--file=", "", commandArgs()[grepl("--file", commandArgs())])
if (length(this_file) > 0){
wd <- paste(head(strsplit(this_file, '[/|\\]')[[1]], -1), collapse = .Platform$file.sep)
}else{
wd <- dirname(rstudioapi::getSourceEditorContext()$path)
}
print(wd)

The following code gives the directory of the running Rscript if you are running it either from Rstudio or from the command line using Rscript command:
if (rstudioapi::isAvailable()) {
if (require('rstudioapi') != TRUE) {
install.packages('rstudioapi')
}else{
library(rstudioapi) # load it
}
wdir <- dirname(getActiveDocumentContext()$path)
}else{
wdir <- getwd()
}
setwd(wdir)

Related

Is it possible to stop `Rscript` cleaning up its `tempdir`?

I'm using R, via Rscript and H2O, but H2O is crashing. I want to review the logs, but the R tempdir that contains them seem to be removed when the R session ends (i.e. when the Rscript finishes).
Is it possible to tell R/Rscript not to remove the tmp folder it uses?
A work around for this would be to use on.exit to get the temporary files and save them in a different directory. An example function would be like this:
ranfunction <- function(){
#Get list of files in tempdir
on.exit(templist <- list.files(tempdir(), full.names = T,pattern = "^file") )
#create a new directory for files to go on exit
#use add = T to add to the on.exit call
on.exit(dir.create(dir1 <- file.path("G:","testdir")),add = T )
#for each file in templist assign it to the new directory
on.exit(
lapply(templist,function(x){
file.create(assign(x, tempfile(tmpdir = dir1) ))})
,add=T)
}
ranfunction()
One thing this function does not take into account is that if you rerun it - it will throw an error because the new directory dir1 already exits. You would have to delete dir1 before re-running the script.

Use the filename or filepath in R programs

Does anyone know if it's possible to derive the filename/filepath of an R program? I'm looking for something similar to "%sysfunc(GetOption(SYSIN))" in SAS which will return the filepath of a SAS program (running in batch mode). Can I do anything similar in R?
The best I've been able to come up with so far is to add the filename and current directory using shortcut keys in the text editor I use (PSPad). Is there an easier way to do this?
Here's my example:
progname<-"Iris data listing"
# You must use either double-backslashes or forward slashes in pathnames
progdir<-"F:\\R Programming\\Word output\\"
# Set the working directory to the program location
setwd(progdir)
# Make the ReporteRs package available for creating Word output
library(ReporteRs)
# Load the "Iris" provided with R
data("iris")
options('ReporteRs-fontsize'=8, 'ReporteRs-default-font'='Arial')
# Initialize the Word output object
doc <- docx()
# Add a title
doc <- addTitle(doc,"A sample listing",level=1)
# Create a nicely formatted listing, style similar to Journal
listing<-vanilla.table(iris)
# Add the listing to the Word output
doc <- addFlexTable(doc, listing)
# Create the Word output file
writeDoc( doc, file = paste0(progdir,progname,".docx"))
This works fairly well, both in batch and in RStudio. I'd really appreciate a better solution though
The link to Rscript: Determine path of the executing script provided by #Juan Bosco contained most of the information I needed. One problem it didn't address was running an R program in RStudio (sourcing in RStudio was discussed and solved). I found that this problem could be dealt with using rstudioapi::getActiveDocumentContext()$path).
It's also noteworthy that the solutions for batch mode won't work using
Rterm.exe --no-restore --no-save < %1 > %1.out 2>&1
The solutions require that the --file= option be used, e.g.
D:\R\R-3.3.2\bin\x64\Rterm.exe --no-restore --no-save --file="%~1.R" > "%~1.out" 2>&1 R_LIBS=D:/R/library
Here's a new version of the get_script_path function posted by #aprstar. This has been modified to also work in RStudio (note that it requires the rstudioapi library.
# Based on "get_script_path" function by aprstar, Aug 14 '15 at 18:46
# https://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script
# That solution didn't work for programs executed directly in RStudio
# Requires the rstudioapi package
# Assumes programs executed in batch have used the "--file=" option
GetProgramPath <- function() {
cmdArgs = commandArgs(trailingOnly = FALSE)
needle = "--file="
match = grep(needle, cmdArgs)
if (cmdArgs[1] == "RStudio") {
# An interactive session in RStudio
# Requires rstudioapi::getActiveDocumentContext
return(normalizePath(rstudioapi::getActiveDocumentContext()$path))
}
else if (length(match) > 0) {
# Batch mode using Rscript or rterm.exe with the "--file=" option
return(normalizePath(sub(needle, "", cmdArgs[match])))
}
else {
ls_vars = ls(sys.frames()[[1]])
if ("fileName" %in% ls_vars) {
# Source'd via RStudio
return(normalizePath(sys.frames()[[1]]$fileName))
}
else {
# Source'd via R console
return(normalizePath(sys.frames()[[1]]$ofile))
}
}
}
I placed this in my .Rprofile file. Now I can get the file information in either batch mode or in RStudio using the following code. I haven't tried it using source() but that should work too.
# "GetProgramPath()" returns the full path name of the file being executed
progpath<-GetProgramPath()
# Get the filename without the ".R" extension
progname<-tools::file_path_sans_ext(basename(progpath))
# Get the file directory
progdir<-dirname(progpath)
# Set the working directory to the program location
setwd(progdir)

How to find file path to load table?

I downloaded a .csv file and saved it on my desktop. Now, to work with it, I am supposed to use the read.table() or read.csv() functions to load the file into R. How do I find the file path for input into a line like this:
yy_2 <- read.csv(file =....., header = TRUE, stringsAsFactors = FALSE)
I use a MacBook Pro, if that helps.
On MacOS, this is most likely to be
fdir <- file.path("~/Desktop")
(~ is Unix shorthand for your home directory.) You can try list.files(fdir) to see if the files are there. Alternately you could try file.choose() as suggested in comments above, although that can only select a file, not a directory; this seems to be a long-standing gap in R (see e.g. this mailing list post from 2012, which suggests dirname(file.choose()) or this function:
choose.dir <- function() {
system("osascript -e 'tell app \"R\" to POSIX path of (choose folder with prompt \"Choose Folder:\")' > /tmp/R_folder",
intern = FALSE, ignore.stderr = TRUE)
p <- system("cat /tmp/R_folder && rm -f /tmp/R_folder", intern = TRUE)
return(ifelse(length(p), p, NA))
}
which appears to crash RStudio (!) but works in the R console on MacOS for me ...

How do I get the absolute path of an input file in R

I am using Rscript to plot some figures from a given CSV file in some directory, which is not necessarily my current working directory. I can call it as follows:
./script.r ../some_directory/inputfile.csv
Now I want to output my figures in the same directory (../some_directory), but I have no idea how to do that. I tried to get the absolute path for the input file because from this I could construct the output path, but I couldn't find out how to do that.
normalizePath() #Converts file paths to canonical user-understandable form
or
library(tools)
file_path_as_absolute()
The question is very old but it still misses a working solution. So here is my answer:
Use normalizePath(dirname(f)).
The example below list all the files and directories in the current directory.
dir <- "."
allFiles <- list.files(dir)
for(f in allFiles){
print(paste(normalizePath(dirname(f)), fsep = .Platform$file.sep, f, sep = ""))
}
Where:
normalizePath(dirname(f)) gives the absolute path of the parent directory. So the individual file names should be added to the path.
.Platform is used to have an OS-portable code. (here)
file.sep gives "the file separator used on your platform: "/" on both Unix-alikes and on Windows (but not on the former port to Classic Mac OS)." (here)
Warning: This may cause some problems if not used with caution. For instance, say this is the path: A/B/a_file and the working directory is now set to B. Then the code below:
dir <- "B"
allFiles <- list.files(dir)
for(f in allFiles){
print(paste(normalizePath(dirname(f)), fsep = .Platform$file.sep, f, sep = ""))
}
would give:
> A/a_file
however, it should be:
> A/B/a_file
Here the solution:
args = commandArgs(TRUE)
results_file = args[1]
output_path = dirname(normalizePath(results_file))
To get the absolute path(s) from file(s)
Why not combine the base R function file.path() with the answer that #Marius gave. This appears marginally simpler, will work with a vector of files (files), and take care of system specific separators:
file.path(normalizePath(dirname(files)), files)
And wrapped inside a function (abspath):
abspath <- function(files)file.path(normalizePath(dirname(files)), files)
For instance:
> setwd("~/test")
> list.files()
[1] "file1.txt" "file2.txt"
And then:
> abspath(files)
[1] "/home/myself/test/file1.txt" "/home/myself/test/file2.txt"
I see that people gave pieces of the solution, but not all of it.
I have used this:
outputFile = paste(normalizePath(dirname(inputFile)),"\\", "my_file.ext", sep = "")
Hope it helps.
fs::path_abs() is my preferred way. It avoids the backslashes of normalizePath().

R: sourcing files using a relative path

Sourcing files using a relative path is useful when dealing with large codebases. Other programming languages have well-defined mechanisms for sourcing files using a path relative to the directory of the file being sourced into. An example is Ruby's require_relative. What is a good way to implement relative path sourcing in R?
Below is what I pieced together a while back using various recipes and R forum posts. It's worked well for me for straight development but is not robust. For example, it breaks when the files are loaded via the testthat library, specifically auto_test(). rscript_stack() returns character(0).
# Returns the stack of RScript files
rscript_stack <- function() {
Filter(Negate(is.null), lapply(sys.frames(), function(x) x$ofile))
}
# Returns the current RScript file path
rscript_current <- function() {
stack <- rscript_stack()
r <- as.character(stack[length(stack)])
first_char <- substring(r, 1, 1)
if (first_char != '~' && first_char != .Platform$file.sep) {
r <- file.path(getwd(), r)
}
r
}
# Sources relative to the current script
source_relative <- function(relative_path, ...) {
source(file.path(dirname(rscript_current()), relative_path), ...)
}
Do you know of a better source_relative implementation?
After a discussion with #hadley on GitHub, I realized that my question goes against the common development patterns in R.
It seems that in R files that are sourced often assume that the working directory (getwd()) is set to the directory they are in. To make this work, source has a chdir argument whose default value is FALSE. When set to TRUE, it will change the working directory to the directory of the file being sourced.
In summary:
Assume that source is always relative because the working directory of the file being sourced is set to the directory where the file is.
To make this work, always set chdir=T when you source files from another directory, e.g., source('lib/stats/big_stats.R', chdir=T).
For convenient sourcing of entire directories in a predictable way I wrote sourceDir, which sources files in a directory in alphabetical order.
sourceDir <- function (path, pattern = "\\.[rR]$", env = NULL, chdir = TRUE)
{
files <- sort(dir(path, pattern, full.names = TRUE))
lapply(files, source, chdir = chdir)
}

Resources