How to get the absolute path from the context of the current running R script? - r

Using python, if I need the absolute path from the context of the current running script all I need to do is to add the following in the code of that script:
import os
os.path.abspath(__file__)
This is very useful as having the absolute path I can then use os.path.join to form new absolute paths for my project components (inside the project directory tree) and more interesting is that everything will continue to work without any problem no matter where the package directory is moved.
I need to achieve the very same thing using R programming, that is obtaining the absolute path of the current running R script ( = the absolute path of its file on the disk). But trying to do the same in R turns out to be quite challenging, at least for me as a rather beginner in R.
After a lot of googling, I tried to use the reticulate package to call Python from R but __file__ is not available there, then I found a few threads on Stackoverflow suggesting to play with the running Stack and others suggesting the use of normalizePath. However none of these worked for me when the entire project package is transferred from one directory to another.
Therefore, I would like to know if for example you have the following file/directory tree
base_dir ( = /home/usr1/apps/R/base_dir)
|
|
|___ myscript.R (this is my R script to be run)
|___ data (this is a directory)
|___ sql (this is a directory)
Is there any solution allowing to add something in the code of myscript.R so that inside the script the program can always know that the base directory is /home/usr1/apps/R/base_dir and if later this base directory is moved to another directory then there is no need to change the code and the program would be able to find correctly the new base directory?

R has in general no way of finding this path, because there is no equivalent to Python’s __file__ in R.
The closest you can get is to look at commandArgs() and laboriously extract the script filename (which requires different handling depending on how the script was launched!). But this will fail if the script was executed in RStudio, and it will fail after calling setwd().
Other solutions (such as the ‘here’ package) rely on heuristics and specific project structures.
But luckily there’s actually a solution that will always work: use ‘box’ modules.
With modules, you’ll always be able to get the path of the current script/module via box::file(). This is the closest equivalent to Python’s __file__ you’ll get in R, and it always works — as long as you’re using ‘box’ modules consistently.
(Internally the ‘box’ package requires complex logic to determine the value of the file() function in all circumstances; I don’t recommend replicating it, it’s too complex. For the curious, the bulk of the relevant logic is in R/loaded.r.)

If you are running the script using Rscript you can use getwd().
#!/usr/bin/Rscript
getwd()
# or assign it to a variable
base_dir = getwd()
you can run it from the command line using one of the following
./yourscript.R
# or
Rscript yourscript.R
Note however, this only works if you run the script from inside the folder, the file is in.
cd ~
./script.R
# "/home/usr1"
cd /
/home/usr1/script.R
# "/"
For a more elaborate option you could consider https://stackoverflow.com/a/55322344/3250126

Related

R reads the path to a .app file as a folder on macOS

Currently I am trialing a package that requires defining a path to where an external application is located. This application (for anyone curious) does not have a formal install and is simply a .app file to be placed in any given folder.
In R, I need to define a path to this file location, e.g.,
program_path <- ~/Desktop/Folder/dsi_studio.app
However, R interprets this as a folder, and thus when I try to run a few functions within the package, it claims the program_path I fed it is a directory and not a program/application.
Is there any way to force R to read this path as an application and not as a folder? I even went as far as defining program_path as the .app's Unix Executable File (i.e., dsi_studio.app/Contents/MacOS/dsi_studio), but no dice. I must be missing something here.
Thanks for any help!
With that code you might be getting a result that is a formula rather than a text/character object. Try putting quotation marks around that file path. The ~ is actually a function to create an R formula object. You also might need path.expand since the ~ in MacOS is a shortcut for /Users/user_name/. Try instead:
program_path <- "~/Desktop/Folder/dsi_studio.app"

How can I set `path.expand` to begin at my working directory?

I'm using a Mac. The path.expand function is several folders removed from my desired working directory. For example:
path.expand('~')
[1] "/Users/my.name"
I'd like to change it to something like this:
path.expand('~')
[1] "/Users/my.name/drive/R/project/sub.folder"
How can I go about this?
Thank you.
The tilde is, in all unix-sen (including macos), special in that it refers to what the operating system considers the home directory (via the env var HOME).
There are two types of answers to this. Can it be done? Perhaps, sure even. Should it be done? There will likely be unintended consequences (that may be hard to troubleshoot and/or workaround), so likely not.
This works on my ubuntu box:
me#mybox:/some/path$ Rscript -e 'Sys.getenv("HOME")'
[1] "/home/me"
me#mybox:/some/path$ HOME=/tmp/ Rscript -e 'Sys.getenv("HOME")'
[1] "/tmp/"
me#mybox:/some/path$ Rscript -e 'Sys.setenv(HOME="/tmp/");Sys.getenv("HOME")'
[1] "/tmp/"
(This notably does not work as well on Windows ... which is not very unix-y of it!)
So you can try overriding it with either:
Sys.setenv(HOME = "/Users/my.name/drive/R/project/sub.folder"), or
Set the HOME variable in your working environment before starting R.
This might have unintended consequences. For instance, R looks for ~/.Rprofile, and git and commands look for ~/.gitconfig and such.
My recommended way-ahead would be to define a variable and change there. If you use RStudio, then its "Projects" can always start you in the correct directory. If not and you still want this "special directory" available to you, perhaps add this to your /Users/username/.Rprofile (in your "actual" homedir)
.specialdir <- "/Users/my.name/drive/R/project/sub.folder"
and, whenever you need to go there, use file.expand(.specialdir). One side-effect of this is that any of your code, functions, reports, whatever that use this will no longer be reproducible.
A way to easily reference your files without needing to change the HOME directory is to use the here package. This basically uses a heuristic to find the right working directory based on where your script is. Normally it looks for RStudio Project files (.rproj) or for a .git file if your working directory is a git repository. It's easy to use and robust to moving machines or accidental use of setwd, or even forgetting to set HOME on a different machine/profile.
If your data file some_data.csv above is stored in /Users/my.name/drive/R/project/sub.folder/some_data.csv, where project is the root folder for the project:
here::here()
[1] "/Users/my.name/drive/R/project"
here::here("sub.folder", "some_data.csv")
[1] "/Users/my.name/drive/R/project/sub.folder/some_data.csv"
and you can use it as a drop in replacement for the path, as in:
data <- read_csv(here::here("sub.folder", "some_data.csv"))

Is there a way to run julia script with arguments from REPL?

I can run julia script with arguments from Powershell as > julia test.jl 'a' 'b'. I can run a script from REPL with include("test.jl") but include accepts just one argument - the path to the script.
From playing around with include it seems that it runs a script as a code block with all the variables referencing the current(?) scope so if I explicitly redefine ARGS variable in REPL it catches on and displays corresponding script results:
>ARGS="c","d"
>include("test.jl") # prints its arguments in a loop
c
d
This however gives a warning for redefining ARGS and doesn't seem the intended way of doing that. Is there another way to run a script from REPL (or from another script) while stating its arguments explicitly?
You probably don't want to run a self-contained script by includeing it. There are two options:
If the script isn't in your control and calling it from the command-line is the canonical interface, just call it in a separate Julia process. run(`$JULIA_HOME/julia path/to/script.jl arg1 arg2`). See running external commands for more details.
If you have control over the script, it'd probably make more sense to split it up into two parts: a library-like file that just defines Julia functions (but doesn't run any analyses) and a command-line file that parses the arguments and calls the functions defined by the library. Both command-line interface and the second script your writing now can include the library — or better yet make the library-like file a full-fledged package.
This solution is not clean or Julia style of doing things. But if you insist:
To avoid the warning when messing with ARGS use the original ARGS but mutate its contents. Like the following:
empty!(ARGS)
push!(ARGS,"argument1")
push!(ARGS,"argument2")
include("file.jl")
And this question is also a duplicate, or related to: juliapassing-argument-to-the-includefile-jl as #AlexanderMorley pointed to.
Not sure if it helps, but it took me a while to figure this:
On your path "C:\Users\\.julia\config\" there may be a .jl file called startup.jl
The trick is that not always Julia setup will create this. So, if neither the directory nor the .jl file exists, create them.
Julia will treat this .jl file as a command list to be executed every time you run REPL. It is very handy in order to set the directory of your projects (i.e. C:\MyJuliaProject\MyJuliaScript.jl using cd("")) and frequent used libs (like using Pkg, using LinearAlgebra, etc)
I wanted to share this as I didn't find anyone explicit saying this directory might not exist in your Julia device's installation. It took me more than it should to figure this thing out.

how can i get the current file directory in R

I have seen many related answers here,but i didn't get a proper way to solve my problem under windows system...
I know the link the similar question
I got that setwd() can locate the directory what i want,however,my R script may move to another directory without any modification,so I want to know the current file directory,becase there are expression like source(...),this called source file and the execution file under the same parent directory in a R project,how I can do?
any help appreciated.
You can get your current directory using the getwd() function and give it a name, say:
cpath = getwd()
Another useful function is the file.path, which can help you specify new directories with simple syntax. For example, you want to get the directory that is one level "above" the current directory, you can use:
upp.dir = file.path("..", "cpath")
This gives upp.dir as "../Your_Current_Dir". How about changing to another folder (called Folder_A) in current directory? Use:
folderA = file.path("cpath", "Folder_A")
These may help easy navigate the file system.
Basically, if you write scripts and those scripts depend on where they are, then you are Doing It Wrong.
Write code in packages. Parameterise functions to make them generally applicable. If you have folders with data in, then make one of those parameters a folder.
A script called with source() cannot reliably locate itself, but that shouldn't be a problem, because WHATEVER CALLED THE SCRIPT knows where the script is (it has to, or how else can it call it?) so it could pass that as a parameter. Something like:
> youarehere = "C:\foo\"
> source("C:\foo\bar.R")
and now bar.R can do setwd(youarehere) and it will work, even if it is badly written such that it relies on sourcing other code in its containing folder.
Or you can do:
> setwd(youarehere)
> source("bar.R")
in your calling function.
But really, its a fail, its a sign of badly written code. Use functions, write packages, use devtools, its really not that hard, then your code will work anywhere and you wont be writing stupid scripts that are a twisty turny maze of source() calls.
Stay classy.

get filename and path of `source`d file

How can a sourced or Sweaved file find out its own path?
Background:
I work a lot with .R scripts or .Rnw files.
My projects are organized in a directory structure, but the path of the project's base directory frequently varies between different computers (e.g. because I just do parts of data analysis for someone else, and their directory structure is different from mine: I have projects base directories ~/Projects/StudentName/ or ~/Projects/Studentname/Projectname and most students who have just their one Project usually have it under ~/Measurements/ or ~/DataAnalysis/ or something the like - which wouldn't work for me).
So a line like
setwd (my.own.path ())
would be incredibly useful as it would allow to ensure the working directory is the base path of the project regardless of where that project actually is. Without the need that the user must think of setting the working directory.
Let me clarify: I look for a solution that works with pressing the editor's/IDE's source or Sweave Keyboard shortcut of the unthinking user.
Just FYI, knitr will setwd() to the dir of the input file when (and only when) evaluating the code chunks, i.e. if you call knit('path/to/input.Rnw'), the working dir will be temporarily switched to path/to/. If you want to know the input dir in code chunks, currently you can call an unexported function knitr:::input_dir() (I may export it in the future).
Starting from gsk3's Seb's suggestions, here's an idea:
the combination of username (login) and IP or name of the computer could be used to select the right directory.
That leads to something like:
setwd (switch (paste (Sys.info () [c ("user", "nodename")], collapse="."),
user.laptop = "~/Messungen",
user2.server = "~/Projekte/Projekt/",
))
So there is an automatic solution, that
works with source
works with Sweave
even works for interactive sessions where the commands are sent line by line
the combination of user and nodename of course needs to be specific
the paths need to be edited by hand, though.
Improvements welcome!
Update:
Gabor Grothendieck answered the following to a related question on r-help today:
this.dir <- dirname(parent.frame(2)$ofile)
setwd(this.dir)
which will work for source.
Another update: I now do most of the data analysis work in RStudio. RStudio's projects basically solve the problem: RStudio changes the working directory to the project root directory every time I switch between projects.
I can therefore put the project directory as far down my directory tree as I want (and the students can also put their copy wherever they want) and sync the data files and scripts/.Rnws via version control (We use a private git server). The RStudio project files are kept out of the version control, i.e. .gitignore contains .Rproj.user.
Obviously, within the project, the directory structure needs to be synchronized.
You can use sys.calls() to get the command used to source the file. Then you need a bit of trickery using regular expressions to get the pathname, bearing in mind that source("something/filename") could have used either the absolute or relative path. Here's a first attempt at putting all the pieces together: try inserting the following lines at the top of a source file.
whereFrom=sys.calls()[[1]]
# This should be an expression that looks something like
# source("pathname/myfilename.R")
whereFrom=as.character(whereFrom[2]) # get the pathname/filename
whereFrom=paste(getwd(),whereFrom,sep="/") # prefix it with the current working directory
pathnameIndex=gregexpr(".*/",whereFrom) # we want the string up to the final '/'
pathnameLength=attr(pathnameIndex[[1]],"match.length")
whereFrom=substr(whereFrom,1,pathnameLength-1)
print(whereFrom) # or "setwd(whereFrom)" to set the working directory
It's not very robust—for instance, it will fail on windows with source("pathname\\filename"), and I haven't tested what happens if you have one file sourcing another file—but you might be able to build a solution on top of this.
I have no direct solution how to obtain the directory of the file itself but if you have a limited range of directories and directory structures you can probably use
if(file.exists("c:/somedir")==TRUE){setwd("c:/somedir")}
You could check out the pattern of the directory in question and then set the dir. Does this help you?
An additional problem is that the working directory is a global variable, which can be changed by any script, so if your script calls another script, it will have to set the wd back. In RStudio I use Session -> Set Working Directory -> To Source File Location (I know, it's not ideal), and then my script does
wd = getwd ()
...
source ("mySubDir/myOtherScript.R", chdir=TRUE); setwd (wd)
...
source ("anotherSubDir/anotherScript.R", chdir=TRUE); setwd (wd)
In this way one can maintain a stack of working directories. I would love to see this implemented in the language itself.
This answer works for source and also inside nvim-R - I have no idea if it works with knitr and similar things. Any feedback appreciated.
If you have multiple scripts source-ing each other, it is important to get the correct one. That is, the largest i for which sys.frame(i)$ofile exists.
get.full.path.to.this.sourced.script = function() {
for(i in sys.nframe():1) { # Go through all the call frames,
# in *reverse* order.
x = sys.frame(i)$ofile
if(!is.null(x)) # if $ofile exists,
return(normalizePath(x)) # then return the full absolute path
}
}

Resources