Creating decent julia's environment variable paths on UNIX-like system using XDG specs to avoid ~/.julia - unix

As a brand-new julia programmer, I am striving to tidy up my $HOME dir in order to avoid bloating it up with a tons of dotfiles, and at same time, define sane paths to julia using XDG Base Directory Specification. The steps defined in the Official instruction will generate the dir ~/.julia, which is rather unpleasant... So I am trying a way out to this problem following sane UNIX path definitions.
After reading the julia's environment variables and some discussions about it, I hit the bullet and tried to organize it in the following way (probably not the most optimized, so help me):
Firstly, I decided to put the extracted julia/ directory inside $XDG_DATA_HOME as it is user-specific data files, right? So a ran
mv julia-1.7.2 $XDG_DATA_HOME
The julia's documentation ask me to define the path to julia's bin/ dir as an env variable $JULIA_BINDIR. To do so, I defined in ~/.profile:
# The absolute path of the directory containing the Julia executable -> ~/.local/share/julia-1.7.2/bin
export JULIA_BINDIR="$XDG_DATA_HOME/julia-1.7.2/bin"
The documentation also ask me to put the julia bin in the $PATH. Hence, I symlink
ln -s $XDG_DATA_HOME/julia-1.7.2/bin/julia ~/.local/bin/julia
because User-specific executable files shall be stored in $HOME/.local/bin. This resolve the problem of putting the julia into the $PATH as long as ~/.local/bin is in it.
I would also like to put the data and config file of julia into $XDG_DATA_HOME and $XDG_CONFIG_HOME, respectively, which matches with the XDG Base Directory Specifications. So I added in my .profile the following env. variables:
# relative julia's data directory -> $JULIA_BINDIR/$DATAROOTDIR/julia/base -> ~/.local/share/julia/base
export DATAROOTDIR="../.."
# relative julia's config files -> $JULIA_BINDIR/$SYSCONFDIR/julia/startup.jl -> ~/.config/julia/startup.jl
export SYSCONFDIR="../../../../.config"
Both $DATAROOTDIR and $SYSCONFDIR are relative paths to the data directory and configuration file directory, respectively. With these environment variables, I am pointing the data dir at $XDG_DATA_HOME/julia and the config dir at $XDG_CONFIG_HOME/julia. To have such directories in theses paths, I symlinked again
ln -s $XDG_DATA_HOME/julia-1.7.2/share/julia $XDG_DATA_HOME/julia
ln -s $XDG_DATA_HOME/julia-1.7.2/etc/julia $XDG_CONFIG_HOME/julia
Moreover, I put the Julia history into $XDG_STATE_HOME as that path should be used to store "logs, history, recently used files", but I barely see UNIX users using this dir. I just added to .profile the following line
export JULIA_HISTORY="$XDG_STATE_HOME/julia"
At moment, when I open up julia on terminal in REPL mode, my computer does not generate ~/.julia/ (thanks God), but I bet my way have flaws, so any contribution is welcome.
Thanks in advance.

Related

Different local and remote organisation R Project and GitHub

I want to version control my R scripts so I've created an R project and a GitHub repo. My scripts are scattered through several directories within the same directory where the R project is.
I would like that my GitHub repository harbors only the scripts, independently of the folders they are locally stored in. However when I run the below command:
git add folder/file.R
git commit -m "my_message"
git push -u origin master
A directory named folder is created containing file.R but I'd like to just see file.R without the folder. Do you know how can I do this? Also, would it be good practice? My local folders are organized so each directory contains its own scripts and results, that's the reason the scripts are separated.
Thank you very much
is there a way to add the file.R without specifying the path?
Not using git add, no. The design constraint for git add is that it should store the file's name exactly as it appears, including the forward slashes, so if the file's name is folder/file.R, that's the file's name.
You have some options here though:
You can make a parallel directory where you put the files with the names you want them to have. Run git init in that directory, copy the folder/file.R file to file.R in that directory. Then cd ../gitdir or whatever is appropriate to get there, and git add file.R.
This method is probably the best because it's the simplest.
You can write your own programs using git hash-file -w and git update-index, which are two of Git's plumbing commands. A plumbing command, in Git, is basically a command that exists so that you can build user-facing commands: they're not meant to be run by humans but rather by other programs. So you write a program (in whatever language you like) that uses these plumbing programs to achieve whatever you want.
In particular, you can create or find a Git blob object holding the contents of file.R as read from anywhere you like, then use git update-index to create an index entry holding whatever path you like and referring to the blob object you created (or found) with git hash-object with the -w flag.
Since Git is a suite of tools, not a solution, you can come up with your own method. The tools in Git are made with particular approaches in mind, but they are flexible enough to be repurposed.

Where is the REQUIRE file situated in Julia?

I was looking for REQUIRE file using mate in shell command but couldn't find it.
It's related to Pkg..
shell> mate ~/.julia/
compiled/ clones/ prefs/ registries/
environments/ conda/ logs/ packages/
You are using Julia 0.7+.
Which means there is no REQUIRE files anywhere.
You may be looking for the Project.toml for the global (or other shared) environment.
You will find that (and it's matching Manifest.toml) in each subdirectory within the environments directory
See the Julia Docs for further reading on this topic.

How do I use setwd in a relative way?

Our team uses R scripts in git repos that are shared between several people, across both Mac and Windows (and occasionally Linux) machines. This tends to lead to a bunch of really annoying lines at the top of scripts that look like this:
#path <- 'C:/data-work/project-a/data'
#path <- 'D:/my-stuff/project-a/data'
path = "~/projects/project-a/data"
#path = 'N:/work-projects/project-a/data'
#path <- "/work/project-a/data"
setwd(path)
To run the script, we have to comment/uncomment the correct path variable or the scripts won't run. This is annoying, untidy, and tends to be a bit of a mess in the commit history too.
In past I've got round this by using shell scripts to set directories relative to the script's location and skipping setwd entirely (and then using ./run-scripts.sh instead of Rscript process.R), but as we've got Windows users here, that won't work. Is there a better way to simplify these messy setwd() boilerplates in R?
(side note: in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?)
The answer is to not use setwd() at all, ever. R does things a bit different than Python, for sure, but this is one thing they have in common.
Instead, any scripts you're executing should assume they're being run from a common, top-level, root folder. When you launch a new R process, its working directory (i.e., what getwd() gives) is set to the same folder as the process was spawned from.
As an example, if you had this layout:
.
├── data
│   └── mydata.csv
└── scripts
└── analysis.R
You would run analysis.R from . and analysis.R would reference data/mydata.csv as "data/mydata.csv" (e.g., read.csv("data/mydata.csv, stringsAsFactors = FALSE)).
I would keep your shell scripts or Makefiles that run your R scripts and have the R scripts assume they're being run from the top level of the git repo.
This might look like:
cd . # Whereever `.` above is
Rscript scripts/analysis.R
Further reading:
https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
https://github.com/jennybc/here_here
1) If you are looking for a way to find the path of the currently running script then see:
Rscript: Determine path of the executing script
2) Another approach is to require that users put an option of a prearranged name in their .Rprofile file. Then the script can setwd to that. An attractive aspect of this system is that over time one can forget where various projects are located and with this system one can just look at the .Rprofile file to remind oneself. For example, for projectA each person running the project would put this in their .Rprofile
options(projectA = "...whatever...")
and then the script would start off with:
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else stop("Set option 'projectA' to its directory")
One variation of this is to assume the current directory if projectA is not defined. Although this may seem to be more flexible I personally find the documenting feature of the above code to be a big advantage.
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else cat("Using", getwd(), "\n")
in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?
R itself unfortunately doesn’t have a way for this. But you can achieve the same result in either of two ways:
Use packages instead of scripts where you include code via source. Then you can use the solution outlined in amoeba’s answer. This works because the real issue is that R has no way of telling the source function where to look for scripts.
Use box::use instead of source. The ‘box’ package provides a module system that allows relative imports of code modules. A nice side-effect of this is that the package provides a function that tells you the path of the current script, just like in Python (and, just like in Python, you normally don’t need to use this function directly).

How do I create a directory with a file in it, in one step?

In the terminal, is there a way to create a directory with a file in it in one step?
Currently I do this in 2 steps:
1. mkdir foo
2. touch foo/bar.txt
Apparently, touch foo/bar.txt doesn't work.
With only standard unix tools, the most direct way to create a directory and a file in this directory is
mkdir foo && touch foo/bar.txt
Unix is built around the philosophy of simple, single-purpose tools with the shell as a glue to combine them. So to create a directory and a file, you instruct a shell to run the directory creation utility then the file creation utility.
I won't swear that there isn't some bizarre way of using a standard tool that lets you do it with a single command. (In fact, there is: unpack an archive — except that you'll need to provide that archive as a file, with predefined owner, date and other metadata, or else use another command to build an archive.) But whatever it is would be convoluted.

Is there a way to wrap arbitary commands located under a subdirctory in a shell script

I have a bunch of customizations and would like to run my test program in a pristine environment.
Sure I could use a tiny shell script to wrap and pass of arguments but it would be cool and useful if I could invoke a pre and possibly post script only to commands located under certain sub directories. The shell I'm using is zsh.
I don't know what you include in your “pristine environment”.
If you want to isolate yourself from the whole system, then maybe chroot is what you're after. You can set up a complete new system, with its own /etc, /bin and so on, but sharing the kernel, networking and other non-filesystem stuff with your running system. Root's cooperation is required (the chroot system call is reserved to root).
If you want to isolate yourself from your dot files, run the program with a different value for the HOME environment variable:
HOME=~/test-environment /path/to/test-program
HOME=~/test-environment zsh
If this is specifically about zsh's configuration files, you can set the ZDOTDIR environment variable before starting it to tell zsh to run its own dot files from a directory other than $HOME (or zsh --no-rcs to not load any dot file).
If by pristine environment you mean a fully controlled set of environment variables, then the env program does this.
env -i PATH=$PATH HOME=$HOME program args
will run program args with only the environment variables you specified.

Resources