Load file implicitly from Path - julia

I am trying to get my head around programming with multiple modules (in different files). I don't want to load explicitly the files with ìnclude in the right order.
I am using the Atom IDE as my development platform, so I don't run julia explicitly.
when I am just using importall Datastructures (where ModuleName is the name of the module) julia complains:
LoadError: ArgumentError: Module Datastructures not found in current path.
Run `Pkg.add("Datastructures")` to install the Datastructures package.
while loading F:\dev\ai\Interpreter.jl, in expression starting on line 8

There are two ways to build a package or module in julia:
1) Use the tools in PkgDev. You can get them with Pkg.add("PkgDev") ; using PkgDev. Now you can use PkgDev.generate("MyPackageName", "MIT") (or whatever license you prefer) to build your package folder. By default, julia will build this folder in the same directory as all your other external packages. On Linux, this would ~/.julia/v0.6/ (or whatever version you are running). Also by default, this folder will be on the julia path, so you can just type using MyPackageName at the REPL to load it.
Note that julia essentially loads the package by looking for the file ~/.julia/v0.6/MyPackageName/src/MyPackageName.jl and then running it. If your module consists of multiple files, you should have all of them in the ~/.julia/v0.6/MyPackageName/src/ directory, and then have a line of code in the MyPackageName.jl file that says include("MyOtherFileOfCode.jl").
2) If you don't want to keep your package in ~/.julia/v0.6/ for some reason, or you don't want to build your package using PkgDev.generate(), you can of course just set the files up yourself.
Let's assume you want MyPackageName to be stored in the ~/MyCode directory. First, create the directory ~/MyCode/MyPackageName/. Within this directory, I strongly recommend using the same structure that julia and github use, i.e. store all your code in a directory called ~/MyCode/MyPackageName/src/.
At a minimum, you will need a file in this directory called ~/MyCode/MyPackageName/src/MyPackageName.jl (just like in the method above). This file should begin with module MyPackageName and finish with end. Then, put whatever you want in-between (including include calls to other files in the src directory if you wish).
The final step is to make sure that julia can find MyPackageName. To do this, you will need ~/MyCode to be on the julia path. To do this, use: push!(LOAD_PATH, "~/MyCode") or push!(LOAD_PATH, "~/MyCode/MyPackageName").
Maybe you don't want to have to run this command every time you want to access MyPackageName. No problem, you just need to add this line to your .juliarc.jl file, which is automatically run every time you start julia. On Linux, your .juliarc.jl file should be in your home directory, i.e. ~/.juliarc.jl. If it isn't there, you can just create it and put whatever code you want in there. If you're on a different OS, you'll have to google where to put your .juliarc.jl.
This answer turned out longer than I planned...

Related

How to run R projects / use their relative paths from the terminal without setwd() resp. cd

I'm kinda lost on that one:
I have set up an R project, let's call it "Test Project.Rproj". The beauty of R projects is the possibility to use relative paths (relative to the .Rproj file). My project consists of a "main.R" script, which is saved on the same level as the .Rproj file.
Additionally I have a directory called 'Output', where I want my plots and exported data to be saved. My "main.R" file looks like the following:
my_df <- data.frame(A = 1:10, B = 11:20)
my_df |>
writexl::write_xlsx(here::here("Output",
paste0("my_df_",
stringr::str_replace_all(as.character(Sys.time()), ":", ""),
".xlsx")))
My final goal is to automate the execution of the 'main.R' file using the Windows Task Scheduler. But in order to do so, I have to be able to run the script from the terminal. The problem here is the working directory. When opening an R project, all the paths are relative to .Rproj file. But in the terminal the current working directory is <C:\Users\my_name>. Of course I could manually set the working directory via cd "path\to\my\project. But I would like to avoid that.
My current call for the execution of the main.R file in the terminal is the following:
"C:\Program Files\R\R-4.1.0\bin\Rscript" -e "source('C:/Users/my_name/path/to/my/project/main.R')"
My two ideas for a solution are the following, but I am happy for other suggestions as well.
In order to replicate the usual use of a project: Is there a way to execute the .Rproj
file from the terminal? In order to create a similar environment as in RStudio, where all the relative paths are working, when executing scripts from the project afterwards?
There are two packages adressing the problem of relative paths: rprojroot and here, where the former is the basis for the latter. I am pretty sure that here does not provide the needed functionality. I tried adding here::i_am("main.R) to my main.R file, but the project root directory still is not found when executing in the terminal from a working directory outside the project.
For rprojroot to work, I think it is also necessary to have your current working directory somewhere within the project. But this package offers a lot of functionality, so I am not sure wheter I am overlooking something.
So I would be happy about any help. Maybe it is impossible and I have to change the working directory manually - then I would be glad to know that as well.
Some links I used in my research:
https://www.tidyverse.org/blog/2017/12/workflow-vs-script/
https://malco.io/2018/11/05/why-should-i-use-the-here-package-when-i-m-already-using-projects/
http://jenrichmond.rbind.io/post/how-to-use-the-here-package/
Thanks a lot!
Edit: My current implementation is an additional R script, where I manually set the working directory via setwd() and source the main.R file. However it is always suggested to avoid setwd, which is why this whole question exists.

How do I use setwd in a relative way?

Our team uses R scripts in git repos that are shared between several people, across both Mac and Windows (and occasionally Linux) machines. This tends to lead to a bunch of really annoying lines at the top of scripts that look like this:
#path <- 'C:/data-work/project-a/data'
#path <- 'D:/my-stuff/project-a/data'
path = "~/projects/project-a/data"
#path = 'N:/work-projects/project-a/data'
#path <- "/work/project-a/data"
setwd(path)
To run the script, we have to comment/uncomment the correct path variable or the scripts won't run. This is annoying, untidy, and tends to be a bit of a mess in the commit history too.
In past I've got round this by using shell scripts to set directories relative to the script's location and skipping setwd entirely (and then using ./run-scripts.sh instead of Rscript process.R), but as we've got Windows users here, that won't work. Is there a better way to simplify these messy setwd() boilerplates in R?
(side note: in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?)
The answer is to not use setwd() at all, ever. R does things a bit different than Python, for sure, but this is one thing they have in common.
Instead, any scripts you're executing should assume they're being run from a common, top-level, root folder. When you launch a new R process, its working directory (i.e., what getwd() gives) is set to the same folder as the process was spawned from.
As an example, if you had this layout:
.
├── data
│   └── mydata.csv
└── scripts
└── analysis.R
You would run analysis.R from . and analysis.R would reference data/mydata.csv as "data/mydata.csv" (e.g., read.csv("data/mydata.csv, stringsAsFactors = FALSE)).
I would keep your shell scripts or Makefiles that run your R scripts and have the R scripts assume they're being run from the top level of the git repo.
This might look like:
cd . # Whereever `.` above is
Rscript scripts/analysis.R
Further reading:
https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
https://github.com/jennybc/here_here
1) If you are looking for a way to find the path of the currently running script then see:
Rscript: Determine path of the executing script
2) Another approach is to require that users put an option of a prearranged name in their .Rprofile file. Then the script can setwd to that. An attractive aspect of this system is that over time one can forget where various projects are located and with this system one can just look at the .Rprofile file to remind oneself. For example, for projectA each person running the project would put this in their .Rprofile
options(projectA = "...whatever...")
and then the script would start off with:
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else stop("Set option 'projectA' to its directory")
One variation of this is to assume the current directory if projectA is not defined. Although this may seem to be more flexible I personally find the documenting feature of the above code to be a big advantage.
proj <- getOption("projectA")
if (!is.null(proj)) setwd(proj) else cat("Using", getwd(), "\n")
in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?
R itself unfortunately doesn’t have a way for this. But you can achieve the same result in either of two ways:
Use packages instead of scripts where you include code via source. Then you can use the solution outlined in amoeba’s answer. This works because the real issue is that R has no way of telling the source function where to look for scripts.
Use box::use instead of source. The ‘box’ package provides a module system that allows relative imports of code modules. A nice side-effect of this is that the package provides a function that tells you the path of the current script, just like in Python (and, just like in Python, you normally don’t need to use this function directly).

Unable to load data file in Julia

There is a CSV file called orders_data stored in my system, but when I try to load this file in Julia using readdlm command in Jupyter Notebook(running in my browser), it says "NO SUCH FILE DIRECTORY FOUND"
I'm not sure why does this happen? is there a specific location where the files need to be stored to be accessed using Julia command? is it that I need to install some packages first to load the file using browser version of jupyter?
//Error information
SystemError: opening file orders_data.csv: No such file or directory
Your working directory is set to your current location when you start a Julia session. You can see what it is by calling the pwd() function. You can change it by calling the cd() function. Unless you specify otherwise, or provide a more complete pathname, Julia looks for files in your current working directory (although it's different for modules).

Where should I set the variable PATH in R?

I constantly need to call Tex Live binaries for compilation in R. However after the upgrade of Tex Live distribution, the path to current binaries needed to updated manually in the PATH(Sys.getenv("PATH")) variable.
As a single user on a Ubuntu system, which file should I update the value in, so that R gets the PATH correctly irrespective of whichever directory R is launched from.
One point I still don't gather is from where does R gets its site-wide (I mean for all users, even if faulty in saying so) PATH variable set, because no such variable name as "PATH" occur inside any files (Renviron, Renviron.site, Rprofile.site) in either of "R_HOME/etc/" and user's home directory? I also haven't set Sys.getenv("R_ENVIRON") and Sys.getenv("R_ENVIRON_USER") values.
I'd appreciate anybody's input here.
#JeffreyGoldberg's solution was close, but not quite right.
Rprofile files are interpreted as R code
Renviron files can only contain name value pairs, and are not interpreted as R code
From the help for Startup:
Note that there are two sorts of files used in startup: environment files which contain lists of environment variables to be set, and profile files which contain R code.
I'm not sure if this question is asking specifically how one can set the site wide value of PATH, rather than PATH for one specific user, but there are three locations you can put these files.
A project directory (i.e., a directory you choose to launch R from)
HOME
R_HOME/etc
These locations are searched in the order numbered above. The first location can contain configurations specific to a project, the second contains those specific to a user, and the third, site wide configuration settings. When a file is found it is used, so local takes precedence over global. Don't think you can create a more specific version that simply updates what you've done in a more general configuration file. R_HOME/etc/Renviron is created on installation and should not be edited. You may create a file called R_HOME/etc/Renviron.site, but do not edit R_HOME/etc/Renviron.
To create a site wide value of PATH, you will want to set it in a file in R_HOME/etc. Here you can use either Renviron.site or Rprofile.site for the file name. For a file in R_HOME/etc, Do not use Renviron, Rprofile, .Renviron, or .Rprofile for the name of a profile or environment file in this location. You can find out what R_HOME is in an R session using R.home(), or Sys.getenv("R_HOME")
To create a PATH value for a single user, set it in a file in HOME, which you can find in your R session using Sys.getenv("HOME") or path.expand("~"). You can also just use "~" to refer to HOME. Here, an Renviron file should be ~/.Renvironand an Rprofile file ~/.Rprofile. Take note of the difference between how profile and environment files are named in your HOME directory vs. R_HOME/etc
To create a PATH for a single project, set it in a file in that project's top level directory. Name the files as you would in your home directory (.Rprofile or .Renviron).
If you are creating an Renviron file, the file should include the following line:
PATH=<your path>
< and > should not be included. An example would be:
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
If you are creating an Rprofile file, the file should include the following line:
Sys.setenv("<your path>")
again, don't include "<" or ">". An example would be:
Sys.setenv("/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin")
There are various ways of doing this that get and edit a PATH variable (e.g., tack on a new path at the end, or the beginning). You can also use the strategy of setting an environment variable if it doesn't already exist and/or doesn't contain something you want it to. I've come to prefer just setting up my path simply, and coding it directly.
One final note, if you run R from a command line interface, environment variables may be inherited from your shell. RStudio also has its own startup sequence and may modify the end of your PATH variable. It should start as it is defined in your Rprofile or Renviron files. The R Console app itself has the fewest quirks with system environment variables, and should accept your path exactly as it is set with an Rprofile or Renviron file.
Edit: I should have tested before posting. What I describe below did not work. (Down voting my own answer is a strange thing.)
On my system (macOS, bash), R.app is not picking up my $PATH from my shell environment or .profile. However RStudio is picking it up. I do not understand the different behaviors.
One way to get consistent behavior would be to specify this in an Renviron file.
If you create a file named .Renviron in your come directory with a line like
Sys.setenv(PATH="/opt/local/bin:usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/Library/TeX/texbin")
(but of course with the path elements you need) that should give you consistent behavior.
The downside is that you need to manually maintain this. I suppose you could run a script from one of your other start up scripts that generated the .Renviron file. But either way, I consider this whole thing a work around in place of actually understanding where R picks up its environment from.

R package development, Possible to create submaps within \R directory?

I'm trying to create a R package. Now I've used roxygen and devtools to help create all necessary files and it's working.
Among others I have the maps /man , /R, /tests. Now I would like to create some subfolders in /R directory, but once I do this and move any scripts inside I get an Error in namespaceExport(ns, exports) when trying to rebuild the package.
Can I only have script files directly within /R subdirectory, and is there any solution to this other than putting the script files in other maps one level up? (such as old scripts that one may use in the future)
Thanks

Resources