Speeding up Julia on terminal for the second run - julia

Running a Julia file in REPL
julia> include("foo.jl")
julia> include("foo.jl")
gives different running time but it doesn't seem to be the case when I run it from the terminal
$ julia foo.jl
$ julia foo.jl
Is there a standard method for saving compiled files outside of Julia?

Normally Julia compiles functions the first time they're used within a given Julia instance, so each time you call julia foo.jl from the command line it will need to re-compile whatever code is called in foo.jl.
If you want to store a compiled version of foo.jl, you can use the PackageCompiler package (https://github.com/JuliaLang/PackageCompiler.jl) which will replace your original Julia binary (where the code in foo.jl has not been compiled) with a new one where foo.jl has been compiled.
Note that you probably don't want to do this if you're actively developing foo.jl as it takes some time to make each new Julia sysimage. If that's the case, you can just create a small script that loads all the packages and calls the same functions you want to call. Once that sysimage is compiled you should be able to import the same packages and use the same functions with no additional compilation time.

If you're re-running it to test changes to the underlying code, also consider using Revise.

There is a Julia package for running Julia as a daemon: https://github.com/dmolina/DaemonMode.jl which is worth checking out. Doesn't exactly answer the question about "how to save compiled Julia code" but it could probably improve the workflow you are after.

Related

How to run a julia script through terminal

I have a script main.jl which prints a simple "Hello world" string:
println("Hello world!")
However, when trying to run the script through the terminal like this:
julia> main.jl
I get the error:
ERROR: type #main has no field jl
All the information I can find online suggests calling the script like I do to run it. I have assured that I'm in the correct directory - what am I doing wrong?
You are trying to run the file from the Julia REPL (as indicated by the julia> prompt at the beginning of the line). There, you have to include the file as #AndreWildberg mentions. This will run the commands from the file as if you had typed them into the REPL.
The information you found online might have been about running Julia from "normal" terminal, aka a console shell like bash on Linux. There, running julia main.jl will run the program, although the REPL method above is usually preferred for working with Julia.
(question about calling the script with arguments asked in the comment):
First of all, I'll mention that this is not the usual workflow with Julia scripts. I've been writing Julia code for years, and I had to look up how to handle command-line arguments because I've never once used them in Julia: usually what's done instead is that you define the functions you want in the file, with maybe a main function, and after doing an include, you call the main function (or whichever function you want to try out) with arguments.
Now, if your script already uses command-line arguments (and you don't want to change that), what you can do is assign to the variable that holds them, ARGS, before the include statement:
julia> push!(empty!(ARGS), "arg1")
1-element Vector{String}:
"arg1"
julia> include("main.jl")
Here we empty the ARGS to make sure any previous values are gone, then push the argument (or arguments) we want into it. You can try this out for educational purposes, but if you are new to the language, I would suggest learning and getting used to the more Julian workflow involving function calls that I've mentioned above.
The julia> prompt means your terminal is in Julia REPL mode and is expecting valid Julia code as input. The Julia code main.jl would mean that you want to return the value of a field named jl inside a variable named main. That is why you get that error. The Julia code you would use to run the contents of a file is include("main.jl"). Here the function include is passed the name of your file as a String. This is how you should run files from the REPL.
The instructions you read online are assuming your terminal is in shell mode. This is usually indicated by a $ prompt. Here your terminal is expecting code in whatever language your shell is using e.g. Bash, PowerShell, Zsh. When Julia is installed, it will (usually) add a julia command which works in any shell. The julia command by itself will transform your terminal from shell mode to REPL mode. This julia command can also take additional arguments like filenames. If you use the command julia main.jl in this environment, it will run the file using the Julia program and then exit you back to your terminal shell. This is how you should run files from the terminal shell.
A third way to run Julia files would be to install an IDE like VSCode. Then you can run code from a file with keyboard shortcuts rather than by typing commands.
See the Getting Started Documentation for more information.
Adding to Sundar R's answer, if you want to run script which takes commandline arguments from REPL, you can check this package: https://github.com/v-i-s-h/Runner.jl
It allows you to run you script from REPL with args like:
julia> #runit "main.jl arg1 arg2"
See the README.md for detailed examples.

Solutions to the gargantuan Julia interpreter startup time

Currently, it takes a few seconds for the Julia interpreter to start on my machine when running any .jl file.
I'm wondering if there is a simple solution to this, such as a way to have a background pool of interpreters ready to execute scripts or a way to make the Julia repl, once opened, execute a .jl file (and possibly do so with the -p argument to properly handle multithreaded scripts) ?
I'm wondering if there is a simple solution to this, such as [...] a way to make the Julia repl, once opened, execute a .jl file [...].
You can execute a .jl file in a running Julia REPL with the include() function. For example, to execute a file foo.jl, enter the Julia REPL and do:
julia> include("test.jl")
The file will then be executed within the REPL. However, this is unlikely to solve your problem, since the file execution will probably take multiple seconds as well. The REPL itself starts quickly, the long execution time stems from Julia taking a long time to load the file.
You can partially address this issue with Revise.jl. Revise.jl is a Julia package that automatically and quickly reloads your imported files and packages when they are edited. Thus, you could mitigate your issue by only having to load the .jl file once at startup. Here is a quick example of using Revise.jl:
julia> Pkg.add("Example")
INFO: Installing Example v0.4.1
INFO: Package database updated
julia> using Revise # importantly, this must come before `using Example`
julia> using Example
julia> hello("world")
"Hello, world"

Why do which and Sys.which return different paths?

I tried to run a Python script from R with:
system('python script.py arg1 arg2')
And got an error:
ImportError: No module named pandas
This was a bit of a surprise since the script was working from the terminal as expected. Having encountered this type of issue before (with knitr, whence the engine.path chunk option), I know to check:
Sys.which('python')
# python
# "/usr/bin/python"
And compare it to the command line:
$ which python
# /Users/michael.chirico/anaconda2/bin/python
(i.e., the error arises because I have pandas installed for the anaconda distribution, though TBH I don't know why I have a different distribution)
Hence I can fix my issue by running:
system('/Users/michael.chirico/anaconda2/bin/python script.py arg1 arg2')
My question is two-fold:
How does R's system/Sys.which find a different python than my terminal?
How can I fix this besides writing out the full binary path each time?
I read ?Sys.which for some hints, but to no avail. In particular, ?Sys.which suggests Sys.which is using which:
This is an interface to the system command which
This is clearly (?) untrue; to be sure, I checked Sys.which('which') and which which to confirm both are pointing to /usr/bin/which (goaded on by this tidbit):
On a Unix-alike the full path to which (usually /usr/bin/which) is found when R is installed.
To the latter, on a whim I tried Sys.setenv(python = '/Users/michael.chirico/anaconda2/bin/python') to no avail.
As some of the comments hint, this is a problem that arises because the PATH environment variable is different for programs launched by Finder (or the Dock) than it is in the Terminal. There are ways to set the PATH for Dock-launched applications, but they aren't pretty. Here's a place to start looking if you want to go that route:
https://apple.stackexchange.com/questions/51677/how-to-set-path-for-finder-launched-applications
The other thing you can do, which is probably more straightforward, is tell R to set the PATH variable when it starts up, using Sys.setenv to add the path to your desired Python instance. You can do that for just one project, for your whole user account, or for the whole system, by placing the command in a .Rprofile file in the corresponding location. More information on how to do this here:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html

Run R command without entering R and without a script

I want to run an R command from command line (actually, from within a Makefile). The command is roxygen2::roxygenise(), if it is relevant. I don't want to create a new file and run that as a script - that will just clutter my directory.
In python, this is simple - you write python -c "import antigravity".
I use the Makefile to build, install and test a (Rcpp) package I'm working on.
This is generally done with so 'shebang scripts'.
Historically, littler was there first, about a decade or so ago. It is still widely used, and contains a number of helper scripts as for example roxy.r which does just what you desire: run roxygen2::roxygenize(). I use this all the time.
Next, Rscript started to ship with R. It is similar to littler but automatically available whereever R is which is a plus. On the minus side, it starts slower, and fails to load the methods package which is a source of a number of bug reports and SO questions.
Much more recently, R itself added the ability to run expressions following the -e ... switch.
So you have plenty of choices. You can also study plenty of src/Makevars files many of which use Rscript.

How to write tests for cpp functions in R package?

To speed some functions in R package, I have re-coded them in cpp functions using Rcpp and successfully embedded those cpp functions into this package. The next step is to test if the cpp functions can output the same results as the original functions in R. So writing tests is necessary.
However, I was stuck on this step. I have read some links
Testing, R package by Hadley Wickham
and CRAN:testthat, page 11.
What I have done is that I run devtools::use_testthat()to create a tests/testthat directory. Then, run use_catch(dir = getwd())to add a test file tests/testthat/test-cpp.R. At this point, I think expect_cpp_tests_pass() might work but was just stuck on it. If I have the original function called add_inflow and add_inflow_Cpp. How can I test if these two functions are equal?
The documentation for ?use_catch attempts to describe exactly how the testing infrastructure for testthat works here, so I'll just copy that as an answer:
Calling use_catch() will:
Create a file src/test-runner.cpp, which ensures that the testthat
package will understand how to run your package's unit tests,
Create an example test file src/test-example.cpp, which showcases how
you might use Catch to write a unit test, and
Add a test file tests/testthat/test-cpp.R, which ensures that testthat
will run your compiled tests during invocations of devtools::test() or
R CMD check.
C++ unit tests can be added to C++ source files within the src/
directory of your package, with a format similar to R code tested with
testthat.
When your package is compiled, unit tests alongside a harness for running these tests will be
compiled into your R package, with the C entry point
run_testthat_tests(). testthat will use that entry point to run your
unit tests when detected.
In short, if you want to write your own C++ unit tests using Catch, you can follow the example of the auto-generated test-example.cpp file. testthat will automatically run your tests, and report failures during the regular devtools::test() process.
Note that the use of Catch is specifically for writing unit tests at the C++ level. If you want to write R test code, then Catch won't be relevant for your use case.
One package you might look at as motivation is the icd package -- see this file for one example of how you might write Catch unit tests with the testthat wrappers.

Resources