How do I effectively utilize the "build" step in Julia? - julia

Per the Julia Docs:
The build step is executed the first time a package is installed or when explicitly invoked with build. A package is built by executing the file deps/build.jl.
Why would I want to make a build.jl file and how do I effectively utilize the benefits it allows for?

Typically (historically) the build step and the build.jl file is used for downloading/installing binary dependencies with e.g. BinaryProvider (or its predecessor BinDeps)1. The build step can also be used to install configuration files. For example, IJulia uses build.jl in order to install a Julia kernel for Jupyter.
If your package is pure Julia code you typically don't have a build.jl file since there is no need for it.
1 With Julia 1.3 we have Artifacts which is meant to replace the BinaryProvider workflow and make build.jl obsolete for the purpose of downloading and installing prebuilt binaries.

Related

How to check which dependency has errored while building a package?

I am a new Julia user and am building my own package. When I build a package I created on my own I get this
(VFitApproximation) pkg> build VFitApproximation
Building MathLink → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/653c35640db592dfddd8c46dbb43d561cbef7862/build.log`
Building Conda ───→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/299304989a5e6473d985212c28928899c74e9421/build.log`
Building PyCall ──→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/169bb8ea6b1b143c5cf57df6d34d022a7b60c6db/build.log`
Progress [========================================>] 4/4
✗ VFitApproximation
3 dependencies successfully precompiled in 9 seconds (38 already precompiled)
1 dependency errored
I don't know which dependency has errored and how to fix that. How do I ask Julia what has errored?
It's VFitApproximation itself that has the error (it's the only one with an ✗ next to it). You should try starting a session and typing using VFitApproximation; if that causes an error, the message will tell you much more about the origin than build. If that doesn't directly trigger an error, then you can try the verbose mode of build as suggested by #sundar above. Julia's package manager runs a lot of its system-wide operations in parallel, which is wonderful when you have to build dozens or hundreds of packages, but under those circumstances you only get general summaries rather than the level of detail you can get from operations focused on a single package.
More generally, most packages don't need manual build: it's typically used for packages that require special configuration at the time of installation. Examples might include downloading a data set from the internet (though this now better handled through Artifacts), or saving configuration data about the user's hardware to a file, etc. For reference, on my system, out of 418 packages I have deved, only 20 have deps/build.jl scripts, and many of those only because they haven't yet been updated to use Artifacts.
Bottom line: for most code, you never need Pkg.build, and you should just use Pkg.precompile or using directly.

How do I add a local project to the import path in Julia?

I want to be able to import or using a package that I'm writing in a directory ~/projects/ExamplePkg from my main Julia REPL / from another project or environment.
By ]foo I mean "use the foo command at the Julia Pkg REPL". Type ] at the Julia REPL to enter the Pkg REPL. Use ]help <command name> for more info or check the link below.
Ensure that your package has a Project.toml that gives it a UUID and names it (generate one with ]generate from the Julia REPL or with the PkgTemplates package) and that it is in a git repo with at least one commit including all the relevant files.
Then choose how you would like to use the package.
You probably want to run ]dev ~/projects/ExamplePkg:
If dev is used on a local path, that path to that package is recorded and used when loading that package. The path will be recorded relative to the project file, unless it is given as an absolute path.
If you use dev and you change the dependencies in the dev'd package, then you should probably run ]resolve in all environments that depend on the package.
Or you can run ]add ~/projects/ExamplePkg:
Instead of giving a URL of a git repo to add we could instead have given a local path to a git repo. This works similarly to adding a URL. The local repository will be tracked (at some branch) and updates from that local repo are pulled when packages are updated. Note that changes to files in the local package repository will not immediately be reflected when loading that package. The changes would have to be committed and the packages updated in order to pull in the changes.
In Julia versions <1.4: If you accidentally ]add a package before the git repo is set up correctly then you might get ERROR: GitError(Code:EUNBORNBRANCH, Class:Reference, reference 'refs/heads/master' not found). Unfortunately, Julia will probably have cached the bad repo, and you will need to remove that from ~/.julia/clones/<gibberish>/. You can find the dir to remove with grep: $ grep ExamplePkg ~/.julia/clones/*/config.
Documentation: https://julialang.github.io/Pkg.jl/v1/managing-packages/
you can try
path_to_package = "~/projects/ExamplePkg"
push!(LOAD_PATH,path_to_package)
# then use it, ExamplePkg is the package's name
using ExamplePkg
But you have to run codes above whenever you restart Julia.
reference is Workflow tips-Julia Documentation

When should a Julia project have a Manifest AND Project file?

I am trying to understand when a Julia Project needs a Manifest AND Project file vs when it just needs a project file. What are the different situations that warrant each case? I am trying to make sure my own project is set up correctly(It has both files currently).
The Manifest.toml is a snapshot of the exact state of a Julia environment. It specifies all packages that are installed in the environment with version numbers - not just the ones that have been ] added but the entire dependency graph!
The Project.toml on the other hand just lists the direct dependencies, that is the packages that have been ] added explicitly, potentially with version bounds specified in a [compat] section.
By checking in both files (specifically the Manifest.toml), you make your project reproducible. Another user only has to ] instantiate and will have the exact same environment that you had when working on the project. This is great for application projects which might consist of multiple Julia scripts which are not intended for use by other Julia projects.
If you only check in the Project.toml you are specifying the dependency information more loosely and will leave room for Julias resolver to find appropriate package versions for all dependencies. This is what you should do when working on a Julia package since people will likely want to install your package next to other packages and overly restricting versions of dependencies will make your package incompatible.
Hence, I'd summarize as follows:
Application / "Project" -> Project.toml + Manifest.toml
Julia Package -> Only Project.toml
For more on applications and packages checkout the glossary of the Pkg.jl documentation.
(Note that there are exceptional cases (unregistered dependencies, for example) where you might have to check in a Manifest.toml for a Julia package.)
In Julia 1.2 and above, you can have nested Project.toml files to express test-specific dependencies. Since you may have a Project.toml in your test folder, which you would need to activate, I would also suggest including a Manifest.toml, as a record of under which environment you for-sure know your package's test are passing.
In other words, I believe in the package/application dichotomy mentioned in crstnbr's answer, and the recommendation to include Manifest.toml with applications, and I would further say that the tests within a package are like an application. The same goes for performance benchmarks that you might have in your package.
I haven't practiced this myself, but it seems like it would be nice to have the CI tests run under both the "frozen" version of the test/Manifest.toml, and the latest versions that the package manager can find of each package. If the tests start failing, it would be easier to tease apart whether the breakage is caused by a change in a dependency.

Compiling haskell module Network on win32/cygwin

I am trying to compile Network.HTTP (http://hackage.haskell.org/package/network) on win32/cygwin. However, it does fail with following message:
Setup.hs: Missing dependency on a foreign library:
* Missing (or bad) header file: HsNet.h
This problem can usually be solved by installing the system package that
provides this library (you may need the "-dev" version). If the library is
already installed but in a non-standard location then you can use the flags
--extra-include-dirs= and --extra-lib-dirs= to specify where it is.
If the header file does exist, it may contain errors that are caught by the C
compiler at the preprocessing stage. In this case you can re-run configure
with the verbosity flag -v3 to see the error messages.
Unfortuntely it does not give more clues. The HsNet.h includes sys/uio.h which, actually should not be included, and should be configurered correctly.
Don't use cygwin, instead follow Johan Tibells way
Installing MSYS
Install the latest Haskell Platform. Use the default settings.
Download version 1.0.11 of MSYS. You'll need the following files:
MSYS-1.0.11.exe
msysDTK-1.0.1.exe
msysCORE-1.0.11-bin.tar.gz
The files are all hosted on haskell.org as they're quite hard to find in the official MinGW/MSYS repo.
Run MSYS-1.0.11.exe followed by msysDTK-1.0.1.exe. The former asks you if you want to run a normalization step. You can skip that.
Unpack msysCORE-1.0.11-bin.tar.gz into C:\msys\1.0. Note that you can't do that using an MSYS shell, because you can't overwrite the files in use, so make a copy of C:\msys\1.0, unpack it there, and then rename the copy back to C:\msys\1.0.
Add C:\Program Files\Haskell Platform\VERSION\mingw\bin to your PATH. This is neccesary if you ever want to build packages that use a configure script, like network, as configure scripts need access to a C compiler.
These steps are what Tibell uses to compile the Network package for win and I have used this myself successfully several times on most of the haskell platform releases.
It is possible to build network on win32/cygwin. And the above steps, though useful (by Jonke) may not be necessary.
While doing the configuration step, specify
runghc Setup.hs configure --configure-option="--build=mingw32"
So that the library is configured for mingw32, else you will get link or "undefined references" if you try to link or use network library.
This combined with #Yogesh Sajanikar's answer made it work for me (on win64/cygwin):
Make sure the gcc on your path is NOT the Mingw/Cygwin one, but the
C:\ghc\ghc-6.12.1\mingw\bin\gcc.exe
(Run
export PATH="/cygdrive/.../ghc-7.8.2/mingw/bin:$PATH"
before running cabal install network in the Cygwin shell)

Dependency management in R

Does R have a dependency management tool to facilitate project-specific dependencies? I'm looking for something akin to Java's maven, Ruby's bundler, Python's virtualenv, Node's npm, etc.
I'm aware of the "Depends" clause in the DESCRIPTION file, as well as the R_LIBS facility, but these don't seem to work in concert to provide a solution to some very common workflows.
I'd essentially like to be able to check out a project and run a single command to build and test the project. The command should install any required packages into a project-specific library without affecting the global R installation. E.g.:
my_project/.Rlibs/*
Unfortunately, Depends: within the DESCRIPTION: file is all you get for the following reasons:
R itself is reasonably cross-platform, but that means we need this to work across platforms and OSs
Encoding Depends: beyond R packages requires encoding the Depends in a portable manner across operating systems---good luck encoding even something simple such as 'a PNG graphics library' in a way that can be resolved unambiguously across systems
Windows does not have a package manager
AFAIK OS X does not have a package manager that mixes what Apple ships and what other Open Source projects provide
Even among Linux distributions, you do not get consistency: just take RStudio as an example which comes in two packages (which all provide their dependencies!) for RedHat/Fedora and Debian/Ubuntu
This is a hard problem.
The packrat package is precisely meant to achieve the following:
install any required packages into a project-specific library without affecting the global R installation
It allows installing different versions of the same packages in different project-local package libraries.
I am adding this answer even though this question is 5 years old, because this solution apparently didn't exist yet at the time the question was asked (as far as I can tell, packrat first appeared on CRAN in 2014).
Update (November 2019)
The new R package renv replaced packrat.
As a stop-gap, I've written a new rbundler package. It installs project dependencies into a project-specific subdirectory (e.g. <PROJECT>/.Rbundle), allowing the user to avoid using global libraries.
rbundler on Github
rbundler on CRAN
We've been using rbundler at Opower for a few months now and have seen a huge improvement in developer workflow, testability, and maintainability of internal packages. Combined with our internal package repository, we have been able to stabilize development of a dozen or so packages for use in production applications.
A common workflow:
Check out a project from github
cd into the project directory
Fire up R
From the R console:
library(rbundler)
bundle('.')
All dependencies will be installed into ./.Rbundle, and an .Renviron file will be created with the following contents:
R_LIBS_USER='.Rbundle'
Any R operations run from within this project directory will adhere to the project-speciic library and package dependencies. Note that, while this method uses the package DESCRIPTION to define dependencies, it needn't have an actual package structure. Thus, rbundler becomes a general tool for managing an R project, whether it be a simple script or a full-blown package.
You could use the following workflow:
1) create a script file, which contains everything you want to setup and store it in your projectd directory as e.g. projectInit.R
2) source this script from your .Rprofile (or any other file executed by R at startup) with a try statement
try(source("./projectInit.R"), silent=TRUE)
This will guarantee that even when no projectInit.R is found, R starts without error message
3) if you start R in your project directory, the projectInit.R file will be sourced if present in the directory and you are ready to go
This is from a Linux perspective, but should work in the same way under windows and Mac as well.

Resources