How to check which dependency has errored while building a package? - julia

I am a new Julia user and am building my own package. When I build a package I created on my own I get this
(VFitApproximation) pkg> build VFitApproximation
Building MathLink → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/653c35640db592dfddd8c46dbb43d561cbef7862/build.log`
Building Conda ───→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/299304989a5e6473d985212c28928899c74e9421/build.log`
Building PyCall ──→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/169bb8ea6b1b143c5cf57df6d34d022a7b60c6db/build.log`
Progress [========================================>] 4/4
✗ VFitApproximation
3 dependencies successfully precompiled in 9 seconds (38 already precompiled)
1 dependency errored
I don't know which dependency has errored and how to fix that. How do I ask Julia what has errored?

It's VFitApproximation itself that has the error (it's the only one with an ✗ next to it). You should try starting a session and typing using VFitApproximation; if that causes an error, the message will tell you much more about the origin than build. If that doesn't directly trigger an error, then you can try the verbose mode of build as suggested by #sundar above. Julia's package manager runs a lot of its system-wide operations in parallel, which is wonderful when you have to build dozens or hundreds of packages, but under those circumstances you only get general summaries rather than the level of detail you can get from operations focused on a single package.
More generally, most packages don't need manual build: it's typically used for packages that require special configuration at the time of installation. Examples might include downloading a data set from the internet (though this now better handled through Artifacts), or saving configuration data about the user's hardware to a file, etc. For reference, on my system, out of 418 packages I have deved, only 20 have deps/build.jl scripts, and many of those only because they haven't yet been updated to use Artifacts.
Bottom line: for most code, you never need Pkg.build, and you should just use Pkg.precompile or using directly.

Related

Ignore a dependency during R CMD check

I have a package that I developed to enable my team (and perhaps other interested users) to install and use a particular R package (RQDA) that was archived on CRAN. I have hosted this package on GitHub and am trying to set up GitHub Actions so that I have a CI workflow in place.
Whenever I run R CMD check locally everything is fine, but when I push to GitHub the build fails. This is because, by default, Actions tries to install that same (archived) package. Expectedly, this fails.
So, my question is this: is there a way I can disable the check for a specific package dependency? There are no plans to ever send this package to CRAN, so I am happy to bypass their package policy in this instance.
2 possible ways:
Upload the source for RQDA to a Github repo, or other publicly accessible location, and put a Remotes: line in your DESCRIPTION file
Save the package to cloud storage, eg an S3 bucket or Azure storage container, and download it from there as a separate workflow step prior to checking
This is how I was able to deal with the problem:
The changes made were entirely in the workflow file at ./.github/workflows/. One of the jobs there is for installing R package dependencies for the project:
- name: Install dependencies
run: |
remotes::install_deps(dependencies = TRUE)
remotes::install_cran("rcmdcheck")
shell: Rscript {0}
The first thing I did was to change the dependencies argument to NA so that only packages listed in Depends and Imports are installed. (The RQDA dependency that was giving me trouble is under Suggests).
There was still an error but this time with some guidance that involved going to the job Check and setting the environment variable _R_CHECK_FORCE_SUGGESTS_ to false.
The check now works as expected.

Virtual environment in R?

I've found several posts about best practice, reproducibility and workflow in R, for example:
How to increase longer term reproducibility of research (particularly using R and Sweave)
Complete substantive examples of reproducible research using R
One of the major preoccupations is ensuring portability of code, in the sense that moving it to a new machine (possibly running a different OS) is relatively straightforward and gives the same results.
Coming from a Python background, I'm used to the concept of a virtual environment. When coupled with a simple list of required packages, this goes some way to ensuring that the installed packages and libraries are available on any machine without too much fuss. Sure, it's no guarantee - different OSes have their own foibles and peculiarities - but it gets you 95% of the way there.
Does such a thing exist within R? Even if it's not as sophisticated. For example simply maintaining a plain text list of required packages and a script that will install any that are missing?
I'm about to start using R in earnest for the first time, probably in conjunction with Sweave, and would ideally like to start in the best way possible! Thanks for your thoughts.
I'm going to use the comment posted by #cboettig in order to resolve this question.
Packrat
Packrat is a dependency management system for R. Gives you three important advantages (all of them focused in your portability needs)
Isolated : Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.
What's next?
Walkthrough guide: http://rstudio.github.io/packrat/walkthrough.html
Most common commands: http://rstudio.github.io/packrat/commands.html
Using Packrat with RStudio: http://rstudio.github.io/packrat/rstudio.html
Limitations and caveats: http://rstudio.github.io/packrat/limitations.html
Update: Packrat has been soft-deprecated and is now superseded by renv, so you might want to check this package instead.
The Anaconda package manager conda supports creating R environments.
conda create -n r-environment r-essentials r-base
conda activate r-environment
I have had a great experience using conda to maintain different Python installations, both user specific and several versions for the same user. I have tested R with conda and the jupyter-notebook and it works great. At least for my needs, which includes RNA-sequencing analyses using the DEseq2 and related packages, as well as data.table and dplyr. There are many bioconductor packages available in conda via bioconda and according to the comments on this SO question, it seems like install.packages() might work as well.
It looks like there is another option from RStudio devs, renv. It's available on CRAN and supersedes Packrat.
In short, you use renv::init() to initialize your project library, and use renv::snapshot() / renv::restore() to save and load the state of your library.
I prefer this option to conda r-enviroments because here everything is stored in the file renv.lock, which can be committed to a Git repo and distributed to the team.
To add to this:
Note:
1. Have Anaconda installed already
2. Assumed your working directory is "C:"
To create desired environment -> "r_environment_name"
C:\>conda create -n "r_environment_name" r-essentials r-base
To see available environments
C:\>conda info --envs
.
..
...
To activate environment
C:\>conda activate "r_environment_name"
(r_environment_name) C:\>
Launch Jupyter Notebook and let the party begins
(r_environment_name) C:\> jupyter notebook
For a similar "requirements.txt", perhaps this link will help -> Is there something like requirements.txt for R?
Check out roveR, the R container management solution. For details, see https://www.slideshare.net/DavidKunFF/ownr-technical-introduction, in particular slide 12.
To install roveR, execute the following command in R:
install.packages("rover", repos = c("https://lair.functionalfinances.com/repos/shared", "https://lair.functionalfinances.com/repos/cran"))
To make full use of the power of roveR (including installing specific versions of packages for reproducibility), you will need access to a laiR - for CRAN, you can use our laiR instance at https://lair.ownr.io, for uploading your own packages and sharing them with your organization you will need a laiR license. You can contact us on the email address in the presentation linked above.

how to make sure a debian package does not have a dependency

I am building a debian package using dpkg.
The package has a dependency on libvirt which is not desired.
The rules file does not specify this dependency, but it is added by dpkg, I suppose due to some calls to libvirt-dev at build time.
However my package works fine without libvirt. As such, libvirt is a "Recommended" package but not "Required". How do I override this dependency and make sure it is not present in my final deb file ?
Hard to know without seeing your actual package, but I'd guess that you have a binary or shared library which is linked against libvirt. That would cause dh_shlibdeps to include libvirt in the ${shlibs:Depends} substvar.
If that's your problem, then the right fix depends on what's getting linked to libvirt. It should be straightforward to determine; just run ldd on each binary or shared library object in your package, and grep for "libvirt".
If the thing linked against libvirt is only incidental to the package, and isn't part of the main functionality, then using Recommends: would indeed be the right thing. To make dh_shlibdeps exclude that object from its dependency scanning, give it a -X option. Example target for debian/rules, assuming debhelper7-style packaging:
override_dh_shlibdeps:
dh_shlibdeps -Xname_of_your_object_to_exclude
If the thing(s) linked to libvirt actually are an important part of the package functionality, then the generated libvirt dependency is appropriate. If you still don't want it, you'll need to work out how to avoid linking against libvirt during your build.

Compiling haskell module Network on win32/cygwin

I am trying to compile Network.HTTP (http://hackage.haskell.org/package/network) on win32/cygwin. However, it does fail with following message:
Setup.hs: Missing dependency on a foreign library:
* Missing (or bad) header file: HsNet.h
This problem can usually be solved by installing the system package that
provides this library (you may need the "-dev" version). If the library is
already installed but in a non-standard location then you can use the flags
--extra-include-dirs= and --extra-lib-dirs= to specify where it is.
If the header file does exist, it may contain errors that are caught by the C
compiler at the preprocessing stage. In this case you can re-run configure
with the verbosity flag -v3 to see the error messages.
Unfortuntely it does not give more clues. The HsNet.h includes sys/uio.h which, actually should not be included, and should be configurered correctly.
Don't use cygwin, instead follow Johan Tibells way
Installing MSYS
Install the latest Haskell Platform. Use the default settings.
Download version 1.0.11 of MSYS. You'll need the following files:
MSYS-1.0.11.exe
msysDTK-1.0.1.exe
msysCORE-1.0.11-bin.tar.gz
The files are all hosted on haskell.org as they're quite hard to find in the official MinGW/MSYS repo.
Run MSYS-1.0.11.exe followed by msysDTK-1.0.1.exe. The former asks you if you want to run a normalization step. You can skip that.
Unpack msysCORE-1.0.11-bin.tar.gz into C:\msys\1.0. Note that you can't do that using an MSYS shell, because you can't overwrite the files in use, so make a copy of C:\msys\1.0, unpack it there, and then rename the copy back to C:\msys\1.0.
Add C:\Program Files\Haskell Platform\VERSION\mingw\bin to your PATH. This is neccesary if you ever want to build packages that use a configure script, like network, as configure scripts need access to a C compiler.
These steps are what Tibell uses to compile the Network package for win and I have used this myself successfully several times on most of the haskell platform releases.
It is possible to build network on win32/cygwin. And the above steps, though useful (by Jonke) may not be necessary.
While doing the configuration step, specify
runghc Setup.hs configure --configure-option="--build=mingw32"
So that the library is configured for mingw32, else you will get link or "undefined references" if you try to link or use network library.
This combined with #Yogesh Sajanikar's answer made it work for me (on win64/cygwin):
Make sure the gcc on your path is NOT the Mingw/Cygwin one, but the
C:\ghc\ghc-6.12.1\mingw\bin\gcc.exe
(Run
export PATH="/cygdrive/.../ghc-7.8.2/mingw/bin:$PATH"
before running cabal install network in the Cygwin shell)

Dependency management in R

Does R have a dependency management tool to facilitate project-specific dependencies? I'm looking for something akin to Java's maven, Ruby's bundler, Python's virtualenv, Node's npm, etc.
I'm aware of the "Depends" clause in the DESCRIPTION file, as well as the R_LIBS facility, but these don't seem to work in concert to provide a solution to some very common workflows.
I'd essentially like to be able to check out a project and run a single command to build and test the project. The command should install any required packages into a project-specific library without affecting the global R installation. E.g.:
my_project/.Rlibs/*
Unfortunately, Depends: within the DESCRIPTION: file is all you get for the following reasons:
R itself is reasonably cross-platform, but that means we need this to work across platforms and OSs
Encoding Depends: beyond R packages requires encoding the Depends in a portable manner across operating systems---good luck encoding even something simple such as 'a PNG graphics library' in a way that can be resolved unambiguously across systems
Windows does not have a package manager
AFAIK OS X does not have a package manager that mixes what Apple ships and what other Open Source projects provide
Even among Linux distributions, you do not get consistency: just take RStudio as an example which comes in two packages (which all provide their dependencies!) for RedHat/Fedora and Debian/Ubuntu
This is a hard problem.
The packrat package is precisely meant to achieve the following:
install any required packages into a project-specific library without affecting the global R installation
It allows installing different versions of the same packages in different project-local package libraries.
I am adding this answer even though this question is 5 years old, because this solution apparently didn't exist yet at the time the question was asked (as far as I can tell, packrat first appeared on CRAN in 2014).
Update (November 2019)
The new R package renv replaced packrat.
As a stop-gap, I've written a new rbundler package. It installs project dependencies into a project-specific subdirectory (e.g. <PROJECT>/.Rbundle), allowing the user to avoid using global libraries.
rbundler on Github
rbundler on CRAN
We've been using rbundler at Opower for a few months now and have seen a huge improvement in developer workflow, testability, and maintainability of internal packages. Combined with our internal package repository, we have been able to stabilize development of a dozen or so packages for use in production applications.
A common workflow:
Check out a project from github
cd into the project directory
Fire up R
From the R console:
library(rbundler)
bundle('.')
All dependencies will be installed into ./.Rbundle, and an .Renviron file will be created with the following contents:
R_LIBS_USER='.Rbundle'
Any R operations run from within this project directory will adhere to the project-speciic library and package dependencies. Note that, while this method uses the package DESCRIPTION to define dependencies, it needn't have an actual package structure. Thus, rbundler becomes a general tool for managing an R project, whether it be a simple script or a full-blown package.
You could use the following workflow:
1) create a script file, which contains everything you want to setup and store it in your projectd directory as e.g. projectInit.R
2) source this script from your .Rprofile (or any other file executed by R at startup) with a try statement
try(source("./projectInit.R"), silent=TRUE)
This will guarantee that even when no projectInit.R is found, R starts without error message
3) if you start R in your project directory, the projectInit.R file will be sourced if present in the directory and you are ready to go
This is from a Linux perspective, but should work in the same way under windows and Mac as well.

Resources