JVM memory settings for specs2 - sbt

SBT keeps running out of memory on some of my bigger acceptance style tests using specs2 and spray-testkit. I have 10 gigs or RAM available and currently I start SBT (using the SBT extras script) with MaxPermSize at 512m, Xms at 1024m and Xmx at 2g.
The acceptance test runs through a client's entire business process in specific sequence, so it's not easy to split the acceptance test in to multiple smaller tests.
Any ideas how I can configure my environment better, or gotcha's that I should look out for will be appreciated.
For what it's worth, I'm using Oracle Java under Ubuntu, and the project uses Scala 2.10, sbt 0.12.2, spray 1.1-M7 with specs2 1.14.
When running the system outside of test, or when using smaller tests, everything runs like clockwork. It's only during larger tests that things go nutty.

One thing you can do is to fork your tests, you can set your memory settings in the build.sbt directly:
fork in Test := true
javaOptions in Test += "-Xmx2048m" // we need lots of heap space
This means that the tests don't depend on you running with the SBT extras script, and the settings don't affect sbt itself. You can also set various other options (see Forking), including changing the working directory, and even the JRE to use.

I suspect you're hitting the exponential problem with the specs2 immutable style. The solution is simply to add more memory or bust your tests up into smaller chunks. More info is here:
http://www.artima.com/articles/compile_time.html

Related

Run hydra configured project with SLURM and Horovod

Right now, I am using Horovod to run distributed training of my pytorch models. I would like to start using hydra config for the --multirun feature and enqueue all jobs with SLURM. I know there is the Submitid plugin. But I am not sure, how would the whole pipeline work with Horovod. Right now, my command for training looks as follows:
CUDA_VISIBLE_DEVICES=2,3 horovodrun -np 2 python training_script.py \
--batch_size 30 \
...
Say I want to use hydra --multirun to run several multi-gpu experiments, I want to enqueue the runs with slurm since my resources are limited and would be run sequentially most of the time and I want to use Horovod to synchronize gradients of my networks. Would this setup run out of the box? Would I need to specify CUDA_VISIBLE_DEVICES if slurm took care of the resources? How would I need to adjust my run command or other settings to make this setup plausible? I am especially interested in how the multirun feature handles GPU resources. Any recommendations are welcome.
The Submitit plugin does support GPU allocation, but I am not familiar with Horovod and have no idea if this can work in conjunction with it.
One new feature of Hydra 1.0 is the ability to set or copy environment variables from the launching process.
This might come in handy in case Horovod is trying to set some environment variables. See the docs for info about it.

What are RUN_HAVE_STD_REGEX, RUN_HAVE_POSIX_REGEX and RUN_HAVE_STEADY_CLOCK for?

In gRPC, when building for arm, I need to disable those three variables:
-DRUN_HAVE_STD_REGEX=OFF
-DRUN_HAVE_POSIX_REGEX=OFF
-DRUN_HAVE_STEADY_CLOCK=OFF
It is not super clear to me what they do, so I wonder:
Why is it that CMake cannot detect them automatically when cross-compiling?
What is the impact of disabling them, say on a system that does support them? Will it sometimes crash? Reduce performances in some situations?
Because they are not auto-detected by CMake, it would be easier for me to always disable them if that works everywhere without major issues for my use-case.
gRPC uses CMake's try_run to automatically detect if the platform supports a feature when cross-compiling. However, some variables need to be supplied manually. From the documentation (emphasis added):
When cross compiling, the executable compiled in the first step usually cannot be run on the build host. The try_run command checks the CMAKE_CROSSCOMPILING variable to detect whether CMake is in cross-compiling mode. If that is the case, it will still try to compile the executable, but it will not try to run the executable unless the CMAKE_CROSSCOMPILING_EMULATOR variable is set. Instead it will create cache variables which must be filled by the user or by presetting them in some CMake script file to the values the executable would have produced if it had been run on its actual target platform.
Basically, it's saying that CMake won't try to run the compiled executable on the build machine unless some test results are specified manually (test which would have been run on the target machine). The below tests will usually cause problems:
-DRUN_HAVE_STD_REGEX
-DRUN_HAVE_GNU_POSIX_REGEX
-DRUN_HAVE_POSIX_REGEX
-DRUN_HAVE_STEADY_CLOCK
Hopefully that answers your first question. I do not know how to answer your second question, as I have always just set those variables manually to match the features of whatever system I've compiled for.

SBT Builds with Bamboo

I was wondering if anyone could recommend best practise for SBT builds using Bamboo. I see that is a Bamboo plugin for SBT however it is a) unsupported and b) isn't compatible with later versions of Bamboo. This combination would almost certainly be a blocker for us as using it could lead to a position where we couldn't take a Bamboo update (potentially fixing a security issue) because it would break all of our SBT builds.
Presumably you can just set up Bamboo to build SBT projects as a script task but I'm a bit worried about the experience here as it's not clear to me how things like failing tests and code coverage will be represented.
Is it possible to have a reasonably slick SBT and Bamboo setup without using the plugin or is Bamboo not a suitable CI system to use with SBT?
We do heavily rely on bamboo in our sbt workflows. The plugin works fine but the only benefit over a short inline script is the parsing of tests which is also available as another task.
We love having some portable build scripts in the projects which can be also used by bamboo.
So here is the starter guide:
have a good portable build script in your project (presumably bash script)
call this script in an inline script in bamboo (so you can do some other stuff as well, e.g. checkout submodules, choosing docker host, ...)

Very slow debugging

I've cross compiled Qt and created SD card image and mounted using losetup. Compiation is much faster now compared to direct sshfs mount. Application runs OK. Now, I want to debug which is dead slow and it appears like it is copying the files back to the dev machine for debugging. I see this suggestion:
File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
I'm using gdb-multiarch and have got gdbserver (on target board).
I'm kind of lost here. Where to set this option? I've supplied --sysroot argument to the binary but no use. Any help is really appreciated.
Update: using Qt Creator for the development.
sysroot is a gdb setting. You can set it in gdb with the set sysroot command. For example:
(gdb) help set sysroot
Set an alternate system root.
The system root is used to load absolute shared library symbol files.
For other (relative) files, you can add directories using
`set solib-search-path'.
This setting controls how gdb tries to find various files it needs, and in particular the executable and shared libraries that you are debugging.
Recent versions of gdb default sysroot to target:, which means "fetch the files from the target". If you're debugging locally, this is just local filesystem access; but if you are debugging remotely and have a slow connection, this can be a bit painful. In order to make this faster, the idea is to keep a local copy of all the files you'll need, and then use set sysroot to point gdb at this local copy.
The main issue with this approach is that if your local copy is out of sync with the remote, you can end up confusing gdb and getting nonsense results. I am not certain but maybe enabling build-ids alleviates this problem somewhat (certainly in theory gdb can detect build-id mismatches and warn, I just don't recall whether it actually does).
As Tom Tromey suggested adding set sysroot {my sysroot local path} as a starting command in the debugger has worked for me.

Can one disable the sbt 1.x server?

Some of my builds and plugins make use of private deployment credentials (sometimes read from the file system, sometimes entered and retained in memory via the InteractionService).
Though perhaps it is overparanoid, I try to be careful to minimize the attack surface of software that uses private information, and it feels like bad hygiene to run a server, even on localhost or a UNIX socket, unnecessarily in these builds.
I've looked for a setting I could set in a plugin that would disable server startup unless overridden by the build. So far have not found anything like this. Is there such a setting?
Many thanks!
Update: With the help of Eugene Yokota, as of sbt 1.1.1 there is now a boolean autoStartServer setting. Builds and plugins can prevent the server from starting up automatically by setting autoStartServer := false. (Users can still manually start-up the server by running startServer if they wish.)
As of sbt 1.1.0 at least, the server won't start unless you start the sbt shell, which means that if you're running sbt in batch mode (for example sbt test) in CI environment, it won't have server.
To stop server even in the shell automatically, I've added a JVM flag sbt.server.autostart. So running sbt as sbt -Dsbt.server.autostart=false would do it. You can globally set that by putting that into your SBT_OPTS.
To manually opt-in for server, you can then run:
> startServer
Update: Now that autoStartServer is a setting, you can write the following in ~/.sbt/1.0/global.sbt:
// This is so it works on sbt 1.x prior to 1.1.1
SettingKey[Boolean]("autoStartServer", "") := false

Resources