How to do file over-rides in hydra? - fb-hydra

I have a main config file, let's say config.yaml:
num_layers: 4
embedding_size: 512
learning_rate: 0.2
max_steps: 200000
I'd like to be able to override this, on the command-line, with another file, like say big_model.yaml, which I'd use conceptually like:
python my_script.py --override big_model.yaml
and big_model.yaml might look like:
num_layers: 8
embedding_size: 1024
I'd like to be able to override with an arbitrary number of such files, each one taking priority over the last. Let's say I also have fast_learn.yaml
learning_rate: 2.0
And so I'd then want to conceptually do something like:
python my_script.py --override big_model.yaml --override fast_learn.yaml
What is the easiest/most standard way to do this in hydra? (or potentially in omegaconf perhaps?)
(note that I'd like these override files to ideally just be standard yaml files, that override the earlier yaml files, ideally; though if I have to write using override DSL instead, I can do that, if that's the easiest/best/most standard way)

It sounds like package override might be the a good solution for you.
The documentation can be found here: https://hydra.cc/docs/next/advanced/overriding_packages
an example application can be found here:
https://github.com/facebookresearch/hydra/tree/master/examples/advanced/package_overrides
using the example application as an example, you can achieve the override by doing something like
$ python simple.py db=postgresql db.pass=helloworld
db:
driver: postgresql
user: postgre_user
pass: helloworld
timeout: 10

Refer to the basic tutorial and read about config groups.
You can create arbitrary config groups, and select one option from each (As of Hydra 1.0, config groups options are mutually exclusive), you will need two config groups here:
one can be model, with a normal, small and big model, and another can trainer, with maybe normal and fast options.
Config groups can also override things in other config groups.
You can also always append to the defaults list from the command line - so you can also add additional config groups that are only used in the command line.
an example for that can an 'experiment' config group. You can use it as:
$ python train.py +experiment=exp1
In such config groups that are overriding things across the entire config you should use the global package (read more about packages in the docs).
# #package _global_
num_layers: 8
embedding_size: 1024
learning_rate: 2.0

Related

Why Juila module have to be prefixed with dot?

Why module using .A has to be prefixed with dot? It doesn't work if you omit the dot.
File ./A.jl
module A
export sayHi
function sayHi()
println("hi")
end
end
File ./Main.jl
include("./A.jl")
using .A # <= Why it has to be prefixed with dot?
sayHi()
Running, start REPL and type
include("./Main.jl")
Part 2
And if you move file A.jl to different location, like ../some-dir/A.jl it has to be prefixed to two dots using ..A. Why?
Because you define module A inside your current module. The dot means "look inside the current module for this". https://docs.julialang.org/en/v1/manual/modules/#Relative-and-absolute-module-paths-1
After digging it deeper - it seems like the answer is - don't use modules.
The documentation is wrong, it says
When in reality, the module usage is heavily tied to the location of files, it could be using Foo, using .Foo, using ..Foo or using Main.Foo - depending on the location of the Foo module relative to the file that imports it. In my personal opinion - something is very wrong with that design.
No support in the VSCode Editor, it doesn't understand using ..Foo. There are other ways to use modules, including altering startup.jl or JULIA_LOAD_PATHS - none of it works either. I assume nobody noticing these problems because nobody actually using modules.
Top answer on YCombinator - gives the same answer - the best way to use modules in Julia - is to not use it at all https://news.ycombinator.com/item?id=19232824

What *is* a salt formula, really?

I am trying to work through the Salt Formulas documentation and seem to be having a fundamental misunderstanding of what a salt formula really is.
Understandably, this question may seem like a duplicate of these questions, but due to my failing to grasp the basic concepts I'm also struggling to make use of the answers to these questions.
I thought, that a salt formula is basically just a package that implements extra functions, a lot like
#include <string.h>
in C, or
import numpy as np
in Python. Thus, I thought, I could download the salt-formula-linux to /srv/formulas/salt-formula-linux/, add that to file_roots, restart the master (all as per the docs), and then write a file like swapoff.sls containing
disable_swap:
linux:
storage:
swap:
file:
enabled: False
(the above is somewhat similar to the examples in the repo's root) in hope that the formula would then handle removing the swap entry from /etc/fstab and running swapoff -a for me. Needless to say, this didn't work, clearly because I'm not understanding what a salt formula is meant to be.
So, what is a salt formula and how do I use it? Can I make use of it as a library of functions too?
This answer might not be fully correct in all technicalities, but this is what solved my problem.
A salt formula is not a library of functions. It is, rather, a collection of state files. While often a state file can be very simple, such as some of my user defined
--> top.sls <--
base:
'*':
- docker
--> docker.sls <--
install_docker_1703:
pkgrepo.managed:
# stuff
pkg.installed:
- name: docker-ce
creating a state file like
--> swapoff.sls <--
disable_swap:
linux.storage.swap: # and so on
is, perhaps, not the way to go. Well, at least, maybe not for a beginner with lacking knowledge.
Instead, add an item to top.sls:
- linux.storage.swap
This is not enough, however. Most formulas (or the state files within them, if you will) are highly parametrizable, i.e. they're full of placeholders with variable names, such as {{ swap.device }}. If there's nothing to fill this gap, the state fill will not be able to do anything. These gaps are filled from pillars.
All that remains, is to create a file like swap.sls in /srv/pillar/ that would contain something like (as per the examples of that formula)
linux:
storage:
enabled: true
swap:
file:
enabled: true
engine: file
device: /swapfile
size: 1024
and also /srv/pillar/top.sls with
base:
'*':
- swap
Perhaps /srv/pillar should also be included in pillar_roots in /etc/salt/master.
So now /srv/salt/top.sls runs /srv/formulas/salt-formula-linux/linux/storage/swap.sls which using the guidance of /srv/pillar/top.sls pulls some parameters from /srv/pillar/swap.sls and enables a swapfile.

Gulp — how to get lazy, ‘make’-like building?

I am using gulp for css and js processing. Sometimes I am missing the good old lazyness of the unix make command:
only generate transformed (whatsover, e.g. compilation) files from original files, that have actually changed (based on time stamps).
this is true from stage 1 to 2 (.cpp -> .o), stage 2 to 3 (linking or other stuff) whatever your dependency graph gives...
Make is not limited to source code: You can do image manipulation in several steps (efficiently ‘lazy’ generation of downscaled thumbs for example) or much else. All based on the fairly simple rule: „is at least one of the source files newer in respect to the current output file(s)?“
Unlike gulp, every step generates (more or less temporary) files, not a continuous pipe.
Is there a way, to get the same kind of lazyness in gulp**, i.e. when generating css?
only transform those (less|sass|stylus) files➝css if something changed (on the very respective file)
same for adding in browser prefixes, concat, minify
Admittedly, beyond the first 1 or 2 steps, the output is most likely already a single stream. So any change means ‘touched’. Still, when playing for example with minify options, I'd rather be lazy about the early transpile, prefixing and concat stages (drawing prior results from a temp file). Also on the javascript side ( typeScript, ... )
lazypipe and gulp-cache sound tempting but are something else, if I understand correctly. Saying .watch() is also only a partial answer, for the very first stage.
Is there a more generic approach?
If you're set on using Gulp, then this would seem to be the way to do it. It involves the gulp-cached and gulp-remember plugins.

Is it possible to specify which tests to choose from?

We have a vast amount of tests. We would like infinitest only to choose between tests that have been included in an .xml-file (i.e. a TestNG suite).
We do not want to put the annotation groups = { "shouldbetested" } in every testcase but rather feed the info from our .xml file into infinitest.
Is this possible?
Is it another tool that could do that for us?
you can use a regular expresstion to "not" skip a certain test:
(?!.*YourTest)
Infinitest can filter out the tests you don't want to run using regular expressions in the infinitest.filters file.
The infinitest.filters contains regular expressions (one per line) that match the test classes you want to filter. Put this file in the root of your project (a.k.a. the working directory), and Infinitest will filter those tests out.
Note that the class names include package names, so use .* in front to match any package.

Easiest way to specify alternate transmogrifier _path?

I'm doing a content migration with collective.transmogrifier and I'm reading files off the file system with transmogrify.filesystem. Instead of importing the files "as is", I'd like to import them to a sub directory in Plone. What is the easiest way to modify the _path?
For example, if the following exists:
/var/www/html/bar/index.html
I'd like to import to:
/Plone/foo/bar/index.html
In other words, import the contents of "baz" to a subdirectory "foo". I see two options:
Use some blueprint in collective.transmogrifier to mangle _path.
Write some blueprint to mangle _path.
Am I missing anything easier?
Use the standard inserter blueprint to generate the paths; it accepts python expressions and can replace keys in-place:
[manglepath]
blueprint = collective.transmogrifier.sections.inserter
key = string:_path
value = python:item['_path'].replace('/var/www/html', '/Plone/foo')
This thus takes the output of the value python expression (which uses the item _path and stores it back under the same key.

Resources