Hydra: how to use variable interpolation in packaged configs - fb-hydra

I have some config file, model/foo.yaml:
# #package _global_
# foo.yaml
MODEL:
BACKBONE:
OUT_FEATURES: [c4, c5]
HEAD:
IN_FEATURES: ${MODEL.BACKBONE.OUT_FEATURES}
There are no issues with variable interpolation when I point to this config in the defaults-list of another config, eg buzz.yaml, except when I also override the package like so:
# buzz.yaml
defaults:
- model#foo_head: foo
Attempting to compose buzz.yaml, you will get an error like:
omegaconf.errors.InterpolationKeyError: Interpolation key 'MODEL.BACKBONE.OUT_FEATURES' not found
Can variable interpolation not be used in configs when packaging?

Yes. OmegaConf supports relative interpolation.
MODEL:
BACKBONE:
OUT_FEATURES: [c4, c5]
HEAD:
IN_FEATURES: ${..BACKBONE.OUT_FEATURES}
I strongly recommend that you read the docs of OmegaConf.

Related

How to use a config group multiple times, while overriding each instance

Here is my current config structure
hydra/
pipeline/
common/
feature.yaml
stage/
train.yaml
with the following files:
train.yaml
# #package _global_
defaults:
- _self_
- ../pipeline/common#train: feature
- ../pipeline/common#val: feature
train:
conf:
split: train
val:
conf:
split: val
pipeline:
- ${oc.dict.values: train.steps}
- ${oc.dict.values: val.steps}
feature.yaml
conf:
split: train
steps:
tabular:
name: "${conf.split}-tabular
class: FeatureGeneration
dataset:
datasources: [ "${conf.split}_split" ]
What I've accomplished:
I've been able to figure out how to use the config group multiple times utilizing the defaults in train.yaml.
What I'm stuck on:
I'm getting an error: InterpolationKeyError 'conf.split' not found
I do realize that imports are absolute. If I put #package common.feature at the beginning of feature.yaml I can import conf.split via common.feature.conf.split, but is there not a cleaner way? I tried relative imports but got the same error.
I can't seem to override conf.split from train.yaml. You can see where I set train.conf.split and val.conf.split but these do not get propagated. What I need to be able to do is have each instance of the config group utilize a different conf.split value. This is the biggest issue I'm facing.
What I've referenced so far:
The following resources have gotten me to where I am so far, but am still having trouble with what's listed above.
Hydra : how to assign config files from same group to two different fields
https://hydra.cc/docs/advanced/overriding_packages/
https://hydra.cc/docs/patterns/extending_configs/
Interpolation is not import and it's evaluated at when you access the config node. At that point your config is already composed so it should be straight forward to use either absolute interpolation (the default) or relative based on the structure of your final config.
Hard to be 100% sure, but I suspect this problem is because your defaults list has _self_ at the beginning. This means that the content of the config with containing the defaults list is overridden by what comes after in the defaults list.
Try to move _self_ to the end:
# #package _global_
defaults:
- ../pipeline/common#train: feature
- ../pipeline/common#val: feature
- _self_
#...

hydra composition for ML models

I have some configurations for ML components as follows:
ml/encoder.yaml
hidden_layers_sizes: [2000, 1000, 300]
z_dim: 50
ml/decoder.yaml
hidden_layers_sizes: [300, 1000, 2000]
z_dim: 50
Now I have another configuration file as models/vae.yaml which I want to define as having these encoder and decoder configurations.
So the whole thing is structured as:
- conf
- ml
- encoder.yaml
- decoder.yaml
- models
- vae.yaml
How should I define in vae.yamlso that the configuration of the encoders and decoders can be passed down to the underlying object (and be overridden if possible by the command line)?
I tried something like:
# #package _global_
defaults:
- override /ml/encoder: encoder
- override /ml/decoder: decoder
However, this results in Could not override 'ml/encoder'. No match in the defaults list.
I managed to get it working as:
defaults:
- encoder: vae_encoder
- decoder: vae_decoder
I changed the config to look as:
- conf
- models
- encoder
- encoder.yaml
- decoder
- decoder.yaml
- vae.yaml

Set list of config nodes as value entries in yaml contrasting with structured configs in Hydra

I would like to get a list of configs as a (default) value entry
and use a structured schema to validate the input list.
E.g., in trainer.yaml:
defaults:
- callbacks:
- checkpointer
- early_stopping
In callbacks/checkpointer.yaml and callbacks/early_stopping.yaml I have a link to appropriate structured configs as default values, e.g.:
# callbacks/checkpointer.yaml
defaults:
- /trainer_lib/callbacks/base_checkpointer#_here_
The structured schema:
#dataclass
class CheckpointerConfig:
_target_: str = "some_library_class"
data_dir: str = "folder"
#dataclass
class TrainerConfig:
callbacks: List[Any] = MISSING
and config store:
cs = ConfigStore.instance()
cs.store(group="trainer_lib/callbacks", name="base_checkpointer", node=CheckpointerConfig)
I am not sure what is the correct syntax (what I tried fails) to accomplish this. I get an omegaconf.errors.ConfigTypeError: Cannot merge DictConfig with ListConfig.
Is there a way to accomplish this? Thanks.
Discussion on this topic in this Hydra issue.
Are you on Hydra 1.0? This is actually supported in Hydra 1.1. Here is the documentation: https://hydra.cc/docs/next/patterns/select_multiple_configs_from_config_group

How to do file over-rides in hydra?

I have a main config file, let's say config.yaml:
num_layers: 4
embedding_size: 512
learning_rate: 0.2
max_steps: 200000
I'd like to be able to override this, on the command-line, with another file, like say big_model.yaml, which I'd use conceptually like:
python my_script.py --override big_model.yaml
and big_model.yaml might look like:
num_layers: 8
embedding_size: 1024
I'd like to be able to override with an arbitrary number of such files, each one taking priority over the last. Let's say I also have fast_learn.yaml
learning_rate: 2.0
And so I'd then want to conceptually do something like:
python my_script.py --override big_model.yaml --override fast_learn.yaml
What is the easiest/most standard way to do this in hydra? (or potentially in omegaconf perhaps?)
(note that I'd like these override files to ideally just be standard yaml files, that override the earlier yaml files, ideally; though if I have to write using override DSL instead, I can do that, if that's the easiest/best/most standard way)
It sounds like package override might be the a good solution for you.
The documentation can be found here: https://hydra.cc/docs/next/advanced/overriding_packages
an example application can be found here:
https://github.com/facebookresearch/hydra/tree/master/examples/advanced/package_overrides
using the example application as an example, you can achieve the override by doing something like
$ python simple.py db=postgresql db.pass=helloworld
db:
driver: postgresql
user: postgre_user
pass: helloworld
timeout: 10
Refer to the basic tutorial and read about config groups.
You can create arbitrary config groups, and select one option from each (As of Hydra 1.0, config groups options are mutually exclusive), you will need two config groups here:
one can be model, with a normal, small and big model, and another can trainer, with maybe normal and fast options.
Config groups can also override things in other config groups.
You can also always append to the defaults list from the command line - so you can also add additional config groups that are only used in the command line.
an example for that can an 'experiment' config group. You can use it as:
$ python train.py +experiment=exp1
In such config groups that are overriding things across the entire config you should use the global package (read more about packages in the docs).
# #package _global_
num_layers: 8
embedding_size: 1024
learning_rate: 2.0

What *is* a salt formula, really?

I am trying to work through the Salt Formulas documentation and seem to be having a fundamental misunderstanding of what a salt formula really is.
Understandably, this question may seem like a duplicate of these questions, but due to my failing to grasp the basic concepts I'm also struggling to make use of the answers to these questions.
I thought, that a salt formula is basically just a package that implements extra functions, a lot like
#include <string.h>
in C, or
import numpy as np
in Python. Thus, I thought, I could download the salt-formula-linux to /srv/formulas/salt-formula-linux/, add that to file_roots, restart the master (all as per the docs), and then write a file like swapoff.sls containing
disable_swap:
linux:
storage:
swap:
file:
enabled: False
(the above is somewhat similar to the examples in the repo's root) in hope that the formula would then handle removing the swap entry from /etc/fstab and running swapoff -a for me. Needless to say, this didn't work, clearly because I'm not understanding what a salt formula is meant to be.
So, what is a salt formula and how do I use it? Can I make use of it as a library of functions too?
This answer might not be fully correct in all technicalities, but this is what solved my problem.
A salt formula is not a library of functions. It is, rather, a collection of state files. While often a state file can be very simple, such as some of my user defined
--> top.sls <--
base:
'*':
- docker
--> docker.sls <--
install_docker_1703:
pkgrepo.managed:
# stuff
pkg.installed:
- name: docker-ce
creating a state file like
--> swapoff.sls <--
disable_swap:
linux.storage.swap: # and so on
is, perhaps, not the way to go. Well, at least, maybe not for a beginner with lacking knowledge.
Instead, add an item to top.sls:
- linux.storage.swap
This is not enough, however. Most formulas (or the state files within them, if you will) are highly parametrizable, i.e. they're full of placeholders with variable names, such as {{ swap.device }}. If there's nothing to fill this gap, the state fill will not be able to do anything. These gaps are filled from pillars.
All that remains, is to create a file like swap.sls in /srv/pillar/ that would contain something like (as per the examples of that formula)
linux:
storage:
enabled: true
swap:
file:
enabled: true
engine: file
device: /swapfile
size: 1024
and also /srv/pillar/top.sls with
base:
'*':
- swap
Perhaps /srv/pillar should also be included in pillar_roots in /etc/salt/master.
So now /srv/salt/top.sls runs /srv/formulas/salt-formula-linux/linux/storage/swap.sls which using the guidance of /srv/pillar/top.sls pulls some parameters from /srv/pillar/swap.sls and enables a swapfile.

Resources