MLOps study - Raviraja Week 2: Hydra
13 Oct 2022Week 2 is about Hydra. If PyTorch Lightning focused on building the model and dataset and WandB recorded that, then Hydra focuses on the task of managing configuration. Configuration needs to be recorded precisely for reproducibility. I have always recorded configuration using Argparser, but let’s take a look at Hydra.
Start hydra
When setting up a development environment using a .py-format file, it’s convenient to use a decorator.
import hydra
from omegaconf import OmegaConf
=
@hydra.main(config_path="configs", config_name="config.yaml", version_base="1.2")
def main(cfg):
print(OmegaConf.to_yaml(cfg, resolve=True))
main()
When you use a decorator, you can modify the configuration at runtime as follows (it can take over the role of Argparser).
python main.py perferences.trait=i_like_stars
If you use the hydra decorator on the code we worked on in Weeks 0 and 1, you can simply turn the configuration into variables as follows.
# example
@hydra.main(config_path="./configs", config_name="config")
def main(cfg):
# print(OmegaConf.to_yaml(cfg))
cola_data = DataModule(
cfg.model.tokenizer, cfg.processing.batch_size, cfg.processing.max_length
)
cola_model = ColaModel(cfg.model.name)
checkpoint_callback = ModelCheckpoint(
dirpath="./models",
filename="best-checkpoint.ckpt",
monitor="valid/loss",
mode="min",
)
wandb_logger = WandbLogger(project="MLOps Basics", entity="raviraja")
trainer = pl.Trainer(
max_epochs=cfg.training.max_epochs,
logger=wandb_logger,
callbacks=[checkpoint_callback, SamplesVisualisationLogger(cola_data)],
log_every_n_steps=cfg.training.log_every_n_steps,
deterministic=cfg.training.deterministic,
limit_train_batches=cfg.training.limit_train_batches,
limit_val_batches=cfg.training.limit_val_batches,
)
trainer.fit(cola_model, cola_data)
However, since the decorator approach doesn’t work in a Jupyter notebook, you can use the configuration via the compose approach.
hydra.core.global_hydra.GlobalHydra.instance().clear()
hydra.initialize("./configs", version_base=None)
cfg = hydra.compose(config_name="config.yaml")
print(OmegaConf.to_yaml(cfg))
Multiple configuration file
There are times when it’s convenient to manage the configuration by splitting it into several files. But it would be inconvenient if you had to load each configuration file separately every time, right? There’s a default list feature that lets you load all configs just by loading a single yaml file.
├── configs
│ ├── config.yaml
│ └── model
│ └── default.yaml
│ └── data
│ └── default.yaml
When the configuration folder exists in a structure like the above, let’s modify config.yaml as follows.
defaults:
- model: default
- data: default
Surprisingly, you can see that just loading config.yaml loads everything all at once.
If we apply this, we can create a folder structure like the following and just modify config.yaml as needed!! (amazing, amazing)
├── configs
│ ├── config.yaml
│ └── model
│ └── default.yaml
│ └── bert.yaml
│ └── transformers.yaml
│ └── database
│ └── default.yaml
│ └── mongoDB.yaml
│ └── AmazonDB.yaml
# in config.yaml
defaults:
- model: default
- database: mongoDB
What about when there’s a variable dependency? It’s very simple.
max_epochs: 1
log_every_n_steps: 10
deterministic: true
limit_train_batches: 0.25
limit_val_batches: ${training.limit_train_batches}
You just bind it in the following way.
Binding not working properly in OmegaConf? Then just add resolve=True and it’s easily solved!
OmegaConf.to_yaml(cfg, resolve=True)
Running multiple jobs
python train.py -m training.max_epochs=1,2 processing.batch_size=32,64,128
If you run it like this, a total of 6 are performed at once!!! And what if you want to run them concurrently? If you use the Joblib library, parallel execution is possible.
pip install hydra-joblib-launcher --upgrade
python train.py -m training.max_epochs=1,2 processing.batch_size=32,64,128 hydra/launcher=joblib