MLOps study - Raviraja Week 2: Hydra

13 Oct 2022

Week 2 is about Hydra. If PyTorch Lightning focused on building the model and dataset and WandB recorded that, then Hydra focuses on the task of managing configuration. Configuration needs to be recorded precisely for reproducibility. I have always recorded configuration using Argparser, but let’s take a look at Hydra.

Start hydra
Multiple configuration file
Running multiple jobs

Start hydra

When setting up a development environment using a .py-format file, it’s convenient to use a decorator.

import hydra
from omegaconf import OmegaConf
=
@hydra.main(config_path="configs", config_name="config.yaml", version_base="1.2")
def main(cfg):
    print(OmegaConf.to_yaml(cfg, resolve=True))
main()

When you use a decorator, you can modify the configuration at runtime as follows (it can take over the role of Argparser).

python main.py perferences.trait=i_like_stars

If you use the hydra decorator on the code we worked on in Weeks 0 and 1, you can simply turn the configuration into variables as follows.

# example 
@hydra.main(config_path="./configs", config_name="config")
def main(cfg):
    # print(OmegaConf.to_yaml(cfg))
    cola_data = DataModule(
        cfg.model.tokenizer, cfg.processing.batch_size, cfg.processing.max_length
    )
    cola_model = ColaModel(cfg.model.name)

    checkpoint_callback = ModelCheckpoint(
        dirpath="./models",
        filename="best-checkpoint.ckpt",
        monitor="valid/loss",
        mode="min",
    )

    wandb_logger = WandbLogger(project="MLOps Basics", entity="raviraja")
    trainer = pl.Trainer(
        max_epochs=cfg.training.max_epochs,
        logger=wandb_logger,
        callbacks=[checkpoint_callback, SamplesVisualisationLogger(cola_data)],
        log_every_n_steps=cfg.training.log_every_n_steps,
        deterministic=cfg.training.deterministic,
        limit_train_batches=cfg.training.limit_train_batches,
        limit_val_batches=cfg.training.limit_val_batches,
    )
    trainer.fit(cola_model, cola_data)

However, since the decorator approach doesn’t work in a Jupyter notebook, you can use the configuration via the compose approach.

hydra.core.global_hydra.GlobalHydra.instance().clear()
hydra.initialize("./configs", version_base=None)
cfg = hydra.compose(config_name="config.yaml")
print(OmegaConf.to_yaml(cfg))

Multiple configuration file

There are times when it’s convenient to manage the configuration by splitting it into several files. But it would be inconvenient if you had to load each configuration file separately every time, right? There’s a default list feature that lets you load all configs just by loading a single yaml file.

├── configs
│   ├── config.yaml
│   └── model
│       └── default.yaml
│   └── data
│       └── default.yaml

When the configuration folder exists in a structure like the above, let’s modify config.yaml as follows.

defaults:
  - model: default
  - data: default

Surprisingly, you can see that just loading config.yaml loads everything all at once.

If we apply this, we can create a folder structure like the following and just modify config.yaml as needed!! (amazing, amazing)

├── configs
│   ├── config.yaml
│   └── model
│       └── default.yaml
│       └── bert.yaml
│       └── transformers.yaml
│   └── database
│       └── default.yaml
│       └── mongoDB.yaml
│       └── AmazonDB.yaml

# in config.yaml
defaults:
  - model: default
  - database: mongoDB

What about when there’s a variable dependency? It’s very simple.

max_epochs: 1
log_every_n_steps: 10
deterministic: true
limit_train_batches: 0.25
limit_val_batches: ${training.limit_train_batches}

You just bind it in the following way. Binding not working properly in OmegaConf? Then just add resolve=True and it’s easily solved!

OmegaConf.to_yaml(cfg, resolve=True)

Running multiple jobs

python train.py -m training.max_epochs=1,2 processing.batch_size=32,64,128

If you run it like this, a total of 6 are performed at once!!! And what if you want to run them concurrently? If you use the Joblib library, parallel execution is possible.

pip install hydra-joblib-launcher --upgrade
python train.py -m training.max_epochs=1,2 processing.batch_size=32,64,128 hydra/launcher=joblib

Download the ipynb file

Jae-Kyung Cho Being unique is better than being perfect

MLOps study - Raviraja Week 2: Hydra

Start hydra

Multiple configuration file

Running multiple jobs

references:

Jae-Kyung Cho Being unique is better than being perfect

MLOps study - Raviraja Week 2: Hydra

Start hydra

Multiple configuration file

Running multiple jobs

references:

Related posts

Diary - AI training이란 무엇일까 (feat. Claude Code) 06 Mar 2026

Diary - What Is AI Training, Really? (feat. Claude Code) 06 Mar 2026

Diary - LLM에서 효율적인 강화학습이란 무엇일까 2 (feat. Qwen-3.5와 GLM-5) 26 Feb 2026