The hotcb CLI provides a single entrypoint for controlling all hotcb modules. It writes commands to hotcb.commands.jsonl (or directly to state files for freeze management).
All commands accept --dir <run_dir> to specify the run directory (defaults to .).
Show the current state of a run: freeze mode, and last applied entries per handle.
hotcb --dir runs/exp-001 statusOutput shows freeze mode, recipe/adjust paths if set, and a summary of the latest applied ledger entry for each module:handle combination.
Bootstrap a run directory with all required files.
hotcb --dir runs/exp-001 initCreates:
hotcb.yaml(withversion: 1if not present)hotcb.commands.jsonlhotcb.applied.jsonlhotcb.recipe.jsonlhotcb.freeze.json
Enable a callback by ID.
hotcb --dir runs/exp-001 cb enable timingDisable a callback by ID.
hotcb --dir runs/exp-001 cb disable feat_vizLoad a callback dynamically from a Python file or module.
# From a Python file
hotcb --dir runs/exp-001 cb load feat_viz \
--file /tmp/feat_viz.py \
--symbol FeatureVizCallback \
--enabled \
--init every=50 layer=conv3
# From an importable module
hotcb --dir runs/exp-001 cb load timing \
--path hotcb.modules.cb.callbacks.timing \
--symbol TimingCallback \
--enabledOptions:
--file <path>-- Python file path (setstarget.kind=python_file)--path <module>-- importable module path (setstarget.kind=module)--symbol <name>-- class name inside the file/module (required)--enabled/--no-enabled-- initial enabled state--init k=v ...-- constructor keyword arguments
Update callback parameters at runtime.
hotcb --dir runs/exp-001 cb set_params timing every=100Unload a callback.
hotcb --dir runs/exp-001 cb unload feat_vizEnable the optimizer controller handle (default ID: main).
hotcb --dir runs/exp-001 opt enableDisable the optimizer controller handle.
hotcb --dir runs/exp-001 opt disableUpdate optimizer parameters.
# Set global learning rate and weight decay
hotcb --dir runs/exp-001 opt set_params lr=3e-5 weight_decay=0.01
# Set clip norm
hotcb --dir runs/exp-001 opt set_params clip_norm=1.0
# Scheduler scale (multiplicative)
hotcb --dir runs/exp-001 opt set_params scheduler_scale=0.5
# Scheduler one-shot drop
hotcb --dir runs/exp-001 opt set_params scheduler_drop=0.1Enable the loss controller handle (default ID: main).
hotcb --dir runs/exp-001 loss enableDisable the loss controller handle.
hotcb --dir runs/exp-001 loss disableUpdate loss parameters.
# Set loss weights (suffix _w maps to weights dict)
hotcb --dir runs/exp-001 loss set_params distill_w=0.2 depth_w=1.5
# Toggle loss terms
hotcb --dir runs/exp-001 loss set_params terms.aux_depth=false terms.aux_heatmap=true
# Set ramp config (JSON value)
hotcb --dir runs/exp-001 loss set_params \
ramps.depth='{"type":"linear","warmup_frac":0.2,"end":2.0}'Write freeze state directly to hotcb.freeze.json. The running kernel picks it up on the next poll.
# Production lock -- block all external mutations
hotcb --dir runs/exp-001 freeze --mode prod
# Replay mode -- replay a saved recipe
hotcb --dir runs/exp-001 freeze --mode replay \
--recipe hotcb.recipe.jsonl \
--policy best_effort
# Replay with adjustments
hotcb --dir runs/exp-001 freeze --mode replay_adjusted \
--recipe hotcb.recipe.jsonl \
--adjust hotcb.adjust.yaml \
--policy strict \
--step-offset 100
# Unfreeze
hotcb --dir runs/exp-001 freeze --mode offOptions:
--mode {off,prod,replay,replay_adjusted}-- freeze mode (required)--recipe <path>-- recipe file for replay modes--adjust <path>-- adjustment overlay forreplay_adjusted--policy {best_effort,strict}-- replay policy (default:best_effort)--step-offset <int>-- global step offset for replay (default:0)
Export a recipe from the applied ledger. Includes only entries with decision=="applied" from modules cb, opt, loss.
hotcb --dir runs/exp-001 recipe export
hotcb --dir runs/exp-001 recipe export --out /tmp/my_recipe.jsonlValidate a recipe file for schema correctness.
hotcb --dir runs/exp-001 recipe validate --recipe hotcb.recipe.jsonlChecks each line for valid JSON, required fields (at, module, op), valid module names, and at.step presence.
Generate a YAML adjustment overlay template from a recipe file. Produces one stub patch entry per unique (module, op, id) combination found in the recipe.
hotcb --dir runs/exp-001 recipe patch-template \
--recipe hotcb.recipe.jsonl \
--output hotcb.adjust.yamlOptions:
--recipe <path>-- recipe file to read (default:hotcb.recipe.jsonl)--output <path>-- output path for the generated template (default:hotcb.adjust.yaml)
The generated file contains commented-out patch fields for each unique operation, ready to fill in:
# Generated from hotcb.recipe.jsonl
patches:
- match:
module: opt
op: set_params
id: main
# replace_params: {}
# shift_step: 0
# drop: falseEnable online tuning. Modes: active (default), observe, suggest.
hotcb --dir runs/exp-001 tune enable
hotcb --dir runs/exp-001 tune enable --mode observeDisable tuning.
hotcb --dir runs/exp-001 tune disableShow tune summary: recipe presence, mutation count, accept rate.
hotcb --dir runs/exp-001 tune statusOverride tune recipe parameters using dotted paths.
hotcb --dir runs/exp-001 tune set acceptance.epsilon=0.002
hotcb --dir runs/exp-001 tune set safety.max_global_reject_streak=5Export the tune run summary.
hotcb --dir runs/exp-001 tune export-recipe --out evolved_summary.jsonShortcut commands that auto-route to the right module.
Defaults to cb enable. Queues a callback enable command.
hotcb --dir runs/exp-001 enable timingDefaults to cb disable. Queues a callback disable command.
hotcb --dir runs/exp-001 disable timingAuto-routes to opt or loss based on key patterns:
- Keys
lr,weight_decay,clip_norm,scheduler_scale,scheduler_drop,group,groupsroute to opt - Keys ending in
_w, or starting withterms.orramps.route to loss - Ambiguous keys produce an error — use explicit subcommands instead
hotcb --dir runs/exp-001 set lr=5e-5 # -> opt set_params
hotcb --dir runs/exp-001 set distill_w=0.25 # -> loss set_paramsStart the dashboard server, optionally with autopilot.
hotcb serve --dir runs/exp1
hotcb serve --dir runs/exp1 --port 9000
hotcb serve --dir runs/exp1 --autopilot ai_suggest --key-metric val_lossOptions:
--host <addr>-- bind address (default:127.0.0.1)--port <int>-- port (default:8421)--autopilot {off,suggest,auto,ai_suggest,ai_auto}-- autopilot mode (default:off)--key-metric <name>-- primary metric for AI optimization (default:val_loss)
Dashboard URL: http://localhost:8421
Launch synthetic training with the dashboard.
hotcb demo # simple synthetic training
hotcb demo --golden # multi-task demo with rich metrics
hotcb demo --autopilot ai_suggest # with AI autopilot
hotcb demo --autopilot suggest --key-metric val_lossOptions:
--golden-- use multi-task golden demo (classification + reconstruction)--port <int>-- dashboard port (default:8421)--autopilot {off,suggest,auto,ai_suggest,ai_auto}-- autopilot mode (default:off)--key-metric <name>-- primary metric for AI optimization (default:val_loss)
Start training + dashboard + autopilot in one command.
# Built-in configs
hotcb launch --config simple --max-steps 500
hotcb launch --config multitask --autopilot ai_suggest --key-metric val_loss
# Time-bounded run (5 minutes, regardless of GPU speed)
hotcb launch --config multitask --max-time 300 --autopilot ai_suggest
# Custom training function
hotcb launch --train-fn my_module:train --max-steps 1000
# Full options
hotcb launch --config multitask \
--autopilot ai_auto \
--key-metric val_loss \
--ai-model gpt-4o-mini \
--ai-budget 5.0 \
--ai-cadence 50 \
--max-steps 2000 \
--seed 42Options:
--config {simple,multitask,finetune}-- built-in training config (default:multitask)--train-fn <module:fn>-- custom training function (overrides--config)--run-dir <path>-- run directory (default: auto-created temp dir)--host <addr>-- dashboard bind address (default:127.0.0.1)--port <int>-- dashboard port (default:8421)--max-steps <int>-- override max training steps--max-time <float>-- wall-clock time limit in seconds (stops training when reached, regardless of step count)--step-delay <float>-- seconds between steps--autopilot {off,suggest,auto,ai_suggest,ai_auto}-- autopilot mode (default:off)--key-metric <name>-- primary metric for AI optimization (default:val_loss)--ai-model <name>-- LLM model name (default:gpt-4o-mini)--ai-budget <float>-- max USD for LLM calls (default:5.0)--ai-cadence <int>-- steps between AI check-ins (default:50)--seed <int>-- random seed
--max-steps and --max-time can be combined — whichever limit is hit first wins. This is useful when step duration varies across hardware (e.g. --max-time 300 for a 5-minute run regardless of GPU speed).
The training function must conform to the hotcb contract:
def train_fn(run_dir, max_steps, step_delay, stop_event):
# Write metrics to {run_dir}/hotcb.metrics.jsonl
# Read commands from {run_dir}/hotcb.commands.jsonl
...Run synthetic benchmarks or CIFAR-10 autopilot evaluation.
hotcb bench run # synthetic benchmarks
hotcb bench eval # CIFAR-10 autopilot evaluationAll k=v arguments support automatic type inference:
| Input | Parsed as |
|---|---|
lr=3e-5 |
float |
every=50 |
int |
enabled=true |
bool |
ramps.depth={"type":"linear"} |
dict (JSON) |
name=foo |
str |