Skip to content

stability(queen): long-term training stability study (10000+ episodes)#581

Merged
gHashTag merged 1 commit into
mainfrom
stability/423-long-term-training-study
Apr 30, 2026
Merged

stability(queen): long-term training stability study (10000+ episodes)#581
gHashTag merged 1 commit into
mainfrom
stability/423-long-term-training-study

Conversation

@gHashTag
Copy link
Copy Markdown
Owner

Summary

Training stability monitor for long-term Queen training runs (10,000+ episodes).

New file

  • src/b2t/stability_monitor.zig — 223 LOC

Features

  • Per-episode metrics: loss, reward, success, memory, CPU, elapsed
  • Divergence detection: loss spike > threshold triggers count
  • Memory budget checking: alert on overrun
  • Log every 100 episodes, checkpoint every 1000
  • Convergence check: loss < 50% initial, divergences < 1% of episodes

Report

  • Loss drift, range, mean ± std, success rate, max memory, convergence

Tests (4)

  • Episode recording, divergence detection, memory budget, convergence

Closes #423

- Add src/b2t/stability_monitor.zig
- StabilityMonitor: per-episode metrics recording
  Divergence detection (loss spike > threshold)
  Memory budget checking, log/checkpoint scheduling
- StabilityReport: full analysis with loss drift,
  convergence check, success rate, memory tracking
- Configurable: 10K episodes, log every 100,
  checkpoint every 1000, memory budget 4GB
- 4 tests: episode recording, divergence detection,
  memory budget, convergence check

Closes #423
@gHashTag gHashTag merged commit 4a2db38 into main Apr 30, 2026
8 of 16 checks passed
@gHashTag gHashTag deleted the stability/423-long-term-training-study branch April 30, 2026 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stability(queen): Long-term training stability study (10000+ episodes)

1 participant