This guide is for users who want to install the repository, prepare local data, and run the main paper-facing entrypoints without first reading the full implementation audit.
Recommended baseline:
python -m pip install -U pip
python -m pip install -r requirements.txt --no-build-isolation
python -m pip checkNotes:
- Python 3.11 is the reference version.
- The repository is Linux-first and expects a CUDA-compatible PyTorch stack for the full training path.
- Reproduce scripts export the repo root on
PYTHONPATHautomatically.
For lightweight local development and CI-style checks, editable install is also supported:
python -m pip install -e .[test]The public repository does not ship prepared datasets. Generate them locally:
bash scripts/data/prepare_all.sh --out_dir dataStrict verification:
bash scripts/data/prepare_all.sh --out_dir data --strict 1See DATA_SOURCES.md for provenance and SHA256 pinning.
This checks task loading, prompt rendering, and evaluator wiring. It is not a model-quality regression test.
bash scripts/reproduce/smoke.sh --task math --limit 1 --print_example 0You can also run:
bash scripts/reproduce/smoke.sh --task code --limit 1 --print_example 0bash scripts/reproduce/paper_main_results.sh sweep \
--registry configs/main_results_registry.yaml \
--only_methods SFTexport PRETRAIN='Qwen/Qwen2.5-3B-Instruct'
bash scripts/reproduce/paper_train.shbash scripts/reproduce/paper_main_results.sh sweep \
--registry configs/main_results_registry.yamlbash scripts/reproduce/paper_analysis_figs.sh fig2 \
--suite math \
--run_c3 ckpt/_runs/<C3_run_dir> \
--run_mappo ckpt/_runs/<MAPPO_run_dir> \
--run_magrpo ckpt/_runs/<MAGRPO_run_dir> \
--run_sft ckpt/_runs/_sft_main_results/<SFT_dir> \
--mappo_critic_ckpt <PATH_TO_MAPPO_CRITIC>- quick repository navigation: CODE_MAP.md
- paper-to-code mapping and invariants: IMPLEMENTATION_AUDIT.md
Before publishing the repository, run:
bash scripts/audit/pre_release.shSingle-command preflight:
bash scripts/reproduce/preflight_repro.sh --task mathThe release surface must not include local generated directories such as data/, artifacts/, ckpt/, runs/, wandb/, or models/. See RELEASE_POLICY.md.
For the full local release gate, use:
bash scripts/audit/release_gate.sh