Main CLI: python generator/generate.py
Train an AR model and generate peptides for specific alleles:
python generator/generate.py \
--data data/mhc_ab.npz \
--shap_json explainer/shap_results.json \
--out_dir outputs/run1 \
--alleles HLA-A*02:01,HLA-B*07:02 \
--num_final 50 \
--device mpsRun with a fuller configuration:
python generator/generate.py \
--data data/mhc_ab.npz \
--shap_json explainer/shap_results.json \
--out_dir outputs/run1 \
--alleles HLA-A*02:01 \
--num_final 50 \
--ar_epochs 50 \
--temperature 1.0 \
--top_p 0.9 \
--enable_refinement \
--refine_steps 15 \
--length_quotas \
--seed 42 \
--device mpsSkip AR training and reuse an existing checkpoint:
python generator/generate.py \
--data data/mhc_ab.npz \
--ar_ckpt generator/checkpoints/ar_transformer_latest.pt \
--out_dir outputs/run2 \
--alleles all \
--num_final 50- Transformer-based autoregressive (AR) peptide generator
- NetMHCpan 4.2 based scoring with no surrogate model
- SHAP-guided local refinement using position importance weights
- Diversity filtering with edit distance and k-mer Jaccard
- Length quotas (
8:10%,9:40%,10:30%,11:20%) or proportional selection
Per-allele outputs:
outputs/<run_id>/<allele>/
final_peptides.csv - Generated peptides with scores
summary.json - Statistics and motif analysis
Run-level outputs:
outputs/<run_id>/
run_summary.json - Overall run configuration and results
ar_model.pt - Trained AR model (if training was performed)
Reusable checkpoints:
generator/checkpoints/
ar_transformer_latest.pt - Reusable checkpoint preset
ar_transformer_latest.meta.json - Metadata for the reusable checkpoint