Skip to content

NOT READY FOR MERGE: add attention looping verification harness#40

Draft
booth-algo wants to merge 2 commits into
mainfrom
asm-count-verification
Draft

NOT READY FOR MERGE: add attention looping verification harness#40
booth-algo wants to merge 2 commits into
mainfrom
asm-count-verification

Conversation

@booth-algo
Copy link
Copy Markdown
Collaborator

@booth-algo booth-algo commented May 18, 2026

Summary

Adds testbench/report artifacts for ATen MHA attention-looping work:

  • Adds attention_looping_compare.py for looped-vs-unrolled static/dynamic/cycle comparison.
  • Adds flash_attention_mha_test.py to run ATen per-head MHA through the transactional emulator against PyTorch SDPA golden.
  • Adds doc/ATTENTION_LOOPING_STEP2_REPORT.md with commands, branch hashes, estimates, and golden-check output.

The actual compiler implementation is in the companion PLENA_Compiler PR from feat/codegen-addr-reg-init.

CLM-60M Native Layer 0 Counts

Rerun locally with compile_hf_model(model, seq_len=64, hidden_size=None, inter_dim=None, num_layers=1). Native dims: hidden=384, inter=1408, heads=6, kv_heads=2, head_dim=64.

Metric Previous Current Change
Total ASM source lines 35,479 15,367 -20,112 (-56.7%)
Actual static instruction lines 34,041 14,403 -19,638 (-57.7%, 2.36x smaller)
Comment / metadata lines 1,438 964 -474 (-33.0%)
Loop-expanded dynamic instructions 645,334 649,762 +4,428 (+0.69%)
Estimated cycles 8,915,366 8,919,794 +4,428 (+0.05%)
Estimated ms @ 1GHz 8.915366 8.919794 +0.004428 (+0.05%)
C_LOOP_START static lines 248 296 +48

Previous file analyzed: /tmp/clm60m_native_1layer/generated_asm_code.asm. Current file analyzed: /tmp/clm60m_native_1layer_after_attention_looping/generated_asm_code.asm.

Verification

  • py_compile for attention helper and harness files
  • attention_looping_compare.py for head_dim=64
  • attention_looping_compare.py for head_dim=128
  • attention_codegen_compare.py after attention looping
  • flash_attention_mha_test.py transactional emulator golden check

Status

Not ready for merge. Branch has been replayed onto current main; opened for review/context while the related compiler branch is still being validated.

@booth-algo booth-algo force-pushed the asm-count-verification branch from 5060af4 to 6242156 Compare May 18, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant