Skip to content

fix: validate final layer in VRAM stage compare#41

Merged
booth-algo merged 3 commits into
mainfrom
chore/sliced-emulator-entrypoints
May 14, 2026
Merged

fix: validate final layer in VRAM stage compare#41
booth-algo merged 3 commits into
mainfrom
chore/sliced-emulator-entrypoints

Conversation

@booth-algo
Copy link
Copy Markdown
Collaborator

Summary

Make aten/vram_stage_compare.py validate the requested decoder layer instead of always reading layer 0 weights and O_full_0.

Changes:

  • add optional layer_idx argument to compare_stages
  • default layer_idx to the last W_o_<idx>.pt found in the build directory
  • read W_o, W_gate, W_up, and W_down for that layer
  • parse O_full_<layer_idx> from generated ASM comments
  • expose optional CLI arg 3 for manual layer selection
  • include layer_idx in the returned result dict

This supports multi-layer native emulator validation where the final-layer VRAM stage comparison is the authoritative check.

Validation

  • py_compile passed for aten/vram_stage_compare.py
  • Native 2-layer SmolVLM2 run validated final layer with inferred layer_idx=1
  • O_proj + residual: 100.00% allclose, MSE 4.18e-06
  • norm + FFN + residual + final_norm: 100.00% allclose, MSE 2.61e-05

@booth-algo booth-algo merged commit 9c12f0b into main May 14, 2026
3 checks passed
@booth-algo booth-algo deleted the chore/sliced-emulator-entrypoints branch May 14, 2026 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant