Skip to content

NOT READY FOR MERGE: parameterize ATen MRAM tile capacity#46

Draft
booth-algo wants to merge 1 commit into
mainfrom
fix/aten-mram-runtime-config
Draft

NOT READY FOR MERGE: parameterize ATen MRAM tile capacity#46
booth-algo wants to merge 1 commit into
mainfrom
fix/aten-mram-runtime-config

Conversation

@booth-algo
Copy link
Copy Markdown
Collaborator

Summary

Implements part 3 of the large-tile plan for the ATen compiler:

  • parameterizes MRAM capacity/alignment by runtime mlen and mram_tile_capacity
  • threads mram_tile_capacity through PlenaCompiler -> IsaCompiler -> MemoryStateMixin -> MRAMAllocator
  • makes VRAMAllocator alignment use runtime mlen
  • replaces the hardcoded linear MAX_K_TILES path with self.mram_tile_capacity
  • threads mram_tile_capacity into native decoder CPU golden K-split reference
  • adds focused allocator and codegen regression tests

Verification

  • py_compile on changed compiler/reference/test files
  • targeted allocator/codegen tests:
    • test_mram_allocator_scales_with_runtime_mlen
    • test_compiler_threads_runtime_memory_geometry
    • test_linear_projection_uses_runtime_mram_tile_capacity
  • lightweight existing ATen compiler unit subset passed
  • smoke: mlen=256, blen=64, M=256 K=256 N=256 compiles
  • smoke: mlen=256, blen=64, M=256 K=1280 N=256 compiles with K-split accumulation

Notes

Default behavior remains mram_tile_capacity=4. This removes the allocator/codegen hardcode but does not add padding for native model dimensions that are not multiples of a larger mlen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant