You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor: introduce tiered profiling levels for a2a3 tensormap_and_ringbuffer swimlane export
Replace the boolean enable_profiling flag with an integer perf_level
throughout the profiling pipeline (CLI → Python bindings → C++ runtime →
AICPU executor → PerformanceCollector → JSON export).
Profiling levels:
0 = off
1 = AICore task start/end timing only (JSON version 0)
2 = + dispatch timestamps, finish timestamps, fanout edges (JSON version 1)
3 = + AICPU scheduler/orchestrator phase buffers (JSON version 2)
Key changes:
- ChipCallConfig: bool enable_profiling → int perf_level
- CLI --enable-profiling: store_true → optional int (bare flag defaults to 3)
- nanobind property: backward-compatible bool→3 coercion for legacy callers
- AICPU executor: split into task_recording_enabled (>0) vs
phase_recording_enabled (>=3) to skip phase overhead at lower levels
- PerformanceCollector: skip phase buffer allocation when perf_level < 3;
version selection based on perf_level and presence of phase data
- swimlane_converter.py: accept version 0, tolerate missing fanout field
- Fix scene_test.py: `val and cond` truncated int to bool; use ternary
0 commit comments