Refactor: unify two-phase dispatch and fix sync_start spin loop by zhusy54 · Pull Request #553 · hw-native-sys/simpler

zhusy54 · 2026-04-14T10:51:39Z

Summary

Refactor idle and pending dispatch into a single unified dispatch_shape() helper, replacing two separate hand-unrolled loops in resolve_and_dispatch_pto2
Extend pending dispatch from AIC-only to all three shapes (AIC, AIV, MIX)
Fix latent bug: get_idle_cluster_offset_states incorrectly applied AIC pending_occupied filter to AIV idle dispatch, causing spurious idle-slot blocks
Fix sync_start spin loop: tasks that cannot be dispatched in pending phase are now requeued immediately instead of spinning

Key Changes

get_idle_cluster_offset_states → get_idle_core_offset_states: AIC-only filter for pending_occupied; AIV/MIX no longer filtered (invariant guarantees idle cores always have pending_occupied=0)
get_pending_only_cluster_offset_states → get_pending_core_offset_states: extended to support AIV (per-core) and MIX (all 3 cores per cluster must be running+free)
Added DispatchPhase enum and get_dispatchable_cores(shape, phase) unified query
Added dispatch_shape(): encapsulates all per-shape dispatch logic for one phase, including sync_start gating, drain-mode entry, multi-block do-while, and lazy refresh
dispatch_block_to_cluster renamed dispatch_block with explicit to_pending flag; MIX dispatch_mix_block_to_cluster now forwards the flag to all three subtask slots
resolve_and_dispatch_pto2 main loop collapses to two nested loops: for phase in {IDLE, PENDING}: for shape in dispatch_order: dispatch_shape(...)

Testing

All simulation tests pass (./ci.sh -p a2a3sim)

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request refactors the AICPU executor to implement a two-phase dispatch mechanism (IDLE and PENDING) for AIC, AIV, and MIX resource shapes. The changes introduce a unified dispatch_shape function and update the CoreTracker to manage bit offsets for both phases, which also addresses a bug where AIC core states could incorrectly interfere with AIV idle dispatch. Feedback was provided regarding an inaccuracy in the code comments describing the bit representation for AIV idle dispatch, noting that it incorrectly claims a per-core bit representation when it is actually per-cluster for the idle phase.

Replace the separate idle-dispatch and AIC-only pending-dispatch loops with a single dispatch_shape() that iterates IDLE then PENDING phases for every resource shape (AIC, AIV, MIX). Key changes: - Add CoreTracker::DispatchPhase enum and get_dispatchable_cores() as a unified query for both phases - Rename get_idle_cluster_offset_states -> get_idle_core_offset_states; skip AIC-centric pending_occupied filter for AIV/MIX (fixes a latent bug where running+pending AIC blocked AIV idle dispatch) - Consolidate get_pending_only_cluster_offset_states into get_pending_core_offset_states, extending pending dispatch to AIV and MIX shapes - MIX pending dispatch: require at least one running core per cluster but allow idle cores to participate; dispatch_mix_block_to_cluster resolves per-core to_pending (idle cores use to_pending=false to trigger change_core_state, running cores use to_pending=true) - Rename dispatch_block_to_cluster -> dispatch_block with a to_pending parameter; AIV pending dispatch passes resolved core bit offset - Fix infinite spin when all tasks in a pending-phase batch are sync_start (requeued without dispatching); dispatch_shape now breaks when dispatched_any is false

gemini-code-assist bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/a2a3/runtime/tensormap_and_ringbuffer/aicpu/aicpu_executor.cpp

zhusy54 force-pushed the dual-sched-MIX branch from 4b58add to 4af4a86 Compare April 14, 2026 13:50

zhusy54 marked this pull request as draft April 15, 2026 00:39

zhusy54 force-pushed the dual-sched-MIX branch from 55b653d to 7472fae Compare April 15, 2026 07:54

zhusy54 force-pushed the dual-sched-MIX branch from 7472fae to 9951499 Compare April 15, 2026 09:15

zhusy54 marked this pull request as ready for review April 15, 2026 09:16

poursoul approved these changes Apr 15, 2026

View reviewed changes

poursoul merged commit 5ddc630 into hw-native-sys:main Apr 15, 2026
15 checks passed

zhusy54 mentioned this pull request Apr 15, 2026

[Performance] Runtime performance optimization tracking #545

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: unify two-phase dispatch and fix sync_start spin loop#553

Refactor: unify two-phase dispatch and fix sync_start spin loop#553
poursoul merged 1 commit intohw-native-sys:mainfrom
zhusy54:dual-sched-MIX

zhusy54 commented Apr 14, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhusy54 commented Apr 14, 2026

Summary

Key Changes

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants