-
Notifications
You must be signed in to change notification settings - Fork 14
Description
TL;DR: I think converting EBB to regular BB would simplify our current code and also enable us to do more advanced compiler passes more easily
Background
HIR currently uses extended basic blocks (EBBs). An EBB allows branches (e.g. IfTrue, IfFalse) to appear anywhere within a block, not just at the end. The LIR, by contrast, uses conventional basic blocks where only the last 0, 1, or 2 instructions can be terminators.
This mismatch creates unnecessary complexity and I'd like to propose converting HIR to use traditional basic blocks.
Why
1. Unify the block model between HIR and LIR
LIR already uses traditional basic blocks. The codegen lowering pass must split every HIR EBB at each mid-block branch, creating a new fall-through LIR block.
Multiple LIR blocks end up sharing the same hir_block_id, which complicates any mapping between the two IRs.
If HIR used the same block model, lowering would be a straightforward 1:1 block mapping.
2. Reuse data flow algorithms between HIR and LIR
LIR has liveness analysis, critical edge splitting , and successor/predecessor computation that all assume traditional basic blocks. HIR has its own ControlFlowInfo that must work around EBBs.
With a shared block model, these algorithms could be written once and reused, or at minimum follow the same patterns and assumptions.
3. Simplify CFG analysis
With EBBs, computing successors requires scanning every instruction in a block:
// Since ZJIT uses extended basic blocks, one must check all instructions
// for their ability to jump to another basic block, rather than just
// the instructions at the end of a given basic block.
This affects every pass that needs CFG information:
ControlFlowInfo::new(): successor/predecessor mapspo_from(): reverse post-order traversalclean_cfg(): CFG cleanup / dead block eliminationfold_constants(): must break on mid-block terminators
With conventional basic blocks, successors are determined solely by the last 2 instructions.
4. Enable standard compiler optimizations
Many textbook optimizations assume traditional basic blocks:
- Dominator tree construction
- Loop detection and LICM
- GVN/GCM (Global Code Motion)
- Tail merging / block deduplication
These are simpler to implement and reason about when each block has exactly one entry point and one exit point.
How hard is it?
I think we mostly need to:
- Change HIR construction to create new blocks as we're emitting (when we emit branching instructions like
IfTrue/IfFalse). - Eliminate the block splitting code in LIR.
- Update
ControlFlowInfo - Fix tests
I think everything else in HIR would stay mostly the same (BranchEdge, SSA param passing, etc).