Skip to content

Commit 09f9850

Browse files
Andre Ferreiraclaude
andcommitted
chore(tasks): Phase 15 IR pipeline tasks — 4 tasks, 89 items
ERA 1, Phase 3: Port HIR + THIR + MIR from C++ (~40K LOC) to TML (~26K LOC): - phase15a: HIR lowering — desugaring, monomorphization, closure capture (24 items) - phase15b: THIR lowering — coercion insertion, exhaustiveness checking (16 items) - phase15c: MIR builder — SSA construction, basic blocks, 40+ instruction types (24 items) - phase15d: MIR passes — 52 optimization passes in tiered order (25 items) All 4 tasks include proposals (66-128 lines each). Order: 15a → 15b → 15c → 15d (sequential) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 832f372 commit 09f9850

13 files changed

Lines changed: 647 additions & 1 deletion

File tree

.rulebook/tasks/TASKS-INDEX.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# TML Project — Task Index
22

33
**Last updated**: 2026-04-05
4-
**Active tasks**: 35 | **Archived**: 5+
4+
**Active tasks**: 39 | **Archived**: 5+
55

66
---
77

@@ -158,6 +158,19 @@ Port the type checker (~21K LOC C++) to TML. Highest-risk, longest phase (8 mont
158158

159159
**Order**: 14a → 14b → 14c → 14d (sequential, 14a/14b can partially overlap)
160160

161+
## Phase 15 — IR Pipeline in TML (ERA 1, Phase 3)
162+
163+
Port HIR, THIR, MIR builder, and 52 MIR optimization passes from C++ to TML.
164+
165+
| ID | Task | Status | Priority | Progress |
166+
|----|------|--------|----------|----------|
167+
| 15a | [HIR Lowering](phase15a_hir-lowering/) | Planned | P0 | 0/24 |
168+
| 15b | [THIR Lowering](phase15b_thir-lowering/) | Planned | P0 | 0/16 |
169+
| 15c | [MIR Builder](phase15c_mir-builder/) | Planned | P0 | 0/24 |
170+
| 15d | [MIR Passes (52 passes)](phase15d_mir-passes/) | Planned | P0 | 0/25 |
171+
172+
**Order**: 15a → 15b → 15c → 15d (sequential)
173+
161174
## Research
162175

163176
| ID | Task | Status | Priority | Progress |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"status": "pending",
3+
"createdAt": "2026-04-06T01:20:11.686Z",
4+
"updatedAt": "2026-04-06T01:20:11.686Z"
5+
}
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Proposal: HIR Lowering — Rewrite in TML (Phase 15a)
2+
3+
## Why
4+
5+
HIR (High-level Intermediate Representation) is the essential bridge between the parsed AST and
6+
the MIR builder. It takes the raw syntax tree and a fully-resolved TypeEnv from the type checker
7+
(phase14d) and produces a typed, desugared tree where every expression node carries its concrete
8+
type, all syntactic sugar has been eliminated, and generic functions have been monomorphized into
9+
concrete specializations. Without HIR, the MIR builder would need to re-implement type resolution,
10+
desugaring, and monomorphization — the entire point of the pipeline split is that HIR does this
11+
work once, cleanly, so MIR can focus on control flow and SSA construction.
12+
13+
This phase ports ~15,207 LOC of C++ across 8 source files and 4 serialization files to ~9,900
14+
lines of TML, completing the first major pass of the ERA 1 compiler port.
15+
16+
## What Changes
17+
18+
Port the following C++ files to TML:
19+
20+
- `compiler/src/hir/hir_builder.cpp` (1,511 LOC) — main HIR builder, entry point
21+
- `compiler/src/hir/hir_builder_expr.cpp` (1,485 LOC) — expression lowering (largest single file)
22+
- `compiler/src/hir/hir_builder_stmt.cpp` (433 LOC) — statement lowering
23+
- `compiler/src/hir/hir_builder_pattern.cpp` (362 LOC) — pattern lowering
24+
- `compiler/src/hir/hir_pass.cpp` (728 LOC) — HIR optimization passes
25+
- `compiler/src/hir/hir_pass_inline.cpp` (1,292 LOC) — HIR inlining pass
26+
- `compiler/src/hir/hir_printer.cpp` (619 LOC) — HIR pretty-printer
27+
- `compiler/src/hir/hir_expr.cpp` (225 LOC), `hir_pattern.cpp` (142 LOC),
28+
`hir_module.cpp` (90 LOC), `hir_stmt.cpp` (76 LOC) — HIR node type definitions
29+
- `compiler/src/hir/serializer/` (4 files, ~3,561 LOC) — binary serialization for pipeline bridge
30+
31+
New TML modules produced:
32+
33+
- `compiler-tml/src/hir/mod.tml` — module root
34+
- `compiler-tml/src/hir/expr.tml``HirExpr` enum (~40 variants, each carrying resolved type)
35+
- `compiler-tml/src/hir/stmt.tml``HirStmt` enum
36+
- `compiler-tml/src/hir/pattern.tml``HirPattern` enum
37+
- `compiler-tml/src/hir/module.tml``HirModule` struct
38+
- `compiler-tml/src/hir/builder.tml``HirBuilder` struct and `lower_module` entry point
39+
- `compiler-tml/src/hir/lower_expr.tml` — expression lowering (largest module)
40+
- `compiler-tml/src/hir/monomorph.tml` — generic instantiation engine
41+
- `compiler-tml/src/hir/printer.tml` — HIR pretty-printer
42+
43+
## Key Decisions
44+
45+
**HirExpr as enum with resolved types on every node.** Every `HirExpr` variant carries its
46+
concrete `Type` as a field. This is the central design decision of the HIR: type resolution is
47+
done once here, and every downstream pass (THIR, MIR, codegen) can read the type directly from
48+
the node without consulting the TypeEnv. This matches the C++ implementation where `HirExpr`
49+
nodes carry a `resolved_type` field populated during lowering.
50+
51+
**Monomorphization as a separate pass, not interleaved.** When expression lowering encounters a
52+
call to `foo[I32](x)`, it does not immediately generate the specialized `foo$I32$` function.
53+
Instead, it records the instantiation in a queue and lowers the call using the mangled name as
54+
a placeholder. A separate monomorphization pass then drains the queue, generating each
55+
specialization exactly once (handling recursive generics by checking the queue before adding).
56+
This matches the C++ design and avoids infinite recursion on recursive generic types like
57+
`List[List[T]]`.
58+
59+
**Closure capture analysis generates closure struct types.** When a closure `do(x) expr` is
60+
encountered, capture analysis walks the body and identifies all free variables. For each closure,
61+
a synthetic struct type is generated (e.g., `__Closure_42`) with one field per captured variable,
62+
using the capture mode (ref, value, or move) determined by how the variable is used. The closure
63+
body is lowered as a separate `HirFunc` with the closure struct as its first parameter. This
64+
matches the C++ `hir_builder.cpp` approach and produces the struct layout that MIR codegen
65+
expects.
66+
67+
**Desugaring is exhaustive and irreversible.** After HIR lowering, `for`/`while`/`var` syntax
68+
does not exist in the output. All `for x in iter` loops become explicit iterator protocol calls
69+
(`iter.next()` in a loop). All `var` declarations become `let mut`. All `if let Just(x) = e`
70+
patterns become explicit `when` expressions. Downstream passes never need to handle these sugar
71+
forms.
72+
73+
## Architecture
74+
75+
```
76+
compiler-tml/src/hir/
77+
mod.tml -- module root, re-exports HirModule, HirExpr, HirBuilder
78+
expr.tml -- HirExpr enum: Lit, Var, Field, Index, Call, When, Loop, Closure, ...
79+
stmt.tml -- HirStmt enum: Let, Expr, Return, ...
80+
pattern.tml -- HirPattern enum: Wildcard, Bind, Struct, Enum, Tuple, ...
81+
module.tml -- HirModule: List[HirFunc] + List[HirTypeDef] + List[HirImpl]
82+
builder.tml -- HirBuilder + lower_module() entry point
83+
lower_expr.tml -- lower_expr(), lower_call(), lower_closure(), lower_when(), ...
84+
monomorph.tml -- MonomorphQueue, drain_queue(), mangle_name()
85+
printer.tml -- print_module(), print_func(), print_expr() for debug output
86+
```
87+
88+
## Pipeline Integration
89+
90+
After phase15a completes, the pipeline is:
91+
92+
```
93+
AST + TypeEnv (from phase14d)
94+
| phase15a: type resolution, desugaring, capture analysis, monomorphization
95+
v
96+
HirModule → THIR lowerer (Phase 15b)
97+
```
98+
99+
The `HirModule` output is a complete, typed, desugared representation of the source. Every
100+
expression carries its concrete type. No generic functions remain — only their specializations.
101+
No syntactic sugar remains. This is what THIR consumes.
102+
103+
## Success Criteria
104+
105+
Differential HIR comparison: run the TML HIR builder on all 1,700+ test files and all stdlib
106+
modules. Serialize the resulting `HirModule` (using the binary serializer from phase 5) and
107+
compare field-by-field with the C++ HIR builder output. Zero diffs required before phase 15b
108+
begins.
109+
110+
## Risk Assessment
111+
112+
High. Monomorphization has well-known edge cases: recursive generics (`List[List[T]]`),
113+
mutually recursive generic functions, generic types appearing only in associated type positions,
114+
and trait object erasure (where monomorphization does not apply). The C++ implementation handles
115+
these via a worklist algorithm with cycle detection; the TML port must replicate this exactly.
116+
117+
Closure capture analysis is also subtle: the capture mode (ref vs value vs move) affects the
118+
generated closure struct layout and must match what the MIR builder expects. A mismatch here
119+
produces runtime use-after-free bugs that are difficult to trace.
120+
121+
Plan: implement data types and builder core first, test against simple non-generic modules,
122+
then add monomorphization and test with generic stdlib modules, then add closure support last.
123+
124+
## Dependencies
125+
126+
- **Requires**: phase14d complete (full TypeEnv with behavior dispatch and coercion annotations)
127+
- **Blocks**: phase15b (THIR lowerer consumes HirModule)
128+
- **Blocks**: phase15c (MIR builder HIR→MIR path consumes HirModule directly)
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Tasks: HIR Lowering — Rewrite in TML
2+
3+
**Status**: Planned (0/24)
4+
**Depends on**: phase14d (type checker complete, TypeEnv available)
5+
**Blocks**: phase15b (THIR needs HIR output)
6+
**Duration**: 8–10 weeks
7+
**Risk**: High — desugaring and monomorphization have subtle semantics
8+
**C++ reference**: ~15,207 LOC → ~9,900 TML
9+
10+
---
11+
12+
## Phase 1: HIR Data Types (4 items)
13+
14+
- [ ] 1.1 Create `compiler-tml/src/hir/mod.tml` — module root
15+
- [ ] 1.2 Create `compiler-tml/src/hir/expr.tml``HirExpr` enum: all expression variants with resolved types (mirrors hir_expr.hpp, ~40 variants)
16+
- [ ] 1.3 Create `compiler-tml/src/hir/stmt.tml``HirStmt` enum, `compiler-tml/src/hir/pattern.tml``HirPattern` enum
17+
- [ ] 1.4 Create `compiler-tml/src/hir/module.tml``HirModule` struct: functions, types, impls, all with resolved type annotations
18+
19+
## Phase 2: HIR Builder Core (6 items)
20+
21+
- [ ] 2.1 Create `compiler-tml/src/hir/builder.tml``HirBuilder` struct: TypeEnv ref, current module, monomorphization queue
22+
- [ ] 2.2 Implement `lower_module(ast: Module, env: ref TypeEnv) -> HirModule` — entry point
23+
- [ ] 2.3 Implement function lowering: resolve param types, return type, lower body → `HirFunc`
24+
- [ ] 2.4 Implement struct/enum lowering: resolve field types, compute layout, assign field indices
25+
- [ ] 2.5 Implement desugaring: `var x = v``let mut x = v`, `for x in iter` → loop over iterator protocol
26+
- [ ] 2.6 Implement closure capture analysis: identify captured variables, determine capture mode (by ref, by value, by move)
27+
28+
## Phase 3: Expression Lowering (5 items)
29+
30+
- [ ] 3.1 Create `compiler-tml/src/hir/lower_expr.tml` — expression lowering
31+
- [ ] 3.2 Implement literal/variable/field/index expressions — attach resolved types
32+
- [ ] 3.3 Implement call expressions: resolve callee, lower args, attach return type
33+
- [ ] 3.4 Implement control flow: lower if/else, when (match), loop — with type-annotated branches
34+
- [ ] 3.5 Implement closures: create closure struct with captured fields, lower body as separate function
35+
36+
## Phase 4: Monomorphization (4 items)
37+
38+
- [ ] 4.1 Create `compiler-tml/src/hir/monomorph.tml` — generic instantiation engine
39+
- [ ] 4.2 Implement: when generic function/type is used with concrete types, generate specialized version
40+
- [ ] 4.3 Implement monomorphization queue: collect all needed instantiations, process iteratively until fixpoint
41+
- [ ] 4.4 Implement name mangling: `List[I32]``List$I32$`, `HashMap[Str, I64]``HashMap$Str$I64$`
42+
43+
## Phase 5: HIR Printer + Serialization (3 items)
44+
45+
- [ ] 5.1 Create `compiler-tml/src/hir/printer.tml` — pretty-print HIR for debugging
46+
- [ ] 5.2 Implement HIR serialization (binary) for hybrid pipeline bridge
47+
- [ ] 5.3 Test: round-trip HIR through serialize/deserialize, verify identical
48+
49+
## Phase 6: Differential Testing (2 items)
50+
51+
- [ ] 6.1 Lower 20 stdlib modules to HIR → compare with C++ HIR output (field by field)
52+
- [ ] 6.2 Lower full test suite → verify zero diffs against C++ HIR builder
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"status": "pending",
3+
"createdAt": "2026-04-06T01:20:11.738Z",
4+
"updatedAt": "2026-04-06T01:20:11.738Z"
5+
}
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# Proposal: THIR Lowering — Rewrite in TML (Phase 15b)
2+
3+
## Why
4+
5+
THIR (Typed High-level Intermediate Representation) is a thin transformation over HIR that
6+
inserts the remaining implicit semantic content that the parser and type checker left out:
7+
coercions, concrete method resolution, operator desugaring, and pattern exhaustiveness
8+
verification. It exists as a distinct pass because these transformations require the full
9+
TypeEnv (available after phase14d) and the typed HIR (available after phase15a), but they
10+
must happen before MIR construction, which assumes all method calls are already resolved to
11+
concrete functions and all operators are already desugared to their trait method equivalents.
12+
13+
Without THIR, the MIR builder would need to re-run method resolution and coercion insertion
14+
mid-construction, tangling two fundamentally different concerns. The separation is documented
15+
as a hard invariant from phase12c.
16+
17+
## What Changes
18+
19+
Port the following C++ files to TML:
20+
21+
- `compiler/src/thir/thir_lower.cpp` (1,138 LOC) — main THIR lowering pass
22+
- `compiler/src/thir/exhaustiveness.cpp` (623 LOC) — pattern exhaustiveness analysis
23+
- `compiler/src/thir/thir_module.cpp` (112 LOC) — THIR module and node type definitions
24+
- 6 header files (1,169 LOC combined) — `ThirExpr`, `ThirModule`, `ThirLower`, exhaustiveness
25+
26+
New TML modules produced:
27+
28+
- `compiler-tml/src/thir/mod.tml` — module root
29+
- `compiler-tml/src/thir/expr.tml``ThirExpr` enum extending `HirExpr` with coercion nodes
30+
- `compiler-tml/src/thir/module.tml``ThirModule` struct
31+
- `compiler-tml/src/thir/lower.tml``ThirLower` struct and `lower` entry point
32+
- `compiler-tml/src/thir/exhaustiveness.tml` — exhaustiveness checker
33+
34+
## Key Decisions
35+
36+
**Coercions as explicit nodes in THIR, not implicit annotations.** When an `I8` value is used
37+
where `I32` is expected, THIR inserts a `ThirExpr::Coerce { inner, from_ty, to_ty }` node
38+
wrapping the original expression. Downstream passes (MIR builder) see a concrete coercion
39+
instruction and generate the appropriate `sext` or `zext` LLVM instruction. Making coercions
40+
implicit (e.g., via a side table or metadata annotation) would require the MIR builder to
41+
re-inspect types at every use site — the explicit node model is simpler and matches the C++
42+
`ThirLower::insert_coercion` design.
43+
44+
**Method resolution uses the trait solver from phase14d, not a local lookup.** When THIR
45+
encounters `obj.method(args)` in the HIR, it calls the `BehaviorSolver` from phase14d to
46+
resolve the concrete impl. The result is a `ThirExpr::MethodCall { recv, method_path, args }`
47+
node where `method_path` is the fully-qualified path to the impl method (e.g.,
48+
`core::iter::Iterator::next` rather than just `next`). This eliminates all ambiguity before
49+
MIR construction, which cannot perform method resolution itself.
50+
51+
**Exhaustiveness uses matrix decomposition (usefulness algorithm).** The exhaustiveness
52+
checker works by constructing a pattern matrix from all `when` arms and computing whether the
53+
set of patterns is exhaustive via the standard "usefulness" algorithm (the same approach as
54+
rustc's `check_match`). For each constructor (enum variant, literal range, struct), it expands
55+
the matrix and checks recursively. This is the algorithm used in `exhaustiveness.cpp` and
56+
must be replicated rather than approximated — the C++ test suite includes edge cases for
57+
or-patterns, range patterns, and nested struct patterns that simpler approaches fail on.
58+
59+
**Operator desugaring is finalized here, not in phase14d.** Phase14d partially desugars
60+
operators via the coercion pass (resolving the trait impl), but THIR rewrites the AST node
61+
itself. After THIR, no `BinOp(Add, a, b)` nodes exist — only
62+
`MethodCall(a, "std::ops::Add::add", [b])`. This two-step design is a documented phase12c
63+
invariant: inference uses the syntactic form, THIR rewrites to the semantic form.
64+
65+
## Architecture
66+
67+
```
68+
compiler-tml/src/thir/
69+
mod.tml -- module root, re-exports ThirModule, ThirExpr, ThirLower
70+
expr.tml -- ThirExpr enum: all HirExpr variants + Coerce, MethodCall (resolved)
71+
module.tml -- ThirModule: List[ThirFunc] + List[ThirTypeDef]
72+
lower.tml -- ThirLower { solver: ref BehaviorSolver, type_env: ref TypeEnv }
73+
-- lower(hir: HirModule) -> ThirModule
74+
-- lower_expr(), insert_coercion(), resolve_method(), desugar_op()
75+
exhaustiveness.tml -- PatternMatrix, is_exhaustive(), find_missing_patterns()
76+
```
77+
78+
## Pipeline Integration
79+
80+
After phase15b completes, the pipeline is:
81+
82+
```
83+
HirModule (from phase15a)
84+
| phase15b: coercion insertion, method resolution, operator desugaring, exhaustiveness check
85+
v
86+
ThirModule → MIR builder THIR path (Phase 15c)
87+
```
88+
89+
The `ThirModule` output is consumed exclusively by the THIR→MIR builder path in phase15c.
90+
The HIR→MIR legacy path in phase15c consumes `HirModule` directly and does not use `ThirModule`.
91+
Both paths must produce equivalent MIR — differential testing in phase15c verifies this.
92+
93+
## Success Criteria
94+
95+
Two-level differential testing:
96+
97+
1. THIR node comparison: lower 20 stdlib modules through the TML THIR pass. Serialize the
98+
`ThirModule` and compare field-by-field with C++ THIR output. Zero diffs required on method
99+
resolution, coercion insertion, and operator desugaring.
100+
101+
2. Exhaustiveness parity: run the TML exhaustiveness checker on all `when` expressions in the
102+
full 1,700+ file test suite. Compare reported exhaustiveness errors and unreachable pattern
103+
warnings with C++ output. Zero disagreements required.
104+
105+
## Risk Assessment
106+
107+
Medium. The trait solver (phase14d) already handles the most complex resolution logic. THIR
108+
wiring is largely mechanical: call the solver, wrap results in THIR nodes, recurse through the
109+
tree. The main risk is the exhaustiveness checker — matrix decomposition is a well-specified
110+
algorithm but has many edge cases (or-patterns, guard interaction, range patterns on integers).
111+
The C++ `exhaustiveness.cpp` (623 LOC) is the reference; implement against it directly.
112+
113+
A secondary risk is coercion double-insertion: phase14d already partially inserts coercions via
114+
its coercion pass. THIR must detect and skip coercions already inserted rather than wrapping
115+
them again. The C++ implementation checks for existing `CoercionExpr` nodes before inserting.
116+
117+
## Dependencies
118+
119+
- **Requires**: phase15a complete (HirModule is the input to THIR lowering)
120+
- **Requires**: phase14d complete (BehaviorSolver is called during method resolution)
121+
- **Blocks**: phase15c (MIR builder THIR path consumes ThirModule)
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Tasks: THIR Lowering — Rewrite in TML
2+
3+
**Status**: Planned (0/16)
4+
**Depends on**: phase15a (HIR output available)
5+
**Blocks**: phase15c (MIR builder needs THIR)
6+
**Duration**: 3–4 weeks
7+
**Risk**: Medium — thin pass, most complexity already handled by type checker (phase14d)
8+
**C++ reference**: ~3,042 LOC → ~2,000 TML
9+
10+
---
11+
12+
## Phase 1: THIR Data Types (3 items)
13+
14+
- [ ] 1.1 Create `compiler-tml/src/thir/mod.tml` — module root
15+
- [ ] 1.2 Create `compiler-tml/src/thir/expr.tml``ThirExpr` enum: HIR expressions + coercion nodes + resolved method calls
16+
- [ ] 1.3 Create `compiler-tml/src/thir/module.tml``ThirModule` struct
17+
18+
## Phase 2: THIR Lowering Pass (6 items)
19+
20+
- [ ] 2.1 Create `compiler-tml/src/thir/lower.tml``ThirLower` struct and `lower(hir: HirModule) -> ThirModule` entry point
21+
- [ ] 2.2 Implement coercion insertion: insert implicit type conversions (integer widening, ref coercion, deref coercion)
22+
- [ ] 2.3 Implement method resolution: resolve `obj.method()` to concrete impl method via trait solver
23+
- [ ] 2.4 Implement operator desugaring: `a + b``a.add(b)`, `a[i]``a.index(i)`, `a == b``a.eq(b)`
24+
- [ ] 2.5 Implement associated type normalization: replace `<T as Iterator>::Item` with concrete type
25+
- [ ] 2.6 Implement `when` arm processing: lower each arm pattern + guard + body
26+
27+
## Phase 3: Exhaustiveness Checker (4 items)
28+
29+
- [ ] 3.1 Create `compiler-tml/src/thir/exhaustiveness.tml` — pattern exhaustiveness analysis
30+
- [ ] 3.2 Implement: for each `when` expression, compute whether all possible values are covered
31+
- [ ] 3.3 Implement: enum exhaustiveness — all variants present, or wildcard covers remainder
32+
- [ ] 3.4 Implement: emit warning for unreachable patterns, error for non-exhaustive match
33+
34+
## Phase 4: Differential Testing (3 items)
35+
36+
- [ ] 4.1 Lower 20 stdlib modules through HIR→THIR → compare with C++ THIR output
37+
- [ ] 4.2 Lower full test suite → verify zero diffs against C++ THIR lowerer
38+
- [ ] 4.3 Specifically test: operator desugaring, coercion insertion, exhaustiveness on 10 edge-case files

0 commit comments

Comments
 (0)