Skip to content

Commit 832f372

Browse files
Andre Ferreiraclaude
andcommitted
chore(tasks): Phase 14 type checker tasks — 4 tasks, 90 items, CRITICAL PATH
ERA 1, Phase 2: Port type checker from C++ (~21K LOC) to TML (~13.6K LOC): - phase14a: Type registration — Type repr, TypeEnv, builtins, AST walk (22 items) - phase14b: Module resolution — imports, visibility, re-exports, binary cache (20 items) - phase14c: Type inference — Hindley-Milner, unification, call resolution (26 items) - phase14d: Behavior dispatch — trait solver, coherence, coercion insertion (22 items) All 4 tasks include real proposals (67-141 lines each). Dependency: 14a → 14b → 14c → 14d (sequential, 14a/14b partially overlap) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 79cd053 commit 832f372

13 files changed

Lines changed: 624 additions & 1 deletion

File tree

.rulebook/tasks/TASKS-INDEX.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# TML Project — Task Index
22

33
**Last updated**: 2026-04-05
4-
**Active tasks**: 31 | **Archived**: 5+
4+
**Active tasks**: 35 | **Archived**: 5+
55

66
---
77

@@ -145,6 +145,19 @@ Port lexer and parser from C++ to TML — first compiler subsystems in TML.
145145

146146
**Order**: 13a → 13b → 13c → 13d (sequential)
147147

148+
## Phase 14 — Type Checker in TML (ERA 1, Phase 2) ⚠️ CRITICAL PATH
149+
150+
Port the type checker (~21K LOC C++) to TML. Highest-risk, longest phase (8 months). See [invariant doc](phase12c_typechecker-invariants/).
151+
152+
| ID | Task | Status | Priority | Progress |
153+
|----|------|--------|----------|----------|
154+
| 14a | [Type Registration](phase14a_typechecker-registration/) | Planned | P0 | 0/22 |
155+
| 14b | [Module Resolution](phase14b_typechecker-module-resolution/) | Planned | P0 | 0/20 |
156+
| 14c | [Type Inference](phase14c_typechecker-inference/) | Planned | P0 | 0/26 |
157+
| 14d | [Behavior Dispatch](phase14d_typechecker-behavior-dispatch/) | Planned | P0 | 0/22 |
158+
159+
**Order**: 14a → 14b → 14c → 14d (sequential, 14a/14b can partially overlap)
160+
148161
## Research
149162

150163
| ID | Task | Status | Priority | Progress |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"status": "pending",
3+
"createdAt": "2026-04-06T01:13:57.380Z",
4+
"updatedAt": "2026-04-06T01:13:57.380Z"
5+
}
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Proposal: Type Checker — Type Registration (Sub-phase 2a)
2+
3+
## Problem
4+
5+
The TML type checker's first phase populates `TypeEnv` with all type declarations before any
6+
inference or checking begins. This phase is currently implemented in C++ across `type.cpp` (965
7+
LOC), `env_core.cpp`, `env_definitions.cpp`, `env_scope.cpp`, `builtins_cache.cpp`, and ten
8+
`builtins/*.cpp` files (~5,302 LOC total). Porting this phase to TML is the entry point for
9+
self-hosting the type checker and unblocks all subsequent checking sub-phases (14b–14d).
10+
11+
## Solution
12+
13+
Port the type representation and environment initialization code to four TML modules under
14+
`compiler-tml/src/types/`:
15+
16+
- `type.tml``Type` as a TML enum replacing the C++ class hierarchy in `type.hpp`
17+
- `env.tml``TypeEnv` struct replacing `TypeEnv` fields and methods from `env.hpp`
18+
- `builtins.tml` — builtin type/behavior registration replacing `builtins/*.cpp`
19+
- `register.tml` — AST declaration walker replacing `env_definitions.cpp` and checker `core.cpp`
20+
phase-1 logic
21+
22+
## Key Design Decisions
23+
24+
**Type as enum, not class hierarchy.** The C++ code uses a `Type` base class with ~15 derived
25+
classes. TML enums with payload variants express the same structure more concisely and enable
26+
exhaustive `when` matching throughout the checker, eliminating virtual dispatch and downcasts.
27+
28+
**HashMap-backed TypeEnv.** `TypeEnv` holds three `HashMap[Str, ...]` maps (types, functions,
29+
behaviors). Lookup is O(1) average. The C++ implementation uses `std::unordered_map` with the
30+
same semantics, so the port is direct.
31+
32+
**Scope chain via List[Scope].** Each `Scope` holds a `Maybe[ref Scope]` parent pointer and a
33+
local `HashMap[Str, Type]`. `lookup` walks the chain from innermost scope to module root. This
34+
matches the C++ `env_scope.cpp` algorithm exactly.
35+
36+
**Builtin registration at env construction.** `TypeEnv.new()` calls `register_builtins(self)`
37+
immediately, matching `env_core.cpp`'s constructor behavior. All 14 primitives, 13 behaviors,
38+
5 memory types, and 4 collection types are registered before any user code is processed.
39+
40+
**Generic substitution in-tree.** `Type.substitute(params)` performs deep substitution on
41+
`Type::Generic` variants, enabling monomorphization without a separate pass. This replaces the
42+
C++ `type.cpp` `substitute` method.
43+
44+
## Files Changed
45+
46+
| File | Purpose |
47+
|------|---------|
48+
| `compiler-tml/src/types/type.tml` | Type enum, equality, display, substitution, size/align |
49+
| `compiler-tml/src/types/env.tml` | TypeEnv struct, Scope, register/lookup operations |
50+
| `compiler-tml/src/types/builtins.tml` | All builtin type and behavior registration |
51+
| `compiler-tml/src/types/register.tml` | AST declaration walker — structs, enums, functions |
52+
53+
## Success Criteria
54+
55+
Differential testing: serialize the TML `TypeEnv` after registration and compare field-by-field
56+
with C++ `TypeEnv` output on the same input. Zero differences on all 20 stdlib modules and the
57+
full test suite constitutes a passing port. The C++ registration phase remains the reference
58+
until phase14d completes.
59+
60+
## Dependencies
61+
62+
- **Requires**: phase13d (TML frontend parsed and AST available in TML), phase12c (type system
63+
invariant document for correctness reference)
64+
- **Blocks**: phase14b (module resolution reads the populated TypeEnv to resolve imports)
65+
- **Duration**: 6–8 weeks
66+
- **Risk**: Medium — type representation and builtin registration are well-scoped with no
67+
constraint solving or inference logic in scope
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Tasks: Type Checker — Type Registration (Sub-phase 2a)
2+
3+
**Status**: Planned (0/22)
4+
**Depends on**: phase13d (TML frontend integrated), phase12c (invariant document)
5+
**Blocks**: phase14b (module resolution needs TypeEnv populated)
6+
**Duration**: 6–8 weeks
7+
**Risk**: Medium — well-scoped, no inference yet
8+
**C++ reference**: ~5,302 LOC → ~3,400 TML
9+
10+
---
11+
12+
## Phase 1: Type Representation (6 items)
13+
14+
- [ ] 1.1 Create `compiler-tml/src/types/type.tml``Type` enum: Primitive, Struct, Enum, Func, Generic, Ref, Ptr, Array, Tuple, Never, Unit, Unknown
15+
- [ ] 1.2 Implement type equality: `Type.equals(other: Type) -> Bool` with structural comparison
16+
- [ ] 1.3 Implement type display: `Type.to_str() -> Str` for error messages
17+
- [ ] 1.4 Implement generic type substitution: `Type.substitute(params: HashMap[Str, Type]) -> Type`
18+
- [ ] 1.5 Implement type size/alignment calculation: `Type.size() -> I64`, `Type.align() -> I64`
19+
- [ ] 1.6 Test: round-trip all Type variants through serialize/deserialize
20+
21+
## Phase 2: TypeEnv Core (5 items)
22+
23+
- [ ] 2.1 Create `compiler-tml/src/types/env.tml``TypeEnv` struct: types (HashMap), functions (HashMap), behaviors (HashMap), scopes (List[Scope])
24+
- [ ] 2.2 Implement `Scope` type: parent scope ref, local bindings, scope kind (Module/Function/Block)
25+
- [ ] 2.3 Implement `TypeEnv.register_type(name: Str, ty: Type)` — add to types map, error on duplicate
26+
- [ ] 2.4 Implement `TypeEnv.register_function(name: Str, sig: FuncSig)` — add to functions map
27+
- [ ] 2.5 Implement `TypeEnv.push_scope()`, `pop_scope()`, `lookup(name: Str) -> Maybe[Type]` with scope chain
28+
29+
## Phase 3: Builtin Types (5 items)
30+
31+
- [ ] 3.1 Create `compiler-tml/src/types/builtins.tml` — register all primitive types: I8, I16, I32, I64, U8, U16, U32, U64, F32, F64, Bool, Str, Unit, Never
32+
- [ ] 3.2 Register builtin behaviors: Add, Sub, Mul, Div, Eq, Ord, Hash, Display, Debug, Clone, Copy, Drop, Iterator
33+
- [ ] 3.3 Register memory builtins: Heap[T], Shared[T], Sync[T], Weak[T], RawPtr
34+
- [ ] 3.4 Register collection builtins: List[T], HashMap[K,V], Maybe[T], Outcome[T,E]
35+
- [ ] 3.5 Test: TypeEnv after builtin registration has all expected types and behaviors
36+
37+
## Phase 4: Declaration Registration from AST (4 items)
38+
39+
- [ ] 4.1 Create `compiler-tml/src/types/register.tml` — walk AST Module, register all declarations
40+
- [ ] 4.2 Implement struct registration: extract fields, generic params, where clauses → register Type::Struct
41+
- [ ] 4.3 Implement enum registration: extract variants with payload types → register Type::Enum
42+
- [ ] 4.4 Implement function registration: extract params, return type, generic params → register FuncSig
43+
44+
## Phase 5: Differential Testing (2 items)
45+
46+
- [ ] 5.1 Register types from 20 stdlib modules → serialize TypeEnv → compare with C++ TypeEnv output
47+
- [ ] 5.2 Register types from full test suite → verify zero diffs against C++ registration phase
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"status": "pending",
3+
"createdAt": "2026-04-06T01:13:58.346Z",
4+
"updatedAt": "2026-04-06T01:13:58.346Z"
5+
}
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Proposal: Type Checker — Module Resolution (Sub-phase 2b)
2+
3+
## Problem
4+
5+
Before the type checker can check any expression, it must resolve all `use` imports and make
6+
the declared types visible in the correct scopes. This is the second phase of checking in the
7+
C++ compiler, spanning `env_module_loading.cpp` (875 LOC), `env_module_load.cpp` (508 LOC),
8+
`env_module_load_decls.cpp` (1,253 LOC), `module.cpp` (549 LOC), `module_metadata.cpp` (740
9+
LOC), `module_binary.cpp` (799 LOC), and `module_binary_read.cpp` (1,409 LOC) — ~7,211 LOC
10+
total. Until this phase is ported to TML, the self-hosted type checker cannot process any
11+
multi-file program.
12+
13+
## Solution
14+
15+
Port module resolution to four TML modules under `compiler-tml/src/types/`:
16+
17+
- `module.tml``Module` struct and `ModulePath`, `Visibility`, `ModuleMetadata` types
18+
- `imports.tml``use` statement resolver covering single, glob, renamed, and re-export forms
19+
- `module_loader.tml` — file-system search, load-parse-register pipeline, circular import guard
20+
- `module_binary.tml` — binary serialization and cache read/write for incremental compilation
21+
22+
## Key Design Decisions
23+
24+
**Module path as List[Str].** `std::collections::HashMap` is represented as
25+
`List["std", "collections", "HashMap"]`. Display joins with `"::"`. This is simpler than a
26+
custom struct and integrates cleanly with `HashMap[ModulePath, Module]` keyed lookups.
27+
28+
**File-system-based module search.** Given `use std::json`, the loader searches `lib/std/src/json/mod.tml`
29+
and `lib/std/src/json.tml` in order, matching the C++ `env_module_loading.cpp` search algorithm.
30+
The search roots are passed in at loader construction time, making the loader testable without
31+
touching the real filesystem.
32+
33+
**Circular import detection via loading stack.** The loader maintains a `List[ModulePath]` of
34+
modules currently being loaded. Before loading a module, it checks whether the path is already
35+
in this stack. If so, it emits a `E0201: circular import` diagnostic and returns an error. This
36+
matches the C++ `loading_stack_` guard in `env_module_loading.cpp`.
37+
38+
**Binary cache for incremental compilation.** Each resolved module serializes to a compact
39+
binary format (header + fingerprint + declaration table). On reload, the loader reads the
40+
fingerprint, compares with the source file hash, and skips re-parsing if unchanged. The format
41+
is identical to the C++ `module_binary.cpp` output so cached modules produced by C++ can be
42+
read by the TML loader during the transition period.
43+
44+
**Re-export propagation.** `pub use inner::Type` adds the item to the current module's public
45+
visibility map. When another module imports the current module, the resolver checks this map
46+
and registers the re-exported item under the importer's scope. This matches the two-pass
47+
approach in `env_module_load.cpp` (collect re-exports first, then resolve importers).
48+
49+
## Files Changed
50+
51+
| File | Purpose |
52+
|------|---------|
53+
| `compiler-tml/src/types/module.tml` | Module struct, ModulePath, Visibility, ModuleMetadata |
54+
| `compiler-tml/src/types/imports.tml` | use statement resolution — single, glob, renamed, pub use |
55+
| `compiler-tml/src/types/module_loader.tml` | File search, load-parse-register, circular import guard |
56+
| `compiler-tml/src/types/module_binary.tml` | Binary cache serialize/deserialize, fingerprinting |
57+
58+
## Success Criteria
59+
60+
Differential testing: run import resolution on 20 stdlib modules and compare the resulting
61+
`TypeEnv` (all registered names, their types, and visibility) against C++ output on the same
62+
inputs. Zero differences on stdlib and the full test suite constitutes a passing port. The C++
63+
resolver remains the reference until phase14d completes.
64+
65+
## Dependencies
66+
67+
- **Requires**: phase14a (TypeEnv must be populated with builtin types and declaration skeletons
68+
before imports can be resolved against it)
69+
- **Blocks**: phase14c (Hindley-Milner inference walks import-resolved TypeEnv to infer
70+
expression types — unresolved imports produce spurious type errors)
71+
- **Duration**: 4–6 weeks
72+
- **Risk**: Medium — the algorithm is a well-defined graph traversal with no constraint solving;
73+
the main complexity is the binary cache format compatibility requirement during the C++/TML
74+
transition period
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Tasks: Type Checker — Module Resolution (Sub-phase 2b)
2+
3+
**Status**: Planned (0/20)
4+
**Depends on**: phase14a (TypeEnv populated with declarations)
5+
**Blocks**: phase14c (inference needs resolved imports)
6+
**Duration**: 4–6 weeks
7+
**Risk**: Medium — mechanical graph traversal, well-defined algorithm
8+
**C++ reference**: ~7,211 LOC → ~4,700 TML
9+
10+
---
11+
12+
## Phase 1: Module Representation (4 items)
13+
14+
- [ ] 1.1 Create `compiler-tml/src/types/module.tml``Module` struct: name, path, declarations, imports, visibility map, metadata
15+
- [ ] 1.2 Implement `ModulePath` type: `List[Str]` with display as `"std::collections::HashMap"`
16+
- [ ] 1.3 Implement `Visibility` checking: pub, pub(crate), private — resolve against module tree
17+
- [ ] 1.4 Implement `ModuleMetadata`: file path, last modified, fingerprint for incremental
18+
19+
## Phase 2: Import Resolution (5 items)
20+
21+
- [ ] 2.1 Create `compiler-tml/src/types/imports.tml``use` statement resolver
22+
- [ ] 2.2 Implement single import: `use std::collections::HashMap` → resolve path, register alias in scope
23+
- [ ] 2.3 Implement glob import: `use std::collections::*` → resolve all pub items, register each
24+
- [ ] 2.4 Implement renamed import: `use std::collections::HashMap as Map` → register under alias
25+
- [ ] 2.5 Implement re-export: `pub use inner::Type` → make visible to importers of this module
26+
27+
## Phase 3: Module Loading (5 items)
28+
29+
- [ ] 3.1 Create `compiler-tml/src/types/module_loader.tml` — load module from file path, parse, register
30+
- [ ] 3.2 Implement module search: given `use std::json`, find `lib/std/src/json/mod.tml`
31+
- [ ] 3.3 Implement circular import detection: track loading stack, error on cycle
32+
- [ ] 3.4 Implement declaration loading: walk loaded module AST, register types/functions into TypeEnv
33+
- [ ] 3.5 Implement transitive import resolution: if A imports B imports C, A sees B's pub re-exports
34+
35+
## Phase 4: Module Binary Cache (4 items)
36+
37+
- [ ] 4.1 Create `compiler-tml/src/types/module_binary.tml` — serialize resolved module to binary cache
38+
- [ ] 4.2 Implement module fingerprinting: hash(source content) for cache invalidation
39+
- [ ] 4.3 Implement cache read: load previously resolved module from binary, skip re-parsing
40+
- [ ] 4.4 Implement cache invalidation: source changed or dependency changed → re-resolve
41+
42+
## Phase 5: Differential Testing (2 items)
43+
44+
- [ ] 5.1 Resolve imports for 20 stdlib modules → compare resolved TypeEnv with C++ output
45+
- [ ] 5.2 Resolve imports for full test suite → verify zero diffs against C++ module resolution
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"status": "pending",
3+
"createdAt": "2026-04-06T01:13:59.010Z",
4+
"updatedAt": "2026-04-06T01:13:59.010Z"
5+
}

0 commit comments

Comments
 (0)