hashintel · lunelson · May 15, 2026 · May 13, 2026 · May 13, 2026 · May 13, 2026
diff --git a/.agents/skills/ln-consult/SKILL.md b/.agents/skills/ln-consult/SKILL.md
@@ -53,7 +53,7 @@ Presume **structural** on a fresh thread when the work touches workflow closure,
 
 Default rule:
 
-`ln-grill → ln-spec → ln-plan → [ln-design] → [ln-oracles] → ln-scope → [ln-spike] → ln-build → ln-review → [ln-refactor] → [ln-sync]`
+`ln-grill` or `ln-disambiguate` → `ln-spec` → `ln-plan` → optional `ln-design` / `ln-oracles` → `ln-scope` → optional `ln-spike` → `ln-build` → `ln-review` → optional `ln-refactor` / `ln-sync`
 
 Bounded exception:
 
@@ -80,6 +80,7 @@ Only recommend the bounded serial exception when those same conditions hold and
 | Situation | Work type | Suggest |
 | --- | --- | --- |
 | Idea is vague, needs fleshing out | structural | `ln-grill` |
+| Plausible interpretations diverge; examples would clarify faster than open-ended questioning | structural | `ln-disambiguate` |
 | Understanding exists, needs a written spec | structural | `ln-spec` |
 | Spec exists, needs work sequencing | structural | `ln-plan` |
 | Verification strategy is the main uncertainty | structural | `ln-oracles` |

diff --git a/.agents/skills/ln-design/SKILL.md b/.agents/skills/ln-design/SKILL.md
@@ -6,7 +6,9 @@ argument-hint: "[module or API boundary to explore]"
 
 # Ln Design
 
-Apply Ousterhout's "Design It Twice": generate **3+ radically different module shapes**, compare on depth, and synthesize. The goal is deep modules — small API surfaces hiding significant complexity. Do not implement; this is purely about the shape of the boundary.
+Apply Ousterhout's "Design It Twice": generate **3+ radically different module shapes**, compare on depth, and synthesize. The goal is deep modules — small interfaces hiding significant complexity. Do not implement; this is purely about the shape of the seam.
+
+Use `ln-design` as the deepening pathway from `ln-review`: when review surfaces a shallow module or weak seam, explore alternative deepened module shapes here before routing to `ln-scope` or `ln-refactor`.
 
 ## Input
 
@@ -16,7 +18,9 @@ The module or API boundary: $ARGUMENTS
 
 ### 1. Gather requirements
 
-Understand the problem, the callers, the key operations, constraints, and — crucially — what complexity should be hidden inside vs exposed. Skip steps you already know the answer to.
+Understand the problem, the callers, the key operations, constraints, and — crucially — what complexity should be hidden inside vs exposed. If this design follows an `ln-review` deepening candidate, start from that candidate's files, problem, possible direction, and benefits. Skip steps you already know the answer to.
+
+Read `memory/SPEC.md` first when it exists. Use its lexicon for domain terms and respect its live assumptions, decisions, and invariants. Read `memory/PLAN.md` when the seam touches active or near-horizon work.
 
 ### 2. Generate designs (parallel sub-agents)
 
@@ -27,13 +31,15 @@ Spawn 3+ sub-agents simultaneously. Each must produce a **radically different**
 - "Optimize for the most common case"
 - "Take inspiration from [specific paradigm or library]"
 
-Each agent returns: **API signature** (types, methods, params), **usage example**, **what it hides**, and **trade-offs**.
+Each agent returns: **interface** (types, methods, params, invariants, ordering constraints, error modes, required configuration, and performance characteristics), **usage example**, **what it hides**, **seam / adapter strategy** where relevant, and **trade-offs**.
 
 ### 3. Present and compare
 
 Show each design sequentially, then compare in prose on:
 
-- **Depth** (Ousterhout's depth test): small surface hiding significant complexity (good) vs large surface with thin implementation (bad)
+- **Depth** (Ousterhout's depth test): small interface hiding significant complexity (good) vs large interface with thin implementation (bad)
+- **Locality**: whether change, bugs, knowledge, and verification concentrate behind the seam
+- **Leverage**: what callers get per fact they must learn about the interface
 - **Ease of correct use** vs ease of misuse
 - **General-purpose vs specialized**: flexibility vs focus
 - **Implementation efficiency**: does the shape allow efficient internals?

diff --git a/.agents/skills/ln-disambiguate/SKILL.md b/.agents/skills/ln-disambiguate/SKILL.md
@@ -0,0 +1,99 @@
+---
+name: ln-disambiguate
+description: "Collapse meaningful ambiguity with contrastive examples. Use when a plan/design has several plausible meanings, requirements feel vague, examples would clarify intent faster than grilling, or the user asks to disambiguate, find ambiguity, use behavioral kernels, or ask contrastive questions."
+---
+
+# Ln Disambiguate
+
+Generate cases where plausible interpretations diverge; ask the user to classify the case.
+
+Users recognize intent in concrete examples faster than they author abstract predicates. Use the TiCoder move generalized beyond tests: produce examples, counterexamples, edge cases, and candidate outcomes that separate meanings. Then translate the answer into typed conclusions.
+
+Use this instead of `ln-grill` when the work has enough shape that ambiguity collapse is the next move. Use `ln-grill` when the idea still needs broad Socratic pressure.
+
+If local context can answer the question, inspect it instead of asking. Read only the context needed to form precise contrasts: `memory/SPEC.md`, `memory/PLAN.md`, and files they explicitly point to. Use the current lexicon; when terms are overloaded, name the competing meanings.
+
+For each ambiguity:
+
+1. Name the ambiguous claim.
+2. Generate 2–4 competing interpretations.
+3. Find the smallest scenario where they produce different outcomes.
+4. Ask one contrastive classification question.
+5. Translate the answer into candidate durable conclusions:
+   - `decision`
+   - `invariant`
+   - `constraint`
+   - `assumption`
+   - `example`
+   - `counterexample`
+   - `criterion`
+   - `unresolved ambiguity`
+
+Prefer one high-yield question at a time. Multiple choice is useful when options are real; forced choice is harmful when it hides the likely fifth answer. Always allow “other / depends — explain.”
+
+Ask questions like:
+
+- “In this exact case, which outcome is correct?”
+- “Is this inside or outside the commitment?”
+- “Would this count as a bug?”
+- “Which option should be rejected?”
+- “Does this example witness the rule, contradict it, or sit outside scope?”
+
+Include your recommended answer when you have enough context, and explain why.
+
+Use behavioral kernels as hidden interviewer machinery. Activate at most the top 2–3 relevant kernels:
+
+| Kernel | Looks for | Typical artifact |
+| --- | --- | --- |
+| Identity & reference | ids, references, links, uniqueness | entity / reference invariant |
+| Containment & topology | parent/child, folders, ordering, graphs | membership / topology invariant |
+| Validation & normalization | valid/invalid input, canonical forms | parser or validation contract |
+| State & lifecycle | states, transitions, terminal states | state-machine invariant |
+| Temporal history | undo, redo, audit, expiration | history / timeline invariant |
+| Optimization & preference | best, preferred, tie-breaks | ranking or objective rule |
+| Authority & capability | roles, permissions, delegation | authorization predicate |
+| Concurrency & collaboration | offline, stale, conflict, merge | conflict-resolution semantics |
+| Transactions & atomicity | all-or-nothing multi-object updates | transaction invariant |
+| Resource accounting | balances, quotas, capacity, limits | conservation / bounds invariant |
+| Derived data & views | counts, filters, projections, caches | view consistency invariant |
+| Error & recovery | retry, rollback, compensation | failure / recovery contract |
+| External effects | APIs, queues, webhooks, clocks | boundary / adapter contract |
+| Change & migration | legacy, compatibility, upgrade | migration / refinement invariant |
+| Observability & evidence | logs, traces, explanations, audit | trace / provenance invariant |
+
+Example:
+
+> A project is deleted while it still has tasks. Which behavior is correct?
+>
+> A. Delete the tasks too.
+> B. Archive the tasks and keep them readable.
+> C. Move tasks to an unassigned pool.
+> D. Block deletion until tasks are reassigned or deleted.
+> E. Other / depends.
+>
+> My recommendation is B if historical traceability matters more than cleanup, because it preserves references and gives us a clear data-integrity invariant.
+
+Translate the answer:
+
+- decision: “Deleted projects archive their tasks rather than deleting or reassigning them.”
+- invariant: “Archived tasks retain a tombstone reference to the deleted project.”
+- positive example: “Deleting a project with open tasks makes those tasks archived and readable.”
+- counterexample: “Tasks silently disappearing after project deletion is rejected.”
+- criterion: “A deletion test verifies task archival and readable tombstone references.”
+
+Stop when the ambiguity is collapsed, explicitly deferred, or ready for `ln-spec`.
+
+Do not create or edit planning artifacts here. Durable conclusions promote into `memory/SPEC.md` or `memory/PLAN.md` through the next routed skill.
+
+## Routing
+
+When the ambiguity pass is complete, present these options to the user. If `tool-ask-question` is available, use it; otherwise use a numbered list.
+
+| # | Label | Target | Why |
+| --- | --- | --- | --- |
+| 1 | Write/update spec | `ln-spec` | Durable conclusions should enter `memory/SPEC.md` |
+| 2 | Plan frontier | `ln-plan` | The meaning is clear but work needs sequencing |
+| 3 | Scope one slice | `ln-scope` | One implementation slice is now obvious |
+| 4 | Grill further | `ln-grill` | The ambiguity pass exposed broader design uncertainty |
+
+Recommended: choose `ln-spec` when decisions, invariants, assumptions, lexicon, examples, or criteria changed.
diff --git a/.agents/skills/ln-review/SKILL.md b/.agents/skills/ln-review/SKILL.md
@@ -18,14 +18,18 @@ If "recent" or unspecified, focus on recently modified files.
 
 ## What to look for
 
+Read `memory/SPEC.md` first when it exists. Use its lexicon for domain terms, and treat the live architecture register as the current decision record. Read `memory/PLAN.md` for active frontier context when the reviewed area touches active or near-horizon work. If ADRs or design docs exist in the touched area, respect them as supporting context, but do not introduce ADRs or sidecar decision logs by default; durable updates reconcile through `memory/SPEC.md` / `memory/PLAN.md`.
+
 Apply Ousterhout's depth test: modules should have small interfaces hiding significant complexity. Modules that move together should live together — clusters of small files always used in concert are a single deep module waiting to be extracted.
 
 Use the deletion test for suspected shallow modules: if deleting the module makes complexity vanish, it was pass-through structure; if the same complexity reappears across multiple callers, the module was earning its keep. Prefer depth as leverage/locality, not line-count ratio.
 
-Treat the interface as the test surface. If callers or tests must reach past the interface to verify important behavior, the module shape is probably wrong. A good seam lets tests and callers cross the same public boundary.
+Treat the interface as the test surface. The interface is everything callers must know to use the module correctly: types, invariants, ordering constraints, error modes, required configuration, and performance characteristics. If callers or tests must reach past the interface to verify important behavior, the module shape is probably wrong. A good seam lets tests and callers cross the same public boundary.
 
 Apply seam discipline: one adapter usually means a hypothetical seam; two adapters make a real seam. Flag indirection introduced only for imagined future variation, especially when it spreads configuration, mocks, or ordering knowledge into callers.
 
+When a finding is a deepening opportunity, present it as a candidate rather than a detailed design. Name the current shallow module shape, the deepened module that might replace it, what complexity would move behind the seam, and why that would improve locality, leverage, and the test surface. Do **not** propose detailed interfaces in `ln-review`; route selected deepening candidates to `ln-design` before scoping or refactoring.
+
 Check the functional core / imperative shell boundary (Gary Bernhardt, "Boundaries"). Pure functions should stay pure. Flag when a pure function has acquired side effects or a growing parameter list — it has drifted into shell territory.
 
 Make invalid states unrepresentable (Yaron Minsky). Split optional fields into distinct types. Use branded types for domain-distinct values.
@@ -54,7 +58,7 @@ Collect misalignments as numbered findings (category: `naming`) with the canonic
 
 ## Output
 
-Present findings as numbered candidates:
+Present findings as numbered candidates. Use the compact form for ordinary findings:
 
 ```md
 ## Review: [area]
@@ -65,19 +69,30 @@ Present findings as numbered candidates:
 2. ...
 ```
 
+Use the deepening form when the finding is a shallow-module or weak-seam opportunity:
+
+```md
+1. **[Deepening candidate]** — [category: depth|seam|coupling|testability] — [impact: low|medium|high]
+   **Files** — [modules/files involved]
+   **Problem** — [why the current module shape causes friction]
+   **Possible direction** — [plain English target shape; no detailed interface yet]
+   **Benefits** — [locality, leverage, and test-surface improvement]
+```
+
 Recommend the highest-impact improvement.
 
 ## Routing
 
 After presenting findings, present these options to the user (use `tool-ask-question`):
 
-| #   | Label           | Target        | Why                                              |
-| --- | --------------- | ------------- | ------------------------------------------------ |
-| 1   | Scope a fix     | `ln-scope`    | A finding warrants a planned slice               |
-| 2   | Plan a refactor | `ln-refactor` | Multiple findings need coordinated restructuring |
-| 3   | Back to triage  | `ln-consult`  | Review complete, no immediate action needed      |
+| #   | Label                      | Target        | Why                                              |
+| --- | -------------------------- | ------------- | ------------------------------------------------ |
+| 1   | Scope a fix                | `ln-scope`    | A finding warrants a planned slice               |
+| 2   | Explore a deepening design | `ln-design`   | A selected candidate needs seam/interface design before scoping or refactoring |
+| 3   | Plan a refactor            | `ln-refactor` | Multiple findings need coordinated restructuring |
+| 4   | Back to triage             | `ln-consult`  | Review complete, no immediate action needed      |
 
-Recommended: **1** if high-impact findings exist, **3** otherwise.
+Recommended: **2** if the highest-impact finding is a deepening candidate, **1** if high-impact findings are concrete fixes, **4** otherwise.
 
 ---
 *Draws from [mattpocock/skills/improve-codebase-architecture](https://github.com/mattpocock/skills/tree/main/improve-codebase-architecture) and [theswerd/aicode/skills/self-documenting-code](https://github.com/theswerd/aicode/blob/main/skills/self-documenting-code/SKILL.md).*