From 7fa88c36ba2326670dd7f6c0f618bfc24ef72013 Mon Sep 17 00:00:00 2001
From: Bartosz Burda <bartoszburda93@gmail.com>
Date: Mon, 18 May 2026 17:56:58 +0200
Subject: [PATCH 1/3] feat: pharaoh-sdd orchestrator skill for gated V-model
 SDD

Signed-off-by: Bartosz Burda <bartoszburda93@gmail.com>
---
 skills/pharaoh-sdd/SKILL.md | 140 ++++++++++++++++++++++++++++++++++++
 1 file changed, 140 insertions(+)
 create mode 100644 skills/pharaoh-sdd/SKILL.md
diff --git a/skills/pharaoh-sdd/SKILL.md b/skills/pharaoh-sdd/SKILL.md
new file mode 100644
index 0000000..6cfee3d
--- /dev/null
+++ b/skills/pharaoh-sdd/SKILL.md
@@ -0,0 +1,140 @@
+---
+name: pharaoh-sdd
+description: Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with .pharaoh/project/ tailoring.
+chains_from: []
+chains_to: []
+---
+
+# pharaoh-sdd
+
+## Overview
+
+The graph is not the hard part. An agent will happily produce one. The hard part is not
+fabricating it. Requirements come from the human, every artefact carries its review, and
+every tier builds clean before the next. This skill enforces that contract.
+
+## Composition
+
+pharaoh-sdd is a non-atomic orchestrator. It does not draft or review artefacts directly.
+It dispatches the following atomic skills:
+
+- `pharaoh-req-draft`: drafts requirement-shaped artefacts at any catalog-declared level
+- `pharaoh-arch-draft`: drafts architecture artefacts
+- `pharaoh-vplan-draft`: drafts verification plans and test cases
+- `pharaoh-req-codelink-annotate`: links implemented code back into the requirement graph
+- `pharaoh-quality-gate`: verifies review coverage and trace completeness at the end
+
+Each `*-draft` skill self-invokes its matching `*-review` as its last step and returns
+`{artefact, review}`. The orchestrator reads that review. It does NOT call `*-review`
+again for a freshly drafted artefact.
+
+## The Iron Gate
+
+Do NOT draft tier N+1 until tier N is drafted, reviewed, validated with `sphinx-build -W`,
+and approved by the human. This holds for every project no matter how small the feature.
+
+Violating the letter of this gate is violating the spirit of it.
+
+## When to use
+
+- A developer asks to add a feature, capability, or behaviour change.
+- The project has `.pharaoh/project/` tailoring and a sphinx-needs V-model.
+- Use even for tiny features. Small features are where requirements get silently invented.
+
+When NOT to use: pure refactors, bug fixes against an existing requirement, or projects
+with no sphinx-needs structure.
+
+## Input
+
+- The developer's feature intent in prose (what, why, measurable success criteria).
+- The project root path.
+- `.pharaoh/project/` tailoring: `artefact-catalog.yaml`, `id-conventions.yaml`, lifecycle
+  config, and any domain-specific checklists.
+- `pharaoh.toml` at the project root and `.pharaoh/project/` tailoring (inputs for deriving
+  tier order, described in Phase 1 below).
+
+## Output
+
+- A set of linked sphinx-needs artefacts forming the V-model graph, one per tier.
+- A clean `sphinx-build -W` after each tier.
+- A pharaoh-quality-gate verdict with review coverage confirmed for every drafted artefact.
+
+## Process
+
+### Phase 0: Elicitation
+
+Before drafting anything. This is a dialogue, not a form.
+
+1. Read project context: `.pharaoh/project/` tailoring (types, tiers, lifecycle),
+   `pharaoh.toml`, and a sample of recent needs from the corpus.
+2. Ask the developer clarifying questions ONE at a time: purpose, constraints, success
+   criteria, and every parameter value that would otherwise be guessed.
+3. Decompose the feature into individual requirements and present that decomposition.
+4. Gate: the developer approves the requirement list. No invented value survives into
+   a requirement.
+
+### Phase 1 onward: walk the V-model tiers
+
+Create a run directory at `.pharaoh/runs/<timestamp>/` before the first tier. Use it for
+all review JSON written during the tier loop. The Terminal step passes this directory to
+`pharaoh-quality-gate`.
+
+Derive the V-model tier order by checking these sources in priority order. First, look for
+an explicit tier or chain declaration in `pharaoh.toml` or `.pharaoh/project/`. If none
+exists, topologically sort the `required_links` chain pairs from `[pharaoh.traceability]`
+when they cover all tiers in use. If that is still incomplete, infer the order from the
+artefact-catalog types together with the link structure observed in the existing corpus
+(`needs.json`). Present the derived tier order to the developer and get confirmation before
+any drafting begins.
+
+Do not hardcode tier depth. For each tier, in order, run this loop:
+
+| step | action |
+|------|--------|
+| draft | Dispatch the tier's atomic draft skill once per artefact. The draft skill self-invokes its review and returns `{artefact, review}`. |
+| evaluate | Read the attached review. If `overall: fail`, `overall: needs_work`, or any binary axis has `score: 0`, re-dispatch the draft skill with the review action items folded into the description. Use `pharaoh-req-regenerate` for requirements, re-invoke the draft skill directly for arch and vplan. |
+| normalise | If the project traces with a generic link field (read from `ubproject.toml [needs.links]` and the existing corpus), rewrite the drafted directive's typed link option (`:satisfies:` or `:verifies:`) to that field. If `ubproject.toml` is absent or has no `[needs.links]` table, keep the typed link as-is. |
+| persist | Write the artefact into the docs tree. Write its review JSON into the run directory so `pharaoh-quality-gate` can confirm review coverage. |
+| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. |
+| checkpoint | Present the tier's artefacts to the developer. Get approval before the next tier. |
+
+At the implementation tier the agent writes the code directly, following test-driven
+development practice (write the failing test first, then make it pass, then refactor). No
+Pharaoh atomic skill governs this tier. Once the implementation is complete, run
+`pharaoh-req-codelink-annotate` with `file_path`, `anchor`, `project_root`, and `mode`
+supplied to link the finished code back into the requirement graph.
+
+### Terminal
+
+Aggregate the persisted review JSONs into the summary YAML that `pharaoh-quality-gate`
+expects. Run `pharaoh-quality-gate`. The deliverable is a V-model graph where every tier
+traces to the next, every artefact has a review on disk, and the build is clean.
+
+## The baseline this skill exists to stop
+
+Handed "add feature X, do spec-driven development" without this skill, an agent will:
+
+- Invent the requirements itself, with fabricated thresholds, parameters, and acceptance
+  criteria, instead of eliciting them from the developer.
+- Run requirement, design, test, and code in one unattended pass with no human in the loop.
+- Draft every artefact and review none.
+- Build once at the end without `-W`, accepting warnings silently.
+
+Each of those is a defect. The process above closes them.
+
+## Rationalisations: STOP
+
+| Excuse | Reality |
+|--------|---------|
+| "I can infer the parameters myself." | An inferred parameter is a fabricated requirement. Elicit it. |
+| "The build succeeded, it is fine." | A build without `-W` passes on warnings. Run `-W`. |
+| "The draft already passed, skip the review." | The draft skill attaches a review. Read it. `overall: needs_work` triggers a re-dispatch just as `fail` does. |
+| "It is a small feature, skip the gate." | Small features are where requirements get invented. The gate holds. |
+| "Running the whole chain is faster." | An unattended chain bakes in decisions the developer never approved. Stop at every checkpoint. |
+
+## Red flags: STOP and return to the last passed gate
+
+- About to write a requirement value the developer never gave you.
+- About to draft tier N+1 while tier N has a `fail`, `needs_work`, or `score: 0` review.
+- About to move past a tier with `sphinx-build` warnings.
+- Ran draft, design, and test with no human pause between them.

From fd6562d4b13131392a6933f2858940a573ddb8ee Mon Sep 17 00:00:00 2001
From: Bartosz Burda <bartoszburda93@gmail.com>
Date: Mon, 18 May 2026 18:13:20 +0200
Subject: [PATCH 2/3] docs: register pharaoh-sdd skill (agent, README,
 copilot-instructions)

Signed-off-by: Bartosz Burda <bartoszburda93@gmail.com>
---
 .github/agents/pharaoh.sdd.agent.md | 12 ++++++++++++
 .github/copilot-instructions.md     |  1 +
 README.md                           | 10 ++++++----
 3 files changed, 19 insertions(+), 4 deletions(-)
 create mode 100644 .github/agents/pharaoh.sdd.agent.md

diff --git a/.github/agents/pharaoh.sdd.agent.md b/.github/agents/pharaoh.sdd.agent.md
new file mode 100644
index 0000000..8076593
--- /dev/null
+++ b/.github/agents/pharaoh.sdd.agent.md
@@ -0,0 +1,12 @@
+---
+description: Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with `.pharaoh/project/` tailoring. Non-atomic orchestrator: elicits requirements from the developer (Phase 0), then walks the V-model tiers one at a time, dispatching atomic draft and review skills, running `sphinx-build -W` after each tier, and waiting for human approval before advancing. Ends with a `pharaoh-quality-gate` pass confirming review coverage and trace completeness.
+handoffs: []
+---
+
+# @pharaoh.sdd
+
+Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with `.pharaoh/project/` tailoring.
+
+Non-atomic orchestrator: elicits requirements from the developer (Phase 0), then walks the V-model tiers one at a time, dispatching atomic draft and review skills, running `sphinx-build -W` after each tier, and waiting for human approval before advancing. Ends with a `pharaoh-quality-gate` pass confirming review coverage and trace completeness.
+
+See [`skills/pharaoh-sdd/SKILL.md`](../../skills/pharaoh-sdd/SKILL.md) for the full specification -- inputs, outputs, the iron gate contract, and the tier loop detail.
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index ff9f389..125f853 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -13,6 +13,7 @@ Pharaoh is a skill-based AI assistant framework for sphinx-needs projects. It he
 | Agent | Purpose |
 |-------|---------|
 | `@pharaoh.setup` | Scaffold Pharaoh into a project -- detect structure, generate `pharaoh.toml` |
+| `@pharaoh.sdd` | Non-atomic V-model SDD orchestrator -- elicit requirements, walk tiers with human approval at every checkpoint, end with quality-gate |
 | `@pharaoh.change` | Analyze impact of a change -- trace through needs links and codelinks, produce a Change Document |
 | `@pharaoh.trace` | Navigate traceability in any direction -- show everything linked to a need across all levels |
 | `@pharaoh.mece` | Gap and redundancy analysis -- find orphans, missing links, MECE violations |
diff --git a/README.md b/README.md
index 75fd6b4..7339043 100644
--- a/README.md
+++ b/README.md
@@ -55,16 +55,18 @@ copilot plugin install pharaoh@pharaoh-dev
 
 ## Skills / Agents
 
-71 atomic skills, organised by purpose. Names below are the Claude Code
-slash form `pharaoh:pharaoh-<name>` (the `pharaoh-` prefix is part of each
-skill's own name). The GitHub Copilot equivalent strips the redundant
-prefix: `@pharaoh.<name>`.
+Skills organised by purpose. Names below are the Claude Code slash form
+`pharaoh:pharaoh-<name>` (the `pharaoh-` prefix is part of each skill's
+own name). The GitHub Copilot equivalent strips the redundant prefix:
+`@pharaoh.<name>`. Most skills are atomic (one artefact × one phase).
+`pharaoh-sdd` is the non-atomic V-model SDD entry point.
 
 **Core workflow:**
 
 | Skill | Purpose |
 |-------|---------|
 | `pharaoh:pharaoh-setup` | Set up Pharaoh in a sphinx-needs project -- detect structure, scaffold Copilot agents |
+| `pharaoh:pharaoh-sdd` | Non-atomic V-model SDD orchestrator -- elicit requirements, walk tiers with human approval at every checkpoint, end with quality-gate |
 | `pharaoh:pharaoh-change` | Analyze the impact of a requirement change, including traceability to code via codelinks |
 | `pharaoh:pharaoh-trace` | Navigate traceability links across requirements, specs, implementations, tests, and code |
 | `pharaoh:pharaoh-mece` | Gap and redundancy analysis -- orphans, missing links, MECE violations |

From 4e6c01ceff91816126b0c0cdbde20010bd44b00d Mon Sep 17 00:00:00 2001
From: Bartosz Burda <bartoszburda93@gmail.com>
Date: Mon, 18 May 2026 18:22:32 +0200
Subject: [PATCH 3/3] fix: correct pharaoh-sdd terminal and persist contracts
 per final review

- Terminal step now invokes pharaoh-quality-gate with project_root and
  run directory via self_review_coverage invariant; artefacts_summary_path
  marked optional because pharaoh-sdd runs no mece or coverage-gap tasks
- persist step names exact review JSON filename conventions matched by
  pharaoh-self-review-coverage-check (<id>_review.json, <id>_diagram_review.json,
  <id>_code_grounding.json)
- ubproject.toml added to Input section with note that [needs.links] is
  read from it for link-field convention; [pharaoh.traceability] references
  now clearly attributed to pharaoh.toml
- data-access and strictness note added: concerns are delegated to atomic skills
- needs.json output location clarified with typical demo path
- implementation tier gains explicit sphinx-build -W rebuild and developer
  checkpoint after pharaoh-req-codelink-annotate, consistent with other tiers

Signed-off-by: Bartosz Burda <bartoszburda93@gmail.com>
---
 skills/pharaoh-sdd/SKILL.md | 28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/skills/pharaoh-sdd/SKILL.md b/skills/pharaoh-sdd/SKILL.md
index 6cfee3d..d01f62a 100644
--- a/skills/pharaoh-sdd/SKILL.md
+++ b/skills/pharaoh-sdd/SKILL.md
@@ -52,6 +52,11 @@ with no sphinx-needs structure.
   config, and any domain-specific checklists.
 - `pharaoh.toml` at the project root and `.pharaoh/project/` tailoring (inputs for deriving
   tier order, described in Phase 1 below).
+- `ubproject.toml` at the project root: `[needs.links]` is read from it for the project's
+  link-field convention (used in the `normalise` step of the tier loop).
+
+Data-access and strictness concerns are handled by the dispatched atomic skills. pharaoh-sdd
+does not restate them.
 
 ## Output
 
@@ -81,8 +86,8 @@ all review JSON written during the tier loop. The Terminal step passes this dire
 
 Derive the V-model tier order by checking these sources in priority order. First, look for
 an explicit tier or chain declaration in `pharaoh.toml` or `.pharaoh/project/`. If none
-exists, topologically sort the `required_links` chain pairs from `[pharaoh.traceability]`
-when they cover all tiers in use. If that is still incomplete, infer the order from the
+exists, topologically sort the `required_links` chain pairs from the `[pharaoh.traceability]`
+table in `pharaoh.toml` when they cover all tiers in use. If that is still incomplete, infer the order from the
 artefact-catalog types together with the link structure observed in the existing corpus
 (`needs.json`). Present the derived tier order to the developer and get confirmation before
 any drafting begins.
@@ -94,21 +99,28 @@ Do not hardcode tier depth. For each tier, in order, run this loop:
 | draft | Dispatch the tier's atomic draft skill once per artefact. The draft skill self-invokes its review and returns `{artefact, review}`. |
 | evaluate | Read the attached review. If `overall: fail`, `overall: needs_work`, or any binary axis has `score: 0`, re-dispatch the draft skill with the review action items folded into the description. Use `pharaoh-req-regenerate` for requirements, re-invoke the draft skill directly for arch and vplan. |
 | normalise | If the project traces with a generic link field (read from `ubproject.toml [needs.links]` and the existing corpus), rewrite the drafted directive's typed link option (`:satisfies:` or `:verifies:`) to that field. If `ubproject.toml` is absent or has no `[needs.links]` table, keep the typed link as-is. |
-| persist | Write the artefact into the docs tree. Write its review JSON into the run directory so `pharaoh-quality-gate` can confirm review coverage. |
-| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. |
+| persist | Write the artefact into the docs tree. Write the review JSON into the run directory using the filename convention `<id>_review.json` (for req-review), `<id>_arch_review.json` (for arch-review), and `<id>_vplan_review.json` (for vplan-review). These names are matched by `pharaoh-self-review-coverage-check`. |
+| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. The `needs.json` output location is the project's configured path, resolvable from `conf.py` or `ubproject.toml` (on a typical demo it is `docs/_build/html/needs.json`). |
 | checkpoint | Present the tier's artefacts to the developer. Get approval before the next tier. |
 
 At the implementation tier the agent writes the code directly, following test-driven
 development practice (write the failing test first, then make it pass, then refactor). No
 Pharaoh atomic skill governs this tier. Once the implementation is complete, run
 `pharaoh-req-codelink-annotate` with `file_path`, `anchor`, `project_root`, and `mode`
-supplied to link the finished code back into the requirement graph.
+supplied to link the finished code back into the requirement graph. After annotation, run
+`sphinx-build -W` to confirm the build is clean with the code links in place. Present the
+completed implementation and annotation to the developer and get approval before proceeding
+to the Terminal step.
 
 ### Terminal
 
-Aggregate the persisted review JSONs into the summary YAML that `pharaoh-quality-gate`
-expects. Run `pharaoh-quality-gate`. The deliverable is a V-model graph where every tier
-traces to the next, every artefact has a review on disk, and the build is clean.
+Run `pharaoh-quality-gate` passing `project_root` and the run directory (via
+`gate_spec.invariants.self_review_coverage`). The `self_review_coverage` invariant reads the
+run directory directly and confirms every drafted artefact has a matching review JSON on
+disk. `artefacts_summary_path` is optional and may be omitted here because the pharaoh-sdd
+chain runs draft-and-review per tier but does not run `pharaoh-mece` or
+`pharaoh-coverage-gap`. The deliverable is a V-model graph where every tier traces to the
+next, every artefact has a review on disk, and the build is clean.
 
 ## The baseline this skill exists to stop