From 7fa88c36ba2326670dd7f6c0f618bfc24ef72013 Mon Sep 17 00:00:00 2001 From: Bartosz Burda Date: Mon, 18 May 2026 17:56:58 +0200 Subject: [PATCH 1/3] feat: pharaoh-sdd orchestrator skill for gated V-model SDD Signed-off-by: Bartosz Burda --- skills/pharaoh-sdd/SKILL.md | 140 ++++++++++++++++++++++++++++++++++++ 1 file changed, 140 insertions(+) create mode 100644 skills/pharaoh-sdd/SKILL.md diff --git a/skills/pharaoh-sdd/SKILL.md b/skills/pharaoh-sdd/SKILL.md new file mode 100644 index 0000000..6cfee3d --- /dev/null +++ b/skills/pharaoh-sdd/SKILL.md @@ -0,0 +1,140 @@ +--- +name: pharaoh-sdd +description: Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with .pharaoh/project/ tailoring. +chains_from: [] +chains_to: [] +--- + +# pharaoh-sdd + +## Overview + +The graph is not the hard part. An agent will happily produce one. The hard part is not +fabricating it. Requirements come from the human, every artefact carries its review, and +every tier builds clean before the next. This skill enforces that contract. + +## Composition + +pharaoh-sdd is a non-atomic orchestrator. It does not draft or review artefacts directly. +It dispatches the following atomic skills: + +- `pharaoh-req-draft`: drafts requirement-shaped artefacts at any catalog-declared level +- `pharaoh-arch-draft`: drafts architecture artefacts +- `pharaoh-vplan-draft`: drafts verification plans and test cases +- `pharaoh-req-codelink-annotate`: links implemented code back into the requirement graph +- `pharaoh-quality-gate`: verifies review coverage and trace completeness at the end + +Each `*-draft` skill self-invokes its matching `*-review` as its last step and returns +`{artefact, review}`. The orchestrator reads that review. It does NOT call `*-review` +again for a freshly drafted artefact. + +## The Iron Gate + +Do NOT draft tier N+1 until tier N is drafted, reviewed, validated with `sphinx-build -W`, +and approved by the human. This holds for every project no matter how small the feature. + +Violating the letter of this gate is violating the spirit of it. + +## When to use + +- A developer asks to add a feature, capability, or behaviour change. +- The project has `.pharaoh/project/` tailoring and a sphinx-needs V-model. +- Use even for tiny features. Small features are where requirements get silently invented. + +When NOT to use: pure refactors, bug fixes against an existing requirement, or projects +with no sphinx-needs structure. + +## Input + +- The developer's feature intent in prose (what, why, measurable success criteria). +- The project root path. +- `.pharaoh/project/` tailoring: `artefact-catalog.yaml`, `id-conventions.yaml`, lifecycle + config, and any domain-specific checklists. +- `pharaoh.toml` at the project root and `.pharaoh/project/` tailoring (inputs for deriving + tier order, described in Phase 1 below). + +## Output + +- A set of linked sphinx-needs artefacts forming the V-model graph, one per tier. +- A clean `sphinx-build -W` after each tier. +- A pharaoh-quality-gate verdict with review coverage confirmed for every drafted artefact. + +## Process + +### Phase 0: Elicitation + +Before drafting anything. This is a dialogue, not a form. + +1. Read project context: `.pharaoh/project/` tailoring (types, tiers, lifecycle), + `pharaoh.toml`, and a sample of recent needs from the corpus. +2. Ask the developer clarifying questions ONE at a time: purpose, constraints, success + criteria, and every parameter value that would otherwise be guessed. +3. Decompose the feature into individual requirements and present that decomposition. +4. Gate: the developer approves the requirement list. No invented value survives into + a requirement. + +### Phase 1 onward: walk the V-model tiers + +Create a run directory at `.pharaoh/runs//` before the first tier. Use it for +all review JSON written during the tier loop. The Terminal step passes this directory to +`pharaoh-quality-gate`. + +Derive the V-model tier order by checking these sources in priority order. First, look for +an explicit tier or chain declaration in `pharaoh.toml` or `.pharaoh/project/`. If none +exists, topologically sort the `required_links` chain pairs from `[pharaoh.traceability]` +when they cover all tiers in use. If that is still incomplete, infer the order from the +artefact-catalog types together with the link structure observed in the existing corpus +(`needs.json`). Present the derived tier order to the developer and get confirmation before +any drafting begins. + +Do not hardcode tier depth. For each tier, in order, run this loop: + +| step | action | +|------|--------| +| draft | Dispatch the tier's atomic draft skill once per artefact. The draft skill self-invokes its review and returns `{artefact, review}`. | +| evaluate | Read the attached review. If `overall: fail`, `overall: needs_work`, or any binary axis has `score: 0`, re-dispatch the draft skill with the review action items folded into the description. Use `pharaoh-req-regenerate` for requirements, re-invoke the draft skill directly for arch and vplan. | +| normalise | If the project traces with a generic link field (read from `ubproject.toml [needs.links]` and the existing corpus), rewrite the drafted directive's typed link option (`:satisfies:` or `:verifies:`) to that field. If `ubproject.toml` is absent or has no `[needs.links]` table, keep the typed link as-is. | +| persist | Write the artefact into the docs tree. Write its review JSON into the run directory so `pharaoh-quality-gate` can confirm review coverage. | +| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. | +| checkpoint | Present the tier's artefacts to the developer. Get approval before the next tier. | + +At the implementation tier the agent writes the code directly, following test-driven +development practice (write the failing test first, then make it pass, then refactor). No +Pharaoh atomic skill governs this tier. Once the implementation is complete, run +`pharaoh-req-codelink-annotate` with `file_path`, `anchor`, `project_root`, and `mode` +supplied to link the finished code back into the requirement graph. + +### Terminal + +Aggregate the persisted review JSONs into the summary YAML that `pharaoh-quality-gate` +expects. Run `pharaoh-quality-gate`. The deliverable is a V-model graph where every tier +traces to the next, every artefact has a review on disk, and the build is clean. + +## The baseline this skill exists to stop + +Handed "add feature X, do spec-driven development" without this skill, an agent will: + +- Invent the requirements itself, with fabricated thresholds, parameters, and acceptance + criteria, instead of eliciting them from the developer. +- Run requirement, design, test, and code in one unattended pass with no human in the loop. +- Draft every artefact and review none. +- Build once at the end without `-W`, accepting warnings silently. + +Each of those is a defect. The process above closes them. + +## Rationalisations: STOP + +| Excuse | Reality | +|--------|---------| +| "I can infer the parameters myself." | An inferred parameter is a fabricated requirement. Elicit it. | +| "The build succeeded, it is fine." | A build without `-W` passes on warnings. Run `-W`. | +| "The draft already passed, skip the review." | The draft skill attaches a review. Read it. `overall: needs_work` triggers a re-dispatch just as `fail` does. | +| "It is a small feature, skip the gate." | Small features are where requirements get invented. The gate holds. | +| "Running the whole chain is faster." | An unattended chain bakes in decisions the developer never approved. Stop at every checkpoint. | + +## Red flags: STOP and return to the last passed gate + +- About to write a requirement value the developer never gave you. +- About to draft tier N+1 while tier N has a `fail`, `needs_work`, or `score: 0` review. +- About to move past a tier with `sphinx-build` warnings. +- Ran draft, design, and test with no human pause between them. From fd6562d4b13131392a6933f2858940a573ddb8ee Mon Sep 17 00:00:00 2001 From: Bartosz Burda Date: Mon, 18 May 2026 18:13:20 +0200 Subject: [PATCH 2/3] docs: register pharaoh-sdd skill (agent, README, copilot-instructions) Signed-off-by: Bartosz Burda --- .github/agents/pharaoh.sdd.agent.md | 12 ++++++++++++ .github/copilot-instructions.md | 1 + README.md | 10 ++++++---- 3 files changed, 19 insertions(+), 4 deletions(-) create mode 100644 .github/agents/pharaoh.sdd.agent.md diff --git a/.github/agents/pharaoh.sdd.agent.md b/.github/agents/pharaoh.sdd.agent.md new file mode 100644 index 0000000..8076593 --- /dev/null +++ b/.github/agents/pharaoh.sdd.agent.md @@ -0,0 +1,12 @@ +--- +description: Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with `.pharaoh/project/` tailoring. Non-atomic orchestrator: elicits requirements from the developer (Phase 0), then walks the V-model tiers one at a time, dispatching atomic draft and review skills, running `sphinx-build -W` after each tier, and waiting for human approval before advancing. Ends with a `pharaoh-quality-gate` pass confirming review coverage and trace completeness. +handoffs: [] +--- + +# @pharaoh.sdd + +Use when a developer wants to add a feature, capability, or behaviour change to a project that uses sphinx-needs and a V-model artefact structure, before any requirement or code is written. Triggers on "add a feature", "implement X", "let's build Y", or "do spec-driven development" in a project with `.pharaoh/project/` tailoring. + +Non-atomic orchestrator: elicits requirements from the developer (Phase 0), then walks the V-model tiers one at a time, dispatching atomic draft and review skills, running `sphinx-build -W` after each tier, and waiting for human approval before advancing. Ends with a `pharaoh-quality-gate` pass confirming review coverage and trace completeness. + +See [`skills/pharaoh-sdd/SKILL.md`](../../skills/pharaoh-sdd/SKILL.md) for the full specification -- inputs, outputs, the iron gate contract, and the tier loop detail. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index ff9f389..125f853 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -13,6 +13,7 @@ Pharaoh is a skill-based AI assistant framework for sphinx-needs projects. It he | Agent | Purpose | |-------|---------| | `@pharaoh.setup` | Scaffold Pharaoh into a project -- detect structure, generate `pharaoh.toml` | +| `@pharaoh.sdd` | Non-atomic V-model SDD orchestrator -- elicit requirements, walk tiers with human approval at every checkpoint, end with quality-gate | | `@pharaoh.change` | Analyze impact of a change -- trace through needs links and codelinks, produce a Change Document | | `@pharaoh.trace` | Navigate traceability in any direction -- show everything linked to a need across all levels | | `@pharaoh.mece` | Gap and redundancy analysis -- find orphans, missing links, MECE violations | diff --git a/README.md b/README.md index 75fd6b4..7339043 100644 --- a/README.md +++ b/README.md @@ -55,16 +55,18 @@ copilot plugin install pharaoh@pharaoh-dev ## Skills / Agents -71 atomic skills, organised by purpose. Names below are the Claude Code -slash form `pharaoh:pharaoh-` (the `pharaoh-` prefix is part of each -skill's own name). The GitHub Copilot equivalent strips the redundant -prefix: `@pharaoh.`. +Skills organised by purpose. Names below are the Claude Code slash form +`pharaoh:pharaoh-` (the `pharaoh-` prefix is part of each skill's +own name). The GitHub Copilot equivalent strips the redundant prefix: +`@pharaoh.`. Most skills are atomic (one artefact × one phase). +`pharaoh-sdd` is the non-atomic V-model SDD entry point. **Core workflow:** | Skill | Purpose | |-------|---------| | `pharaoh:pharaoh-setup` | Set up Pharaoh in a sphinx-needs project -- detect structure, scaffold Copilot agents | +| `pharaoh:pharaoh-sdd` | Non-atomic V-model SDD orchestrator -- elicit requirements, walk tiers with human approval at every checkpoint, end with quality-gate | | `pharaoh:pharaoh-change` | Analyze the impact of a requirement change, including traceability to code via codelinks | | `pharaoh:pharaoh-trace` | Navigate traceability links across requirements, specs, implementations, tests, and code | | `pharaoh:pharaoh-mece` | Gap and redundancy analysis -- orphans, missing links, MECE violations | From 4e6c01ceff91816126b0c0cdbde20010bd44b00d Mon Sep 17 00:00:00 2001 From: Bartosz Burda Date: Mon, 18 May 2026 18:22:32 +0200 Subject: [PATCH 3/3] fix: correct pharaoh-sdd terminal and persist contracts per final review - Terminal step now invokes pharaoh-quality-gate with project_root and run directory via self_review_coverage invariant; artefacts_summary_path marked optional because pharaoh-sdd runs no mece or coverage-gap tasks - persist step names exact review JSON filename conventions matched by pharaoh-self-review-coverage-check (_review.json, _diagram_review.json, _code_grounding.json) - ubproject.toml added to Input section with note that [needs.links] is read from it for link-field convention; [pharaoh.traceability] references now clearly attributed to pharaoh.toml - data-access and strictness note added: concerns are delegated to atomic skills - needs.json output location clarified with typical demo path - implementation tier gains explicit sphinx-build -W rebuild and developer checkpoint after pharaoh-req-codelink-annotate, consistent with other tiers Signed-off-by: Bartosz Burda --- skills/pharaoh-sdd/SKILL.md | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/skills/pharaoh-sdd/SKILL.md b/skills/pharaoh-sdd/SKILL.md index 6cfee3d..d01f62a 100644 --- a/skills/pharaoh-sdd/SKILL.md +++ b/skills/pharaoh-sdd/SKILL.md @@ -52,6 +52,11 @@ with no sphinx-needs structure. config, and any domain-specific checklists. - `pharaoh.toml` at the project root and `.pharaoh/project/` tailoring (inputs for deriving tier order, described in Phase 1 below). +- `ubproject.toml` at the project root: `[needs.links]` is read from it for the project's + link-field convention (used in the `normalise` step of the tier loop). + +Data-access and strictness concerns are handled by the dispatched atomic skills. pharaoh-sdd +does not restate them. ## Output @@ -81,8 +86,8 @@ all review JSON written during the tier loop. The Terminal step passes this dire Derive the V-model tier order by checking these sources in priority order. First, look for an explicit tier or chain declaration in `pharaoh.toml` or `.pharaoh/project/`. If none -exists, topologically sort the `required_links` chain pairs from `[pharaoh.traceability]` -when they cover all tiers in use. If that is still incomplete, infer the order from the +exists, topologically sort the `required_links` chain pairs from the `[pharaoh.traceability]` +table in `pharaoh.toml` when they cover all tiers in use. If that is still incomplete, infer the order from the artefact-catalog types together with the link structure observed in the existing corpus (`needs.json`). Present the derived tier order to the developer and get confirmation before any drafting begins. @@ -94,21 +99,28 @@ Do not hardcode tier depth. For each tier, in order, run this loop: | draft | Dispatch the tier's atomic draft skill once per artefact. The draft skill self-invokes its review and returns `{artefact, review}`. | | evaluate | Read the attached review. If `overall: fail`, `overall: needs_work`, or any binary axis has `score: 0`, re-dispatch the draft skill with the review action items folded into the description. Use `pharaoh-req-regenerate` for requirements, re-invoke the draft skill directly for arch and vplan. | | normalise | If the project traces with a generic link field (read from `ubproject.toml [needs.links]` and the existing corpus), rewrite the drafted directive's typed link option (`:satisfies:` or `:verifies:`) to that field. If `ubproject.toml` is absent or has no `[needs.links]` table, keep the typed link as-is. | -| persist | Write the artefact into the docs tree. Write its review JSON into the run directory so `pharaoh-quality-gate` can confirm review coverage. | -| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. | +| persist | Write the artefact into the docs tree. Write the review JSON into the run directory using the filename convention `_review.json` (for req-review), `_arch_review.json` (for arch-review), and `_vplan_review.json` (for vplan-review). These names are matched by `pharaoh-self-review-coverage-check`. | +| rebuild | Run `sphinx-build -W` on the docs. The tier is not done until the build is clean and `needs.json` regenerates with the new needs. The `needs.json` output location is the project's configured path, resolvable from `conf.py` or `ubproject.toml` (on a typical demo it is `docs/_build/html/needs.json`). | | checkpoint | Present the tier's artefacts to the developer. Get approval before the next tier. | At the implementation tier the agent writes the code directly, following test-driven development practice (write the failing test first, then make it pass, then refactor). No Pharaoh atomic skill governs this tier. Once the implementation is complete, run `pharaoh-req-codelink-annotate` with `file_path`, `anchor`, `project_root`, and `mode` -supplied to link the finished code back into the requirement graph. +supplied to link the finished code back into the requirement graph. After annotation, run +`sphinx-build -W` to confirm the build is clean with the code links in place. Present the +completed implementation and annotation to the developer and get approval before proceeding +to the Terminal step. ### Terminal -Aggregate the persisted review JSONs into the summary YAML that `pharaoh-quality-gate` -expects. Run `pharaoh-quality-gate`. The deliverable is a V-model graph where every tier -traces to the next, every artefact has a review on disk, and the build is clean. +Run `pharaoh-quality-gate` passing `project_root` and the run directory (via +`gate_spec.invariants.self_review_coverage`). The `self_review_coverage` invariant reads the +run directory directly and confirms every drafted artefact has a matching review JSON on +disk. `artefacts_summary_path` is optional and may be omitted here because the pharaoh-sdd +chain runs draft-and-review per tier but does not run `pharaoh-mece` or +`pharaoh-coverage-gap`. The deliverable is a V-model graph where every tier traces to the +next, every artefact has a review on disk, and the build is clean. ## The baseline this skill exists to stop