diff --git a/docs/brownfield-workflow.adoc b/docs/brownfield-workflow.adoc index 428dba6..13b8fbf 100644 --- a/docs/brownfield-workflow.adoc +++ b/docs/brownfield-workflow.adoc @@ -163,7 +163,7 @@ The LLM synthesizes the answered questions plus the code evidence from Phase 1 i * *arc42* with all 12 chapters from Q-3 branch * *Nygard ADRs* with Pugh Matrix from Q-3.9 branch -Every claim references a Question ID and marks team-provided information with `(team answer)`. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. +Code-derived claims carry the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable artifact; team-provided information is marked `(team answer)`. The Question Tree is temporary scaffolding, so Q-IDs are not written into the final documents — during synthesis every claim is traced back to a leaf as a build-time check. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. === Establish Baseline Tests @@ -261,7 +261,7 @@ Stable code that nobody touches does not need specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID. Mark team input with (team answer).` +|`Synthesize self-contained documentation from the Question Tree and team answers. Cite file:line evidence for code-derived claims, mark team input with (team answer), keep deferred questions as explicit gaps. Q-IDs stay out of the output.` |link:#/spec-driven-development[Spec-Driven Workflow] |Baseline Tests diff --git a/docs/brownfield-workflow.de.adoc b/docs/brownfield-workflow.de.adoc index d82e728..6d38eea 100644 --- a/docs/brownfield-workflow.de.adoc +++ b/docs/brownfield-workflow.de.adoc @@ -161,7 +161,7 @@ Das LLM synthetisiert die beantworteten Fragen plus die Code-Evidenz aus Phase 1 * *arc42* mit allen 12 Kapiteln aus dem Q3-Ast * *Nygard-ADRs* mit Pugh-Matrix aus dem Q3.9-Ast -Jede Aussage referenziert eine Question-ID und markiert teamgegebene Information mit `(team answer)`. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. +Code-basierte Aussagen tragen die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf — eine Referenz auf den Code, das einzige dauerhafte Artefakt; teamgegebene Information wird mit `(team answer)` markiert. Der Question Tree ist temporäres Gerüst, daher landen Q-IDs nicht in den finalen Dokumenten — beim Synthetisieren wird jede Aussage als Build-Time-Prüfung auf ein Leaf zurückgeführt. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. === Basis-Tests aufbauen @@ -259,7 +259,7 @@ Stabiler Code, den niemand anfasst, braucht keine Specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID. Mark team input with (team answer).` +|`Synthesize self-contained documentation from the Question Tree and team answers. Cite file:line evidence for code-derived claims, mark team input with (team answer), keep deferred questions as explicit gaps. Q-IDs stay out of the output.` |link:#/spec-driven-development[Spec-Driven Workflow] |Basis-Tests diff --git a/docs/socratic-recovery-skill.adoc b/docs/socratic-recovery-skill.adoc index 7700d23..0ee80b0 100644 --- a/docs/socratic-recovery-skill.adoc +++ b/docs/socratic-recovery-skill.adoc @@ -22,7 +22,7 @@ Outputs two AsciiDoc files: `QUESTION_TREE.adoc` (full reasoning trace) and `OPE === Phase 2 — Synthesize documentation -The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Every claim cites a Q-ID; team-supplied facts are marked `(team answer)`. +The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf, and team-supplied facts are marked `(team answer)`. The Question Tree is temporary scaffolding, so Q-IDs stay out of the final documents. == When to use it @@ -87,7 +87,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the The skill enforces a two-phase workflow: build a Question Tree first ([ANSWERED] with code evidence vs [OPEN] with role), let the team answer -the OPEN leaves, then synthesize documentation with full Q-ID traceability. +the OPEN leaves, then synthesize self-contained documentation that traces +every claim to code evidence or a team answer. ---- === link:https://github.com/google-gemini/gemini-cli[Gemini CLI] @@ -105,7 +106,9 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite Q-IDs in every claim. +the OPEN leaves. The docs must be self-contained: cite file:line evidence +for code-derived claims, mark team input with (team answer). Q-IDs stay +out of the output. ---- === link:https://docs.cursor.com/[Cursor] @@ -138,8 +141,9 @@ Recovery workflow at https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-theory-recovery Two phases: first a Question Tree separating code-derivable facts from -open questions routed by role; second, synthesis with Q-ID traceability -after the team fills the gaps. +open questions routed by role; second, synthesis into self-contained +documentation — code-evidenced or team-answered — after the team fills +the gaps. ---- === link:https://kiro.dev/[Amazon Kiro] diff --git a/docs/socratic-recovery-skill.de.adoc b/docs/socratic-recovery-skill.de.adoc index 3adb01f..65c9ad5 100644 --- a/docs/socratic-recovery-skill.de.adoc +++ b/docs/socratic-recovery-skill.de.adoc @@ -22,7 +22,7 @@ Output sind zwei AsciiDoc-Dateien: `QUESTION_TREE.adoc` (vollständige Begründu === Phase 2 — Dokumentation synthetisieren -Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Jede Aussage zitiert eine Q-ID; team-gegebene Fakten sind mit `(team answer)` markiert. +Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Code-basierte Aussagen zitieren die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf, team-gegebene Fakten sind mit `(team answer)` markiert. Der Question Tree ist temporäres Gerüst, daher landen Q-IDs nicht in den finalen Dokumenten. == Wann zu verwenden @@ -87,7 +87,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the The skill enforces a two-phase workflow: build a Question Tree first ([ANSWERED] with code evidence vs [OPEN] with role), let the team answer -the OPEN leaves, then synthesize documentation with full Q-ID traceability. +the OPEN leaves, then synthesize self-contained documentation that traces +every claim to code evidence or a team answer. ---- === link:https://github.com/google-gemini/gemini-cli[Gemini CLI] @@ -105,7 +106,9 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite Q-IDs in every claim. +the OPEN leaves. The docs must be self-contained: cite file:line evidence +for code-derived claims, mark team input with (team answer). Q-IDs stay +out of the output. ---- === link:https://docs.cursor.com/[Cursor] @@ -138,8 +141,9 @@ Recovery workflow at https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-theory-recovery Two phases: first a Question Tree separating code-derivable facts from -open questions routed by role; second, synthesis with Q-ID traceability -after the team fills the gaps. +open questions routed by role; second, synthesis into self-contained +documentation — code-evidenced or team-answered — after the team fills +the gaps. ---- === link:https://kiro.dev/[Amazon Kiro] diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md index 29b0bd9..c1ef585 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md @@ -65,7 +65,7 @@ The fix: model the gaps explicitly. Every question about the system is either `[ ┌────────────────────────────────┐ Phase 2 │ Answered tree ──► Docs │ │ PRD · Cockburn UCs · arc42 · │ - │ Nygard ADRs (every claim Q-ID) │ + │ Nygard ADRs (claims cite code) │ └────────────────────────────────┘ ``` @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable, canonical artifact. Team-supplied information is marked `(team answer)`. The Question Tree is temporary scaffolding, so its Q-IDs are not written into the final documents; during synthesis every claim is still traced back to a leaf as a build-time check. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index a875d16..a8370a9 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,12 +50,23 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets: - "The system uses Hexagonal Architecture [Q3.5]." -- Team-supplied facts get an inline marker: "Sessions expire after 24 hours - (team answer, Q3.4.2)." +- The synthesized documentation must be self-contained. The Question Tree + is temporary scaffolding — it is renumbered on every re-run — so Q-IDs + must NOT appear in the output. While synthesizing, trace every claim + back to a leaf: each claim must come from an [ANSWERED] leaf or an + answered [OPEN] leaf. This tracing is a build-time check, not something + written into the documents. +- A claim backed by an [ANSWERED] leaf cites the code evidence from that + leaf — the reference to the code, the only durable, canonical artifact: + "The system uses Hexagonal Architecture [src/app/Ports.java, + src/adapter/JpaOrderRepository.java:30]." + Copy the Evidence line verbatim from the leaf; do not invent, shorten, + or re-derive file paths. A leaf with no Evidence line is not [ANSWERED] + and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer): + "Sessions expire after 24 hours (team answer)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are - deferred (Q4.1.deferred) and must be resolved before the next release." + deferred and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it under-specified — that is signal, not a defect. diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md index 5b2e600..96e394a 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md @@ -141,13 +141,19 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID: +The synthesized documentation must be self-contained. The Question Tree is temporary scaffolding — it is renumbered on every re-run — so its Q-IDs are NOT carried into the final documents. During Phase 2, every claim is traced back to a leaf as a build-time check; what gets *written* is the durable reference only: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions -expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). -Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved +The system uses Hexagonal Architecture [src/app/Ports.java, +src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours +(team answer). Quality-goal priorities are deferred and must be resolved before the next release. ``` -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention. +The three forms are deliberate: + +- `[file:line, ...]` — code-derived fact. Copied verbatim from the `Evidence` line of the `[ANSWERED]` leaf; it points at the code, the only canonical, persistent artifact. +- `(team answer)` — team-supplied fact. No code evidence exists; the marker tells the reader a human asserted this and it must be re-verified with a human, not derived from code. +- `deferred` — a known gap, stated explicitly, not a fact. + +This is the auditable trace: a code-derived claim without its `file:line` evidence is incomplete; a fact that is neither code-evidenced nor marked `(team answer)` is invention. diff --git a/skill/socratic-code-theory-recovery/SKILL.md b/skill/socratic-code-theory-recovery/SKILL.md index 29b0bd9..c1ef585 100644 --- a/skill/socratic-code-theory-recovery/SKILL.md +++ b/skill/socratic-code-theory-recovery/SKILL.md @@ -65,7 +65,7 @@ The fix: model the gaps explicitly. Every question about the system is either `[ ┌────────────────────────────────┐ Phase 2 │ Answered tree ──► Docs │ │ PRD · Cockburn UCs · arc42 · │ - │ Nygard ADRs (every claim Q-ID) │ + │ Nygard ADRs (claims cite code) │ └────────────────────────────────┘ ``` @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable, canonical artifact. Team-supplied information is marked `(team answer)`. The Question Tree is temporary scaffolding, so its Q-IDs are not written into the final documents; during synthesis every claim is still traced back to a leaf as a build-time check. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index a875d16..a8370a9 100644 --- a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,12 +50,23 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets: - "The system uses Hexagonal Architecture [Q3.5]." -- Team-supplied facts get an inline marker: "Sessions expire after 24 hours - (team answer, Q3.4.2)." +- The synthesized documentation must be self-contained. The Question Tree + is temporary scaffolding — it is renumbered on every re-run — so Q-IDs + must NOT appear in the output. While synthesizing, trace every claim + back to a leaf: each claim must come from an [ANSWERED] leaf or an + answered [OPEN] leaf. This tracing is a build-time check, not something + written into the documents. +- A claim backed by an [ANSWERED] leaf cites the code evidence from that + leaf — the reference to the code, the only durable, canonical artifact: + "The system uses Hexagonal Architecture [src/app/Ports.java, + src/adapter/JpaOrderRepository.java:30]." + Copy the Evidence line verbatim from the leaf; do not invent, shorten, + or re-derive file paths. A leaf with no Evidence line is not [ANSWERED] + and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer): + "Sessions expire after 24 hours (team answer)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are - deferred (Q4.1.deferred) and must be resolved before the next release." + deferred and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it under-specified — that is signal, not a defect. diff --git a/skill/socratic-code-theory-recovery/references/output-schema.md b/skill/socratic-code-theory-recovery/references/output-schema.md index 5b2e600..96e394a 100644 --- a/skill/socratic-code-theory-recovery/references/output-schema.md +++ b/skill/socratic-code-theory-recovery/references/output-schema.md @@ -141,13 +141,19 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID: +The synthesized documentation must be self-contained. The Question Tree is temporary scaffolding — it is renumbered on every re-run — so its Q-IDs are NOT carried into the final documents. During Phase 2, every claim is traced back to a leaf as a build-time check; what gets *written* is the durable reference only: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions -expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). -Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved +The system uses Hexagonal Architecture [src/app/Ports.java, +src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours +(team answer). Quality-goal priorities are deferred and must be resolved before the next release. ``` -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention. +The three forms are deliberate: + +- `[file:line, ...]` — code-derived fact. Copied verbatim from the `Evidence` line of the `[ANSWERED]` leaf; it points at the code, the only canonical, persistent artifact. +- `(team answer)` — team-supplied fact. No code evidence exists; the marker tells the reader a human asserted this and it must be re-verified with a human, not derived from code. +- `deferred` — a known gap, stated explicitly, not a fact. + +This is the auditable trace: a code-derived claim without its `file:line` evidence is incomplete; a fact that is neither code-evidenced nor marked `(team answer)` is invention.