From 16e6dd804dd4a251a96411e7acfdfea786d4e2f8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?R=7BAI=7Df=20D=2E=20M=C3=BCller?= Date: Sun, 17 May 2026 12:37:08 +0200 Subject: [PATCH 1/2] fix(socratic-recovery): carry code evidence into Phase 2 citations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 2 synthesized documentation cited only the Q-ID (e.g. [Q3.5]). The file:line code evidence sits in the [ANSWERED] leaf of the Question Tree, so a reader had to open the tree to find where a claim came from. Phase 2 now copies the Evidence line from the [ANSWERED] leaf into the citation alongside the Q-ID — [Q3.5; src/app/Ports.java:12] — so the source location is visible in the documentation itself. Team answers keep the (team answer, Q-ID) form (no code evidence exists); deferred questions stay explicit gaps. Updated the skill (phase-2-synthesize prompt, output-schema, SKILL.md) and the website brownfield-workflow / socratic-recovery-skill docs including the copy-paste Phase 2 cheat-sheet prompt, EN and DE. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/brownfield-workflow.adoc | 4 ++-- docs/brownfield-workflow.de.adoc | 4 ++-- docs/socratic-recovery-skill.adoc | 5 +++-- docs/socratic-recovery-skill.de.adoc | 5 +++-- .../socratic-code-theory-recovery/SKILL.md | 2 +- .../prompts/phase-2-synthesize.md | 16 ++++++++++++---- .../references/output-schema.md | 13 ++++++++++--- skill/socratic-code-theory-recovery/SKILL.md | 2 +- .../prompts/phase-2-synthesize.md | 16 ++++++++++++---- .../references/output-schema.md | 13 ++++++++++--- 10 files changed, 56 insertions(+), 24 deletions(-) diff --git a/docs/brownfield-workflow.adoc b/docs/brownfield-workflow.adoc index 428dba6..b451701 100644 --- a/docs/brownfield-workflow.adoc +++ b/docs/brownfield-workflow.adoc @@ -163,7 +163,7 @@ The LLM synthesizes the answered questions plus the code evidence from Phase 1 i * *arc42* with all 12 chapters from Q-3 branch * *Nygard ADRs* with Pugh Matrix from Q-3.9 branch -Every claim references a Question ID and marks team-provided information with `(team answer)`. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. +Every claim references a Question ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree; team-provided information is marked `(team answer)`. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. === Establish Baseline Tests @@ -261,7 +261,7 @@ Stable code that nobody touches does not need specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID. Mark team input with (team answer).` +|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID; for code-derived claims, also cite the file:line evidence from the ANSWERED leaf. Mark team input with (team answer).` |link:#/spec-driven-development[Spec-Driven Workflow] |Baseline Tests diff --git a/docs/brownfield-workflow.de.adoc b/docs/brownfield-workflow.de.adoc index d82e728..22043e8 100644 --- a/docs/brownfield-workflow.de.adoc +++ b/docs/brownfield-workflow.de.adoc @@ -161,7 +161,7 @@ Das LLM synthetisiert die beantworteten Fragen plus die Code-Evidenz aus Phase 1 * *arc42* mit allen 12 Kapiteln aus dem Q3-Ast * *Nygard-ADRs* mit Pugh-Matrix aus dem Q3.9-Ast -Jede Aussage referenziert eine Question-ID und markiert teamgegebene Information mit `(team answer)`. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. +Jede Aussage referenziert eine Question-ID. Eine code-basierte Aussage trägt zusätzlich die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf — in die Zitation kopiert, damit die Quelle sichtbar ist, ohne den Question Tree zu öffnen; teamgegebene Information wird mit `(team answer)` markiert. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. === Basis-Tests aufbauen @@ -259,7 +259,7 @@ Stabiler Code, den niemand anfasst, braucht keine Specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID. Mark team input with (team answer).` +|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID; for code-derived claims, also cite the file:line evidence from the ANSWERED leaf. Mark team input with (team answer).` |link:#/spec-driven-development[Spec-Driven Workflow] |Basis-Tests diff --git a/docs/socratic-recovery-skill.adoc b/docs/socratic-recovery-skill.adoc index 7700d23..8686ee4 100644 --- a/docs/socratic-recovery-skill.adoc +++ b/docs/socratic-recovery-skill.adoc @@ -22,7 +22,7 @@ Outputs two AsciiDoc files: `QUESTION_TREE.adoc` (full reasoning trace) and `OPE === Phase 2 — Synthesize documentation -The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Every claim cites a Q-ID; team-supplied facts are marked `(team answer)`. +The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Every claim cites a Q-ID; code-derived claims also carry the `file:line` evidence from their `[ANSWERED]` leaf, and team-supplied facts are marked `(team answer)`. == When to use it @@ -105,7 +105,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite Q-IDs in every claim. +the OPEN leaves. Cite a Q-ID in every claim; for code-derived claims, +also cite the file:line evidence from the ANSWERED leaf. ---- === link:https://docs.cursor.com/[Cursor] diff --git a/docs/socratic-recovery-skill.de.adoc b/docs/socratic-recovery-skill.de.adoc index 3adb01f..1aec0b7 100644 --- a/docs/socratic-recovery-skill.de.adoc +++ b/docs/socratic-recovery-skill.de.adoc @@ -22,7 +22,7 @@ Output sind zwei AsciiDoc-Dateien: `QUESTION_TREE.adoc` (vollständige Begründu === Phase 2 — Dokumentation synthetisieren -Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Jede Aussage zitiert eine Q-ID; team-gegebene Fakten sind mit `(team answer)` markiert. +Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Jede Aussage zitiert eine Q-ID; code-basierte Aussagen tragen zusätzlich die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf, und team-gegebene Fakten sind mit `(team answer)` markiert. == Wann zu verwenden @@ -105,7 +105,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite Q-IDs in every claim. +the OPEN leaves. Cite a Q-ID in every claim; for code-derived claims, +also cite the file:line evidence from the ANSWERED leaf. ---- === link:https://docs.cursor.com/[Cursor] diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md index 29b0bd9..ade6407 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Every claim references a Q-ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index a875d16..755a815 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,10 +50,18 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets: - "The system uses Hexagonal Architecture [Q3.5]." -- Team-supplied facts get an inline marker: "Sessions expire after 24 hours - (team answer, Q3.4.2)." +- Every paragraph references the Q-IDs that support it, in square brackets. +- For a claim backed by an [ANSWERED] leaf, carry the code evidence from + that leaf into the citation alongside the Q-ID — so the reader sees the + source location without opening the Question Tree: + "The system uses Hexagonal Architecture [Q3.5; src/app/Ports.java, + src/adapter/JpaOrderRepository.java:30]." + Copy the Evidence line verbatim from the leaf; do not invent, shorten, + or re-derive file paths. If a leaf has no Evidence line it is not + [ANSWERED] and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer) + with the Q-ID only: "Sessions expire after 24 hours (team answer, + Q3.4.2)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are deferred (Q4.1.deferred) and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md index 5b2e600..9b93c33 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md @@ -141,13 +141,20 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID: +After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID. For a claim backed by an `[ANSWERED]` leaf, the citation also carries the code evidence copied from that leaf, so the reader sees the source location without opening the Question Tree: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions +The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture; +src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved before the next release. ``` -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention. +The three citation forms are deliberate: + +- `[Q-ID; file:line, ...]` — code-derived fact. The `file:line` part is the `Evidence` line of the `[ANSWERED]` leaf, copied verbatim. +- `(team answer, Q-ID)` — team-supplied fact. No code evidence exists; the Q-ID points to the answered `[OPEN]` leaf. +- `(Q-ID.deferred)` — an explicit gap, not a fact. + +This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention; a code-derived claim without its `file:line` evidence is incomplete. diff --git a/skill/socratic-code-theory-recovery/SKILL.md b/skill/socratic-code-theory-recovery/SKILL.md index 29b0bd9..ade6407 100644 --- a/skill/socratic-code-theory-recovery/SKILL.md +++ b/skill/socratic-code-theory-recovery/SKILL.md @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Every claim references a Q-ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index a875d16..755a815 100644 --- a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,10 +50,18 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets: - "The system uses Hexagonal Architecture [Q3.5]." -- Team-supplied facts get an inline marker: "Sessions expire after 24 hours - (team answer, Q3.4.2)." +- Every paragraph references the Q-IDs that support it, in square brackets. +- For a claim backed by an [ANSWERED] leaf, carry the code evidence from + that leaf into the citation alongside the Q-ID — so the reader sees the + source location without opening the Question Tree: + "The system uses Hexagonal Architecture [Q3.5; src/app/Ports.java, + src/adapter/JpaOrderRepository.java:30]." + Copy the Evidence line verbatim from the leaf; do not invent, shorten, + or re-derive file paths. If a leaf has no Evidence line it is not + [ANSWERED] and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer) + with the Q-ID only: "Sessions expire after 24 hours (team answer, + Q3.4.2)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are deferred (Q4.1.deferred) and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or diff --git a/skill/socratic-code-theory-recovery/references/output-schema.md b/skill/socratic-code-theory-recovery/references/output-schema.md index 5b2e600..9b93c33 100644 --- a/skill/socratic-code-theory-recovery/references/output-schema.md +++ b/skill/socratic-code-theory-recovery/references/output-schema.md @@ -141,13 +141,20 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID: +After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID. For a claim backed by an `[ANSWERED]` leaf, the citation also carries the code evidence copied from that leaf, so the reader sees the source location without opening the Question Tree: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions +The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture; +src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved before the next release. ``` -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention. +The three citation forms are deliberate: + +- `[Q-ID; file:line, ...]` — code-derived fact. The `file:line` part is the `Evidence` line of the `[ANSWERED]` leaf, copied verbatim. +- `(team answer, Q-ID)` — team-supplied fact. No code evidence exists; the Q-ID points to the answered `[OPEN]` leaf. +- `(Q-ID.deferred)` — an explicit gap, not a fact. + +This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention; a code-derived claim without its `file:line` evidence is incomplete. From 7e79cefaca05f731fc0a2864d67e04a0bcb186d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?R=7BAI=7Df=20D=2E=20M=C3=BCller?= Date: Sun, 17 May 2026 13:09:08 +0200 Subject: [PATCH 2/2] fix(socratic-recovery): keep Q-IDs out of synthesized docs, cite code only MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The earlier change in this PR put both the Q-ID and the file:line evidence into the final documentation. But the Question Tree is temporary scaffolding — it is renumbered on every Phase 1 re-run — so a Q-ID baked into permanent documentation is a reference into a discarded, renumbered artifact: dead at best, wrong at worst. The synthesized documentation must be self-contained. It cites only durable references: - code-derived claim: the file:line evidence, copied verbatim from the [ANSWERED] leaf — a pointer at the code, the only canonical artifact. - team-supplied fact: marked (team answer), no external reference needed. - deferred question: an explicit gap. The Q-ID stays a Phase 2 build-time device: during synthesis every claim must trace back to a leaf, but the Q-ID is not emitted into the output. This also makes the re-run/diff workflow more robust — claims correlate by code location, not by Q-IDs that change every run. Updates the skill (phase-2-synthesize, output-schema, SKILL.md incl. the workflow diagram) and the website brownfield-workflow / socratic-recovery-skill docs, EN and DE. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/brownfield-workflow.adoc | 4 +-- docs/brownfield-workflow.de.adoc | 4 +-- docs/socratic-recovery-skill.adoc | 15 ++++++----- docs/socratic-recovery-skill.de.adoc | 15 ++++++----- .../socratic-code-theory-recovery/SKILL.md | 4 +-- .../prompts/phase-2-synthesize.md | 25 +++++++++++-------- .../references/output-schema.md | 19 +++++++------- skill/socratic-code-theory-recovery/SKILL.md | 4 +-- .../prompts/phase-2-synthesize.md | 25 +++++++++++-------- .../references/output-schema.md | 19 +++++++------- 10 files changed, 72 insertions(+), 62 deletions(-) diff --git a/docs/brownfield-workflow.adoc b/docs/brownfield-workflow.adoc index b451701..13b8fbf 100644 --- a/docs/brownfield-workflow.adoc +++ b/docs/brownfield-workflow.adoc @@ -163,7 +163,7 @@ The LLM synthesizes the answered questions plus the code evidence from Phase 1 i * *arc42* with all 12 chapters from Q-3 branch * *Nygard ADRs* with Pugh Matrix from Q-3.9 branch -Every claim references a Question ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree; team-provided information is marked `(team answer)`. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. +Code-derived claims carry the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable artifact; team-provided information is marked `(team answer)`. The Question Tree is temporary scaffolding, so Q-IDs are not written into the final documents — during synthesis every claim is traced back to a leaf as a build-time check. This dual traceability (code evidence + team input) is the key difference from a simple reverse-engineering prompt. === Establish Baseline Tests @@ -261,7 +261,7 @@ Stable code that nobody touches does not need specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID; for code-derived claims, also cite the file:line evidence from the ANSWERED leaf. Mark team input with (team answer).` +|`Synthesize self-contained documentation from the Question Tree and team answers. Cite file:line evidence for code-derived claims, mark team input with (team answer), keep deferred questions as explicit gaps. Q-IDs stay out of the output.` |link:#/spec-driven-development[Spec-Driven Workflow] |Baseline Tests diff --git a/docs/brownfield-workflow.de.adoc b/docs/brownfield-workflow.de.adoc index 22043e8..6d38eea 100644 --- a/docs/brownfield-workflow.de.adoc +++ b/docs/brownfield-workflow.de.adoc @@ -161,7 +161,7 @@ Das LLM synthetisiert die beantworteten Fragen plus die Code-Evidenz aus Phase 1 * *arc42* mit allen 12 Kapiteln aus dem Q3-Ast * *Nygard-ADRs* mit Pugh-Matrix aus dem Q3.9-Ast -Jede Aussage referenziert eine Question-ID. Eine code-basierte Aussage trägt zusätzlich die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf — in die Zitation kopiert, damit die Quelle sichtbar ist, ohne den Question Tree zu öffnen; teamgegebene Information wird mit `(team answer)` markiert. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. +Code-basierte Aussagen tragen die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf — eine Referenz auf den Code, das einzige dauerhafte Artefakt; teamgegebene Information wird mit `(team answer)` markiert. Der Question Tree ist temporäres Gerüst, daher landen Q-IDs nicht in den finalen Dokumenten — beim Synthetisieren wird jede Aussage als Build-Time-Prüfung auf ein Leaf zurückgeführt. Diese doppelte Rückverfolgbarkeit (Code-Evidenz + Team-Input) ist der entscheidende Unterschied zu einem einfachen Reverse-Engineering-Prompt. === Basis-Tests aufbauen @@ -259,7 +259,7 @@ Stabiler Code, den niemand anfasst, braucht keine Specs. |{empty}-- |Theory Recovery (Phase 2) -|`Synthesize documentation from the Question Tree and team answers. Every claim references a Q-ID; for code-derived claims, also cite the file:line evidence from the ANSWERED leaf. Mark team input with (team answer).` +|`Synthesize self-contained documentation from the Question Tree and team answers. Cite file:line evidence for code-derived claims, mark team input with (team answer), keep deferred questions as explicit gaps. Q-IDs stay out of the output.` |link:#/spec-driven-development[Spec-Driven Workflow] |Basis-Tests diff --git a/docs/socratic-recovery-skill.adoc b/docs/socratic-recovery-skill.adoc index 8686ee4..0ee80b0 100644 --- a/docs/socratic-recovery-skill.adoc +++ b/docs/socratic-recovery-skill.adoc @@ -22,7 +22,7 @@ Outputs two AsciiDoc files: `QUESTION_TREE.adoc` (full reasoning trace) and `OPE === Phase 2 — Synthesize documentation -The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Every claim cites a Q-ID; code-derived claims also carry the `file:line` evidence from their `[ANSWERED]` leaf, and team-supplied facts are marked `(team answer)`. +The skill takes the answered tree and produces a PRD, Cockburn use cases, an arc42 architecture document, and Nygard ADRs with Pugh matrices. Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf, and team-supplied facts are marked `(team answer)`. The Question Tree is temporary scaffolding, so Q-IDs stay out of the final documents. == When to use it @@ -87,7 +87,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the The skill enforces a two-phase workflow: build a Question Tree first ([ANSWERED] with code evidence vs [OPEN] with role), let the team answer -the OPEN leaves, then synthesize documentation with full Q-ID traceability. +the OPEN leaves, then synthesize self-contained documentation that traces +every claim to code evidence or a team answer. ---- === link:https://github.com/google-gemini/gemini-cli[Gemini CLI] @@ -105,8 +106,9 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite a Q-ID in every claim; for code-derived claims, -also cite the file:line evidence from the ANSWERED leaf. +the OPEN leaves. The docs must be self-contained: cite file:line evidence +for code-derived claims, mark team input with (team answer). Q-IDs stay +out of the output. ---- === link:https://docs.cursor.com/[Cursor] @@ -139,8 +141,9 @@ Recovery workflow at https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-theory-recovery Two phases: first a Question Tree separating code-derivable facts from -open questions routed by role; second, synthesis with Q-ID traceability -after the team fills the gaps. +open questions routed by role; second, synthesis into self-contained +documentation — code-evidenced or team-answered — after the team fills +the gaps. ---- === link:https://kiro.dev/[Amazon Kiro] diff --git a/docs/socratic-recovery-skill.de.adoc b/docs/socratic-recovery-skill.de.adoc index 1aec0b7..65c9ad5 100644 --- a/docs/socratic-recovery-skill.de.adoc +++ b/docs/socratic-recovery-skill.de.adoc @@ -22,7 +22,7 @@ Output sind zwei AsciiDoc-Dateien: `QUESTION_TREE.adoc` (vollständige Begründu === Phase 2 — Dokumentation synthetisieren -Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Jede Aussage zitiert eine Q-ID; code-basierte Aussagen tragen zusätzlich die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf, und team-gegebene Fakten sind mit `(team answer)` markiert. +Der Skill nimmt den beantworteten Baum und erzeugt ein PRD, Cockburn Use Cases, eine arc42-Architekturbeschreibung und Nygard-ADRs mit Pugh-Matrix. Code-basierte Aussagen zitieren die `file:line`-Evidenz aus ihrem `[ANSWERED]`-Leaf, team-gegebene Fakten sind mit `(team answer)` markiert. Der Question Tree ist temporäres Gerüst, daher landen Q-IDs nicht in den finalen Dokumenten. == Wann zu verwenden @@ -87,7 +87,8 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the The skill enforces a two-phase workflow: build a Question Tree first ([ANSWERED] with code evidence vs [OPEN] with role), let the team answer -the OPEN leaves, then synthesize documentation with full Q-ID traceability. +the OPEN leaves, then synthesize self-contained documentation that traces +every claim to code evidence or a team answer. ---- === link:https://github.com/google-gemini/gemini-cli[Gemini CLI] @@ -105,8 +106,9 @@ https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-the Build a Question Tree before writing any documentation. Mark each leaf [ANSWERED] (with file:line evidence) or [OPEN] (with Category and Ask role). Synthesize docs from the answered tree only after the team has filled in -the OPEN leaves. Cite a Q-ID in every claim; for code-derived claims, -also cite the file:line evidence from the ANSWERED leaf. +the OPEN leaves. The docs must be self-contained: cite file:line evidence +for code-derived claims, mark team input with (team answer). Q-IDs stay +out of the output. ---- === link:https://docs.cursor.com/[Cursor] @@ -139,8 +141,9 @@ Recovery workflow at https://github.com/LLM-Coding/Semantic-Anchors/tree/main/skill/socratic-code-theory-recovery Two phases: first a Question Tree separating code-derivable facts from -open questions routed by role; second, synthesis with Q-ID traceability -after the team fills the gaps. +open questions routed by role; second, synthesis into self-contained +documentation — code-evidenced or team-answered — after the team fills +the gaps. ---- === link:https://kiro.dev/[Amazon Kiro] diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md index ade6407..c1ef585 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md @@ -65,7 +65,7 @@ The fix: model the gaps explicitly. Every question about the system is either `[ ┌────────────────────────────────┐ Phase 2 │ Answered tree ──► Docs │ │ PRD · Cockburn UCs · arc42 · │ - │ Nygard ADRs (every claim Q-ID) │ + │ Nygard ADRs (claims cite code) │ └────────────────────────────────┘ ``` @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable, canonical artifact. Team-supplied information is marked `(team answer)`. The Question Tree is temporary scaffolding, so its Q-IDs are not written into the final documents; during synthesis every claim is still traced back to a leaf as a build-time check. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index 755a815..a8370a9 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,20 +50,23 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets. -- For a claim backed by an [ANSWERED] leaf, carry the code evidence from - that leaf into the citation alongside the Q-ID — so the reader sees the - source location without opening the Question Tree: - "The system uses Hexagonal Architecture [Q3.5; src/app/Ports.java, +- The synthesized documentation must be self-contained. The Question Tree + is temporary scaffolding — it is renumbered on every re-run — so Q-IDs + must NOT appear in the output. While synthesizing, trace every claim + back to a leaf: each claim must come from an [ANSWERED] leaf or an + answered [OPEN] leaf. This tracing is a build-time check, not something + written into the documents. +- A claim backed by an [ANSWERED] leaf cites the code evidence from that + leaf — the reference to the code, the only durable, canonical artifact: + "The system uses Hexagonal Architecture [src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]." Copy the Evidence line verbatim from the leaf; do not invent, shorten, - or re-derive file paths. If a leaf has no Evidence line it is not - [ANSWERED] and must not be cited as fact. -- Team-supplied facts have no code evidence — mark them (team answer) - with the Q-ID only: "Sessions expire after 24 hours (team answer, - Q3.4.2)." + or re-derive file paths. A leaf with no Evidence line is not [ANSWERED] + and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer): + "Sessions expire after 24 hours (team answer)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are - deferred (Q4.1.deferred) and must be resolved before the next release." + deferred and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it under-specified — that is signal, not a defect. diff --git a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md index 9b93c33..96e394a 100644 --- a/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md +++ b/plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md @@ -141,20 +141,19 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID. For a claim backed by an `[ANSWERED]` leaf, the citation also carries the code evidence copied from that leaf, so the reader sees the source location without opening the Question Tree: +The synthesized documentation must be self-contained. The Question Tree is temporary scaffolding — it is renumbered on every re-run — so its Q-IDs are NOT carried into the final documents. During Phase 2, every claim is traced back to a leaf as a build-time check; what gets *written* is the durable reference only: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture; -src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions -expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). -Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved +The system uses Hexagonal Architecture [src/app/Ports.java, +src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours +(team answer). Quality-goal priorities are deferred and must be resolved before the next release. ``` -The three citation forms are deliberate: +The three forms are deliberate: -- `[Q-ID; file:line, ...]` — code-derived fact. The `file:line` part is the `Evidence` line of the `[ANSWERED]` leaf, copied verbatim. -- `(team answer, Q-ID)` — team-supplied fact. No code evidence exists; the Q-ID points to the answered `[OPEN]` leaf. -- `(Q-ID.deferred)` — an explicit gap, not a fact. +- `[file:line, ...]` — code-derived fact. Copied verbatim from the `Evidence` line of the `[ANSWERED]` leaf; it points at the code, the only canonical, persistent artifact. +- `(team answer)` — team-supplied fact. No code evidence exists; the marker tells the reader a human asserted this and it must be re-verified with a human, not derived from code. +- `deferred` — a known gap, stated explicitly, not a fact. -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention; a code-derived claim without its `file:line` evidence is incomplete. +This is the auditable trace: a code-derived claim without its `file:line` evidence is incomplete; a fact that is neither code-evidenced nor marked `(team answer)` is invention. diff --git a/skill/socratic-code-theory-recovery/SKILL.md b/skill/socratic-code-theory-recovery/SKILL.md index ade6407..c1ef585 100644 --- a/skill/socratic-code-theory-recovery/SKILL.md +++ b/skill/socratic-code-theory-recovery/SKILL.md @@ -65,7 +65,7 @@ The fix: model the gaps explicitly. Every question about the system is either `[ ┌────────────────────────────────┐ Phase 2 │ Answered tree ──► Docs │ │ PRD · Cockburn UCs · arc42 · │ - │ Nygard ADRs (every claim Q-ID) │ + │ Nygard ADRs (claims cite code) │ └────────────────────────────────┘ ``` @@ -104,7 +104,7 @@ Use [prompts/phase-2-synthesize.md](prompts/phase-2-synthesize.md). The Phase 2 - **arc42** with all 12 chapters from the Q3 branch - **Nygard ADRs** with Pugh Matrix from the Q3.9 branch -Every claim references a Q-ID. A code-derived claim also carries the `file:line` evidence from its `[ANSWERED]` leaf, copied into the citation so the source is visible without opening the Question Tree. Team-supplied information is marked `(team answer)`. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. +Code-derived claims cite the `file:line` evidence from their `[ANSWERED]` leaf — a reference to the code, the only durable, canonical artifact. Team-supplied information is marked `(team answer)`. The Question Tree is temporary scaffolding, so its Q-IDs are not written into the final documents; during synthesis every claim is still traced back to a leaf as a build-time check. This dual traceability — code evidence plus team input — is the difference from a simple reverse-engineering prompt that fills in gaps silently. ## What the LLM can and cannot recover diff --git a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md index 755a815..a8370a9 100644 --- a/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md +++ b/skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md @@ -50,20 +50,23 @@ Produce four artifacts: - Anchor: ADR according to Nygard Rules for traceability: -- Every paragraph references the Q-IDs that support it, in square brackets. -- For a claim backed by an [ANSWERED] leaf, carry the code evidence from - that leaf into the citation alongside the Q-ID — so the reader sees the - source location without opening the Question Tree: - "The system uses Hexagonal Architecture [Q3.5; src/app/Ports.java, +- The synthesized documentation must be self-contained. The Question Tree + is temporary scaffolding — it is renumbered on every re-run — so Q-IDs + must NOT appear in the output. While synthesizing, trace every claim + back to a leaf: each claim must come from an [ANSWERED] leaf or an + answered [OPEN] leaf. This tracing is a build-time check, not something + written into the documents. +- A claim backed by an [ANSWERED] leaf cites the code evidence from that + leaf — the reference to the code, the only durable, canonical artifact: + "The system uses Hexagonal Architecture [src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]." Copy the Evidence line verbatim from the leaf; do not invent, shorten, - or re-derive file paths. If a leaf has no Evidence line it is not - [ANSWERED] and must not be cited as fact. -- Team-supplied facts have no code evidence — mark them (team answer) - with the Q-ID only: "Sessions expire after 24 hours (team answer, - Q3.4.2)." + or re-derive file paths. A leaf with no Evidence line is not [ANSWERED] + and must not be cited as fact. +- Team-supplied facts have no code evidence — mark them (team answer): + "Sessions expire after 24 hours (team answer)." - Deferred questions stay as explicit gaps: "Quality-goal priorities are - deferred (Q4.1.deferred) and must be resolved before the next release." + deferred and must be resolved before the next release." - Do not introduce facts that do not appear in QUESTION_TREE.adoc or OPEN_QUESTIONS.adoc. If a Section feels under-specified, leave it under-specified — that is signal, not a defect. diff --git a/skill/socratic-code-theory-recovery/references/output-schema.md b/skill/socratic-code-theory-recovery/references/output-schema.md index 9b93c33..96e394a 100644 --- a/skill/socratic-code-theory-recovery/references/output-schema.md +++ b/skill/socratic-code-theory-recovery/references/output-schema.md @@ -141,20 +141,19 @@ _(write here)_ ## Phase 2 traceability -After Phase 2, every paragraph in the synthesized documentation cites at least one Q-ID. For a claim backed by an `[ANSWERED]` leaf, the citation also carries the code evidence copied from that leaf, so the reader sees the source location without opening the Question Tree: +The synthesized documentation must be self-contained. The Question Tree is temporary scaffolding — it is renumbered on every re-run — so its Q-IDs are NOT carried into the final documents. During Phase 2, every claim is traced back to a leaf as a build-time check; what gets *written* is the durable reference only: ``` -The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture; -src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions -expire after 24 hours (team answer, Q3.8.Security.SessionLifetime). -Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved +The system uses Hexagonal Architecture [src/app/Ports.java, +src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours +(team answer). Quality-goal priorities are deferred and must be resolved before the next release. ``` -The three citation forms are deliberate: +The three forms are deliberate: -- `[Q-ID; file:line, ...]` — code-derived fact. The `file:line` part is the `Evidence` line of the `[ANSWERED]` leaf, copied verbatim. -- `(team answer, Q-ID)` — team-supplied fact. No code evidence exists; the Q-ID points to the answered `[OPEN]` leaf. -- `(Q-ID.deferred)` — an explicit gap, not a fact. +- `[file:line, ...]` — code-derived fact. Copied verbatim from the `Evidence` line of the `[ANSWERED]` leaf; it points at the code, the only canonical, persistent artifact. +- `(team answer)` — team-supplied fact. No code evidence exists; the marker tells the reader a human asserted this and it must be re-verified with a human, not derived from code. +- `deferred` — a known gap, stated explicitly, not a fact. -This is the auditable trace from documentation back to either code evidence or a team answer. Anything without a Q-ID is invention; a code-derived claim without its `file:line` evidence is incomplete. +This is the auditable trace: a code-derived claim without its `file:line` evidence is incomplete; a fact that is neither code-evidenced nor marked `(team answer)` is invention.