Skip to content

fix(socratic-recovery): make synthesized docs self-contained, cite code not Q-IDs#491

Merged
rdmueller merged 2 commits into
LLM-Coding:mainfrom
raifdmueller:fix/phase2-code-evidence-citations
May 17, 2026
Merged

fix(socratic-recovery): make synthesized docs self-contained, cite code not Q-IDs#491
rdmueller merged 2 commits into
LLM-Coding:mainfrom
raifdmueller:fix/phase2-code-evidence-citations

Conversation

@raifdmueller
Copy link
Copy Markdown
Contributor

@raifdmueller raifdmueller commented May 17, 2026

Summary

In the brownfield Socratic Code-Theory Recovery workflow, Phase 2 synthesized documentation cited the Q-ID (e.g. [Q3.5]). The Q-ID points into the Question Tree — which is temporary scaffolding, renumbered on every Phase 1 re-run. A Q-ID baked into permanent documentation is a reference into a discarded, renumbered artifact: dead at best, wrong after a re-run.

The synthesized documentation is now self-contained — it cites only durable references:

The system uses Hexagonal Architecture [src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions expire after 24 hours (team answer). Quality-goal priorities are deferred and must be resolved before the next release.

  • code-derived claimfile:line evidence, copied verbatim from the [ANSWERED] leaf — a pointer at the code, the only canonical, persistent artifact.
  • team-supplied fact(team answer), no external reference needed.
  • deferred question → an explicit gap.

The Q-ID stays a Phase 2 build-time device: during synthesis every claim must trace back to a leaf, but the Q-ID is not emitted into the output. This also makes the re-run/diff workflow more robust — claims correlate by code location, not by Q-IDs that change every run.

Changes

  • Skill: phase-2-synthesize.md, output-schema.md, SKILL.md (incl. the workflow diagram); plugin mirror auto-synced by the pre-commit hook.
  • Website docs: brownfield-workflow.adoc / .de.adoc (incl. the copy-paste Phase 2 cheat-sheet prompt), socratic-recovery-skill.adoc / .de.adoc.
  • The brownfield-fair-comparison experiment report is left as-is — it records a past experiment run.

Test plan

  • npm run build succeeds; brownfield + socratic-recovery-skill pages render
  • Phase 2 cheat-sheet prompt on the live brownfield page says Q-IDs stay out of the output

🤖 Generated with Claude Code

Phase 2 synthesized documentation cited only the Q-ID (e.g. [Q3.5]).
The file:line code evidence sits in the [ANSWERED] leaf of the Question
Tree, so a reader had to open the tree to find where a claim came from.

Phase 2 now copies the Evidence line from the [ANSWERED] leaf into the
citation alongside the Q-ID — [Q3.5; src/app/Ports.java:12] — so the
source location is visible in the documentation itself. Team answers
keep the (team answer, Q-ID) form (no code evidence exists); deferred
questions stay explicit gaps.

Updated the skill (phase-2-synthesize prompt, output-schema, SKILL.md)
and the website brownfield-workflow / socratic-recovery-skill docs
including the copy-paste Phase 2 cheat-sheet prompt, EN and DE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

Walkthrough

Die Pull Request aktualisiert durchgehend die Dokumentation und Prompts zum Socratic-Code-Theory-Recovery-Skill, um die Phase-2-Syntheseregeln präziser zu definieren. Alle änderungen beschreiben die gleiche Traceability-Anforderung: Code-abgeleitete Claims müssen Q-IDs zitieren und verbatim file:line-Evidence aus [ANSWERED]-Leaves enthalten; Team-Inputs sind als (team answer) zu markieren.

Changes

Phase-2-Traceability-Anforderungen

Layer / File(s) Summary
Brownfield-Workflow-Dokumentation
docs/brownfield-workflow.adoc, docs/brownfield-workflow.de.adoc
Phase-2-Beschreibung und Prompt-Cheat-Sheet in englischer und deutscher Version werden aktualisiert, um zu verdeutlichen, dass code-basierte Claims zusätzlich zur Q-ID die file:line-Evidenz aus dem entsprechenden [ANSWERED]-Leaf mitführen müssen; Team-Inputs bleiben als (team answer) gekennzeichnet.
Socratic-Recovery-Skill-Dokumentation
docs/socratic-recovery-skill.adoc, docs/socratic-recovery-skill.de.adoc
Englische und deutsche Skill-Dokumentation präzisiert die Traceability-Anforderungen: Phase-2-Synthese-Beschreibung und Codex-Setup-Anweisungen verlangen Q-ID-Zitierung in jedem Claim sowie file:line-Evidence für code-basierte Aussagen aus [ANSWERED]-Leaves.
Plugin-Skill-Implementierung: Phase 2 konkretisiert
plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md, plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md, plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md
Detaillierte Spezifikation der Phase-2-Traceability: SKILL.md verdeutlicht Code-Evidence-Anforderung; Phase-2-Syntheseprompt wird erweitert mit Regeln für verbatim Evidence-Übernahme, Evidence-freie Claims dürfen nicht zitiert werden, Team-Antworten erhalten formalisierte Markierung; Output-Schema definiert drei erlaubte Zitationsformate (code-derived mit Evidence, team-supplied ohne Evidence, deferred-Gaps).
Standalone-Skill-Implementierung: Phase 2 konkretisiert
skill/socratic-code-theory-recovery/SKILL.md, skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md, skill/socratic-code-theory-recovery/references/output-schema.md
Identische Aktualisierungen wie Plugin-Version: SKILL.md-Dokumentation, Phase-2-Syntheseprompt mit tightened Traceability-Regeln und detailliertem Evidence-Handling, Output-Schema mit explizit definierten Zitationsformaten und Anforderung für auditable Traces.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • LLM-Coding/Semantic-Anchors#455: Modifiziert ebenfalls die Brownfield-Workflow-Dokumentation zu Phase-2-Traceability-Regeln mit Fokus auf Q-ID- und Evidence-Anforderungen.
  • LLM-Coding/Semantic-Anchors#478: Aktualisiert die Socratic-Code-Theory-Recovery-Skill-Phase-2-Prompts und Output-Schema mit identischen Traceability-Präzisierungen.
  • LLM-Coding/Semantic-Anchors#479: Überlappt bei Phase-2-Traceability-Dokumentation und Prompts mit Fokus auf Q-ID-Zitierung und verbatim [ANSWERED] file:line-Evidence.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive Der Titel bezieht sich auf ein reales Aspekt der Änderungen (Code-Evidence in Phase-2-Zitationen), ist aber zu vage und teilweise ungenau für die Hauptänderung. Der Titel sollte präziser sein: z.B. 'fix(socratic-recovery): Code-Evidence in Phase-2-Zitationen kopieren' wäre aussagekräftiger und würde die zentrale Änderung deutlicher beschreiben.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md`:
- Around line 146-152: The fenced code block in the markdown lacks a language
specifier (triggering MD040); update the opening fence from ``` to include a
language (e.g., ```text) so the block is explicitly declared (modify the fenced
block that contains "The system uses Hexagonal Architecture ..." to start with
```text).

In `@skill/socratic-code-theory-recovery/references/output-schema.md`:
- Around line 146-152: Die Markdown-Fenced-Code-Block im Abschnitt mit dem Text
"The system uses Hexagonal Architecture..." fehlt eine Sprachangabe, wodurch
MD040 ausgelöst wird; fix: addiere unmittelbar nach den drei Backticks eine
Sprache (z.B. "text") so der Fence lautet ```text, damit der Block korrekt
getaggt ist; suche den dreifachen-Backtick-Fence um den Absatz (die vorhandene
``` ohne Sprache) und ersetze ihn durch ```text (behalte den inneren Inhalt
unverändert).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: b2cd45b7-6202-4ab5-9ec5-a8ce87328a25

📥 Commits

Reviewing files that changed from the base of the PR and between e9b2066 and 16e6dd8.

📒 Files selected for processing (10)
  • docs/brownfield-workflow.adoc
  • docs/brownfield-workflow.de.adoc
  • docs/socratic-recovery-skill.adoc
  • docs/socratic-recovery-skill.de.adoc
  • plugins/semantic-anchors/skills/socratic-code-theory-recovery/SKILL.md
  • plugins/semantic-anchors/skills/socratic-code-theory-recovery/prompts/phase-2-synthesize.md
  • plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md
  • skill/socratic-code-theory-recovery/SKILL.md
  • skill/socratic-code-theory-recovery/prompts/phase-2-synthesize.md
  • skill/socratic-code-theory-recovery/references/output-schema.md

Comment on lines 146 to 152
```
The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions
The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture;
src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions
expire after 24 hours (team answer, Q3.8.Security.SessionLifetime).
Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved
before the next release.
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fenced Code Block mit Sprache deklarieren

Der Codeblock sollte eine Sprache bekommen, damit MD040 nicht mehr anschlägt.

Vorgeschlagener Fix
-```
+```text
 The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture;
 src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions
 expire after 24 hours (team answer, Q3.8.Security.SessionLifetime).
 Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved
 before the next release.
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
@plugins/semantic-anchors/skills/socratic-code-theory-recovery/references/output-schema.md
around lines 146 - 152, The fenced code block in the markdown lacks a language
specifier (triggering MD040); update the opening fence from to include a language (e.g.,text) so the block is explicitly declared (modify the fenced
block that contains "The system uses Hexagonal Architecture ..." to start with

Comment on lines 146 to 152
```
The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture]. Sessions
The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture;
src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions
expire after 24 hours (team answer, Q3.8.Security.SessionLifetime).
Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved
before the next release.
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fenced Code Block mit Sprache deklarieren

Hier fehlt die Sprachangabe am Fence; das triggert MD040.

Vorgeschlagener Fix
-```
+```text
 The system uses Hexagonal Architecture [Q3.9.HexagonalArchitecture;
 src/app/Ports.java, src/adapter/JpaOrderRepository.java:30]. Sessions
 expire after 24 hours (team answer, Q3.8.Security.SessionLifetime).
 Quality-goal priorities are deferred (Q4.0.deferred) and must be resolved
 before the next release.
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @skill/socratic-code-theory-recovery/references/output-schema.md around lines
146 - 152, Die Markdown-Fenced-Code-Block im Abschnitt mit dem Text "The system
uses Hexagonal Architecture..." fehlt eine Sprachangabe, wodurch MD040 ausgelöst
wird; fix: addiere unmittelbar nach den drei Backticks eine Sprache (z.B.
"text") so der Fence lautet text, damit der Block korrekt getaggt ist; suche den dreifachen-Backtick-Fence um den Absatz (die vorhandene ohne Sprache)
und ersetze ihn durch ```text (behalte den inneren Inhalt unverändert).


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

… only

The earlier change in this PR put both the Q-ID and the file:line
evidence into the final documentation. But the Question Tree is
temporary scaffolding — it is renumbered on every Phase 1 re-run — so a
Q-ID baked into permanent documentation is a reference into a discarded,
renumbered artifact: dead at best, wrong at worst.

The synthesized documentation must be self-contained. It cites only
durable references:
- code-derived claim: the file:line evidence, copied verbatim from the
  [ANSWERED] leaf — a pointer at the code, the only canonical artifact.
- team-supplied fact: marked (team answer), no external reference needed.
- deferred question: an explicit gap.

The Q-ID stays a Phase 2 build-time device: during synthesis every
claim must trace back to a leaf, but the Q-ID is not emitted into the
output. This also makes the re-run/diff workflow more robust — claims
correlate by code location, not by Q-IDs that change every run.

Updates the skill (phase-2-synthesize, output-schema, SKILL.md incl.
the workflow diagram) and the website brownfield-workflow /
socratic-recovery-skill docs, EN and DE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@raifdmueller raifdmueller changed the title fix(socratic-recovery): carry code evidence into Phase 2 citations fix(socratic-recovery): make synthesized docs self-contained, cite code not Q-IDs May 17, 2026
@rdmueller rdmueller merged commit 00aa614 into LLM-Coding:main May 17, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants