diff --git a/.agents/plugins/marketplace.json b/.agents/plugins/marketplace.json index dd9982d..e3a7d03 100644 --- a/.agents/plugins/marketplace.json +++ b/.agents/plugins/marketplace.json @@ -1,5 +1,5 @@ { - "name": "llmdoc-local", + "name": "llmdoc-cc-plugin", "interface": { "displayName": "llmdoc Local Plugins" }, diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 63cdc5c..3e877f3 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -4,7 +4,9 @@ "name": "TokenRoll", "email": "shuaiqijianaho@qq.com" }, - "description": "Marketplace for the minimal llmdoc Claude Code workflow", + "metadata": { + "description": "Marketplace for the minimal llmdoc Claude Code workflow" + }, "plugins": [ { "name": "llmdoc", diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 6ffecce..2d3df29 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "llmdoc", "description": "llmdoc Claude Code plugin with a minimal workflow: init, update, and use", - "version": "2.0.0", + "version": "3.0.0", "author": { "name": "DJJ & Danniel" } diff --git a/.claude/settings.local.json b/.claude/settings.local.json new file mode 100644 index 0000000..0967ef4 --- /dev/null +++ b/.claude/settings.local.json @@ -0,0 +1 @@ +{} diff --git a/.codex/agents/llmdoc-investigator.toml b/.codex/agents/llmdoc-investigator.toml index cac80c8..a8ffcc1 100644 --- a/.codex/agents/llmdoc-investigator.toml +++ b/.codex/agents/llmdoc-investigator.toml @@ -17,7 +17,14 @@ Then follow this protocol: - Investigate source code to fill gaps left by the docs. - Prefer file-level and symbol-level references. - Only add line numbers when needed to prove subtle or disputed behavior. -- If asked to write findings to disk, write temporary scratch reports under .llmdoc-tmp/investigations/. +- Brief budget: for depth=deep, limit each brief to ≤5 questions and ≤15 specific files or symbols. If the scope exceeds this, cover only the highest-priority questions and report the rest as gaps for follow-up. Do not attempt a single pass that would exhaust the context window. +- For file-sink runs, require TOPIC and OUTPUT_PATH from the caller. Treat OUTPUT_PATH as the canonical persisted artifact for the topic. +- If asked to write findings to disk, first draft the full markdown report and append as the very last line. This sentinel lets the main agent detect truncation: a file without the sentinel is treated as context_overflow, not persisted. +- First attempt to Write the full markdown report to . Only after that primary write attempt, try a best-effort sidecar Write of the same markdown to .sidecar.md. The sidecar is a recovery lane for cases where the tool-framework transport loses the return payload; it must never replace the primary artifact or block the run if it fails. +- When the primary write succeeds, return STATUS: persisted with TOPIC, OUTPUT_PATH, and SIDECAR_PATH. +- When the primary write fails, return STATUS: write_failed_fallback_ready with TOPIC, OUTPUT_PATH, SIDECAR_PATH, FAILURE_TYPE, FAILURE_MESSAGE, and the full REPORT_MARKDOWN inside a fenced markdown block so the main agent can persist it. +- Report SIDECAR_PATH: none only when the sidecar write also failed; otherwise name the sidecar path explicitly. +- Do not claim persistence unless the primary write to OUTPUT_PATH actually succeeded. A sidecar-only write is not persisted. - Do not silently redesign the system. Return evidence, findings, gaps, and recommended next reads. """ nickname_candidates = ["Scout", "Trace", "Atlas"] diff --git a/AGENTS.example.md b/AGENTS.example.md index 56c6236..4c64d0e 100644 --- a/AGENTS.example.md +++ b/AGENTS.example.md @@ -1,5 +1,7 @@ # Load The `llmdoc` Skill First +Always answer in 简体中文 + Before broad source-code exploration, planning, or documentation work, load the `llmdoc` skill. The main assistant should align with the user before non-trivial plans or edits. diff --git a/README.md b/README.md index 5060b6b..1e1b832 100644 --- a/README.md +++ b/README.md @@ -54,9 +54,38 @@ The command: 1. Inspects the repo 2. Creates the llmdoc directory structure -3. Runs multi-investigator temporary scratch work with explicit coverage checks, then a follow-up gap-check pass -4. Generates initial MUST, overview, architecture, and reference docs -5. Synchronizes `llmdoc/index.md` +3. Runs a short pre-investigation user calibration; pressing Enter with no extra reply continues with repository evidence +4. Runs size-aware, theme-driven multi-investigator temporary scratch work with explicit persistence checks, subagent-to-main-agent write fallback, and targeted follow-up passes instead of rerunning the whole repo +5. Shows a required post-investigation concept list so the user can generate docs now or add terms, emphasis, or conventions +6. Generates initial MUST, overview, architecture, and reference docs +7. Synchronizes `llmdoc/index.md` +8. Removes `.llmdoc-tmp/` after the stable docs are complete + +Repository size is estimated from first-party source files and tests after excluding dependency, generated, cache, and VCS directories such as `node_modules/`, `dist/`, `build/`, `.next/`, `coverage/`, `vendor/`, and `.git/`. Lockfiles, generated artifacts, vendored code, and cache directories do not count toward the thresholds. The current size bands are: small `<= 1000 LOC`, medium `1001-5000 LOC`, large `> 5000 LOC`. + +The first stable pass stays depth-first, but the core-doc target is now size-aware: small and medium repositories usually start with `2-3` deep architecture or reference docs, while large repositories can start with `3-5` when they have distinct invariant clusters worth documenting separately. + +If an investigator can return a report but cannot write its scratch file, init now falls back to having the coordinating assistant persist the returned markdown to the same `.llmdoc-tmp/investigations/` path. Investigators also perform a best-effort sidecar write to `.sidecar.md`, so that even if the tool-framework transport drops the return payload (for example `Tool result missing due to internal error`), the coordinating assistant can recover the report from disk without re-materializing the markdown through the model. A sidecar is only a recovery source: if `output_path` is missing but the sidecar is complete, the coordinating assistant copies the sidecar back to `output_path` and verifies the restored canonical file before continuing. It should ask the user for write authorization only if the coordinating assistant's own fallback write or restore copy also fails. + +#### Investigator failure handling + +Init tracks four investigator result states and applies a different recovery path for each: + +| State | Trigger | Recovery | +|-------|---------|----------| +| `persisted` | Report written; `` sentinel present | Verify file and continue | +| `write_failed_fallback_ready` | Write failed; full markdown in return payload | Coordinating assistant writes to same path, verifies sentinel | +| `transport_failure` | Tool call returns internal error; no payload | Check `output_path`, then sidecar; if only the sidecar is complete, copy it back to `output_path`, verify, and rerun only if neither path is restorable | +| `context_overflow` | File present but sentinel missing (truncated) | Split brief into ≤3 narrower sub-briefs; route via follow-up slot | + +Each investigator report ends with the sentinel ``. A file without the sentinel is treated as truncated (`context_overflow`), not complete. The sentinel lives only in `.llmdoc-tmp/investigations/` scratch files and is removed with the directory at the end of init. + +Platform concurrency limits shape how recovery fan-out works: + +- **Claude Code**: hard cap of 10 concurrent subagents; excess requests are queued automatically. Recovery sub-briefs can be sent concurrently within this cap. +- **Codex**: bounded by `max_threads` and `max_depth` in `.codex/config.toml`. Recovery sub-briefs must stay within remaining budget; prefer sequential when budget is tight. + +Context overflow recovery does not rerun the same brief scope — that would overflow again. Instead, the topic is split and processed through the existing follow-up slot. ### `/llmdoc:update` @@ -69,11 +98,13 @@ The command: 1. Rebuilds context from llmdoc and the current working tree 2. Proactively reads relevant guides and reflections -3. Investigates impacted concepts +3. Recreates `.llmdoc-tmp/investigations/` on demand and investigates impacted concepts with the same persistence checks and fallback recovery used by init when file-sink scratch reports are needed 4. Writes a reflection under `llmdoc/memory/reflections/` 5. Updates stable docs 6. Synchronizes `llmdoc/index.md` +If `/llmdoc:update` needs file-sink scratch work after a previous init cleaned up `.llmdoc-tmp/`, it recreates `.llmdoc-tmp/investigations/` before launching the investigation or writing fallback artifacts. If the coordinating assistant cannot create that directory, cannot write a fallback report, or cannot restore `output_path` from a valid sidecar, it must pause and ask the user for write authorization instead of silently stalling. + In normal use, the main assistant should proactively ask whether to run `/llmdoc:update` when the task produced durable knowledge or a useful reflection. ## llmdoc Layout @@ -130,6 +161,15 @@ Then install this plugin marketplace and plugin: /plugin install llmdoc@llmdoc-cc-plugin ``` +Compatibility note: if Claude Code still has an older cached marketplace entry such as `tokenroll-cc-plugin`, reset the marketplace and reload plugins with: + +```bash +/plugin marketplace remove tokenroll-cc-plugin +/plugin marketplace add https://github.com/TokenRollAI/llmdoc +/plugin install llmdoc@llmdoc-cc-plugin +/reload-plugins +``` + After installation: 1. Copy [`CLAUDE.example.md`](CLAUDE.example.md) into `~/.claude/CLAUDE.md`. diff --git a/README.zh-CN.md b/README.zh-CN.md index 2408dad..bbffc93 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -54,9 +54,38 @@ 1. 检查仓库结构 2. 创建 llmdoc 目录骨架 -3. 启动多个 investigator 生成临时调查草稿,显式检查覆盖面,并补做一轮查缺补漏 -4. 生成初始 MUST、overview、architecture、reference 文档 -5. 同步 `llmdoc/index.md` +3. 进行一轮简短的调研前用户确认;如果无需补充,直接回车即可继续并按仓库证据推进 +4. 按项目体量和主题驱动 investigator 调查,显式检查调查报告是否已落盘,在 subagent 写入失败时由主 assistant 补写,并只做定向补查,不重跑整仓 +5. 展示一份必选的调查后概念列表,让用户直接生成文档,或补充术语、重点、约定 +6. 生成初始 MUST、overview、architecture、reference 文档 +7. 同步 `llmdoc/index.md` +8. 在稳定文档完成后移除 `.llmdoc-tmp/` + +项目体量的估算基于“排除依赖、生成物、缓存和 VCS 目录之后”的第一方源码与测试文件。至少会忽略 `node_modules/`、`dist/`、`build/`、`.next/`、`coverage/`、`vendor/` 和 `.git/`;lockfile、生成产物、vendored code 和缓存目录也不计入阈值。当前分档为:小型 `<= 1000 LOC`,中型 `1001-5000 LOC`,大型 `> 5000 LOC`。 + +首轮稳定文档仍然坚持“先深后广”,但核心文档数量现在按仓库规模放宽:小型和中型仓库通常先产出 `2-3` 篇深度 architecture / reference 文档,大型仓库如果确实存在多个应独立成文的不变量簇,可以先产出 `3-5` 篇。 + +如果 investigator 能返回调研结果但无法把草稿写入 `.llmdoc-tmp/investigations/`,init 现在会先由主 assistant 把返回的 markdown 补写到同一路径。investigator 在返回前还会尽量把同一份 markdown 再写入 `.sidecar.md`,这样即便工具框架传输层丢掉整个返回(例如 `Tool result missing due to internal error`),主 assistant 也能直接从磁盘恢复报告,而不用通过模型重新展开这份 markdown。sidecar 只是恢复来源:如果 `output_path` 缺失但 sidecar 完整,主 assistant 会先把 sidecar 复制回 `output_path`,验证恢复后的 canonical 文件,再继续后续流程。只有在主 assistant 的 fallback 写入或这次恢复 copy 也失败时,才会请求用户授权继续。 + +#### Investigator 失败处理 + +Init 追踪四种 investigator 结果状态,每种有不同的恢复路径: + +| 状态 | 触发条件 | 恢复路径 | +|------|---------|---------| +| `persisted` | 报告已写盘且含 `` 哨兵 | 验证文件后继续 | +| `write_failed_fallback_ready` | 写盘失败,返回完整 markdown | 主 assistant 写到同一路径并验证哨兵 | +| `transport_failure` | 工具调用返回 internal error,无 payload | 先查 `output_path`,再查 sidecar;如果只有 sidecar 完整,则先复制回 `output_path` 并验证;只有两条路径都不可恢复时才重跑 | +| `context_overflow` | 文件存在但哨兵缺失(被截断) | 拆分 brief 为 ≤3 个子 brief,走 follow-up 槽 | + +每个 investigator 报告的最后一行写入哨兵 ``。没有哨兵的文件视为截断(`context_overflow`),不会被当作已完成的报告。哨兵只存在于 `.llmdoc-tmp/investigations/` 临时文件中,随目录在 init 结束时一并清除。 + +平台并发限制决定恢复时的扇出方式: + +- **Claude Code**:并发上限 10,超出自动排队。溢出恢复的子 brief 可以并发发出,超出上限会排队等待。 +- **Codex**:受 `.codex/config.toml` 中 `max_threads` 和 `max_depth` 约束。恢复时的子 brief 必须控制在剩余预算内,线程或深度紧张时优先串行执行。 + +context overflow 的恢复路径不会用同样的 scope 重跑——会再次溢出。正确做法是将该 topic 拆分,走现有的 follow-up 槽位串行处理。 ### `/llmdoc:update` @@ -69,11 +98,13 @@ 1. 基于 llmdoc 和当前 working tree 重建上下文 2. 主动阅读相关 guides 和 reflection -3. 调研受影响的概念 +3. 在需要 file-sink 临时报告时按需重建 `.llmdoc-tmp/investigations/`,并用和 init 一致的落盘检查与 fallback 恢复机制调研受影响的概念 4. 在 `llmdoc/memory/reflections/` 下写 reflection 5. 更新稳定文档 6. 同步 `llmdoc/index.md` +如果 `/llmdoc:update` 发生在一次已经清理掉 `.llmdoc-tmp/` 的 init 之后,而这次 update 又需要 file-sink 临时调查报告,它会先重建 `.llmdoc-tmp/investigations/`,再启动调查或补写 fallback 文件。如果主 assistant 无法创建该目录、无法写入 fallback 报告,或无法从有效 sidecar 恢复 `output_path`,就必须暂停并向用户请求写入授权,而不是无声卡住。 + 在日常使用里,如果任务产生了值得长期保留的知识或反思,主 assistant 应该主动询问是否现在运行 `/llmdoc:update`。 ## llmdoc 结构 @@ -130,6 +161,15 @@ llmdoc/ /plugin install llmdoc@llmdoc-cc-plugin ``` +兼容性说明:如果 Claude Code 本地还缓存着旧的 marketplace 条目,比如 `tokenroll-cc-plugin`,可以按下面的命令重置 marketplace 并重新加载插件: + +```bash +/plugin marketplace remove tokenroll-cc-plugin +/plugin marketplace add https://github.com/TokenRollAI/llmdoc +/plugin install llmdoc@llmdoc-cc-plugin +/reload-plugins +``` + 安装后: 1. 把 [`CLAUDE.example.md`](CLAUDE.example.md) 复制到 `~/.claude/CLAUDE.md` diff --git a/agents/investigator.md b/agents/investigator.md index 64d533e..052a646 100644 --- a/agents/investigator.md +++ b/agents/investigator.md @@ -25,6 +25,8 @@ Key practices: - **Use line numbers sparingly:** Add line numbers only when they are required to prove a disputed or non-obvious behavior. - **Objective:** Report facts and evidence, not design opinions. - **Split by sink:** `sink=chat` is for direct answers. `sink=file` is for temporary scratch artifacts, usually under `.llmdoc-tmp/investigations/`. +- **Brief budget:** For `sink=file` with `depth=deep`, limit each brief to ≤5 questions and ≤15 specific files or symbols. If the caller's scope exceeds this, investigate only the highest-priority questions in this pass and report the remainder as gaps for follow-up. Do not attempt a single pass that would exhaust the context window. +- **Persist with `Write`:** For `sink=file`, assemble the full markdown report first, then try to persist it with `Write`. Do not rely on `Bash` as the primary write path. - **No long code pastes:** The reader can open source files directly. @@ -32,7 +34,8 @@ Key practices: - **Questions**: The concrete questions to answer. - **Depth**: `quick` or `deep`. - **Sink**: `chat` or `file`. -- **Output Path**: Required when `sink=file` unless the caller explicitly asks you to choose a path. +- **Topic**: Required when `sink=file`. Use a stable, human-readable label for the scratch artifact, not an ephemeral sentence. +- **Output Path**: Required when `sink=file`. @@ -65,7 +68,33 @@ Key practices: -Write a markdown file using the same section layout as ``, then return the absolute file path. +When `sink=file`: + +1. Require both `Topic` and `Output Path` from the caller. +2. Draft the full markdown report using the same section layout as ``. Append `` as the very last line of the markdown. This sentinel allows the coordinating agent to detect truncation: a report file that exists but lacks this sentinel is treated as `context_overflow`, not `persisted`. +3. Try to write that markdown to `Output Path` with `Write`. Treat `Output Path` as the canonical persisted artifact for the topic. +4. After the primary write attempt, always attempt a best-effort sidecar write of the same markdown to `.sidecar.md` using `Write`. This is a recovery lane for the case where the tool-framework transport loses the return payload. Do not fail the run if the sidecar write fails; silently continue. +5. If the primary write to `Output Path` succeeds, return: + +STATUS: persisted +TOPIC: +OUTPUT_PATH: +SIDECAR_PATH: .sidecar.md | none + +6. If the primary write to `Output Path` fails, return: + +STATUS: write_failed_fallback_ready +TOPIC: +OUTPUT_PATH: +SIDECAR_PATH: .sidecar.md | none +FAILURE_TYPE: write_permission_denied | tool_refused | shell_write_failed | unknown_write_failure +FAILURE_MESSAGE: +REPORT_MARKDOWN: +```markdown + +``` + +Do not claim persistence unless the primary write to `Output Path` actually succeeded. A sidecar-only write is not `persisted`. Report `SIDECAR_PATH: none` only when the sidecar write also failed. Always ensure the investigation is specific, factual, and easy for another agent to reuse. diff --git a/agents/recorder.md b/agents/recorder.md index abfaf8e..e90ae8b 100644 --- a/agents/recorder.md +++ b/agents/recorder.md @@ -17,7 +17,7 @@ When invoked: 3. Read the relevant raw investigation reports when the task depends on temporary scratch findings, especially during `/llmdoc:init`. 4. Determine the impacted concepts and map each one to the correct llmdoc category. 5. Keep `llmdoc/index.md` and `llmdoc/startup.md` distinct in purpose and content. -6. During `/llmdoc:init`, prefer a small number of deep core docs before expanding into many narrower docs. +6. During `/llmdoc:init`, prefer a size-aware small number of deep core docs before expanding into many narrower docs. 7. Update the touched documents and synchronize `llmdoc/index.md`. 8. Report every file you created, updated, or deleted. @@ -42,7 +42,7 @@ Split rules: - One concept per document. - One workflow per guide. - One ownership boundary or invariant cluster per architecture doc. -- During init, depth beats premature fragmentation. Prefer 2-3 strong core docs over 10+ shallow ones. +- During init, depth beats premature fragmentation. Prefer `2-3` strong core docs on small and medium repositories, and `3-5` on large repositories, over `10+` shallow ones. - If a document grows large only because it is preserving one coherent execution model, invariant set, or contract cluster, keep it intact until a clean split is obvious. - If a document exceeds roughly 120 lines, covers more than one workflow, or mixes stable facts with transient notes, split it when doing so improves retrieval without discarding essential reasoning flow. - Do not promote content into `/must/` unless it is stable, short, and useful on nearly every task. diff --git a/changelog_20260423.md b/changelog_20260423.md new file mode 100644 index 0000000..e2a9bc9 --- /dev/null +++ b/changelog_20260423.md @@ -0,0 +1,192 @@ +# PR 说明:enhance — Init/Update 调查持久化恢复与大型仓库首轮文档阈值放宽 + +## PR 标题 + +```text +feat: sync init/update persistence recovery and relax large-repo first-pass doc targets +``` + +## 概述 + +本 PR 聚焦 llmdoc 的三类协同改进: + +1. **Init 调查编排与四状态恢复协议落地** + `/llmdoc:init` 现在明确区分 `persisted`、`write_failed_fallback_ready`、`transport_failure`、`context_overflow` 四种文件落盘调查结果,并要求以哨兵、canonical `output_path`、sidecar 恢复和定向 follow-up 的顺序处理;首轮 investigation subagent 启动个数也改为按仓库体量阈值分档控制。 + +2. **Update 与 Init 的文件落盘协议对齐** + `/llmdoc:update` 现在在需要 file-sink 调查时复用与 init 相同的持久化检查、fallback 写入、sidecar 恢复和授权升级路径;同时会按需重建 `.llmdoc-tmp/investigations/`,不再假设 init 会把该目录保留到下一次 update。 + +3. **大型仓库首轮稳定文档数量阈值放宽** + 首轮稳定文档仍坚持“先深后广”,但核心 architecture/reference 文档目标改为按仓库规模分档:小型/中型仓库通常 `2-3` 篇,大型仓库可以放宽到 `3-5` 篇,只要这些文档分别承载不同的不变量簇、执行流或契约面。 + +以上改动已经同步到命令契约、agent prompt、Codex helper skill、架构文档、维护指南和中英文 README,避免出现“命令已变、执行提示还停留在旧规则”的偏差。 + +## 背景与动机 + +### 1. Init 的恢复协议需要可验证的持久化定义 + +之前的恢复逻辑默认“子 agent 有返回”就足够推进流程,但这会把三类不同问题混在一起: + +- **真正已持久化**:文件完整写入且可复用 +- **写入失败但 markdown 还在返回载荷里**:主 assistant 可以补写 +- **传输层失败或上下文溢出**:返回通道不可靠,必须先检查磁盘状态 + +如果没有哨兵、canonical 路径和 sidecar 恢复顺序,覆盖率门控会把缺失或截断的调查报告误判为“可用”。 + +### 1.1 Init 的体量分档与首轮 investigation subagent 启动个数 + +Init 的 fan-out 不是固定值,而是先按排除依赖、生成物、缓存和 VCS 目录后的第一方源码与测试文件估算仓库体量,再决定首轮 investigation subagent 个数。 + +当前体量分档为: + +- **小型仓库**:`<= 1000 LOC` +- **中型仓库**:`1001-5000 LOC` +- **大型仓库**:`> 5000 LOC` + +对应的首轮 investigation subagent / investigator 启动个数为: + +- **小型仓库**:`1-2` +- **中型仓库**:`2-3` +- **大型仓库**:`3-5` + +这个阈值的目的不是追求最大并发,而是在限制第一波 fan-out 的同时保持主题覆盖稳定。也就是说: + +- 小仓库避免为了“并行”而过度切碎调查主题 +- 中仓库允许有明确的主题切片,但仍应避免把边缘主题拆成独立 subagent +- 大仓库允许更宽的第一波 fan-out,但仍应以主题切分和覆盖率门控为前提,而不是无约束扩张 + +### 2. Update 之前没有补齐与 Init 同级别的恢复路径 + +`/llmdoc:update` 新增 file-sink 调查能力之后,暴露出两个缺口: + +- **目录存在性假设错误**:init 成功后会删除 `.llmdoc-tmp/`,因此 update 不能假定 `.llmdoc-tmp/investigations/` 始终存在 +- **最终失败升级路径缺失**:当主 assistant 也无法写 fallback 文件或无法从 sidecar 恢复 canonical `output_path` 时,流程必须暂停并向用户请求写入授权,而不是卡在未定义状态 + +### 3. 大型仓库首轮核心文档数过紧 + +原先“首轮稳定文档优先 `2-3` 篇”的表述,对小型和中型仓库是合理的,但对大型仓库有点过于保守。 +当仓库确实存在多个应分开记录的不变量簇、运行流和契约面时,把首轮强行压成 `2-3` 篇会造成: + +- 过度聚合,降低检索性 +- 核心文档内部混入过多互不相干的执行模型 +- `recorder` 在实践中倾向于把不该捏在一起的内容塞进同一篇 + +因此本 PR 只放宽“大型仓库首轮核心文档数”,不放宽“浅文档泛滥”的总体约束。 + +## 变更内容 + +### 协议表面:Init / Investigator / Recovery + +| 文件 | 改动内容 | +|------|---------| +| `commands/init.md` | 明确四状态协议;要求 `output_path` 为 canonical artifact;sidecar 只作恢复来源;`transport_failure` 时先查 `output_path` 再查 sidecar,必要时复制还原主路径;`context_overflow` 时拆分为 ≤3 个子 brief;首轮 investigation subagent 启动个数按体量阈值分档(小型 `1-2` / 中型 `2-3` / 大型 `3-5`);大型仓库首轮核心文档数放宽为 `3-5` | +| `agents/investigator.md` | `OutputFormat_File` 明确哨兵为最后一行、主写入优先、sidecar 为 best-effort 恢复通道、`SIDECAR_PATH` 字段和 Brief 预算 | +| `.codex/agents/llmdoc-investigator.toml` | 将 investigator 的持久化协议、哨兵、sidecar、failure payload 与 Claude prompt 保持一致 | +| `skills/llmdoc-init/SKILL.md` | 与 `commands/init.md` 同步 recovery 契约;补充大型仓库首轮核心文档可为 `3-5` | + +### Update 工作流对齐 + +| 文件 | 改动内容 | +|------|---------| +| `commands/update.md` | 在 file-sink 调查或主 assistant fallback 写入前按需重建 `.llmdoc-tmp/investigations/`;复用 init 的 canonical `output_path` / sentinel / sidecar 恢复协议;补齐“无法建目录 / 无法写 fallback / 无法还原 canonical 路径”时的用户授权升级路径 | +| `skills/llmdoc-update/SKILL.md` | 与 `commands/update.md` 同步,确保 Codex helper skill 不落后于命令契约 | + +### 稳定文档生成策略 + +| 文件 | 改动内容 | +|------|---------| +| `commands/init.md` | 首轮调查 fan-out 改为 size-aware:小型仓库 `1-2` 个 investigator,中型 `2-3` 个,大型 `3-5` 个;首轮稳定文档也改为 size-aware:小型/中型仓库通常 `2-3` 篇,大型仓库允许 `3-5` 篇深度 architecture/reference 文档 | +| `agents/recorder.md` | 将“2-3 strong core docs”改为按仓库规模分档,避免 `recorder` 在大仓库上被过度约束 | +| `skills/llmdoc-init/SKILL.md` | 对 helper skill 明确同样的首轮 investigation subagent 启动阈值和首轮核心文档数量策略 | +| `llmdoc/architecture/init-investigation-orchestration.md` | 将首轮 investigation fan-out 和“small number of deep core docs”都细化为按仓库大小的目标范围 | + +### 架构文档、指南与路由 + +| 文件 | 改动内容 | +|------|---------| +| `llmdoc/architecture/init-investigation-orchestration.md` | 记录 init 的恢复顺序、不变量、Codex 并发限制、sidecar 只作恢复来源,以及按仓库规模放宽的大仓库首轮核心文档策略 | +| `llmdoc/guides/updating-init-investigation-depth.md` | 校验清单与常见失败点同步到最新 recovery 契约;说明大仓库可以拥有更宽的首轮核心文档集 | +| `llmdoc/index.md` | 路由描述更新为包含 recovery / transport-failure / follow-up 等新语义 | +| `llmdoc/must/doc-routing.md` | init 相关路由规则加入“报告持久化回退、传输失败恢复”等修改前置阅读要求 | + +### 公开说明 + +| 文件 | 改动内容 | +|------|---------| +| `README.md` | 更新 init / update 工作流摘要;说明 init 的四状态恢复;新增 update 的按需重建 scratch 目录与授权升级路径;新增大型仓库首轮核心文档 `3-5` 的说明 | +| `README.zh-CN.md` | 与英文 README 同步 | + +### 反思文档 + +| 文件 | 内容 | +|------|------| +| `llmdoc/memory/reflections/2026-04-20-subagent-transport-failure.md` | 传输层失败为什么不能直接等价于“重新跑一遍” | +| `llmdoc/memory/reflections/2026-04-21-context-overflow-recovery.md` | 为什么需要哨兵、context overflow 状态以及平台感知的恢复策略 | + +### 其他修改 + +| 文件 | 改动内容 | +|------|---------| +| `.claude-plugin/plugin.json` | 版本号更新为 `3.0.0` | +| `.claude/settings.local.json` | 增加本仓库本地 Claude 权限白名单:`WebFetch(domain:claude.com)`、`WebSearch` | + +## 文件落盘调查失败处理 + +Init 是四状态协议的主场景;Update 在需要 file-sink scratch 调查时复用同一协议。 + +| 状态 | 触发条件 | 恢复路径 | +|------|---------|---------| +| `persisted` | 报告已写入 canonical `output_path`,且包含 `` 哨兵 | 验证 `output_path` 存在、非空、含哨兵后继续 | +| `write_failed_fallback_ready` | 主写入失败,但完整 markdown 仍在返回载荷中 | 协调 assistant 将 `report_markdown` 写回同一 `output_path`,再验证哨兵 | +| `transport_failure` | 工具调用 internal error / missing result,无返回载荷 | 先查 `output_path`,再查 `.sidecar.md`;若只有 sidecar 完整,则复制还原 canonical `output_path` 并验证;两者都不可恢复才重跑 | +| `context_overflow` | 文件存在但哨兵缺失,说明报告被截断 | 不重跑同样 scope;将 brief 拆为 ≤3 个更窄子 brief,走 follow-up 槽 | + +补充说明: + +- `` 必须是报告文件最后一行 +- sidecar 不是持久化成功状态,只是恢复来源 +- 只有 canonical `output_path` 被验证成功,主题才算完成 +- 如果主 assistant 无法创建 scratch 目录、无法写 fallback 文件、或无法从有效 sidecar 还原 canonical `output_path`,必须暂停并向用户请求写入授权 + +## 首轮稳定文档策略 + +首轮稳定文档仍然遵循“先深后广”: + +- **小型 / 中型仓库**:通常先产出 `2-3` 篇核心 architecture / reference 文档 +- **大型仓库**:允许先产出 `3-5` 篇,只要这些文档分别覆盖不同的不变量簇、流程或契约边界 + +这不是鼓励首轮铺很多浅文档,而是给大型仓库留下足够的结构表达空间。 +依然不应该把首轮稳定文档扩张成 `10+` 篇浅层摘要。 + +## 首轮 investigation subagent 启动阈值 + +在生成稳定文档之前,`/llmdoc:init` 会先按体量分档决定第一波 investigation subagent 的启动个数: + +- **小型仓库**:`1-2` +- **中型仓库**:`2-3` +- **大型仓库**:`3-5` + +这里的“按阈值启动个数”是首轮调查 fan-out 上限,不是硬性要求必须把配额打满。 +如果主题面较少、某些主题可以合并、或当前平台前台稳定性不足,实际启动个数可以低于上限,但不应该在同等体量下无理由超过该范围。 + +## 非目标 + +- 不修改 `.codex/config.toml` 中的 `max_threads` / `max_depth` +- 不把 sidecar 提升为新的 canonical artifact +- 不放宽“浅文档泛滥”的约束;大型仓库放宽的是核心文档数量,不是允许随意拆碎 +- 不修改单篇文档的拆分原则,例如“一个 workflow 一篇 guide”“一个 ownership / invariant cluster 一篇 architecture” + +## 验证清单 + +- [ ] 在小型仓库运行 `/llmdoc:init`,确认预调查校准提示出现,并显示 `No extra context, continue` +- [ ] 确认小型仓库首轮 investigator 扇出仍为 `1-2` +- [ ] 确认中型仓库首轮 investigator 扇出为 `2-3` +- [ ] 确认大型仓库首轮 investigator 扇出为 `3-5` +- [ ] 确认第一波结束后触发覆盖率门控;仅在有缺口时触发 follow-up +- [ ] 手动检查调查报告最后一行为 `` +- [ ] 模拟 `transport_failure`,确认先查 `output_path` 再查 sidecar;若 sidecar 完整则复制还原 canonical 路径 +- [ ] 模拟 `context_overflow`,确认走 brief 拆分路径而非重跑同样 scope +- [ ] 在“init 已清理 `.llmdoc-tmp/`”之后运行 `/llmdoc:update`,确认会按需重建 `.llmdoc-tmp/investigations/` +- [ ] 模拟 `/llmdoc:update` 的 fallback 写入失败或 sidecar 还原失败,确认会暂停并请求用户授权,而不是无声卡住 +- [ ] 在大型仓库场景下,确认首轮稳定文档允许 `3-5` 篇深度 architecture / reference 文档,而不是被硬压成 `2-3` +- [ ] 验证中英文 README 对 init / update / recovery / 大型仓库首轮文档阈值的描述保持一致 diff --git a/commands/init.md b/commands/init.md index 3320da9..ca25607 100644 --- a/commands/init.md +++ b/commands/init.md @@ -1,5 +1,5 @@ --- -description: "Initialize or re-bootstrap llmdoc using the minimal init workflow." +description: "Initialize or re-bootstrap llmdoc using evidence-first investigation with required user calibration checkpoints." --- # /llmdoc:init @@ -18,7 +18,12 @@ Why: 1. Inspect the project root. - Read top-level manifests and README files. - - Avoid dependency and build directories. + - Exclude dependency, generated, cache, and VCS directories throughout init. At minimum, ignore `node_modules/`, `dist/`, `build/`, `.next/`, `coverage/`, `vendor/`, and `.git/`. + - Estimate repository size from first-party source files and tests after those exclusions. Do not count lockfiles, generated artifacts, vendored code, or cache directories toward LOC thresholds. + - Use that LOC estimate to classify the repo as: + - small: `<= 1000 LOC` + - medium: `1001-5000 LOC` + - large: `> 5000 LOC` 2. Create or repair the llmdoc skeleton. - Ensure these directories exist: @@ -32,23 +37,98 @@ Why: - `llmdoc/memory/decisions/` - `.llmdoc-tmp/investigations/` -3. Run investigation. +3. Run the pre-investigation user calibration. + - This step is required, but the user may skip it by pressing Enter with no extra reply. + - If the environment supports explicit options, `No extra context, continue` may still be shown, but a blank reply should be treated the same way. + - If the user presses Enter with no extra reply, continue with repository evidence. + - This is one of the only valid points where init may pause and wait for user input. Make it explicit that init is waiting for calibration, not finished. + - Only ask for four kinds of context: + - who the project is for + - what the core purpose or core functions are + - which internal terms are specific to this team or project rather than generic concepts + - which conventions or boundaries are not obvious from the repo but should affect document structure + - Keep this interaction short. Do not open a broad interview. + - Treat the answers as calibration for investigation scope, terminology, and document structure. + - Persist the current calibration state under `.llmdoc-tmp/investigations/`. + +4. Run investigation. - Use `investigator` for evidence gathering. - - Default to multiple focused investigators instead of one broad investigator pass. - - On most non-trivial repositories, start with 3-5 parallel investigators. + - In Claude Code, background investigators are an internal execution detail. Do not treat investigator launch as a completion point. + - Do not hand control back to the user while init is still collecting investigation results, consolidating coverage, running follow-up, generating stable docs, synchronizing the index, or cleaning `.llmdoc-tmp/`, unless explicit user input is required. + - The only valid user-facing pause points during init are: + - the pre-investigation calibration + - the post-investigation confirmation + - the final completion summary after stable docs, index sync, and cleanup are done + - Feed the confirmed calibration context and user hints into the investigation plan when they improve coverage or document structure. + - Before launching each `sink=file` investigator, assign a stable `topic` label and a unique `output_path` under `.llmdoc-tmp/investigations/`. + - Default to multiple focused investigators instead of one broad investigator pass, but cap fan-out by repository size: + - small: `1-2` investigators + - medium: `2-3` investigators + - large: `3-5` investigators - Bias toward coverage, not just speed. `init` should leave a reusable retrieval map, not only enough facts to draft a first document set. - Split by theme, not by random directories. Good slices include repo shape and entrypoints, runtime architecture, feature areas, tests and quality signals, and delivery or ops surfaces when present. - Explicitly cover the major repo surfaces that exist. At minimum, consider public interface docs, command contracts, agent prompts, runtime or tool configuration, quality signals, and integration surfaces. + - Keep theme coverage stable even when fan-out is capped. Merge secondary slices into the main assistant or a quick pass instead of dropping them. - Prefer `depth=deep` for the core investigation slices. Use `depth=quick` only for clearly secondary slices. + - Do not inspect excluded dependency, generated, cache, or VCS directories during investigation or follow-up. + - When `sink=file`, the investigator should first assemble the complete markdown report (ending with `` as the last line), then try to persist it to `output_path` with `Write`. Treat `output_path` as the canonical artifact for that topic. Do not rely on `Bash` as the primary persistence path. + - Cap each investigator brief to ≤5 questions and ≤15 specific files or symbols. If a thematic slice exceeds this budget, split it before launch rather than relying on overflow recovery. + - Require the investigator to also attempt a best-effort sidecar `Write` of the same markdown to `.sidecar.md` after the primary write attempt and before returning. The sidecar is a recovery lane for cases where the tool-framework transport loses the return payload. It must never replace a successful primary write, and a failed sidecar write must never block the run. + - Treat each file-sink investigation result as one of four states: + - `persisted`: the report was written to `output_path`, returns `topic`, `output_path`, and `sidecar_path`, and the file contains the `` sentinel + - `write_failed_fallback_ready`: the report could not be written, but returns `topic`, `output_path`, `sidecar_path`, `failure_type`, `failure_message`, and full `report_markdown` so the coordinating assistant can persist it + - `transport_failure`: inferred when the subagent tool call returns an internal error or a missing tool result. No payload is available in the return channel. + - `context_overflow`: inferred when the report file exists on disk but the `` sentinel is missing. The report was truncated before completion. - Persist reports under `.llmdoc-tmp/investigations/`. + - Notification or result text alone is not a persisted report. + - Do not treat a `persisted` response as complete until the coordinating assistant verifies that the file exists, is non-empty, **and** contains the `` sentinel. + - If an investigator returns `write_failed_fallback_ready`, the coordinating assistant must immediately write `report_markdown` to the same `output_path`, then verify the file and sentinel before continuing. + - If the coordinating assistant observes `transport_failure`, it must first check `output_path` on disk. If `output_path` is missing, empty, or lacks the sentinel, it must then check `.sidecar.md`. A valid sidecar is only a recovery source: when the sidecar is complete, copy it back to `output_path`, verify the restored canonical file, and continue only after `output_path` exists, is non-empty, and contains the sentinel. Only rerun when neither path yields a restorable report. + - If the coordinating assistant observes `context_overflow`, do not rerun the same brief scope. Split the topic into ≤3 narrower sub-briefs and route them through the follow-up slot. On Claude Code, sub-briefs can be sent concurrently (they queue at the platform cap of 10). On Codex, prefer sequential sub-briefs to stay within `max_threads` and `max_depth` budget. + - If the coordinating assistant cannot write that fallback report or cannot restore `output_path` from a valid sidecar, pause init and explicitly ask the user for authorization to write the missing report files under `.llmdoc-tmp/investigations/`. Explain which topics are blocked and that init has not finished. + - Do not expand follow-up fan-out while the current required batch still has unpersisted reports. - Do not wait for the repository to be "large enough" before splitting. Split whenever doing so will produce better coverage or clearer retrieval maps. - - After the first wave, run at least one follow-up investigation pass to resolve gaps, conflicts, and cross-cutting relationships discovered by the initial investigators. + - If Claude Code returns foreground control after launching background investigators, immediately continue by waiting for results, checking written investigation reports, and advancing toward the coverage gate. Do not present init as finished. + - If the current fan-out would cause Claude Code to expose an unfinished init as if it were done, reduce investigator count and continue in a more foreground-stable way instead of preserving maximum parallelism. + - While investigators are still running, report status in progress language such as "init is still running" and "waiting for investigator results". Do not imply completion, and do not invite the user to start a new task. + - Run the coverage gate only after every required report from the current batch has been persisted and verified on disk. + - After the first wave, run a coverage gate before deciding on follow-up. The gate should check: + - whether the key themes were covered + - whether investigator conclusions still conflict + - whether user supplements that affect document structure or terminology have been verified + - whether major document-structure ambiguity still remains + - whether any unresolved uncertainty has been downgraded to an explicit gap instead of a hidden assumption + - For small and medium repositories, use the coverage gate to choose one of these outcomes: + - pass: continue to stable-doc generation + - pass with gaps: continue, but preserve the remaining uncertainty as explicit gaps + - targeted follow-up required: run another investigation pass scoped only to the open gaps + - For large repositories, always run the first coverage gate before follow-up. Use it to prepare one targeted follow-up brief by default, then rerun the same gate after that pass to choose the outcomes above. + - Scope every follow-up pass to a brief that contains only `missing_topics`, `conflicts`, `user_supplements_to_verify`, and `doc_structure_risks`. + - Follow-up must only check missing evidence. Do not re-run the whole repo, re-open all themes, or revisit already settled conclusions. + - Choose follow-up defaults by repository size: + - small: follow-up is conditional and should use at most `0-1` investigators + - medium: follow-up is conditional and should use at most `1-2` investigators + - large: after the first coverage gate prepares the brief, run one targeted follow-up pass by default, then use the rerun gate to decide whether to continue; use at most `1-3` investigators per follow-up pass - Before handing off to `recorder`, consolidate what was covered, what was intentionally skipped, and what remains uncertain. Missing evidence should become an explicit gap, not a silent omission. - Treat these reports as scratch artifacts for bootstrapping, not stable project memory. -4. Generate the initial stable docs with `recorder`. - - `recorder` should directly read the relevant raw investigation reports under `.llmdoc-tmp/investigations/` before writing stable docs. Do not rely only on second-hand summaries from the coordinating assistant. +5. Run the required post-investigation confirmation. + - Prepare a concise concept list of what is about to enter the record and influence stable docs. + - Include the current understanding of project purpose and audience, core functions, identified internal terms, conventions or boundaries that affect document structure, and the document emphasis the first stable pass is about to prioritize. + - This is one of the only valid points where init may pause and wait for user input. Make it explicit that init is paused for confirmation and has not finished yet. + - Show only two user actions: + - `Generate docs now` + - `I want to add: terms | emphasis | conventions` + - If the user adds information, accept only that scoped supplement instead of reopening a broad interview. + - Route user supplements through the same targeted follow-up and coverage-gate mechanism. Only verify the requested terms, emphasis, conventions, and directly related evidence. + - Do not restart the full investigation after user feedback. Follow-up should only fill the open gap and then return to this confirmation step. + - Keep implementation facts evidence-first. User input may refine positioning, terminology, and structure, but it should not override repository evidence about behavior or ownership. + - Do not treat unverified or conflicting claims as stable facts. Keep them as scratch notes or explicit gaps until evidence is strong enough. + +6. Generate the initial stable docs with `recorder`. + - `recorder` should directly read the relevant raw investigation reports under `.llmdoc-tmp/investigations/` after they have been persisted and verified on disk. Do not rely only on second-hand summaries from the coordinating assistant or notification-only fallback text. - Synthesize across all investigation reports, not just the first one that looks complete. + - Use user-confirmed project-positioning information when it improves retrieval quality, terminology, and document structure. - Treat uncovered major areas as documentation gaps to record, not as proof that those areas do not matter. - Create `llmdoc/index.md` as the global documentation map. - Create `llmdoc/startup.md`. @@ -56,13 +136,17 @@ Why: - Ensure `llmdoc/index.md` does not duplicate the ordered startup list in `llmdoc/startup.md`. - Ensure `llmdoc/startup.md` does not duplicate the global category catalog from `llmdoc/index.md`. - Create `llmdoc/overview/project-overview.md`. - - In the first stable pass, prioritize 2-3 core architecture or reference docs that capture the system's deepest invariants, flows, and contracts. Do not spread the first pass across many shallow documents. + - In the first stable pass, prioritize a size-aware core set of architecture or reference docs that capture the system's deepest invariants, flows, and contracts: usually `2-3` on small and medium repositories, and `3-5` on large repositories. Do not spread the first pass across many shallow documents. - Create focused architecture and reference docs based on the investigation reports, then expand into additional smaller docs only after the core docs are deep enough to stand on their own. - Treat document length as a quality tradeoff, not a hard limit. If a core doc needs more space to preserve causal flow, invariants, and terminology, keep it cohesive before splitting. -5. Synchronize `llmdoc/index.md`. +7. Synchronize `llmdoc/index.md`. - Index all stable docs. - Keep `memory/reflections/` and `memory/decisions/` separate from stable docs. - Do not treat `.llmdoc-tmp/` as part of llmdoc. -6. Summarize what was created and where the main startup docs live. +8. Remove `.llmdoc-tmp/`. + - Delete the temporary investigation artifacts after the stable docs and index are complete. + - Do not leave `.llmdoc-tmp/` behind after a successful init run. + +9. Summarize what was created and where the main startup docs live. diff --git a/commands/update.md b/commands/update.md index 7244be7..b8d4cbf 100644 --- a/commands/update.md +++ b/commands/update.md @@ -26,7 +26,18 @@ Why: 2. Investigate the impacted concepts. - Use `investigator`. - Prefer targeted questions over broad repo scans. + - Recreate `.llmdoc-tmp/investigations/` on demand before any file-sink investigation or coordinating-assistant fallback write. Do not assume init left the directory behind. - Persist temporary investigation notes under `.llmdoc-tmp/investigations/` only when they help the current update. + - When using `sink=file`, `topic` and `output_path` are required. Assign a stable `topic` label and a unique `output_path` under `.llmdoc-tmp/investigations/` before launching the investigation. + - File-sink update investigations use the same persistence contract as init: write the full report to `output_path` first, append `` as the last line, then attempt a best-effort sidecar write to `.sidecar.md`. + - Treat `output_path` as the canonical artifact for the update investigation. The sidecar is recovery-only and must never replace the primary artifact. + - Treat each file-sink investigation result as one of the same four states used by init: `persisted`, `write_failed_fallback_ready`, `transport_failure`, or `context_overflow`. + - Do not rely on a `persisted` response until `output_path` exists, is non-empty, and contains the `` sentinel. + - A sidecar-only write is not `persisted`. + - If an investigation returns `write_failed_fallback_ready`, immediately write `report_markdown` to the same `output_path`, then verify the file and sentinel before using it. + - If the investigation transport fails, first check `output_path` on disk. If `output_path` is missing, empty, or lacks the sentinel, then check `.sidecar.md`. A valid sidecar is only a recovery source: when the sidecar is complete, copy it back to `output_path`, verify the restored canonical file, and continue only after `output_path` exists, is non-empty, and contains the sentinel. Only rerun when neither path yields a restorable report. + - If the investigation hits `context_overflow`, do not rerun the same brief scope. Split the topic into narrower follow-up briefs instead. + - If the coordinating assistant cannot create `.llmdoc-tmp/investigations/`, cannot write the fallback report, or cannot restore `output_path` from a valid sidecar, pause update and explicitly ask the user for authorization to write the blocked scratch files under `.llmdoc-tmp/investigations/`. Explain which topics are blocked and that update has not finished. 3. Reflect before editing stable docs. - Use `reflector` to write a task-specific reflection into `llmdoc/memory/reflections/`. diff --git a/llmdoc/architecture/init-investigation-orchestration.md b/llmdoc/architecture/init-investigation-orchestration.md index 640f20f..10291cf 100644 --- a/llmdoc/architecture/init-investigation-orchestration.md +++ b/llmdoc/architecture/init-investigation-orchestration.md @@ -1,37 +1,83 @@ # Architecture of Init Investigation Orchestration ## Purpose -- Define how `/llmdoc:init` should expand investigation work so the first documentation bootstrap is broad enough to be reusable. +- Define how `/llmdoc:init` should expand investigation work so the first documentation bootstrap stays reusable without imposing large-repo cost on small repositories. - Define the minimum expectation for investigation coverage so bootstrap docs do not inherit blind spots from a narrow first pass. +- Define where required user calibration should shape init without weakening evidence-first behavior. +- Define how init should recover when background investigators can produce a report but cannot persist it directly. ## Core Components - `commands/init.md` (`/llmdoc:init`): The main orchestration contract for repository inspection, investigation, and stable doc generation. - `agents/investigator.md` (`investigator`): The evidence-gathering role used for targeted codebase and doc investigation. - `agents/recorder.md` (`recorder`): The stable-doc writer that must preserve investigation depth instead of flattening it into thin summaries. -- `.codex/config.toml` (`[agents]`): Global Codex agent limits that can silently cap fan-out depth and concurrency. +- `.codex/config.toml` (`[agents]`): Global Codex agent limits that cap fan-out depth and concurrency. Current repo values: `max_threads = 8`, `max_depth = 2` during developing. These bound how many investigators can run in parallel and how deep follow-up can nest. Recovery investigators must stay within remaining thread and depth budget. - `README.md` (`/llmdoc:init`): English public summary of the init workflow. - `README.zh-CN.md` (`/llmdoc:init`): Chinese public summary of the init workflow. ## Flow -- Inspect the repository root and create the llmdoc skeleton. -- Start a first wave of focused investigators, usually 3-5 in parallel for non-trivial repositories. +- Inspect the repository root, exclude dependency or generated directories, estimate relevant LOC, and classify the repo as small (`<= 1000 LOC`), medium (`1001-5000 LOC`), or large (`> 5000 LOC`). +- Create the llmdoc skeleton. +- Run a required pre-investigation calibration step. The assistant must always offer `No extra context, continue` as an explicit option. +- Keep the pre-investigation calibration narrow. It should cover project audience, core purpose or core functions, internal team-specific terms, and conventions or boundaries that should affect document structure. +- Enumerate the thematic investigation slices before assigning subagents. +- Assign each file-sink investigation a stable `topic` label and a unique `.llmdoc-tmp/investigations/...` output path before launching the subagent. +- Start a first wave of focused investigators with size-aware fan-out: + - small: `1-2` + - medium: `2-3` + - large: `3-5` - Split by theme instead of arbitrary directories. Typical slices are repo shape and entrypoints, runtime architecture, feature areas, tests and quality signals, and delivery or ops surfaces when present. - Make coverage explicit. The init flow should deliberately touch the major repo surfaces that exist rather than assuming a few deep slices are enough. +- Keep all major themes in scope even when the size cap is lower than the number of ideal slices. Merge secondary slices into the coordinating assistant or a quick pass instead of dropping them. - Prefer `depth=deep` for core slices and use `depth=quick` only for clearly secondary slices. -- Run a follow-up investigation pass to resolve gaps, conflicts, and cross-cutting relationships discovered by the first wave. +- Let investigators try to persist their own file-sink reports first with `Write`, not `Bash`. +- Treat `result returned` and `report persisted` as different states. +- Do not treat a `persisted` result as complete until the coordinating assistant verifies the canonical `output_path` exists, is non-empty, and contains the `` sentinel. +- If an investigator cannot persist its report, require it to return a structured fallback payload containing `topic`, `output_path`, `sidecar_path`, `failure_type`, `failure_message`, and the full markdown report so the coordinating assistant can write it. +- Require the investigator to always attempt a best-effort sidecar `Write` of the same markdown to `.sidecar.md` before returning, regardless of whether the primary write succeeded. Treat the sidecar as a recovery lane that must never block the run if it fails or replace the canonical artifact. +- Let the coordinating assistant immediately persist that fallback report, then verify the file exists, is non-empty, and contains the `` sentinel before treating the topic as complete. +- Handle the third observation state `transport_failure`, inferred when the subagent tool call returns an internal error or a missing tool result. On `transport_failure`, the coordinating assistant must first check `output_path` on disk. If `output_path` is missing, empty, or lacks the sentinel, it must then check `.sidecar.md`. A valid sidecar is only a recovery source: when the sidecar is complete, copy it back to `output_path`, verify the restored canonical file, and continue only after `output_path` exists, is non-empty, and contains the sentinel. Only rerun when neither path yields a restorable report. +- Handle the fourth observation state `context_overflow`, inferred when the report file exists on disk but the `` sentinel is missing. On `context_overflow`, do not rerun the same brief scope. Split the topic into ≤3 narrower sub-briefs and route them through the follow-up slot. Claude Code queues sub-briefs at the platform cap of 10; Codex must stay within the remaining `max_threads` and `max_depth` budget and should prefer serial sub-briefs when budget is tight. +- If the coordinating assistant also cannot write the fallback report or cannot restore `output_path` from a valid sidecar, pause init and ask the user for explicit authorization to write the blocked `.llmdoc-tmp/investigations/` files. +- Run the coverage gate only after every required report in the current batch has been persisted and verified on disk. +- Run a coverage gate after the first wave. The gate should check key-theme coverage, unresolved conflicts, unverified user supplements, document-structure risks, and whether remaining uncertainty has been downgraded to explicit gaps. +- For small and medium repositories, let the first coverage gate either pass directly or trigger follow-up only when it fails. +- For large repositories, let the first coverage gate prepare one targeted follow-up brief by default, then use the same gate after that pass to decide whether more follow-up is needed. +- Scope each follow-up pass to a brief containing only `missing_topics`, `conflicts`, `user_supplements_to_verify`, and `doc_structure_risks`. +- Treat follow-up as a targeted repair phase. It should never re-run the whole repo or re-open already settled themes. +- If persistence becomes unstable during a batch, stabilize the current batch before increasing follow-up fan-out. - Before synthesis, consolidate covered slices, intentionally skipped areas, and unresolved uncertainty so `recorder` sees both knowledge and gaps. +- Run a required post-investigation confirmation before stable-doc generation. +- The post-investigation confirmation should show a concise concept list and only two actions: generate now, or add terms, emphasis, or conventions. +- If user supplements reveal unresolved ambiguity, send them through the same targeted follow-up and coverage-gate path instead of restarting the whole init flow. - Let `recorder` directly read the raw investigation reports and synthesize across all investigation outputs instead of trusting a single broad report or a second-hand summary. -- In the first stable pass, let `recorder` produce a small number of deep core docs before expanding into narrower retrieval docs. +- In the first stable pass, let `recorder` produce a size-aware small number of deep core docs before expanding into narrower retrieval docs: usually `2-3` on small and medium repositories, and `3-5` on large repositories. ## Invariants +- `/llmdoc:init` should always offer the pre-investigation calibration step and explicitly show `No extra context, continue`. +- `/llmdoc:init` should always require a post-investigation confirmation before stable-doc generation. +- Both user interactions should stay narrow and decision-oriented instead of turning into a broad interview. - `/llmdoc:init` should not default to a single broad investigation pass on non-trivial repositories. - `/llmdoc:init` should not treat partial slice coverage as "comprehensive enough" without an explicit coverage check. +- `/llmdoc:init` should use repository-size thresholds to cap fan-out, not to remove themes from coverage. +- `/llmdoc:init` should exclude dependency, generated, cache, and VCS directories from sizing and investigation. +- `/llmdoc:init` should distinguish `result returned` from `report persisted` and should not treat notification-only results as completed reports. +- `/llmdoc:init` should not treat a missing or internal-error tool result as proof that no report exists. Before rerunning, it must check `output_path` and `.sidecar.md` on disk. +- `/llmdoc:init` should not treat a sidecar-only file as `persisted`. It must restore the canonical `output_path` before continuing from a recovered report. +- `/llmdoc:init` should treat a report file without the `` sentinel as `context_overflow`, not `persisted`. A truncated report must never silently enter the coverage gate. +- `/llmdoc:init` should not rerun a `context_overflow` topic with the same brief scope. It must split the brief and use the follow-up slot. +- `/llmdoc:init` should not increase concurrent fan-out beyond what the platform can queue or execute when recovering from `context_overflow`. Claude Code queues at 10; Codex is bounded by `max_threads` and `max_depth`. +- `/llmdoc:init` should prefer investigator self-persist, then coordinating-assistant fallback persist, then explicit user authorization if both writes fail. +- `/llmdoc:init` should not rely on `Bash` as the primary persistence path for investigation scratch files. +- `/llmdoc:init` should treat follow-up as targeted gap resolution, not as a second full-repo pass. - `/llmdoc:init` should not flatten deep investigation into many shallow stable docs during the first pass. +- `/llmdoc:init` should not write unverified or conflicting user claims into stable docs as facts. - Investigation scratch belongs in `.llmdoc-tmp/investigations/`. - Public docs and command contracts should describe the same init behavior. - Codex agent limits should allow at least one level of follow-up investigation rather than capping the workflow at depth `1`. ## Related Docs +- `llmdoc/guides/init-user-calibration.md` - `llmdoc/guides/updating-init-investigation-depth.md` - `llmdoc/reference/repo-surfaces.md` - `llmdoc/memory/reflections/2026-04-05-init-subagent-depth.md` +- `llmdoc/memory/reflections/2026-04-20-subagent-transport-failure.md` diff --git a/llmdoc/guides/init-user-calibration.md b/llmdoc/guides/init-user-calibration.md new file mode 100644 index 0000000..46216f3 --- /dev/null +++ b/llmdoc/guides/init-user-calibration.md @@ -0,0 +1,43 @@ +# How to Update Init User Calibration + +## Preconditions +- Read `llmdoc/architecture/init-investigation-orchestration.md` before editing. +- Confirm whether the change affects the command contract, helper skill, public README summaries, or all three. + +## Main Steps +1. Inspect `commands/init.md` and `skills/llmdoc-init/SKILL.md` to confirm that `/llmdoc:init` still has two user checkpoints: one before investigation and one after investigation. +2. Keep the pre-investigation checkpoint required, but always show `No extra context, continue` as an explicit user-facing option. +3. Keep the pre-investigation questions narrow. They should only cover: + - who the project is for + - what the core purpose or core functions are + - which internal terms are team-specific rather than generic + - which hidden conventions or boundaries should affect document structure +4. Keep the pre-investigation checkpoint short. It should calibrate investigation scope, terminology, and document structure instead of turning into a broad interview. +5. Keep the post-investigation checkpoint required. +6. Make the post-investigation checkpoint show a concise concept list of what is about to influence stable docs. +7. Keep the post-investigation choices narrow: generate now, or add terms, emphasis, or conventions. +8. If the user adds information, route it through the same targeted follow-up and coverage-gate mechanism used by init. Only verify the supplemented terms, emphasis, conventions, and directly related evidence. +9. Keep evidence-first behavior intact. User-confirmed project-positioning information may shape stable docs, but user input must not override repository evidence about implementation behavior or ownership. +10. Keep unverified or conflicting claims in scratch notes or explicit gaps until evidence is strong enough. +11. Synchronize `README.md`, `README.zh-CN.md`, `llmdoc/index.md`, and routing docs whenever the interaction design changes. + +## Verification +- `commands/init.md` explicitly includes the pre-investigation checkpoint and requires `No extra context, continue` to be shown as an option. +- `commands/init.md` explicitly includes the required post-investigation confirmation and its two allowed actions. +- User supplements flow into targeted follow-up, not a whole-repo rerun. +- `skills/llmdoc-init/SKILL.md` matches the command contract. +- The README summaries match the actual init flow. +- Unverified or conflicting user claims do not enter stable docs as facts. + +## Common Failure Points +- Turning the pre-investigation checkpoint into a broad interview. +- Letting the skip path exist only as an implied behavior instead of a visible option. +- Adding many post-investigation options that make the decision harder instead of easier. +- Treating user supplements as fact without a targeted evidence check. +- Restarting broad repository investigation when only a narrow follow-up is needed. +- Updating the command contract but forgetting the helper skill or public summaries. + +## Related Docs +- `llmdoc/architecture/init-investigation-orchestration.md` +- `llmdoc/guides/updating-init-investigation-depth.md` +- `llmdoc/reference/repo-surfaces.md` diff --git a/llmdoc/guides/updating-init-investigation-depth.md b/llmdoc/guides/updating-init-investigation-depth.md index f1bb864..1b3269c 100644 --- a/llmdoc/guides/updating-init-investigation-depth.md +++ b/llmdoc/guides/updating-init-investigation-depth.md @@ -1,28 +1,59 @@ -# How to Update Init Investigation Depth, Coverage, and Synthesis +# How to Update Init Investigation Depth, Persistence Fallback, Coverage, Follow-up, and Synthesis ## Preconditions - Confirm whether the weak behavior comes from command guidance, agent prompts, runtime config, or all three. - Read `llmdoc/architecture/init-investigation-orchestration.md` before editing. ## Main Steps -1. Inspect `commands/init.md` to see whether investigation defaults to one broad pass or multiple focused passes, and whether it makes coverage expectations explicit. +1. Inspect `commands/init.md` to see whether investigation still uses thematic splitting, repository-size thresholds, explicit persistence checks, explicit coverage expectations, and targeted follow-up instead of a fixed second full-repo pass. 2. Inspect `.codex/config.toml` to see whether `max_threads` or `max_depth` will cap the intended orchestration. -3. Inspect `agents/recorder.md` to see whether stable-doc synthesis preserves depth or pushes the workflow toward premature fragmentation. -4. Update the init contract so it specifies thematic splitting, a reasonable default investigator count, explicit major-surface coverage, direct recorder access to raw investigation reports, and a follow-up gap-check pass. -5. Update `recorder` rules so init favors a few deep core docs before wider splitting. -6. Update public summaries in `README.md` and `README.zh-CN.md` so they match the actual contract. -7. If the change reveals a recurring lesson, record it in a reflection and promote stable parts into architecture or reference docs. +3. Inspect `agents/investigator.md` and `.codex/agents/llmdoc-investigator.toml` to confirm that file-sink investigations: (a) end every report with the `` sentinel, (b) cap each brief to ≤5 questions and ≤15 files, (c) try direct persistence first, (d) perform a best-effort sidecar `Write` to `.sidecar.md`, and (e) return fallback-ready payloads when primary writes fail. +4. Inspect `agents/recorder.md` to see whether stable-doc synthesis preserves depth or pushes the workflow toward premature fragmentation. +5. Update the init contract so it keeps all major themes in scope while capping first-wave fan-out by repository size: + - small: `<= 1000 LOC` + - medium: `1001-5000 LOC` + - large: `> 5000 LOC` +6. Keep dependency, generated, cache, and VCS directories excluded from sizing and investigation. At minimum, the contract should name `node_modules/`, `dist/`, `build/`, `.next/`, `coverage/`, `vendor/`, and `.git/`. +7. Update the init contract so file-sink investigations distinguish `result returned` from `report persisted`, and so the coordinating assistant can persist fallback-ready reports before coverage continues. +8. Keep direct investigator persistence on the normal path, but require the fallback order to stay: + - investigator writes successfully (sentinel must be present for `persisted` to be accepted) + - coordinating assistant writes fallback markdown to the same path + - on `transport_failure`, coordinating assistant first checks `output_path`; if only the sidecar is complete, it copies the sidecar back to `output_path` and verifies the restored canonical file before considering a rerun + - on `context_overflow` (file exists, sentinel missing), split the brief into ≤3 sub-briefs and route via follow-up; do not rerun the same scope + - user authorization is requested only if the fallback write fails and no valid sidecar restore copy can produce the canonical `output_path` +9. Update the init contract so follow-up is driven by a coverage gate, with these outcomes: + - `pass` + - `pass with gaps` + - `targeted follow-up required` +10. Keep follow-up scoped to `missing_topics`, `conflicts`, `user_supplements_to_verify`, and `doc_structure_risks`. Do not let it re-run the whole repo. +11. Update `recorder` rules so init favors a few deep core docs before wider splitting. +12. Update public summaries in `README.md` and `README.zh-CN.md` so they match the actual contract. +13. If the change reveals a recurring lesson, record it in a reflection and promote stable parts into architecture or reference docs. ## Verification -- `commands/init.md` explicitly describes multiple focused investigators, coverage expectations, direct recorder reads of raw investigation reports, and a follow-up pass. -- `agents/recorder.md` allows deep core docs during init instead of forcing early fragmentation. +- `commands/init.md` explicitly describes thematic splitting, repository-size fan-out thresholds, investigator-to-main-agent persistence fallback, coverage expectations, direct recorder reads of raw investigation reports, and targeted follow-up behavior. +- `agents/investigator.md`, `.codex/agents/llmdoc-investigator.toml`, `commands/init.md`, and `skills/llmdoc-init/SKILL.md` agree on the four file-sink outcomes: `persisted` (file + sentinel present), `write_failed_fallback_ready`, the main-agent-inferred `transport_failure`, and the main-agent-inferred `context_overflow` (file present, sentinel missing). Investigator-side surfaces include the `` sentinel requirement, the ≤5-question/≤15-file brief cap, the best-effort sidecar write, and the `SIDECAR_PATH` field. +- `agents/recorder.md` allows size-aware deep core docs during init instead of forcing early fragmentation, with large repositories allowed a wider first-pass core set than small and medium repositories. +- Recovery rules treat `output_path` as the canonical artifact and use sidecars only as copy sources for restoring that canonical file, never as persisted end state on their own. +- `commands/init.md` excludes dependency and generated directories from both sizing and investigation. +- `commands/init.md` requires coverage to wait for persisted and verified investigation files instead of notification-only results. +- `commands/init.md` keeps follow-up conditional for small and medium repositories, while large repositories default to one targeted follow-up pass. - `.codex/config.toml` no longer blocks the intended depth. - The English and Chinese README summaries match the command behavior. ## Common Failure Points - Fixing only command text while leaving agent limits too tight. -- Raising concurrency without documenting how investigation should be split or what "enough coverage" means. +- Raising concurrency without documenting how investigation should be split, how size thresholds apply, or what "enough coverage" means. +- Letting notification text masquerade as a persisted report, so coverage starts before the scratch files exist. +- Treating a missing or internal-error tool return as a rerun trigger without first checking `output_path` and the sidecar on disk, which re-runs investigators whose reports are already persisted. +- Continuing from a sidecar-only file without restoring `output_path`, which leaves the canonical artifact missing and breaks the persisted contract. +- Treating a report file without the `` sentinel as `persisted` — a truncated report silently entering the coverage gate produces shallow or contradictory stable docs. +- Retrying a `context_overflow` investigator with the same brief scope — the same context limit will truncate it again. Split and use follow-up instead. +- Spawning concurrent recovery investigators in Codex when `max_threads` or `max_depth` is already saturated — excess investigators are blocked, not queued, unlike Claude Code. +- Falling back to user authorization too early instead of first letting the coordinating assistant persist the returned markdown. - Improving investigation depth without changing how `recorder` consumes and preserves that depth. +- Letting follow-up drift into a second full-repo pass instead of a targeted repair phase. +- Counting `node_modules/` and other generated directories toward repository size, which inflates fan-out decisions and slows init. - Updating only one README and leaving the other public surface stale. ## Related Docs diff --git a/llmdoc/index.md b/llmdoc/index.md index 7214f8c..d284321 100644 --- a/llmdoc/index.md +++ b/llmdoc/index.md @@ -15,15 +15,17 @@ ## Key Documents - `llmdoc/startup.md`: ordered startup reading list - `llmdoc/overview/project-overview.md`: what this repository is and what belongs here -- `llmdoc/architecture/init-investigation-orchestration.md`: how `/llmdoc:init` investigation is expected to fan out and converge -- `llmdoc/guides/updating-init-investigation-depth.md`: how to change init depth safely when the workflow is too shallow or too broad +- `llmdoc/architecture/init-investigation-orchestration.md`: how `/llmdoc:init` fan-out, report persistence fallback, transport-failure recovery, coverage gates, follow-up, and repo exclusions are expected to work +- `llmdoc/guides/init-user-calibration.md`: how `/llmdoc:init` should collect and use the two required user calibration steps +- `llmdoc/guides/updating-init-investigation-depth.md`: how to change init depth, report persistence fallback, transport-failure recovery, follow-up behavior, and repository-size thresholds safely - `llmdoc/reference/repo-surfaces.md`: stable map of commands, agents, plugin files, and Codex config surfaces ## Routing Rules - Read `startup.md` first on normal work. -- Read `architecture/init-investigation-orchestration.md` before changing `/llmdoc:init`, agent fan-out strategy, or Codex agent limits. -- Read `guides/updating-init-investigation-depth.md` before tuning investigation breadth or follow-up passes. -- Read `reference/repo-surfaces.md` before moving or renaming public repo surfaces such as commands, agents, plugin files, or `.codex/config.toml`. +- Read `architecture/init-investigation-orchestration.md` before changing `/llmdoc:init`, agent fan-out strategy, report persistence fallback, transport-failure recovery, follow-up gates, repository-size thresholds, exclusion rules, or Codex agent limits. +- Read `guides/init-user-calibration.md` before changing `/llmdoc:init` user questions, confirmation checkpoints, or evidence-vs-user-input rules. +- Read `guides/updating-init-investigation-depth.md` before tuning investigation breadth, report persistence fallback, transport-failure recovery, follow-up passes, or repository-size thresholds. +- Read `reference/repo-surfaces.md` before moving or renaming public repo surfaces such as commands, agents, plugin files, marketplace manifests, or `.codex/config.toml`. ## Memory - `llmdoc/memory/reflections/`: task-specific lessons and mistakes diff --git a/llmdoc/memory/reflections/2026-04-18-marketplace-identifier-compat.md b/llmdoc/memory/reflections/2026-04-18-marketplace-identifier-compat.md new file mode 100644 index 0000000..16d66d2 --- /dev/null +++ b/llmdoc/memory/reflections/2026-04-18-marketplace-identifier-compat.md @@ -0,0 +1,27 @@ +# Marketplace Identifier Compatibility Reflection + +## Task +- Align Codex and Claude plugin marketplace metadata, renaming the local marketplace from `llmdoc-local` to `llmdoc-cc-plugin` and consolidating description metadata. + +## Expected vs Actual +- Expected: users install `llmdoc@llmdoc-cc-plugin` and it just works. +- Actual: users with cached entries under older names (`llmdoc-local`, `tokenroll-cc-plugin`) see stale marketplace references that produce a reload error until the old entry is removed. + +## What Went Wrong +- The marketplace identifier is effectively a public install contract, but it was treated as internal metadata. +- `llmdoc/reference/repo-surfaces.md` listed `.claude-plugin/plugin.json` but omitted `.claude-plugin/marketplace.json` and `.agents/plugins/marketplace.json`, so the rename had no stable-doc source of truth. +- Public READMEs absorbed the compatibility instructions, but llmdoc internal docs did not record that the identifier itself is stable and user-facing. + +## Root Cause +- The plugin metadata surface was split across Codex (`.agents/plugins/marketplace.json`) and Claude (`.claude-plugin/marketplace.json`), and llmdoc reference docs tracked only the plugin manifests, not the marketplace manifests that publish the install name. + +## Missing Docs or Signals +- No reference entry declaring the current marketplace identifier (`llmdoc-cc-plugin`) and the files that publish it. +- No routing hint that connects plugin install or reload-plugins failures to a marketplace-identifier mismatch. + +## Promotion Candidates +- Add marketplace manifests and the current marketplace identifier to `llmdoc/reference/repo-surfaces.md` as stable surfaces. +- Keep the README compatibility note as the user-facing recovery path, and let `repo-surfaces.md` act as the internal source of truth for which identifier is currently canonical. + +## Follow-up +- If the marketplace identifier changes again, update both marketplace manifests, the README compatibility note in English and Chinese, and `llmdoc/reference/repo-surfaces.md` in the same commit. diff --git a/llmdoc/memory/reflections/2026-04-20-subagent-transport-failure.md b/llmdoc/memory/reflections/2026-04-20-subagent-transport-failure.md new file mode 100644 index 0000000..5099c85 --- /dev/null +++ b/llmdoc/memory/reflections/2026-04-20-subagent-transport-failure.md @@ -0,0 +1,33 @@ +# Subagent Transport Failure Reflection + +## Task +- Extend the init investigation protocol to handle Claude Code tool-framework errors where the entire subagent tool call returns `[Tool result missing due to internal error]`. + +## Expected vs Actual +- Expected: when an investigator fails, the fallback protocol kicks in and the main agent either writes `report_markdown` to disk or degrades cleanly. +- Actual: when the tool framework itself loses the subagent's return payload, the main agent has no STATUS, no `failure_type`, no `failure_message`, and no `report_markdown`. It cannot degrade because degradation assumes "markdown in hand". + +## What Went Wrong +- The protocol defined only two return states: `persisted` and `write_failed_fallback_ready`. Both assume the tool transport layer itself is healthy. +- Transport-level failures (missing tool result, internal error) were implicitly treated the same as a missed timeout, so the only recovery path was a full rerun. +- On a `required batch` shard such as a deep architecture slice, a missing subagent result blocks the coverage gate and forces a rerun even when the file may already be on disk. + +## Root Cause +- The fallback taxonomy conflated subagent protocol failure with transport failure. One has markdown to recover from; the other does not. +- Neither side of the protocol wrote a durable artifact outside the return channel, so when the return channel broke, there was nothing to salvage. + +## Missing Docs or Signals +- No documented recovery path for transport-layer failure. +- No instruction to the main agent to check `output_path` on disk before concluding the subagent produced nothing. +- No sidecar or equivalent best-effort artifact that survives a lost tool return. + +## Promotion Candidates +- Add a third observation state `transport_failure` to the init contract. It is inferred by the main agent, not returned by the investigator. +- Require the main agent to verify `output_path` on disk before rerunning on a missing tool result. A report that reached disk is not lost just because the tool return was. +- Require the investigator to mirror `report_markdown` to a sidecar path as a best-effort recovery lane before returning, so even a lost tool result leaves something to degrade from. +- Update `agents/investigator.md` OutputFormat_File so the sidecar write is part of the contract, not an optional tweak. +- Update `llmdoc/guides/updating-init-investigation-depth.md` verification to cover the three-state taxonomy. + +## Follow-up +- When persistence stabilizes, re-evaluate whether the sidecar can be dropped on the success path to avoid double IO. On the failure path, keep it. +- If future framework errors expose a fourth failure mode (for example truncated markdown), extend the taxonomy rather than overloading `transport_failure`. diff --git a/llmdoc/memory/reflections/2026-04-21-context-overflow-recovery.md b/llmdoc/memory/reflections/2026-04-21-context-overflow-recovery.md new file mode 100644 index 0000000..98df3cf --- /dev/null +++ b/llmdoc/memory/reflections/2026-04-21-context-overflow-recovery.md @@ -0,0 +1,45 @@ +# Context Overflow Recovery and Platform Limit Confirmation + +## Task +- Extend the investigator failure-recovery protocol to cover context window overflow, and anchor it to confirmed platform concurrency limits. + +## Confirmed Platform Limits + +| Platform | Concurrency | Depth | Behavior at limit | +|----------|-------------|-------|-------------------| +| Claude Code | Hard 10 | 1 level (subagents cannot spawn subagents) | Auto-queues; does not reject | +| Codex | `max_threads = 8` | `max_depth = 2` | Blocks fan-out; does not queue | + +Key distinction: Claude Code queues gracefully; Codex blocks. Recovery strategies that add concurrent investigators are safe on Claude Code (they queue at 10), but dangerous on Codex (blocked when `max_threads` is saturated). + +## What Went Wrong (Prior Protocol Gap) + +The protocol defined three investigator result states: `persisted`, `write_failed_fallback_ready`, `transport_failure`. None of these covered the case where the subagent's context window overflows mid-run, producing a partial file that exists on disk but was never completed. The coordinating assistant would treat this truncated file as a valid `persisted` report, letting incomplete evidence enter the coverage gate silently. + +## Design Decisions + +**Sentinel ``**: each investigator appends this HTML comment as the last line of every markdown report. The coordinating assistant now checks for the sentinel as part of verification. A file without the sentinel is `context_overflow`, not `persisted`. The sentinel is an HTML comment — invisible in rendered markdown if accidentally included in stable docs, and automatically cleaned up with `.llmdoc-tmp/` at the end of init. + +**Fourth observation state `context_overflow`**: inferred by the coordinating assistant when the report file exists but the sentinel is missing. It is not returned by the subagent (the subagent is unaware of its own overflow). + +**Recovery via follow-up slot, not parallel fan-out**: `context_overflow` means the brief was too wide. Retrying with the same scope will overflow again. Instead, split the topic into ≤3 narrower sub-briefs and route them through the existing follow-up slot. On Claude Code sub-briefs can be concurrent (they queue). On Codex, serialize to respect `max_threads` and `max_depth`. + +**Brief size cap**: each investigator brief is capped at ≤5 questions and ≤15 specific files or symbols at launch. This prevents overflow rather than relying on recovery. + +## Root Cause of the Gap + +The original protocol was designed to handle write failures and transport failures — both cases where the report either exists or doesn't. Partial writes (file exists, content truncated) were not considered because the protocol assumed `Write` is atomic. In practice, a context overflow produces a partial file that appears to have "succeeded" from the filesystem's perspective. + +## Promotion Candidates + +All promoted in this update: +- Sentinel requirement and `context_overflow` state → architecture invariants and all four protocol surfaces +- Brief size cap → investigator prompt and Codex TOML +- Platform-aware recovery (queue vs block) → architecture doc and guide +- README sections: investigator failure table and platform limit note + +## Follow-up + +- If the Codex `max_depth` is increased from 2, the recovery strategy for Codex can be relaxed to allow limited concurrent sub-briefs. Update the architecture note and guide at that point. +- If Claude Code ships `maxParallelAgents`, the protocol can surface a recommended cap. For now, 10 is the platform default and cannot be configured lower or higher. +- The brief size cap (≤5 questions / ≤15 files) is a conservative starting point. If experience shows investigators consistently run out of questions before context, the cap can be relaxed. diff --git a/llmdoc/must/doc-routing.md b/llmdoc/must/doc-routing.md index f709f62..e7200f1 100644 --- a/llmdoc/must/doc-routing.md +++ b/llmdoc/must/doc-routing.md @@ -1,7 +1,7 @@ # Doc Routing ## Read Next By Task -- For `/llmdoc:init` changes: read `llmdoc/architecture/init-investigation-orchestration.md` and `llmdoc/guides/updating-init-investigation-depth.md`. +- For `/llmdoc:init` changes: read `llmdoc/architecture/init-investigation-orchestration.md`, `llmdoc/guides/init-user-calibration.md`, and `llmdoc/guides/updating-init-investigation-depth.md`, especially when changing fan-out thresholds, report persistence fallback, transport-failure recovery, follow-up gates, or exclusion rules. - For public interface or repo layout changes: read `llmdoc/overview/project-overview.md` and `llmdoc/reference/repo-surfaces.md`. - For repeated workflow mistakes or regressions: read the relevant files under `llmdoc/memory/reflections/` first. diff --git a/llmdoc/reference/repo-surfaces.md b/llmdoc/reference/repo-surfaces.md index 764dbd9..a0b4d5c 100644 --- a/llmdoc/reference/repo-surfaces.md +++ b/llmdoc/reference/repo-surfaces.md @@ -12,6 +12,7 @@ - `.codex/config.toml`: Codex-wide agent fan-out and depth limits for this repository. - `.codex/agents/*.toml`: Project-scoped Codex custom agents. - `.codex-plugin/plugin.json` and `.claude-plugin/plugin.json`: Plugin metadata for Codex and Claude Code. +- `.agents/plugins/marketplace.json` and `.claude-plugin/marketplace.json`: Marketplace manifests that publish the install identifier. Current identifier is `llmdoc-cc-plugin`; renames break existing caches and require a README compatibility note. - `README.md` and `README.zh-CN.md`: Public summaries that should reflect actual workflow behavior, not aspirational behavior. ## Sources of Truth diff --git a/skills/llmdoc-init/SKILL.md b/skills/llmdoc-init/SKILL.md index 37dc945..009df66 100644 --- a/skills/llmdoc-init/SKILL.md +++ b/skills/llmdoc-init/SKILL.md @@ -25,7 +25,9 @@ Then execute this workflow: 1. Inspect the project root. - Read top-level manifests and README files. - - Avoid dependency and build directories. + - Exclude dependency, generated, cache, and VCS directories throughout init. At minimum, ignore `node_modules/`, `dist/`, `build/`, `.next/`, `coverage/`, `vendor/`, and `.git/`. + - Estimate repository size from first-party source files and tests after those exclusions. Do not count lockfiles, generated artifacts, vendored code, or cache directories toward LOC thresholds. + - Use that LOC estimate to classify the repo as small (`<= 1000 LOC`), medium (`1001-5000 LOC`), or large (`> 5000 LOC`). 2. Create or repair the llmdoc skeleton. - Ensure these paths exist: @@ -39,25 +41,84 @@ Then execute this workflow: - `llmdoc/memory/decisions/` - `.llmdoc-tmp/investigations/` -3. Run investigation. +3. Run the pre-investigation user calibration. + - This step is required, but the user may skip it by pressing Enter with no extra reply. + - If the environment supports explicit options, `No extra context, continue` may still be shown, but a blank reply should be treated the same way. + - Continue with repository evidence when the user presses Enter with no extra reply. + - This is one of the only valid points where init may pause and wait for user input. Make it explicit that init is waiting for calibration, not finished. + - Limit the interaction to four kinds of context: who the project is for, what the core purpose or core functions are, which internal terms are team-specific rather than generic, and which hidden conventions or boundaries should affect document structure. + - Keep the interaction short and use it only to calibrate investigation scope, terminology, and document structure. + - Persist the current calibration state under `.llmdoc-tmp/investigations/`. + +4. Run investigation. + - In Claude Code, background investigators are an internal execution detail. Do not treat investigator launch as a completion point. + - Do not hand control back to the user while init is still collecting investigation results, consolidating coverage, running follow-up, generating stable docs, synchronizing the index, or cleaning `.llmdoc-tmp/`, unless explicit user input is required. + - The only valid user-facing pause points during init are the pre-investigation calibration, the post-investigation confirmation, and the final completion summary after stable docs, index sync, and cleanup are done. + - Feed the confirmed calibration context and user hints into the investigation plan when they improve coverage or document structure. + - Before launching each `sink=file` investigator, assign a stable `topic` label and a unique `output_path` under `.llmdoc-tmp/investigations/`. - Prefer multiple focused investigators over one broad pass. - - On most non-trivial repositories, start with 3-5 focused slices. + - Cap the first-wave fan-out by repository size: + - small: `1-2` investigators + - medium: `2-3` investigators + - large: `3-5` investigators - Split by theme, not by random directories. - - Run at least one follow-up pass for gaps, conflicts, and cross-cutting relationships. + - Keep theme coverage stable even when fan-out is capped. Merge secondary slices instead of dropping them. + - Do not inspect excluded dependency, generated, cache, or VCS directories during investigation or follow-up. + - When `sink=file`, the investigator should first assemble the complete markdown report (ending with `` as the last line), then try to persist it to `output_path` with `Write`. Treat `output_path` as the canonical artifact for that topic. Do not rely on `Bash` as the primary persistence path. + - Cap each investigator brief to ≤5 questions and ≤15 specific files or symbols. If a thematic slice exceeds this budget, split it before launch rather than relying on overflow recovery. + - The investigator must also attempt a best-effort sidecar `Write` of the same markdown to `.sidecar.md` after the primary write attempt and before returning. This recovery lane survives tool-framework transport failures. It must never replace a successful primary write, and a failed sidecar write must never block the run. + - Treat each file-sink investigation result as one of four states: + - `persisted`: the report was written to `output_path`, returns `topic`, `output_path`, and `sidecar_path`, and the file contains the `` sentinel + - `write_failed_fallback_ready`: the report could not be written, but returns `topic`, `output_path`, `sidecar_path`, `failure_type`, `failure_message`, and full `report_markdown` so the main assistant can persist it + - `transport_failure`: inferred when the subagent tool call returns an internal error or a missing tool result. No payload is available in the return channel. + - `context_overflow`: inferred when the report file exists on disk but the `` sentinel is missing. The report was truncated before completion. + - Notification or result text alone is not a persisted report. + - Do not treat a `persisted` response as complete until the main assistant verifies that the file exists, is non-empty, **and** contains the `` sentinel. + - If an investigator returns `write_failed_fallback_ready`, immediately write `report_markdown` to the same `output_path`, then verify the file and sentinel before continuing. + - If the main assistant observes `transport_failure`, first check `output_path` on disk. If `output_path` is missing, empty, or lacks the sentinel, then check `.sidecar.md`. A valid sidecar is only a recovery source: when the sidecar is complete, copy it back to `output_path`, verify the restored canonical file, and continue only after `output_path` exists, is non-empty, and contains the sentinel. Only rerun when neither path yields a restorable report. + - If the main assistant observes `context_overflow`, do not rerun the same brief scope. Split the topic into ≤3 narrower sub-briefs and route them through the follow-up slot. On Claude Code, sub-briefs can be sent concurrently (queue at platform cap of 10). On Codex, prefer sequential sub-briefs to stay within `max_threads` and `max_depth` budget. + - If the main assistant cannot write that fallback report or cannot restore `output_path` from a valid sidecar, pause init and explicitly ask the user for authorization to write the missing report files under `.llmdoc-tmp/investigations/`. Explain which topics are blocked and that init has not finished. + - Do not expand follow-up fan-out while the current required batch still has unpersisted reports. + - If Claude Code returns foreground control after launching background investigators, immediately continue by waiting for results, checking written investigation reports, and advancing toward the coverage gate. Do not present init as finished. + - If the current fan-out would cause Claude Code to expose an unfinished init as if it were done, reduce investigator count and continue in a more foreground-stable way instead of preserving maximum parallelism. + - While investigators are still running, report status in progress language such as "init is still running" and "waiting for investigator results". Do not imply completion, and do not invite the user to start a new task. + - Run the coverage gate only after every required report from the current batch has been persisted and verified on disk. + - After the first wave, run a coverage gate that checks for missing topics, unresolved conflicts, unverified user supplements, document-structure risks, and hidden assumptions that should become explicit gaps. + - For small and medium repositories, use the coverage gate to choose `pass`, `pass with gaps`, or `targeted follow-up required`. + - For large repositories, always run the first coverage gate before follow-up. Use it to prepare one targeted follow-up brief by default, then rerun the same gate after that pass. + - Scope every follow-up pass to `missing_topics`, `conflicts`, `user_supplements_to_verify`, and `doc_structure_risks`. + - Follow-up must only fill gaps. Do not re-run the whole repo or reopen already settled themes. + - Choose follow-up defaults by repository size: + - small: conditional, at most `0-1` investigators + - medium: conditional, at most `1-2` investigators + - large: after the first coverage gate prepares the brief, run one targeted follow-up pass by default, then let the rerun gate decide whether to continue; at most `1-3` investigators per follow-up pass - Treat investigation output as scratch material, not stable project memory. -4. Generate the initial stable docs. +5. Run the required post-investigation confirmation. + - Show the concept list that is about to enter the record and influence stable docs. + - This is one of the only valid points where init may pause and wait for user input. Make it explicit that init is paused for confirmation and has not finished yet. + - Offer only `Generate docs now` or `I want to add: terms | emphasis | conventions`. + - If the user adds information, accept only that scoped supplement, route it through the same targeted follow-up and coverage gate, and then repeat the confirmation step. + - Keep implementation facts evidence-first. User input may refine positioning, terminology, and structure, but it should not override repository evidence about behavior or ownership. + - Keep unverified or conflicting claims out of stable docs until evidence is strong enough. + +6. Generate the initial stable docs. - Create `llmdoc/index.md` as the global doc map. - Create `llmdoc/startup.md`. - Create a small set of MUST docs. - Create `llmdoc/overview/project-overview.md`. + - In the first stable pass, keep the core architecture and reference set size-aware: usually `2-3` deep docs on small and medium repositories, and `3-5` on large repositories. - Create focused architecture and reference docs from the strongest investigation slices first. -5. Synchronize `llmdoc/index.md`. +7. Synchronize `llmdoc/index.md`. - Index stable docs. - Keep `memory/reflections/` and `memory/decisions/` separate from stable docs. - Do not treat `.llmdoc-tmp/` as part of llmdoc. -6. Summarize what was created and where the startup docs live. +8. Remove `.llmdoc-tmp/`. + - Delete the temporary investigation artifacts after the stable docs and index are complete. + - Do not leave `.llmdoc-tmp/` behind after a successful init run. + +9. Summarize what was created and where the startup docs live. If the repository already contains `llmdoc/`, read `llmdoc/index.md`, `llmdoc/startup.md`, and the listed MUST docs before making broader changes. diff --git a/skills/llmdoc-update/SKILL.md b/skills/llmdoc-update/SKILL.md index 6a4eaa7..25ddad6 100644 --- a/skills/llmdoc-update/SKILL.md +++ b/skills/llmdoc-update/SKILL.md @@ -30,7 +30,18 @@ Then execute this workflow: 2. Investigate the impacted concepts. - Use short, evidence-first exploration. + - Recreate `.llmdoc-tmp/investigations/` on demand before any file-sink investigation or main-assistant fallback write. Do not assume init left the directory behind. - Persist temporary investigation notes under `.llmdoc-tmp/investigations/` only when they help the current update. + - When using `sink=file`, `topic` and `output_path` are required. Assign a stable `topic` label and a unique `output_path` under `.llmdoc-tmp/investigations/` before launching the investigation. + - File-sink update investigations use the same persistence contract as init: write the full report to `output_path` first, append `` as the last line, then attempt a best-effort sidecar write to `.sidecar.md`. + - Treat `output_path` as the canonical artifact for the update investigation. The sidecar is recovery-only and must never replace the primary artifact. + - Treat each file-sink investigation result as one of the same four states used by init: `persisted`, `write_failed_fallback_ready`, `transport_failure`, or `context_overflow`. + - Do not rely on a `persisted` response until `output_path` exists, is non-empty, and contains the `` sentinel. + - A sidecar-only write is not `persisted`. + - If an investigation returns `write_failed_fallback_ready`, immediately write `report_markdown` to the same `output_path`, then verify the file and sentinel before using it. + - If the investigation transport fails, first check `output_path` on disk. If `output_path` is missing, empty, or lacks the sentinel, then check `.sidecar.md`. A valid sidecar is only a recovery source: when the sidecar is complete, copy it back to `output_path`, verify the restored canonical file, and continue only after `output_path` exists, is non-empty, and contains the sentinel. Only rerun when neither path yields a restorable report. + - If the investigation hits `context_overflow`, do not rerun the same brief scope. Split the topic into narrower follow-up briefs instead. + - If the main assistant cannot create `.llmdoc-tmp/investigations/`, cannot write the fallback report, or cannot restore `output_path` from a valid sidecar, pause update and explicitly ask the user for authorization to write the blocked scratch files under `.llmdoc-tmp/investigations/`. Explain which topics are blocked and that update has not finished. 3. Reflect before editing stable docs. - Write a task-specific reflection under `llmdoc/memory/reflections/`.