TokenRollAI · itMrBoy · Apr 19, 2026 · Apr 19, 2026 · Apr 19, 2026 · Apr 19, 2026
diff --git a/.agents/plugins/marketplace.json b/.agents/plugins/marketplace.json
@@ -1,5 +1,5 @@
 {
-  "name": "llmdoc-local",
+  "name": "llmdoc-cc-plugin",
   "interface": {
     "displayName": "llmdoc Local Plugins"
   },

diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -4,7 +4,9 @@
     "name": "TokenRoll",
     "email": "shuaiqijianaho@qq.com"
   },
-  "description": "Marketplace for the minimal llmdoc Claude Code workflow",
+  "metadata": {
+    "description": "Marketplace for the minimal llmdoc Claude Code workflow"
+  },
   "plugins": [
     {
       "name": "llmdoc",

diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
   "name": "llmdoc",
   "description": "llmdoc Claude Code plugin with a minimal workflow: init, update, and use",
-  "version": "2.0.0",
+  "version": "3.0.0",
   "author": {
     "name": "DJJ & Danniel"
   }

diff --git a/.claude/settings.local.json b/.claude/settings.local.json
@@ -0,0 +1 @@
+{}
diff --git a/.codex/agents/llmdoc-investigator.toml b/.codex/agents/llmdoc-investigator.toml
@@ -17,7 +17,14 @@ Then follow this protocol:
 - Investigate source code to fill gaps left by the docs.
 - Prefer file-level and symbol-level references.
 - Only add line numbers when needed to prove subtle or disputed behavior.
-- If asked to write findings to disk, write temporary scratch reports under .llmdoc-tmp/investigations/.
+- Brief budget: for depth=deep, limit each brief to ≤5 questions and ≤15 specific files or symbols. If the scope exceeds this, cover only the highest-priority questions and report the rest as gaps for follow-up. Do not attempt a single pass that would exhaust the context window.
+- For file-sink runs, require TOPIC and OUTPUT_PATH from the caller. Treat OUTPUT_PATH as the canonical persisted artifact for the topic.
+- If asked to write findings to disk, first draft the full markdown report and append <!-- llmdoc:eor --> as the very last line. This sentinel lets the main agent detect truncation: a file without the sentinel is treated as context_overflow, not persisted.
+- First attempt to Write the full markdown report to <OUTPUT_PATH>. Only after that primary write attempt, try a best-effort sidecar Write of the same markdown to <OUTPUT_PATH>.sidecar.md. The sidecar is a recovery lane for cases where the tool-framework transport loses the return payload; it must never replace the primary artifact or block the run if it fails.
+- When the primary write succeeds, return STATUS: persisted with TOPIC, OUTPUT_PATH, and SIDECAR_PATH.
+- When the primary write fails, return STATUS: write_failed_fallback_ready with TOPIC, OUTPUT_PATH, SIDECAR_PATH, FAILURE_TYPE, FAILURE_MESSAGE, and the full REPORT_MARKDOWN inside a fenced markdown block so the main agent can persist it.
+- Report SIDECAR_PATH: none only when the sidecar write also failed; otherwise name the sidecar path explicitly.
+- Do not claim persistence unless the primary write to OUTPUT_PATH actually succeeded. A sidecar-only write is not persisted.
 - Do not silently redesign the system. Return evidence, findings, gaps, and recommended next reads.
 """
 nickname_candidates = ["Scout", "Trace", "Atlas"]
diff --git a/AGENTS.example.md b/AGENTS.example.md
@@ -1,5 +1,7 @@
 # Load The `llmdoc` Skill First
 
+Always answer in 简体中文
+
 Before broad source-code exploration, planning, or documentation work, load the `llmdoc` skill.
 
 The main assistant should align with the user before non-trivial plans or edits.

diff --git a/README.md b/README.md
@@ -54,9 +54,38 @@ The command:
 
 1. Inspects the repo
 2. Creates the llmdoc directory structure
-3. Runs multi-investigator temporary scratch work with explicit coverage checks, then a follow-up gap-check pass
-4. Generates initial MUST, overview, architecture, and reference docs
-5. Synchronizes `llmdoc/index.md`
+3. Runs a short pre-investigation user calibration; pressing Enter with no extra reply continues with repository evidence
+4. Runs size-aware, theme-driven multi-investigator temporary scratch work with explicit persistence checks, subagent-to-main-agent write fallback, and targeted follow-up passes instead of rerunning the whole repo
+5. Shows a required post-investigation concept list so the user can generate docs now or add terms, emphasis, or conventions
+6. Generates initial MUST, overview, architecture, and reference docs
+7. Synchronizes `llmdoc/index.md`
+8. Removes `.llmdoc-tmp/` after the stable docs are complete
+
+Repository size is estimated from first-party source files and tests after excluding dependency, generated, cache, and VCS directories such as `node_modules/`, `dist/`, `build/`, `.next/`, `coverage/`, `vendor/`, and `.git/`. Lockfiles, generated artifacts, vendored code, and cache directories do not count toward the thresholds. The current size bands are: small `<= 1000 LOC`, medium `1001-5000 LOC`, large `> 5000 LOC`.
+
+The first stable pass stays depth-first, but the core-doc target is now size-aware: small and medium repositories usually start with `2-3` deep architecture or reference docs, while large repositories can start with `3-5` when they have distinct invariant clusters worth documenting separately.
+
+If an investigator can return a report but cannot write its scratch file, init now falls back to having the coordinating assistant persist the returned markdown to the same `.llmdoc-tmp/investigations/` path. Investigators also perform a best-effort sidecar write to `<output_path>.sidecar.md`, so that even if the tool-framework transport drops the return payload (for example `Tool result missing due to internal error`), the coordinating assistant can recover the report from disk without re-materializing the markdown through the model. A sidecar is only a recovery source: if `output_path` is missing but the sidecar is complete, the coordinating assistant copies the sidecar back to `output_path` and verifies the restored canonical file before continuing. It should ask the user for write authorization only if the coordinating assistant's own fallback write or restore copy also fails.
+
+#### Investigator failure handling
+
+Init tracks four investigator result states and applies a different recovery path for each:
+
+| State | Trigger | Recovery |
+|-------|---------|----------|
+| `persisted` | Report written; `<!-- llmdoc:eor -->` sentinel present | Verify file and continue |
+| `write_failed_fallback_ready` | Write failed; full markdown in return payload | Coordinating assistant writes to same path, verifies sentinel |
+| `transport_failure` | Tool call returns internal error; no payload | Check `output_path`, then sidecar; if only the sidecar is complete, copy it back to `output_path`, verify, and rerun only if neither path is restorable |
+| `context_overflow` | File present but sentinel missing (truncated) | Split brief into ≤3 narrower sub-briefs; route via follow-up slot |
+
+Each investigator report ends with the sentinel `<!-- llmdoc:eor -->`. A file without the sentinel is treated as truncated (`context_overflow`), not complete. The sentinel lives only in `.llmdoc-tmp/investigations/` scratch files and is removed with the directory at the end of init.
+
+Platform concurrency limits shape how recovery fan-out works:
+
+- **Claude Code**: hard cap of 10 concurrent subagents; excess requests are queued automatically. Recovery sub-briefs can be sent concurrently within this cap.
+- **Codex**: bounded by `max_threads` and `max_depth` in `.codex/config.toml`. Recovery sub-briefs must stay within remaining budget; prefer sequential when budget is tight.
+
+Context overflow recovery does not rerun the same brief scope — that would overflow again. Instead, the topic is split and processed through the existing follow-up slot.
 
 ### `/llmdoc:update`
 
@@ -69,11 +98,13 @@ The command:
 
 1. Rebuilds context from llmdoc and the current working tree
 2. Proactively reads relevant guides and reflections
-3. Investigates impacted concepts
+3. Recreates `.llmdoc-tmp/investigations/` on demand and investigates impacted concepts with the same persistence checks and fallback recovery used by init when file-sink scratch reports are needed
 4. Writes a reflection under `llmdoc/memory/reflections/`
 5. Updates stable docs
 6. Synchronizes `llmdoc/index.md`
 
+If `/llmdoc:update` needs file-sink scratch work after a previous init cleaned up `.llmdoc-tmp/`, it recreates `.llmdoc-tmp/investigations/` before launching the investigation or writing fallback artifacts. If the coordinating assistant cannot create that directory, cannot write a fallback report, or cannot restore `output_path` from a valid sidecar, it must pause and ask the user for write authorization instead of silently stalling.
+
 In normal use, the main assistant should proactively ask whether to run `/llmdoc:update` when the task produced durable knowledge or a useful reflection.
 
 ## llmdoc Layout
@@ -130,6 +161,15 @@ Then install this plugin marketplace and plugin:
 /plugin install llmdoc@llmdoc-cc-plugin
 ```
 
+Compatibility note: if Claude Code still has an older cached marketplace entry such as `tokenroll-cc-plugin`, reset the marketplace and reload plugins with:
+
+```bash
+/plugin marketplace remove tokenroll-cc-plugin
+/plugin marketplace add https://github.com/TokenRollAI/llmdoc
+/plugin install llmdoc@llmdoc-cc-plugin
+/reload-plugins
+```
+
 After installation:
 
 1. Copy [`CLAUDE.example.md`](CLAUDE.example.md) into `~/.claude/CLAUDE.md`.

diff --git a/README.zh-CN.md b/README.zh-CN.md
@@ -54,9 +54,38 @@
 
 1. 检查仓库结构
 2. 创建 llmdoc 目录骨架
-3. 启动多个 investigator 生成临时调查草稿，显式检查覆盖面，并补做一轮查缺补漏
-4. 生成初始 MUST、overview、architecture、reference 文档
-5. 同步 `llmdoc/index.md`
+3. 进行一轮简短的调研前用户确认；如果无需补充，直接回车即可继续并按仓库证据推进
+4. 按项目体量和主题驱动 investigator 调查，显式检查调查报告是否已落盘，在 subagent 写入失败时由主 assistant 补写，并只做定向补查，不重跑整仓
+5. 展示一份必选的调查后概念列表，让用户直接生成文档，或补充术语、重点、约定
+6. 生成初始 MUST、overview、architecture、reference 文档
+7. 同步 `llmdoc/index.md`
+8. 在稳定文档完成后移除 `.llmdoc-tmp/`
+
+项目体量的估算基于“排除依赖、生成物、缓存和 VCS 目录之后”的第一方源码与测试文件。至少会忽略 `node_modules/`、`dist/`、`build/`、`.next/`、`coverage/`、`vendor/` 和 `.git/`；lockfile、生成产物、vendored code 和缓存目录也不计入阈值。当前分档为：小型 `<= 1000 LOC`，中型 `1001-5000 LOC`，大型 `> 5000 LOC`。
+
+首轮稳定文档仍然坚持“先深后广”，但核心文档数量现在按仓库规模放宽：小型和中型仓库通常先产出 `2-3` 篇深度 architecture / reference 文档，大型仓库如果确实存在多个应独立成文的不变量簇，可以先产出 `3-5` 篇。
+
+如果 investigator 能返回调研结果但无法把草稿写入 `.llmdoc-tmp/investigations/`，init 现在会先由主 assistant 把返回的 markdown 补写到同一路径。investigator 在返回前还会尽量把同一份 markdown 再写入 `<output_path>.sidecar.md`，这样即便工具框架传输层丢掉整个返回（例如 `Tool result missing due to internal error`），主 assistant 也能直接从磁盘恢复报告，而不用通过模型重新展开这份 markdown。sidecar 只是恢复来源：如果 `output_path` 缺失但 sidecar 完整，主 assistant 会先把 sidecar 复制回 `output_path`，验证恢复后的 canonical 文件，再继续后续流程。只有在主 assistant 的 fallback 写入或这次恢复 copy 也失败时，才会请求用户授权继续。
+
+#### Investigator 失败处理
+
+Init 追踪四种 investigator 结果状态，每种有不同的恢复路径：
+
+| 状态 | 触发条件 | 恢复路径 |
+|------|---------|---------|
+| `persisted` | 报告已写盘且含 `<!-- llmdoc:eor -->` 哨兵 | 验证文件后继续 |
+| `write_failed_fallback_ready` | 写盘失败，返回完整 markdown | 主 assistant 写到同一路径并验证哨兵 |
+| `transport_failure` | 工具调用返回 internal error，无 payload | 先查 `output_path`，再查 sidecar；如果只有 sidecar 完整，则先复制回 `output_path` 并验证；只有两条路径都不可恢复时才重跑 |
+| `context_overflow` | 文件存在但哨兵缺失（被截断） | 拆分 brief 为 ≤3 个子 brief，走 follow-up 槽 |
+
+每个 investigator 报告的最后一行写入哨兵 `<!-- llmdoc:eor -->`。没有哨兵的文件视为截断（`context_overflow`），不会被当作已完成的报告。哨兵只存在于 `.llmdoc-tmp/investigations/` 临时文件中，随目录在 init 结束时一并清除。
+
+平台并发限制决定恢复时的扇出方式：
+
+- **Claude Code**：并发上限 10，超出自动排队。溢出恢复的子 brief 可以并发发出，超出上限会排队等待。
+- **Codex**：受 `.codex/config.toml` 中 `max_threads` 和 `max_depth` 约束。恢复时的子 brief 必须控制在剩余预算内，线程或深度紧张时优先串行执行。
+
+context overflow 的恢复路径不会用同样的 scope 重跑——会再次溢出。正确做法是将该 topic 拆分，走现有的 follow-up 槽位串行处理。
 
 ### `/llmdoc:update`
 
@@ -69,11 +98,13 @@
 
 1. 基于 llmdoc 和当前 working tree 重建上下文
 2. 主动阅读相关 guides 和 reflection
-3. 调研受影响的概念
+3. 在需要 file-sink 临时报告时按需重建 `.llmdoc-tmp/investigations/`，并用和 init 一致的落盘检查与 fallback 恢复机制调研受影响的概念
 4. 在 `llmdoc/memory/reflections/` 下写 reflection
 5. 更新稳定文档
 6. 同步 `llmdoc/index.md`
 
+如果 `/llmdoc:update` 发生在一次已经清理掉 `.llmdoc-tmp/` 的 init 之后，而这次 update 又需要 file-sink 临时调查报告，它会先重建 `.llmdoc-tmp/investigations/`，再启动调查或补写 fallback 文件。如果主 assistant 无法创建该目录、无法写入 fallback 报告，或无法从有效 sidecar 恢复 `output_path`，就必须暂停并向用户请求写入授权，而不是无声卡住。
+
 在日常使用里，如果任务产生了值得长期保留的知识或反思，主 assistant 应该主动询问是否现在运行 `/llmdoc:update`。
 
 ## llmdoc 结构
@@ -130,6 +161,15 @@ llmdoc/
 /plugin install llmdoc@llmdoc-cc-plugin
 ```
 
+兼容性说明：如果 Claude Code 本地还缓存着旧的 marketplace 条目，比如 `tokenroll-cc-plugin`，可以按下面的命令重置 marketplace 并重新加载插件：
+
+```bash
+/plugin marketplace remove tokenroll-cc-plugin
+/plugin marketplace add https://github.com/TokenRollAI/llmdoc
+/plugin install llmdoc@llmdoc-cc-plugin
+/reload-plugins
+```
+
 安装后：
 
 1. 把 [`CLAUDE.example.md`](CLAUDE.example.md) 复制到 `~/.claude/CLAUDE.md`

diff --git a/agents/investigator.md b/agents/investigator.md
@@ -25,14 +25,17 @@ Key practices:
 - **Use line numbers sparingly:** Add line numbers only when they are required to prove a disputed or non-obvious behavior.
 - **Objective:** Report facts and evidence, not design opinions.
 - **Split by sink:** `sink=chat` is for direct answers. `sink=file` is for temporary scratch artifacts, usually under `.llmdoc-tmp/investigations/`.
+- **Brief budget:** For `sink=file` with `depth=deep`, limit each brief to ≤5 questions and ≤15 specific files or symbols. If the caller's scope exceeds this, investigate only the highest-priority questions in this pass and report the remainder as gaps for follow-up. Do not attempt a single pass that would exhaust the context window.
+- **Persist with `Write`:** For `sink=file`, assemble the full markdown report first, then try to persist it with `Write`. Do not rely on `Bash` as the primary write path.
 - **No long code pastes:** The reader can open source files directly.
 
 <InputFormat>
 - **Objective**: The investigation goal.
 - **Questions**: The concrete questions to answer.
 - **Depth**: `quick` or `deep`.
 - **Sink**: `chat` or `file`.
-- **Output Path**: Required when `sink=file` unless the caller explicitly asks you to choose a path.
+- **Topic**: Required when `sink=file`. Use a stable, human-readable label for the scratch artifact, not an ephemeral sentence.
+- **Output Path**: Required when `sink=file`.
 </InputFormat>
 
 <OutputFormat_Chat>
@@ -65,7 +68,33 @@ Key practices:
   </OutputFormat_Chat>
 
 <OutputFormat_File>
-Write a markdown file using the same section layout as `<OutputFormat_Chat>`, then return the absolute file path.
+When `sink=file`:
+
+1. Require both `Topic` and `Output Path` from the caller.
+2. Draft the full markdown report using the same section layout as `<OutputFormat_Chat>`. Append `<!-- llmdoc:eor -->` as the very last line of the markdown. This sentinel allows the coordinating agent to detect truncation: a report file that exists but lacks this sentinel is treated as `context_overflow`, not `persisted`.
+3. Try to write that markdown to `Output Path` with `Write`. Treat `Output Path` as the canonical persisted artifact for the topic.
+4. After the primary write attempt, always attempt a best-effort sidecar write of the same markdown to `<Output Path>.sidecar.md` using `Write`. This is a recovery lane for the case where the tool-framework transport loses the return payload. Do not fail the run if the sidecar write fails; silently continue.
+5. If the primary write to `Output Path` succeeds, return:
+
+STATUS: persisted
+TOPIC: <topic>
+OUTPUT_PATH: <output path>
+SIDECAR_PATH: <output path>.sidecar.md | none
+
+6. If the primary write to `Output Path` fails, return:
+
+STATUS: write_failed_fallback_ready
+TOPIC: <topic>
+OUTPUT_PATH: <output path>
+SIDECAR_PATH: <output path>.sidecar.md | none
+FAILURE_TYPE: write_permission_denied | tool_refused | shell_write_failed | unknown_write_failure
+FAILURE_MESSAGE: <brief error summary>
+REPORT_MARKDOWN:
+```markdown
+<full markdown report>
+```
+
+Do not claim persistence unless the primary write to `Output Path` actually succeeded. A sidecar-only write is not `persisted`. Report `SIDECAR_PATH: none` only when the sidecar write also failed.
 </OutputFormat_File>
 
 Always ensure the investigation is specific, factual, and easy for another agent to reuse.
diff --git a/agents/recorder.md b/agents/recorder.md
@@ -17,7 +17,7 @@ When invoked:
 3. Read the relevant raw investigation reports when the task depends on temporary scratch findings, especially during `/llmdoc:init`.
 4. Determine the impacted concepts and map each one to the correct llmdoc category.
 5. Keep `llmdoc/index.md` and `llmdoc/startup.md` distinct in purpose and content.
-6. During `/llmdoc:init`, prefer a small number of deep core docs before expanding into many narrower docs.
+6. During `/llmdoc:init`, prefer a size-aware small number of deep core docs before expanding into many narrower docs.
 7. Update the touched documents and synchronize `llmdoc/index.md`.
 8. Report every file you created, updated, or deleted.
 
@@ -42,7 +42,7 @@ Split rules:
 - One concept per document.
 - One workflow per guide.
 - One ownership boundary or invariant cluster per architecture doc.
-- During init, depth beats premature fragmentation. Prefer 2-3 strong core docs over 10+ shallow ones.
+- During init, depth beats premature fragmentation. Prefer `2-3` strong core docs on small and medium repositories, and `3-5` on large repositories, over `10+` shallow ones.
 - If a document grows large only because it is preserving one coherent execution model, invariant set, or contract cluster, keep it intact until a clean split is obvious.
 - If a document exceeds roughly 120 lines, covers more than one workflow, or mixes stable facts with transient notes, split it when doing so improves retrieval without discarding essential reasoning flow.
 - Do not promote content into `/must/` unless it is stable, short, and useful on nearly every task.