docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680)#2686
Conversation
Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover (no catalog code in this cycle). Grounded in a verified inventory of the actual tool registry. - docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states (active / deferred / hidden-compatibility / deprecated / removed) modeled as const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec field), so registration stays untouched and old transcripts always replay. Includes the deprecation manifest (exec_wait/exec_interact/tts → hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are already non-visible dead code → cleanup + guardrail), per-mode/per-provider active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated. - docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 + symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; an explainable tool contract returning reasons[]; and a real CodeWhale query eval set. Complements grep_files/file_search, never replaces. - docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation syntax (the token IS the skill name), namespaced resolution, ambiguity- suggests-not-guesses, visible activation line, and a smallest-viable slice. - docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered context-memory stack, tool repair/autoload, the evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; also records the deferred-not-done diet items. Targets codex/v0.8.53.
There was a problem hiding this comment.
Hmbown has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
Code Review
This pull request introduces several design documents outlining the architectural roadmap for CodeWhale v0.9.0+, including local-first semantic code retrieval, inline skill invocation syntax, a tool-surface lifecycle policy, and the overall 'North Star' vision. The review feedback highlights several areas for refinement: correcting a broken section reference and adding SQLite triggers to synchronize the FTS5 index in the codebase search design, expanding the skill invocation scanner to skip inline code backticks to avoid false positives, and correcting factual and mathematical inconsistencies regarding active native tool counts in the lifecycle policy document.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
|
||
| `<workspace-hash>` is a stable hash of the canonical workspace root, so each checkout/worktree gets its own index and nothing is shared across unrelated projects. Backed by `rusqlite` (existing dep). | ||
|
|
||
| > Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read). |
There was a problem hiding this comment.
The reference to §7 is broken because there is no Section 7 in this document. The /memory doctor subcommand is actually described in Appendix B of this document (and Section 9 of docs/VISION_NORTH_STAR.md). We should update this reference to point to Appendix B.
| > Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read). | |
| > Migration note (ties to the `/memory doctor` taxonomy in Appendix B): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read). |
| ); | ||
|
|
There was a problem hiding this comment.
When using an external content table (content='chunks') in SQLite FTS5, the virtual table does not automatically stay in sync with the underlying content table. To prevent the index from becoming stale or corrupted when chunks are inserted, updated, or deleted, you must define SQLite triggers. It would be highly beneficial to specify these triggers in the design.
| ); | |
| ); | |
| -- Triggers to keep the external content FTS5 index in sync with the chunks table | |
| CREATE TRIGGER chunks_ai AFTER INSERT ON chunks BEGIN | |
| INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol); | |
| END; | |
| CREATE TRIGGER chunks_ad AFTER DELETE ON chunks BEGIN | |
| INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol); | |
| END; | |
| CREATE TRIGGER chunks_au AFTER UPDATE ON chunks BEGIN | |
| INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol); | |
| INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol); | |
| END; | |
| resolver. The scanner must skip `$` occurrences inside code fences and inline | ||
| command strings (see Non-goals) so shell `$VAR` references are never treated as | ||
| skill mentions. |
There was a problem hiding this comment.
In addition to code fences and inline command strings, the scanner should also explicitly skip $ occurrences inside inline code backticks (e.g., `$VAR`). This will prevent false positives when users or models reference shell variables or other $-prefixed terms in inline code blocks.
| resolver. The scanner must skip `$` occurrences inside code fences and inline | |
| command strings (see Non-goals) so shell `$VAR` references are never treated as | |
| skill mentions. | |
| resolver. The scanner must skip `$` occurrences inside code fences, inline code backticks, and inline | |
| command strings (see Non-goals) so shell `$VAR` references are never treated as | |
| skill mentions. |
| ## 5. Active-catalog budget (per mode, per provider) | ||
|
|
||
| The active set is the first-turn cost. Current default active set: | ||
| `DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`). |
There was a problem hiding this comment.
DEFAULT_ACTIVE_NATIVE_TOOLS actually has 26 entries in crates/tui/src/core/engine/tool_catalog.rs:37-64 (not 25). Let's correct this number to keep the design document factually accurate.
| `DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`). | |
| `DEFAULT_ACTIVE_NATIVE_TOOLS` has **26** entries (`tool_catalog.rs:37-64`). |
|
|
||
| | Provider | First-turn active source | Current count | Target after diet | | ||
| |---|---|---|---| | ||
| | Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) | |
There was a problem hiding this comment.
Since DEFAULT_ACTIVE_NATIVE_TOOLS has 26 entries, the current count is 26, and removing exec_wait and exec_interact will bring the target count to 24 (or ~23 if we also account for other changes). Let's update the table row to reflect the correct count.
| | Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) | | |
| | Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 26 | ~23 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) | |
| The default diet removes `exec_wait` and `exec_interact` from the active head | ||
| (they become hidden-compat; their canonical twins `exec_shell_wait` / | ||
| `exec_shell_interact` stay). `tts` and `todo_*` are *already not* in the active | ||
| set, so the active count moves **25 → 23** from the wait/interact removal alone; |
There was a problem hiding this comment.
Since the default active set has 26 entries, removing exec_wait and exec_interact (2 entries) moves the active count from 26 → 24 (not 25 → 23). Let's update this text to align with the correct math.
| set, so the active count moves **25 → 23** from the wait/interact removal alone; | |
| set, so the active count moves **26 → 24** from the wait/interact removal alone; |
Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover. No tool-catalog code changes in this cycle — these docs define the policy and contracts so the code PRs can follow with sign-off. Grounded in a verified inventory of the actual tool registry (every cited
file:linewas checked against the tree).What the inventory changed about the brief
#[allow(dead_code)], never registered. Onlyagent_open/agent_eval/agent_close/tool_agentare wired (registry.rs:1017-1029). So Deprecate legacy subagent tool names and keep only agent_open/eval/close canonical #2683 is cleanup + guardrails, not "hide tools."exec_wait≡exec_shell_wait,exec_interact≡exec_shell_interact(same struct, two names, both active), plusspeech/tts. Highest-value weaker-model win.todo_*is already deferred, so Deprecate model-visible todo_* aliases in favor of checklist_* #2682 is "drop from tool-search + deprecation notice," lower-risk than it sounds.Docs in this PR
docs/TOOL_LIFECYCLE.mdconstname-sets + alias table intool_catalog.rs(not a per-ToolSpecfield) so registration stays untouched and old transcripts always replay; deprecation manifest; per-mode/provider active-catalog budget (incl. Arcee's 8-tool first-turn set); prefix-cache safety rules;tool_agent= canonical but DeepSeek-V4-gated.docs/CODEBASE_SEARCH_DESIGN.mdrusqlitestorage; mtime/branch/vendor invalidation; explainable{path,line,snippet,score,reasons[]}contract; real-query eval set. Complementsgrep_files/file_search.docs/SKILL_INVOCATION_DESIGN.md$<skill-name>inline invocation syntax (the token is the skill name), namespaced resolution, ambiguity-suggests-not-guesses, visible activation line, smallest-viable slice.docs/VISION_NORTH_STAR.md/memorysmall ·/contextdashboard ·/rules·/workflow·/overlay·$<skill>·codebase_search). Marked DIRECTION, not committed 0.8.53 work; records deferred-not-done diet items.Relationship to in-flight PRs
Does not contradict #2684 (subagent role vocab / lifecycle / eval) or #2685 (git history active + RLM/field errors). The planned code PRs (lifecycle mechanism → shell-alias collapse →
todo_*deprecation → subagent cruft scrub) are described inTOOL_LIFECYCLE.md §3and await your sign-off.Verification
Docs only — no code. Anchors spot-checked against the tree (
skills/mod.rs:421/553,registry.rs:1017-1029,tool_catalog.rs:37-64/106-115/169-196,tool_routing.rs:1139-1140,todo.rs:184-196,rlm.rs:26).todo_add/update/listreplacement mappings corrected to their 1:1checklist_*counterparts.🤖 Generated with Claude Code