docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680) by Hmbown · Pull Request #2686 · Hmbown/CodeWhale

Hmbown · 2026-06-03T18:47:52Z

Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover. No tool-catalog code changes in this cycle — these docs define the policy and contracts so the code PRs can follow with sign-off. Grounded in a verified inventory of the actual tool registry (every cited file:line was checked against the tree).

What the inventory changed about the brief

The 11 legacy subagent names (Deprecate legacy subagent tool names and keep only agent_open/eval/close canonical #2683) are already invisible to models — all #[allow(dead_code)], never registered. Only agent_open/agent_eval/agent_close/tool_agent are wired (registry.rs:1017-1029). So Deprecate legacy subagent tool names and keep only agent_open/eval/close canonical #2683 is cleanup + guardrails, not "hide tools."
The real both-active exact duplicates are the shell aliases: exec_wait≡exec_shell_wait, exec_interact≡exec_shell_interact (same struct, two names, both active), plus speech/tts. Highest-value weaker-model win.
todo_* is already deferred, so Deprecate model-visible todo_* aliases in favor of checklist_* #2682 is "drop from tool-search + deprecation notice," lower-risk than it sounds.

Docs in this PR

Doc	Issue	Summary
`docs/TOOL_LIFECYCLE.md`	#2681	Umbrella policy: 5 lifecycle states modeled as `const` name-sets + alias table in `tool_catalog.rs` (not a per-`ToolSpec` field) so registration stays untouched and old transcripts always replay; deprecation manifest; per-mode/provider active-catalog budget (incl. Arcee's 8-tool first-turn set); prefix-cache safety rules; `tool_agent` = canonical but DeepSeek-V4-gated.
`docs/CODEBASE_SEARCH_DESIGN.md`	#2680 (v0.9.0)	Local-first FTS5/BM25 + symbol/path + RRF hybrid; `rusqlite` storage; mtime/branch/vendor invalidation; explainable `{path,line,snippet,score,reasons[]}` contract; real-query eval set. Complements `grep_files`/`file_search`.
`docs/SKILL_INVOCATION_DESIGN.md`	(0.9.0)	`$<skill-name>` inline invocation syntax (the token is the skill name), namespaced resolution, ambiguity-suggests-not-guesses, visible activation line, smallest-viable slice.
`docs/VISION_NORTH_STAR.md`	(0.9.0+)	Intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, layered context-memory stack, tool repair/autoload, evaluation loop, and the command-surface taxonomy (`/memory` small · `/context` dashboard · `/rules` · `/workflow` · `/overlay` · `$<skill>` · `codebase_search`). Marked DIRECTION, not committed 0.8.53 work; records deferred-not-done diet items.

Relationship to in-flight PRs

Does not contradict #2684 (subagent role vocab / lifecycle / eval) or #2685 (git history active + RLM/field errors). The planned code PRs (lifecycle mechanism → shell-alias collapse → todo_* deprecation → subagent cruft scrub) are described in TOOL_LIFECYCLE.md §3 and await your sign-off.

Verification

Docs only — no code. Anchors spot-checked against the tree (skills/mod.rs:421/553, registry.rs:1017-1029, tool_catalog.rs:37-64/106-115/169-196, tool_routing.rs:1139-1140, todo.rs:184-196, rlm.rs:26). todo_add/update/list replacement mappings corrected to their 1:1 checklist_* counterparts.

🤖 Generated with Claude Code

Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover (no catalog code in this cycle). Grounded in a verified inventory of the actual tool registry. - docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states (active / deferred / hidden-compatibility / deprecated / removed) modeled as const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec field), so registration stays untouched and old transcripts always replay. Includes the deprecation manifest (exec_wait/exec_interact/tts → hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are already non-visible dead code → cleanup + guardrail), per-mode/per-provider active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated. - docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 + symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; an explainable tool contract returning reasons[]; and a real CodeWhale query eval set. Complements grep_files/file_search, never replaces. - docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation syntax (the token IS the skill name), namespaced resolution, ambiguity- suggests-not-guesses, visible activation line, and a smallest-viable slice. - docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered context-memory stack, tool repair/autoload, the evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; also records the deferred-not-done diet items. Targets codex/v0.8.53.

greptile-apps

Hmbown has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

gemini-code-assist

Code Review

This pull request introduces several design documents outlining the architectural roadmap for CodeWhale v0.9.0+, including local-first semantic code retrieval, inline skill invocation syntax, a tool-surface lifecycle policy, and the overall 'North Star' vision. The review feedback highlights several areas for refinement: correcting a broken section reference and adding SQLite triggers to synchronize the FTS5 index in the codebase search design, expanding the skill invocation scanner to skip inline code backticks to avoid false positives, and correcting factual and mathematical inconsistencies regarding active native tool counts in the lifecycle policy document.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-03T18:49:25Z

+
+`<workspace-hash>` is a stable hash of the canonical workspace root, so each checkout/worktree gets its own index and nothing is shared across unrelated projects. Backed by `rusqlite` (existing dep).
+
+> Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).


The reference to §7 is broken because there is no Section 7 in this document. The /memory doctor subcommand is actually described in Appendix B of this document (and Section 9 of docs/VISION_NORTH_STAR.md). We should update this reference to point to Appendix B.

Suggested change

> Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).

> Migration note (ties to the `/memory doctor` taxonomy in Appendix B): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).

gemini-code-assist · 2026-06-03T18:49:25Z

+);
+


When using an external content table (content='chunks') in SQLite FTS5, the virtual table does not automatically stay in sync with the underlying content table. To prevent the index from becoming stale or corrupted when chunks are inserted, updated, or deleted, you must define SQLite triggers. It would be highly beneficial to specify these triggers in the design.

Suggested change

);

);

-- Triggers to keep the external content FTS5 index in sync with the chunks table

CREATE TRIGGER chunks_ai AFTER INSERT ON chunks BEGIN

INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);

END;

CREATE TRIGGER chunks_ad AFTER DELETE ON chunks BEGIN

INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);

END;

CREATE TRIGGER chunks_au AFTER UPDATE ON chunks BEGIN

INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);

INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);

END;

gemini-code-assist · 2026-06-03T18:49:25Z

+  resolver. The scanner must skip `$` occurrences inside code fences and inline
+  command strings (see Non-goals) so shell `$VAR` references are never treated as
+  skill mentions.


In addition to code fences and inline command strings, the scanner should also explicitly skip $ occurrences inside inline code backticks (e.g., `$VAR`). This will prevent false positives when users or models reference shell variables or other $-prefixed terms in inline code blocks.

Suggested change

resolver. The scanner must skip `$` occurrences inside code fences and inline

command strings (see Non-goals) so shell `$VAR` references are never treated as

skill mentions.

resolver. The scanner must skip `$` occurrences inside code fences, inline code backticks, and inline

command strings (see Non-goals) so shell `$VAR` references are never treated as

skill mentions.

gemini-code-assist · 2026-06-03T18:49:25Z

+## 5. Active-catalog budget (per mode, per provider)
+
+The active set is the first-turn cost. Current default active set:
+`DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`).


DEFAULT_ACTIVE_NATIVE_TOOLS actually has 26 entries in crates/tui/src/core/engine/tool_catalog.rs:37-64 (not 25). Let's correct this number to keep the design document factually accurate.

Suggested change

`DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`).

`DEFAULT_ACTIVE_NATIVE_TOOLS` has **26** entries (`tool_catalog.rs:37-64`).

gemini-code-assist · 2026-06-03T18:49:25Z

+
+| Provider | First-turn active source | Current count | Target after diet |
+|---|---|---|---|
+| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |


Since DEFAULT_ACTIVE_NATIVE_TOOLS has 26 entries, the current count is 26, and removing exec_wait and exec_interact will bring the target count to 24 (or ~23 if we also account for other changes). Let's update the table row to reflect the correct count.

Suggested change

| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |

| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 26 | ~23 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |

gemini-code-assist · 2026-06-03T18:49:25Z

+The default diet removes `exec_wait` and `exec_interact` from the active head
+(they become hidden-compat; their canonical twins `exec_shell_wait` /
+`exec_shell_interact` stay). `tts` and `todo_*` are *already not* in the active
+set, so the active count moves **25 → 23** from the wait/interact removal alone;


Since the default active set has 26 entries, removing exec_wait and exec_interact (2 entries) moves the active count from 26 → 24 (not 25 → 23). Let's update this text to align with the correct math.

Suggested change

set, so the active count moves **25 → 23** from the wait/interact removal alone;

set, so the active count moves **26 → 24** from the wait/interact removal alone;

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

Hmbown mentioned this pull request Jun 3, 2026

feat(project): deprecate WHALE.md; add .codewhale/constitution.json authority layer (v0.8.53) #2688

Merged

Hmbown merged commit 8cb4f94 into codex/v0.8.53 Jun 3, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680)#2686

docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680)#2686
Hmbown merged 1 commit into
codex/v0.8.53from
codex/v0.8.53-toolsurface-design-docs

Hmbown commented Jun 3, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		`<workspace-hash>` is a stable hash of the canonical workspace root, so each checkout/worktree gets its own index and nothing is shared across unrelated projects. Backed by `rusqlite` (existing dep).

		> Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).

-);
+);
+-- Triggers to keep the external content FTS5 index in sync with the chunks table
+CREATE TRIGGER chunks_ai AFTER INSERT ON chunks BEGIN
+  INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);
+END;
+CREATE TRIGGER chunks_ad AFTER DELETE ON chunks BEGIN
+  INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);
+END;
+CREATE TRIGGER chunks_au AFTER UPDATE ON chunks BEGIN
+  INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);
+  INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);
+END;

	`DEFAULT_ACTIVE_NATIVE_TOOLS` has 25 entries (`tool_catalog.rs:37-64`).
	`DEFAULT_ACTIVE_NATIVE_TOOLS` has 26 entries (`tool_catalog.rs:37-64`).

	\| Default (DeepSeek et al.) \| `DEFAULT_ACTIVE_NATIVE_TOOLS` \| 25 \| ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) \|
	\| Default (DeepSeek et al.) \| `DEFAULT_ACTIVE_NATIVE_TOOLS` \| 26 \| ~23 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) \|

	set, so the active count moves 25 → 23 from the wait/interact removal alone;
	set, so the active count moves 26 → 24 from the wait/interact removal alone;

Conversation

Hmbown commented Jun 3, 2026

What the inventory changed about the brief

Docs in this PR

Relationship to in-flight PRs

Verification

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant