Skip to content

docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680)#2686

Merged
Hmbown merged 1 commit into
codex/v0.8.53from
codex/v0.8.53-toolsurface-design-docs
Jun 3, 2026
Merged

docs: v0.8.53 tool-surface-diet design + north-star direction (#2681, #2680)#2686
Hmbown merged 1 commit into
codex/v0.8.53from
codex/v0.8.53-toolsurface-design-docs

Conversation

@Hmbown
Copy link
Copy Markdown
Owner

@Hmbown Hmbown commented Jun 3, 2026

Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover. No tool-catalog code changes in this cycle — these docs define the policy and contracts so the code PRs can follow with sign-off. Grounded in a verified inventory of the actual tool registry (every cited file:line was checked against the tree).

What the inventory changed about the brief

Docs in this PR

Doc Issue Summary
docs/TOOL_LIFECYCLE.md #2681 Umbrella policy: 5 lifecycle states modeled as const name-sets + alias table in tool_catalog.rs (not a per-ToolSpec field) so registration stays untouched and old transcripts always replay; deprecation manifest; per-mode/provider active-catalog budget (incl. Arcee's 8-tool first-turn set); prefix-cache safety rules; tool_agent = canonical but DeepSeek-V4-gated.
docs/CODEBASE_SEARCH_DESIGN.md #2680 (v0.9.0) Local-first FTS5/BM25 + symbol/path + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; explainable {path,line,snippet,score,reasons[]} contract; real-query eval set. Complements grep_files/file_search.
docs/SKILL_INVOCATION_DESIGN.md (0.9.0) $<skill-name> inline invocation syntax (the token is the skill name), namespaced resolution, ambiguity-suggests-not-guesses, visible activation line, smallest-viable slice.
docs/VISION_NORTH_STAR.md (0.9.0+) Intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, layered context-memory stack, tool repair/autoload, evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; records deferred-not-done diet items.

Relationship to in-flight PRs

Does not contradict #2684 (subagent role vocab / lifecycle / eval) or #2685 (git history active + RLM/field errors). The planned code PRs (lifecycle mechanism → shell-alias collapse → todo_* deprecation → subagent cruft scrub) are described in TOOL_LIFECYCLE.md §3 and await your sign-off.

Verification

Docs only — no code. Anchors spot-checked against the tree (skills/mod.rs:421/553, registry.rs:1017-1029, tool_catalog.rs:37-64/106-115/169-196, tool_routing.rs:1139-1140, todo.rs:184-196, rlm.rs:26). todo_add/update/list replacement mappings corrected to their 1:1 checklist_* counterparts.

🤖 Generated with Claude Code

Design-only deliverables for the v0.8.53 "tool surface diet / canonical
surfaces" cutover (no catalog code in this cycle). Grounded in a verified
inventory of the actual tool registry.

- docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states
  (active / deferred / hidden-compatibility / deprecated / removed) modeled as
  const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec
  field), so registration stays untouched and old transcripts always replay.
  Includes the deprecation manifest (exec_wait/exec_interact/tts →
  hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are
  already non-visible dead code → cleanup + guardrail), per-mode/per-provider
  active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache
  safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated.
- docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 +
  symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor
  invalidation; an explainable tool contract returning reasons[]; and a real
  CodeWhale query eval set. Complements grep_files/file_search, never replaces.
- docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation
  syntax (the token IS the skill name), namespaced resolution, ambiguity-
  suggests-not-guesses, visible activation line, and a smallest-viable slice.
- docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase
  intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered
  context-memory stack, tool repair/autoload, the evaluation loop, and the
  command-surface taxonomy (/memory small · /context dashboard · /rules ·
  /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not
  committed 0.8.53 work; also records the deferred-not-done diet items.

Targets codex/v0.8.53.
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmbown has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several design documents outlining the architectural roadmap for CodeWhale v0.9.0+, including local-first semantic code retrieval, inline skill invocation syntax, a tool-surface lifecycle policy, and the overall 'North Star' vision. The review feedback highlights several areas for refinement: correcting a broken section reference and adding SQLite triggers to synchronize the FTS5 index in the codebase search design, expanding the skill invocation scanner to skip inline code backticks to avoid false positives, and correcting factual and mathematical inconsistencies regarding active native tool counts in the lifecycle policy document.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.


`<workspace-hash>` is a stable hash of the canonical workspace root, so each checkout/worktree gets its own index and nothing is shared across unrelated projects. Backed by `rusqlite` (existing dep).

> Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The reference to §7 is broken because there is no Section 7 in this document. The /memory doctor subcommand is actually described in Appendix B of this document (and Section 9 of docs/VISION_NORTH_STAR.md). We should update this reference to point to Appendix B.

Suggested change
> Migration note (ties to the `/memory doctor` taxonomy in §7): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).
> Migration note (ties to the `/memory doctor` taxonomy in Appendix B): older builds used `~/.deepseek`. The index path is `~/.codewhale` only; if a legacy `~/.deepseek/index` exists it is ignored (a future `doctor` may offer to migrate, never auto-read).

Comment on lines +111 to +112
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When using an external content table (content='chunks') in SQLite FTS5, the virtual table does not automatically stay in sync with the underlying content table. To prevent the index from becoming stale or corrupted when chunks are inserted, updated, or deleted, you must define SQLite triggers. It would be highly beneficial to specify these triggers in the design.

Suggested change
);
);
-- Triggers to keep the external content FTS5 index in sync with the chunks table
CREATE TRIGGER chunks_ai AFTER INSERT ON chunks BEGIN
INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);
END;
CREATE TRIGGER chunks_ad AFTER DELETE ON chunks BEGIN
INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);
END;
CREATE TRIGGER chunks_au AFTER UPDATE ON chunks BEGIN
INSERT INTO chunks_fts(chunks_fts, rowid, text, symbol) VALUES('delete', old.id, old.text, old.symbol);
INSERT INTO chunks_fts(rowid, text, symbol) VALUES (new.id, new.text, new.symbol);
END;

Comment on lines +166 to +168
resolver. The scanner must skip `$` occurrences inside code fences and inline
command strings (see Non-goals) so shell `$VAR` references are never treated as
skill mentions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In addition to code fences and inline command strings, the scanner should also explicitly skip $ occurrences inside inline code backticks (e.g., `$VAR`). This will prevent false positives when users or models reference shell variables or other $-prefixed terms in inline code blocks.

Suggested change
resolver. The scanner must skip `$` occurrences inside code fences and inline
command strings (see Non-goals) so shell `$VAR` references are never treated as
skill mentions.
resolver. The scanner must skip `$` occurrences inside code fences, inline code backticks, and inline
command strings (see Non-goals) so shell `$VAR` references are never treated as
skill mentions.

Comment thread docs/TOOL_LIFECYCLE.md
## 5. Active-catalog budget (per mode, per provider)

The active set is the first-turn cost. Current default active set:
`DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

DEFAULT_ACTIVE_NATIVE_TOOLS actually has 26 entries in crates/tui/src/core/engine/tool_catalog.rs:37-64 (not 25). Let's correct this number to keep the design document factually accurate.

Suggested change
`DEFAULT_ACTIVE_NATIVE_TOOLS` has **25** entries (`tool_catalog.rs:37-64`).
`DEFAULT_ACTIVE_NATIVE_TOOLS` has **26** entries (`tool_catalog.rs:37-64`).

Comment thread docs/TOOL_LIFECYCLE.md

| Provider | First-turn active source | Current count | Target after diet |
|---|---|---|---|
| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since DEFAULT_ACTIVE_NATIVE_TOOLS has 26 entries, the current count is 26, and removing exec_wait and exec_interact will bring the target count to 24 (or ~23 if we also account for other changes). Let's update the table row to reflect the correct count.

Suggested change
| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 25 | ~22 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |
| Default (DeepSeek et al.) | `DEFAULT_ACTIVE_NATIVE_TOOLS` | 26 | ~23 (drop `exec_wait`, `exec_interact`; `todo_*` already not active) |

Comment thread docs/TOOL_LIFECYCLE.md
The default diet removes `exec_wait` and `exec_interact` from the active head
(they become hidden-compat; their canonical twins `exec_shell_wait` /
`exec_shell_interact` stay). `tts` and `todo_*` are *already not* in the active
set, so the active count moves **25 → 23** from the wait/interact removal alone;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the default active set has 26 entries, removing exec_wait and exec_interact (2 entries) moves the active count from 26 → 24 (not 25 → 23). Let's update this text to align with the correct math.

Suggested change
set, so the active count moves **2523** from the wait/interact removal alone;
set, so the active count moves **2624** from the wait/interact removal alone;

@Hmbown Hmbown merged commit 8cb4f94 into codex/v0.8.53 Jun 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant