Skip to content

perf(prefix-cache): cache tool-catalog JSON serialization across checks#2632

Open
HUQIANTAO wants to merge 2 commits into
Hmbown:mainfrom
HUQIANTAO:perf/tool-catalog-cache
Open

perf(prefix-cache): cache tool-catalog JSON serialization across checks#2632
HUQIANTAO wants to merge 2 commits into
Hmbown:mainfrom
HUQIANTAO:perf/tool-catalog-cache

Conversation

@HUQIANTAO
Copy link
Copy Markdown
Contributor

Summary

PrefixFingerprint::compute is called once per turn by the turn-loop prefix-stability check (and again on every mode flip, project-context refresh, or canonical-state overlay). The tool-side work serializes every tool to the chat-API JSON shape, sorts the resulting strings, joins with newlines, and SHA-256s the result. For a 60-tool catalog that is ~25–40 KB of allocation plus a sort, all of which produces a byte-identical output once the tool set is stable across turns — the common case after the first turn of a session.

This PR introduces a process-local ToolCatalogCache that stores the joined+sorted catalog under a content-derived u64 identity. On a hit, the per-tool JSON serialization, sort, and join are skipped entirely; the pre-computed SHA-256 hex digest is returned directly.

Why now

The prefix-stability manager was wired into the turn loop in v0.8.50 to surface cache-hit telemetry to the TUI footer. That made the catalog serialization run on every turn instead of every session boundary. This PR removes that regression: the first turn pays the full cost, every subsequent turn pays only the identity hash.

The cache lives on PrefixStabilityManager (per-session ownership) and backs a new PrefixFingerprint::compute_with_tool_cache entry point. check_and_update, PrefixStabilityManager::new, and pin() all use the cached path. The original compute() is kept as a fallback for callers that do not have a cache in hand.

Changes

File Change
crates/tui/src/prefix_cache.rs New ToolCatalogCache (capacity-bounded LRU) and CachedCatalog entry type. New PrefixFingerprint::compute_with_tool_cache entry point. PrefixStabilityManager gains a tool_catalog_cache field. 8 new unit tests.

Design notes

  • Identity hash is order-sensitive. Two slices with the same tools in different orders produce two cache entries. The downstream fingerprint itself remains order-insensitive (the sort in fingerprint_for takes care of that). Order-sensitive identity makes a re-registration of the same set in the same order a hit.
  • Eviction policy: insertion-order LRU. Matches the strategy in transcript_cache.rs. Capacity defaults to 8 (sized for "session + 1 or 2 forked subagent catalogs"). invalidate() is exposed for tool-registry hot-reload and MCP attach paths.
  • No new external dependencies. Uses std::collections::HashMap, VecDeque, and std::hash::DefaultHasher.
  • No public API breakage. PrefixFingerprint::compute is unchanged.

Tests

running 30 tests (prefix_cache module)
test prefix_cache::tests::compute_with_tool_cache_matches_compute_uncached ... ok
test prefix_cache::tests::manager_check_and_update_uses_cached_tool_fingerprint ... ok
test prefix_cache::tests::tool_catalog_cache_miss_then_hit_returns_same_arc ... ok
test prefix_cache::tests::tool_catalog_cache_different_tool_sets_dont_collide ... ok
test prefix_cache::tests::tool_catalog_cache_pinned_by_input_order ... ok
test prefix_cache::tests::tool_catalog_cache_detects_schema_change ... ok
test prefix_cache::tests::tool_catalog_cache_respects_capacity ... ok
test prefix_cache::tests::tool_catalog_cache_invalidate_clears_all ... ok
test prefix_cache::tests::tool_catalog_cache_empty_slice_uses_zero_capacity_path ... ok
... (21 prior tests still pass)

test result: ok. 30 passed; 0 failed

The wider prefix-cache contract tests in settings, prompts, and core::engine::tests continue to pass.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Thanks @HUQIANTAO for taking the time to contribute.

This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in .github/APPROVED_CONTRIBUTORS will be closed automatically.

Please read CONTRIBUTING.md for the expected contribution shape. A maintainer can grant PR access by commenting /lgtm on a pull request.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a ToolCatalogCache to optimize prefix fingerprint computation by caching the serialized JSON representation of tool catalogs, avoiding redundant serialization, sorting, and joining. A review comment identifies a correctness bug where the strict field of a tool is omitted from the cache identity hash, potentially leading to incorrect cache hits. Additionally, the reviewer suggests a performance improvement to hash the serde_json::Value directly instead of serializing it to a string on every check to avoid unnecessary allocations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +319 to 334
fn tool_set_identity(tools: &[Tool]) -> u64 {
let mut hasher = DefaultHasher::new();
tools.len().hash(&mut hasher);
for tool in tools {
tool.name.hash(&mut hasher);
tool.description.hash(&mut hasher);
// Hash the schema as a canonical JSON string. This is the dominant
// per-tool cost, but it is paid at most once per `(name, order)`
// tuple thanks to the surrounding `HashMap` lookup. Tools that
// mutate their `input_schema` (rare) will simply miss the cache.
let schema_text = serde_json::to_string(&tool.input_schema)
.unwrap_or_else(|_| "<unserializable schema>".to_string());
schema_text.hash(&mut hasher);
}
hasher.finish()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Correctness & Performance Issues in tool_set_identity

  1. Correctness Bug (High Severity): The tool.strict field is not hashed. Since strict directly affects the serialized JSON schema sent to the API (as seen in tool_to_api_json), omitting it from the identity hash can lead to incorrect cache hits if a tool's strict setting changes.
  2. Performance Overhead (Medium/High Severity): Calling serde_json::to_string(&tool.input_schema) on every turn/check (even on cache hits) performs multiple string allocations per tool. For a large tool catalog, this defeats a significant portion of the performance benefits of this cache.

Solution

We can resolve both issues by:

  • Explicitly hashing tool.strict.
  • Implementing a recursive helper function hash_json_value to hash the serde_json::Value directly without any string allocations.
fn tool_set_identity(tools: &[Tool]) -> u64 {
    let mut hasher = DefaultHasher::new();
    tools.len().hash(&mut hasher);
    for tool in tools {
        tool.name.hash(&mut hasher);
        tool.description.hash(&mut hasher);
        tool.strict.hash(&mut hasher);
        hash_json_value(&tool.input_schema, &mut hasher);
    }
    hasher.finish()
}

fn hash_json_value<H: Hasher>(val: &serde_json::Value, state: &mut H) {
    match val {
        serde_json::Value::Null => 0.hash(state),
        serde_json::Value::Bool(b) => {
            1.hash(state);
            b.hash(state);
        }
        serde_json::Value::Number(n) => {
            2.hash(state);
            if let Some(i) = n.as_i64() {
                i.hash(state);
            } else if let Some(u) = n.as_u64() {
                u.hash(state);
            } else if let Some(f) = n.as_f64() {
                f.to_bits().hash(state);
            }
        }
        serde_json::Value::String(s) => {
            3.hash(state);
            s.hash(state);
        }
        serde_json::Value::Array(arr) => {
            4.hash(state);
            arr.len().hash(state);
            for v in arr {
                hash_json_value(v, state);
            }
        }
        serde_json::Value::Object(obj) => {
            5.hash(state);
            obj.len().hash(state);
            for (k, v) in obj {
                k.hash(state);
                hash_json_value(v, state);
            }
        }
    }
}

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented Jun 3, 2026

Thanks @HUQIANTAO. I checked this for the 0.8.52 freeze, and it should wait for a follow-up pass: CI is currently failing with -D warnings dead-code errors in prefix_cache.rs (PrefixFingerprint::compute, CachedCatalog::joined, and unused cache helper methods), which also breaks the macOS/mobile build legs. Not a release blocker, but the idea is useful once the cache API is either wired through or narrowed to only used surface.

HUQIANTAO added 2 commits June 3, 2026 19:43
PrefixFingerprint::compute is called once per turn by the turn loop
prefix-stability check. The tool-side work serializes every tool to the
chat-API JSON shape, sorts the resulting strings, joins with newlines,
and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of
allocation plus a sort, all of which produces a byte-identical output
once the tool set is stable across turns (the common case after the
first turn of a session).

Introduce a process-local ToolCatalogCache that stores the joined+sorted
catalog under a content-derived u64 identity (length + per-tool name +
description + serialized input_schema). On a hit, the per-tool JSON
serialization, sort, and join are skipped entirely — the pre-computed
SHA-256 hex digest is returned directly.

The cache lives on PrefixStabilityManager (per-session ownership) and
backs a new PrefixFingerprint::compute_with_tool_cache entry point.
check_and_update, PrefixStabilityManager::new, and pin() all use the
cached path. The original compute() is kept as a fallback for callers
that do not have a cache in hand (e.g. CLI tools that build a one-shot
fingerprint).

The cache is bounded (default capacity = 8) and uses insertion-order
eviction, matching the eviction strategy already in
transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload
and MCP attach paths.

Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on
hit), identity collisions, schema change detection, capacity eviction,
invalidate, empty slice, and the equivalence between cached and uncached
fingerprints. The full 30-test prefix_cache suite passes; the wider
prefix-cache contract tests in settings, prompts, and
core::engine::tests continue to pass.
…with PrefixFingerprint::compute

Three follow-ups to the previous perf commit:

1. Correctness: tool.strict participates in the wire format emitted by
   tool_to_api_json, so it MUST participate in the cache identity. Two
   catalogs that differ only in strict would otherwise collide and serve
   a stale SHA-256, silently busting prefix-cache stability on the wire.

2. Allocation: replace the per-tool serde_json::to_string in
   tool_set_identity with a hash_json_value helper that walks the JSON
   tree directly. For a 60-tool catalog this drops ~25-40 KB of
   transient allocation per cache miss.

3. Dead code: the previous patch introduced PrefixFingerprint::compute,
   CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a
   thread-local cache helper that were not used outside tests. With
   -D warnings in CI all four triggered dead-code errors. The compute
   helper is now only built in cfg(test); the rest are marked
   #[allow(dead_code)] with comments explaining their observability and
   test-only use.
@HUQIANTAO HUQIANTAO force-pushed the perf/tool-catalog-cache branch from 81e7bc7 to a3822e8 Compare June 3, 2026 11:54
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants