perf(prefix-cache): cache tool-catalog JSON serialization across checks#2632
perf(prefix-cache): cache tool-catalog JSON serialization across checks#2632HUQIANTAO wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
Thanks @HUQIANTAO for taking the time to contribute. This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in Please read |
There was a problem hiding this comment.
Code Review
This pull request introduces a ToolCatalogCache to optimize prefix fingerprint computation by caching the serialized JSON representation of tool catalogs, avoiding redundant serialization, sorting, and joining. A review comment identifies a correctness bug where the strict field of a tool is omitted from the cache identity hash, potentially leading to incorrect cache hits. Additionally, the reviewer suggests a performance improvement to hash the serde_json::Value directly instead of serializing it to a string on every check to avoid unnecessary allocations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| fn tool_set_identity(tools: &[Tool]) -> u64 { | ||
| let mut hasher = DefaultHasher::new(); | ||
| tools.len().hash(&mut hasher); | ||
| for tool in tools { | ||
| tool.name.hash(&mut hasher); | ||
| tool.description.hash(&mut hasher); | ||
| // Hash the schema as a canonical JSON string. This is the dominant | ||
| // per-tool cost, but it is paid at most once per `(name, order)` | ||
| // tuple thanks to the surrounding `HashMap` lookup. Tools that | ||
| // mutate their `input_schema` (rare) will simply miss the cache. | ||
| let schema_text = serde_json::to_string(&tool.input_schema) | ||
| .unwrap_or_else(|_| "<unserializable schema>".to_string()); | ||
| schema_text.hash(&mut hasher); | ||
| } | ||
| hasher.finish() | ||
| } |
There was a problem hiding this comment.
Correctness & Performance Issues in tool_set_identity
- Correctness Bug (High Severity): The
tool.strictfield is not hashed. Sincestrictdirectly affects the serialized JSON schema sent to the API (as seen intool_to_api_json), omitting it from the identity hash can lead to incorrect cache hits if a tool'sstrictsetting changes. - Performance Overhead (Medium/High Severity): Calling
serde_json::to_string(&tool.input_schema)on every turn/check (even on cache hits) performs multiple string allocations per tool. For a large tool catalog, this defeats a significant portion of the performance benefits of this cache.
Solution
We can resolve both issues by:
- Explicitly hashing
tool.strict. - Implementing a recursive helper function
hash_json_valueto hash theserde_json::Valuedirectly without any string allocations.
fn tool_set_identity(tools: &[Tool]) -> u64 {
let mut hasher = DefaultHasher::new();
tools.len().hash(&mut hasher);
for tool in tools {
tool.name.hash(&mut hasher);
tool.description.hash(&mut hasher);
tool.strict.hash(&mut hasher);
hash_json_value(&tool.input_schema, &mut hasher);
}
hasher.finish()
}
fn hash_json_value<H: Hasher>(val: &serde_json::Value, state: &mut H) {
match val {
serde_json::Value::Null => 0.hash(state),
serde_json::Value::Bool(b) => {
1.hash(state);
b.hash(state);
}
serde_json::Value::Number(n) => {
2.hash(state);
if let Some(i) = n.as_i64() {
i.hash(state);
} else if let Some(u) = n.as_u64() {
u.hash(state);
} else if let Some(f) = n.as_f64() {
f.to_bits().hash(state);
}
}
serde_json::Value::String(s) => {
3.hash(state);
s.hash(state);
}
serde_json::Value::Array(arr) => {
4.hash(state);
arr.len().hash(state);
for v in arr {
hash_json_value(v, state);
}
}
serde_json::Value::Object(obj) => {
5.hash(state);
obj.len().hash(state);
for (k, v) in obj {
k.hash(state);
hash_json_value(v, state);
}
}
}
}|
Thanks @HUQIANTAO. I checked this for the 0.8.52 freeze, and it should wait for a follow-up pass: CI is currently failing with |
PrefixFingerprint::compute is called once per turn by the turn loop prefix-stability check. The tool-side work serializes every tool to the chat-API JSON shape, sorts the resulting strings, joins with newlines, and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of allocation plus a sort, all of which produces a byte-identical output once the tool set is stable across turns (the common case after the first turn of a session). Introduce a process-local ToolCatalogCache that stores the joined+sorted catalog under a content-derived u64 identity (length + per-tool name + description + serialized input_schema). On a hit, the per-tool JSON serialization, sort, and join are skipped entirely — the pre-computed SHA-256 hex digest is returned directly. The cache lives on PrefixStabilityManager (per-session ownership) and backs a new PrefixFingerprint::compute_with_tool_cache entry point. check_and_update, PrefixStabilityManager::new, and pin() all use the cached path. The original compute() is kept as a fallback for callers that do not have a cache in hand (e.g. CLI tools that build a one-shot fingerprint). The cache is bounded (default capacity = 8) and uses insertion-order eviction, matching the eviction strategy already in transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload and MCP attach paths. Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on hit), identity collisions, schema change detection, capacity eviction, invalidate, empty slice, and the equivalence between cached and uncached fingerprints. The full 30-test prefix_cache suite passes; the wider prefix-cache contract tests in settings, prompts, and core::engine::tests continue to pass.
…with PrefixFingerprint::compute
Three follow-ups to the previous perf commit:
1. Correctness: tool.strict participates in the wire format emitted by
tool_to_api_json, so it MUST participate in the cache identity. Two
catalogs that differ only in strict would otherwise collide and serve
a stale SHA-256, silently busting prefix-cache stability on the wire.
2. Allocation: replace the per-tool serde_json::to_string in
tool_set_identity with a hash_json_value helper that walks the JSON
tree directly. For a 60-tool catalog this drops ~25-40 KB of
transient allocation per cache miss.
3. Dead code: the previous patch introduced PrefixFingerprint::compute,
CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a
thread-local cache helper that were not used outside tests. With
-D warnings in CI all four triggered dead-code errors. The compute
helper is now only built in cfg(test); the rest are marked
#[allow(dead_code)] with comments explaining their observability and
test-only use.
81e7bc7 to
a3822e8
Compare
There was a problem hiding this comment.
HUQIANTAO has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
Summary
PrefixFingerprint::computeis called once per turn by the turn-loop prefix-stability check (and again on every mode flip, project-context refresh, or canonical-state overlay). The tool-side work serializes every tool to the chat-API JSON shape, sorts the resulting strings, joins with newlines, and SHA-256s the result. For a 60-tool catalog that is ~25–40 KB of allocation plus a sort, all of which produces a byte-identical output once the tool set is stable across turns — the common case after the first turn of a session.This PR introduces a process-local
ToolCatalogCachethat stores the joined+sorted catalog under a content-derivedu64identity. On a hit, the per-tool JSON serialization, sort, and join are skipped entirely; the pre-computed SHA-256 hex digest is returned directly.Why now
The prefix-stability manager was wired into the turn loop in v0.8.50 to surface cache-hit telemetry to the TUI footer. That made the catalog serialization run on every turn instead of every session boundary. This PR removes that regression: the first turn pays the full cost, every subsequent turn pays only the identity hash.
The cache lives on
PrefixStabilityManager(per-session ownership) and backs a newPrefixFingerprint::compute_with_tool_cacheentry point.check_and_update,PrefixStabilityManager::new, andpin()all use the cached path. The originalcompute()is kept as a fallback for callers that do not have a cache in hand.Changes
crates/tui/src/prefix_cache.rsToolCatalogCache(capacity-bounded LRU) andCachedCatalogentry type. NewPrefixFingerprint::compute_with_tool_cacheentry point.PrefixStabilityManagergains atool_catalog_cachefield. 8 new unit tests.Design notes
fingerprint_fortakes care of that). Order-sensitive identity makes a re-registration of the same set in the same order a hit.transcript_cache.rs. Capacity defaults to 8 (sized for "session + 1 or 2 forked subagent catalogs").invalidate()is exposed for tool-registry hot-reload and MCP attach paths.std::collections::HashMap,VecDeque, andstd::hash::DefaultHasher.PrefixFingerprint::computeis unchanged.Tests
The wider prefix-cache contract tests in
settings,prompts, andcore::engine::testscontinue to pass.