Skip to content

Cache last-successful union branch index per schema instance#1439

Open
heiwen wants to merge 3 commits intoopen-circle:mainfrom
heiwen:perf/union-success-cache
Open

Cache last-successful union branch index per schema instance#1439
heiwen wants to merge 3 commits intoopen-circle:mainfrom
heiwen:perf/union-success-cache

Conversation

@heiwen
Copy link
Copy Markdown

@heiwen heiwen commented Mar 22, 2026

Summary

In typical usage, repeated calls to union with valid input match the same branch every time (e.g. a discriminated union where the input type is stable). This caches the last-successful branch index in a module-level WeakMap<object, number> keyed on the schema instance and tries it first on the next call.

  • On cache hit with valid input: one branch tried instead of N, no issue accumulation
  • On cache miss or invalid input: falls back to the existing linear scan, cache updated on success
  • No behavioral change on output or error reporting

Benchmark

~3.0% improvement on safeParse valid (allErrors) path.

Related to #1408

Summary by CodeRabbit

  • Performance
    • Union schema validation now remembers the last successful option per schema and uses a fast path when that option is first, reducing repeated work and speeding up common validations.
    • Maintains correctness by falling back to full evaluation when needed, ensuring accurate results while improving throughput.

Add a module-level `WeakMap` that records, for each `union` schema
instance, the index of the option that matched on the most recent
successful parse. On the next call the cached index is tried first; if
it matches, the remaining options are skipped entirely. Falls back to
the linear scan when the hint misses or the schema has never matched.

Schemas derived via `omit`/`pick`/`partial`/`required` share the same
object identity so the hint naturally tracks per instance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

📝 Walkthrough

Walkthrough

A module-level WeakMap was added to cache each union schema instance's last successful option index. The ~run method now re-checks the cached option first (bounds-checked), includes a start-of-run fast path for the cached index 0, and updates the cache when an option validates without issues.

Changes

Cohort / File(s) Summary
Union Schema Caching
library/src/schemas/union/union.ts
Added module-level WeakMap _lastSuccessIndex keyed by schema instance. ~run now fast-paths to the cached option (bounds-checked), includes a start-of-run shortcut when the cached index is 0 that re-runs only this.options[0]['~run'], uses an index-based loop over this.options, and updates the cache when an option returns a typed dataset with no issues.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hop remembered, tucked away tight,
I try the same path that once felt right,
WeakMap whispers which choice to test,
I bounce once more and settle in rest,
Fewer hops now — a cozy night.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Cache last-successful union branch index per schema instance' directly matches the main change: adding a WeakMap to cache the index of the last successful union option per schema instance for performance optimization.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@heiwen heiwen marked this pull request as ready for review March 22, 2026 14:58
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. enhancement New feature or request performance Its performance related labels Mar 22, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@library/src/schemas/union/union.ts`:
- Around line 129-130: The early-return that returns lastDataset as
SuccessDataset<unknown> (when lastDataset.typed && !lastDataset.issues) breaks
left-to-right winner semantics for overlapping unions; change the fast-path so
it only triggers for unions proven disjoint (e.g., gate on a disjoint
flag/property or add an isDisjoint check) or remove the early return entirely so
earlier branches are still attempted, and apply the same fix at the analogous
check around lines 154-155; add a regression test in
library/src/schemas/union/union.test.ts that exercises union([literal('foo'),
string()]) (parse 'bar' then 'foo') to assert the first branch wins and
transforms/defaults behave correctly.
- Around line 124-128: The cached lastIndex is being dereferenced without
verifying it still points into the current this.options array (UnionOptions is
stored by reference); update the union() logic around lastIndex/lastDataset so
you first verify lastIndex is a valid index (e.g., lastIndex >= 0 && lastIndex <
this.options.length && this.options[lastIndex] && typeof
this.options[lastIndex]['~run'] === 'function') before calling
this.options[lastIndex]['~run'](...); if the guard fails, clear or ignore
lastIndex and fall back to the existing linear scan over this.options to compute
lastDataset as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f75f1f66-48b7-4398-9570-fc39179d53fa

📥 Commits

Reviewing files that changed from the base of the PR and between 2fa2435 and 5a1903f.

📒 Files selected for processing (1)
  • library/src/schemas/union/union.ts

Comment thread library/src/schemas/union/union.ts Outdated
Comment thread library/src/schemas/union/union.ts Outdated
- Guard lastIndex with < this.options.length in case options is mutated
- Only early-return on cache hit when lastIndex === 0; for higher indexes
  the full loop still runs (reusing the cached dataset) so earlier branches
  with higher priority are not skipped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 23, 2026

@heiwen is attempting to deploy a commit to the Open Circle Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Mar 23, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@library/src/schemas/union/union.ts`:
- Around line 119-143: The current fast-path reuses a cached branch result from
_lastSuccessIndex even when lastIndex > 0, causing speculative execution of
later branches; modify the logic so the cached optionDataset is only reused when
lastIndex === 0 (i.e., only allow the short-circuit fast path for index 0) —
remove/guard the reuse for any other lastIndex to avoid running non-priority
branches, update the conditional around this.options[i]['~run'] to reflect this
change, and add a regression test in library/src/schemas/union/union.test.ts
that demonstrates union([literal('foo'), pipe(string(), transform(sideEffect))])
does not invoke the transform when branch 0 matches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0a068d3-148a-4f84-82d3-5ab0c0a72c0f

📥 Commits

Reviewing files that changed from the base of the PR and between 5a1903f and 13c27ac.

📒 Files selected for processing (1)
  • library/src/schemas/union/union.ts

Comment thread library/src/schemas/union/union.ts Outdated
Restricts the early-return to lastIndex === 0, eliminating pre-computation
for higher indexes which could run transform side effects out of order.
The full loop still updates the cache so index 0 is reached naturally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 24, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
library/src/schemas/union/union.ts (2)

122-124: ⚠️ Potential issue | 🟡 Minor

Restore the empty-array guard on the fast path.

Line 123 now dereferences this.options[0] whenever the cache holds 0. Because union() keeps the caller's options array by reference, shrinking it to [] after a successful parse turns this into undefined['~run'](...) instead of falling back to the normal scan.

🛡️ Minimal guard
-      if (_lastSuccessIndex.get(this) === 0) {
+      if (_lastSuccessIndex.get(this) === 0 && this.options.length > 0) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@library/src/schemas/union/union.ts` around lines 122 - 124, The fast-path
currently dereferences this.options[0] when _lastSuccessIndex.get(this) === 0
which breaks if the caller mutated options to an empty array; restore an
empty-array guard before calling this.options[0]['~run'] (e.g., check
this.options && this.options.length > 0 or similar) and fall back to the normal
scan path when options is empty so union() does not attempt to call '~run' on
undefined; ensure this change touches the block that computes fastDataset and
retains existing behavior when options is non-empty.

122-128: ⚠️ Potential issue | 🔴 Critical

Reuse the warmed option-0 result on fallback.

If Lines 127-128 do not return, Line 134 executes option 0 again. After the cache is warmed, a failing first branch now runs any transforms or user callbacks twice, which is an observable behavior change from the pre-cache implementation.

🔧 Reuse the first dataset instead of re-running branch 0
+      let firstOptionDataset:
+        | ReturnType<(typeof this.options)[number]['~run']>
+        | undefined;
+
       // Fast path: if the first option succeeded last time, try it first.
       // Restricted to index 0 to preserve left-to-right priority semantics and
       // avoid running side-effecting branches (transforms) out of order.
       if (_lastSuccessIndex.get(this) === 0) {
-        const fastDataset = this.options[0]['~run'](
+        firstOptionDataset = this.options[0]['~run'](
           { value: dataset.value },
           config
         );
-        if (fastDataset.typed && !fastDataset.issues) {
-          return fastDataset as SuccessDataset<unknown>;
+        if (firstOptionDataset.typed && !firstOptionDataset.issues) {
+          return firstOptionDataset as SuccessDataset<unknown>;
         }
       }
 
       // Parse schema of each option and collect datasets
       for (let i = 0; i < this.options.length; i++) {
-        const optionDataset = this.options[i]['~run'](
-          { value: dataset.value },
-          config
-        );
+        const optionDataset =
+          i === 0 && firstOptionDataset
+            ? firstOptionDataset
+            : this.options[i]['~run']({ value: dataset.value }, config);

Also applies to: 133-137

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@library/src/schemas/union/union.ts` around lines 122 - 128, The current logic
re-invokes options[0]['~run'] when falling back after warming the cache, causing
branch-0 transforms/callbacks to run twice; modify the fallback path in the
Union implementation to reuse the previously computed fastDataset (the variable
produced when _lastSuccessIndex.get(this) === 0) instead of calling
this.options[0]['~run'] again — i.e., keep and reuse fastDataset
(typed/issue-checked) for the fallback return/handling (also apply the same
reuse pattern to the symmetric case around the other fallback at lines 133-137)
so option 0 is not executed twice and you still return the cached
SuccessDataset<unknown> when appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@library/src/schemas/union/union.ts`:
- Around line 122-124: The fast-path currently dereferences this.options[0] when
_lastSuccessIndex.get(this) === 0 which breaks if the caller mutated options to
an empty array; restore an empty-array guard before calling
this.options[0]['~run'] (e.g., check this.options && this.options.length > 0 or
similar) and fall back to the normal scan path when options is empty so union()
does not attempt to call '~run' on undefined; ensure this change touches the
block that computes fastDataset and retains existing behavior when options is
non-empty.
- Around line 122-128: The current logic re-invokes options[0]['~run'] when
falling back after warming the cache, causing branch-0 transforms/callbacks to
run twice; modify the fallback path in the Union implementation to reuse the
previously computed fastDataset (the variable produced when
_lastSuccessIndex.get(this) === 0) instead of calling this.options[0]['~run']
again — i.e., keep and reuse fastDataset (typed/issue-checked) for the fallback
return/handling (also apply the same reuse pattern to the symmetric case around
the other fallback at lines 133-137) so option 0 is not executed twice and you
still return the cached SuccessDataset<unknown> when appropriate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 374dee6f-4a55-48d1-aed9-5c5edd6769df

📥 Commits

Reviewing files that changed from the base of the PR and between 13c27ac and 89a1da3.

📒 Files selected for processing (1)
  • library/src/schemas/union/union.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request performance Its performance related size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant