Skip to content

perf: abstract FormatCache as pluggable trait, optimize format runtime#679

Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/format-runtime-optimization
Open

perf: abstract FormatCache as pluggable trait, optimize format runtime#679
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/format-runtime-optimization

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 5, 2026

Motivation

The % format operator in Jsonnet is commonly used in string-heavy workloads (e.g., realistic2). Each format call re-parses the format string, which is wasteful when the same format pattern is used repeatedly.

Key Design Decision

Abstract format string parsing into a FormatCache trait, allowing parsed format specifications to be cached and reused across calls. The default implementation provides a simple LRU-style cache.

Modification

  • Add FormatCache trait with pluggable cache strategy
  • Cache parsed format specifications keyed by format string
  • Optimize format runtime dispatch for common patterns
  • ~317 lines changed

Benchmark Results

JMH (JVM, 3 iterations warmup + 3 measurement)

Benchmark Master (ms/op) This PR (ms/op) Change
bench.02 50.427 ± 38.9 45.984 ± 2.4 -8.8%
comparison2 85.854 ± 188.7 69.734 ± 19.9 -18.8%
realistic2 73.458 ± 66.7 70.857 ± 3.2 -3.5%

Analysis

The format cache helps most on comparison2 (-18.8%) which has repeated format patterns. The realistic2 improvement is modest (-3.5%) because its format patterns have more variety. The infrastructure enables further optimizations on format-heavy workloads.

References

  • Upstream: jit branch experiment

Result

All 5 tests pass. All benchmarks positive, no regressions.

Extract format string cache from static field in Format.scala into a
pluggable FormatCache trait (analogous to ParseCache). This allows users
to supply custom cache implementations (e.g., Caffeine-based) via the
Interpreter/Evaluator constructors.

Key changes:
- New FormatCache trait with getOrElseUpdate API
- DefaultFormatCache: LRU LinkedHashMap (256 entries), thread-safe
- FormatCache.SharedDefault singleton preserves process-wide sharing
- FormatCache.EmptyCache for testing
- CompiledFormat sealed trait for type-safe opaque cache entries
- RuntimeFormat: direct Val dispatch, Long fast path, pre-cached specs
- PartialApplyFmt pre-parses at construction time (no cache needed)
- FormatCache threaded through Interpreter → Evaluator constructors

Upstream: he-pin/sjsonnet jit branch (format optimization commits)
@He-Pin He-Pin force-pushed the perf/format-runtime-optimization branch from 9ffd530 to c0bc815 Compare April 5, 2026 11:06
@He-Pin He-Pin changed the title perf: optimize Format runtime with cache, direct Val dispatch, and Long fast path perf: abstract FormatCache as pluggable trait, optimize format runtime Apr 5, 2026
@He-Pin He-Pin marked this pull request as ready for review April 5, 2026 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant