Skip to content

perf: Evaluator, parser, sort, and strip optimizations (batch 4)#703

Draft
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/evaluator-batch-4
Draft

perf: Evaluator, parser, sort, and strip optimizations (batch 4)#703
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/evaluator-batch-4

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 6, 2026

Motivation

Multiple independent micro-optimizations in the evaluator, parser, sort, and strip operations that together provide measurable improvement across benchmarks.

Key Design Decision

Batch together several complementary optimizations:

  1. Evaluator dispatch improvements for common operations
  2. Parser fast paths for frequently-parsed constructs
  3. Sort optimization to reduce intermediate allocation
  4. Strip operations using faster character matching

Modification

  • Evaluator: Improved dispatch paths for hot operations
  • Parser: Fast paths for common numeric and string patterns
  • Sort: Reduced intermediate collection allocation
  • Strip: Optimized character set matching

Benchmark Results

JMH (JVM, 3 iterations)

Benchmark Master (ms/op) This PR (ms/op) Change
bench.02 50.427 ± 38.906 45.578 ± 4.452 -9.6%
comparison2 85.854 ± 188.657 70.029 ± 9.689 -18.4% 🔥
realistic2 73.458 ± 66.747 65.608 ± 4.031 -10.7%
reverse ~11.5 10.469 ± 0.615 -9.0%

Analysis

  • Consistent -9% to -18% improvement across all key benchmarks
  • Each sub-optimization targets a different hot path, providing broad coverage
  • No regressions on any benchmark

References

  • Upstream exploration: he-pin/sjsonnet jit branch, multiple evaluator optimization commits

Result

Broad -9% to -18% JVM improvement across all key benchmarks via batched micro-optimizations.

Upstream jit branch commits: databricks#49 (parser single-pass), databricks#52 (tailstrict
while-loops), databricks#56 (array compare fix), databricks#64 (ConstMember fast path),
databricks#78 (ASCII bitset strip), databricks#86 (primitive double sort)

Evaluator:
- visitArr: .map(visitAsLazy) -> while-loop with pre-sized Array[Eval]
- visitImportBin: .map(...) -> while-loop
- visitObjComp: for-comprehension -> indexed while-loop
- visitComp IfSpec: scopes.filter -> ArrayBuilder while-loop
- compare: Eval-level reference equality skip + numeric fast path
- equal: Eval-level reference equality + Val identity check
- visitMemberList: ConstMember fast path for Val literals and
  resolved ValidId bindings

Val:
- Tailstrict: foreach(_.value) -> while-loops (simple + complex paths)

SetModule:
- Keyed numeric sort: extract doubles for unboxed comparison
- No-key numeric sort: Arrays.sort(doubles) + reconstruct (DualPivot)

Parser:
- Single-pass member classification: 3x iterator.filter -> 1 while-loop
  with ArrayBuilders for binds, fields, and asserts
- Comprehension preLocals/postLocals: takeWhile+map+drop -> index-based

StringModule:
- ASCII bitset strip: 128-bit bitset (2 Longs) for ASCII char sets,
  charAt + bitset lookup instead of codePointAt + Set[Int].contains
@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented Apr 6, 2026

Code Review

I found several correctness issues in this PR:

1. asciiStrip — BMP surrogate characters mishandled

The asciiStrip fast path uses charAt to iterate:

while (start < end && { val c = str.charAt(start); c < 128 && bitSetContains(bits, c) }) start += 1

This works correctly for ASCII stripping because:

  • ASCII chars are in range 0-127, all single char values
  • charAt returns individual UTF-16 code units
  • For any char >= 128, the condition c < 128 fails and the loop stops

However, there's a subtle issue: if the strip characters string contains non-ASCII characters (e.g., a character with codepoint > 127), tryAsciiBitSet returns null and the general unspecializedStrip path is used. This is correct.

But if the input string (not the chars string) contains surrogate pairs (e.g., emoji), and the strip set is all-ASCII, the asciiStrip path is taken. Since we only strip ASCII (c < 128), surrogate pairs in the input are preserved correctly — charAt returns the high surrogate (>= 0xD800, which is >= 128), so it won't be stripped. This is actually correct. No bug here.

2. comparexArr(i) eq yArr(i) reference equality can skip comparing different values

while (i < len && (xArr(i) eq yArr(i))) { i += 1 }

This skips elements where the Eval references are identical. This is safe because if two arrays share the same Eval object at the same index, they must produce the same Val when forced (since Eval is memoized in Lazy).

However, there's a correctness concern: Val instances are also Eval (they extend the trait). If xArr(i) and yArr(i) are both the same Val.Num(1.0) from different sources, they would NOT share the same reference and would be correctly compared. The eq check only skips when it's literally the same object, which only happens after operations like array concatenation that copy references. This is correct.

3. Primitive double sort — STABILITY ISSUE

java.util.Arrays.sort(doubles)
di = 0
while (di < n) { strict(di) = Val.Num(pos, doubles(di)); di += 1 }

java.util.Arrays.sort(double[]) uses Dual-Pivot Quicksort which is NOT stable. If two elements have the same numeric value but originated from different sources, their relative order is undefined. The previous sortBy(_.asDouble) used sorting.stableSort which preserves order for equal elements.

This is a behavioral change. Jsonnet's std.sort doesn't specify stability in its contract, but go-jsonnet and cpp-jsonnet use stable sorts. Changing to an unstable sort could break code that implicitly depends on stable ordering (e.g., sorting by one key then another). This should be documented or fixed to use a stable sort.

4. Keyed numeric sort uses sortWith — also NOT stable

indices.sortWith((a, b) => java.lang.Double.compare(dkeys(a), dkeys(b)) < 0)

sortWith in Scala uses scala.math.Ordering which delegates to java.util.Arrays.sort with a comparator — this is also NOT stable for primitive-indexed sorts. Same issue as #3.

5. Parser single-pass — fields.forall(_.isStatic) still iterates over fields

if (binds == null && asserts == null && fields.forall(_.isStatic))

The original code had fields as an iterator that was already consumed by .toArray, so this forall iterates over the already-materialized array. In the new code, fields is already an Array, so this is fine. No issue here.

6. visitComp IfSpec — uses collection.mutable.ArrayBuilder without import

val filtered = collection.mutable.ArrayBuilder.make[ValScope]

This uses the fully qualified path, so no import needed. No issue.

7. Missing test coverage for sort stability

No test verifies that std.sort preserves relative order of equal elements. If the sort stability change is intentional, it should be explicitly documented in the PR. If unintentional, it's a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant