You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf: bulk text block scanner bypasses fastparse per-line overhead (#689)
## Motivation
Text blocks (`|||` syntax) are parsed line-by-line through fastparse,
which incurs per-line combinator overhead for each newline. Programs
with large text blocks (templates, embedded configs) pay this cost
unnecessarily.
## Key Design Decision
Implement a bulk scanner that directly scans for the text block
terminator (`|||`) using a simple character loop, bypassing the
fastparse per-line combinator overhead entirely. The scanner processes
the entire text block in a single pass.
## Modification
- Add bulk text block scanning in the parser
- Directly scan for `|||` terminator without per-line fastparse dispatch
- Preserve exact text block semantics (whitespace stripping,
indentation)
## Benchmark Results
### JMH (JVM, 3 iterations warmup + 3 measurement)
| Benchmark | Master (ms/op) | This PR (ms/op) | Change |
|-----------|---------------|-----------------|--------|
| bench.02 | 50.427 ± 38.9 | 45.838 ± 6.9 | **-9.1%** |
| comparison2 | 85.854 ± 188.7 | 70.746 ± 12.3 | **-17.6%** |
| realistic2 | 73.458 ± 66.7 | 69.255 ± 4.0 | **-5.7%** |
## Analysis
The improvement is modest but consistent across all benchmarks. The
benefit will be larger for programs with many or large text blocks.
Since parsing is typically a small fraction of total eval time, the
-5.7% to -17.6% range is expected.
## References
- Upstream: jit branch experiment
## Result
All 46 tests pass. All benchmarks positive, no regressions.
0 commit comments