Skip to content

Phase 3: Make script rewriters fragment-safe for streaming#591

Open
aram356 wants to merge 8 commits intofeature/streaming-pipeline-phase2from
feature/streaming-pipeline-phase3
Open

Phase 3: Make script rewriters fragment-safe for streaming#591
aram356 wants to merge 8 commits intofeature/streaming-pipeline-phase2from
feature/streaming-pipeline-phase3

Conversation

@aram356
Copy link
Copy Markdown
Collaborator

@aram356 aram356 commented Mar 26, 2026

Summary

Make all script rewriters fragment-safe so streaming works even with GTM and NextJS active. This removes the buffered fallback introduced in Phase 1, enabling full streaming for all configurations. Also adds the 2xx streaming gate, publisher-level tests, and small-chunk pipeline regression tests.

Closes #586, closes #587, closes #588, closes #589, closes #590.
Part of epic #563. Depends on Phase 2 (#585).

Performance results (staging vs production, median over 5 runs, Chrome 1440x900)

Metric Production (v135, buffered) Staging (v136, streaming) Delta
TTFB 54 ms 35 ms -19 ms (-35%)
First Paint 186 ms 160 ms -26 ms (-14%)
First Contentful Paint 186 ms 160 ms -26 ms (-14%)
DOM Content Loaded 286 ms 282 ms -4 ms (~same)
DOM Complete 1060 ms 663 ms -397 ms (-37%)

Production (v135) buffers the entire response body before sending any bytes to the client. Staging (v136) streams processed chunks incrementally via StreamingBody. The 35% TTFB improvement cascades into earlier paint metrics, and DOM Complete sees the largest absolute gain (-397ms) because the browser can parse/render while still receiving the body.

Metric definitions
  • TTFB (Time to First Byte) — Time from the browser's navigation request until the first byte of the HTML response arrives. Measures server responsiveness.
  • First Paint (FP) — When the browser renders the first pixel to the screen (background color, border, etc.). Indicates the page is starting to load visually.
  • First Contentful Paint (FCP) — When the browser renders the first piece of actual content (text, image, canvas). Core Web Vital.
  • DOM Content Loaded — When the HTML has been fully parsed and all deferred scripts have executed.
  • DOM Complete — When the page and all sub-resources (images, stylesheets, iframes) have finished loading.

Problem

lol_html fragments text nodes across input chunk boundaries. When the streaming HtmlRewriterAdapter feeds chunks incrementally, a text node like "googletagmanager.com/gtm.js" can be split into "google" and "tagmanager.com/gtm.js" — neither fragment matches the full domain string, so the rewrite silently fails.

Phase 1 worked around this with a buffered adapter mode. Phase 3 fixes the root cause.

Solution

Each script rewriter now accumulates text fragments via Mutex<String> until last_in_text_node() is true, then processes the complete text:

  • Intermediate fragments: return RemoveNode (suppress output, accumulate)
  • Last fragment: process full accumulated text, return Replace(rewritten) or Keep
  • Unfragmented text (common case): no accumulation overhead, processes directly

What changed

File Lines What
script_rewriter.rs +106 -3 NextJsNextDataRewriter accumulates fragments
google_tag_manager.rs +143 -5 GoogleTagManagerIntegration accumulates fragments
streaming_processor.rs -97 Remove new_buffered(), buffered flag, accumulated_input, buffered test
html_processor.rs -7 Remove has_script_rewriters check, always use streaming adapter
publisher.rs +211 -5 2xx streaming gate, streaming gate tests, stream_publisher_body gzip test
nextjs/mod.rs +50 Small-chunk (32 byte) __NEXT_DATA__ pipeline regression test
google_tag_manager.rs +45 Small-chunk (32 byte) GTM pipeline regression test

Tests added

  • fragmented_next_data_is_accumulated_and_rewritten — splits __NEXT_DATA__ mid-URL
  • unfragmented_next_data_works_without_accumulation — fast path still works
  • fragmented_next_data_without_rewritable_urls_preserves_content — Keep-after-accumulation bug
  • fragmented_gtm_snippet_is_accumulated_and_rewritten — splits GTM domain mid-string
  • non_gtm_fragmented_script_is_passed_through — non-GTM scripts emitted unchanged
  • small_chunk_next_data_rewrite_survives_fragmentation — 32-byte chunks through full HTML pipeline
  • small_chunk_gtm_rewrite_survives_fragmentation — 32-byte chunks through full HTML pipeline
  • streaming_gate_allows_2xx_html_without_post_processors — gate unit test
  • streaming_gate_blocks_non_2xx_responses — 4xx/5xx stays buffered
  • streaming_gate_blocks_html_with_post_processors — post-processors force buffering
  • streaming_gate_allows_non_html_with_post_processors — non-HTML streams regardless
  • streaming_gate_blocks_non_2xx_json — error JSON stays buffered
  • stream_publisher_body_preserves_gzip_round_trip — public API gzip test

Verification

  • cargo test --workspace — 766 passed, 0 failed
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • cargo build --release --target wasm32-wasip1 — success

Test plan

  • Fragmented NextJS __NEXT_DATA__ test passes
  • Fragmented GTM inline script test passes
  • Small-chunk (32 byte) pipeline tests pass for both integrations
  • All existing NextJS tests pass (57 tests)
  • All existing GTM tests pass (32 tests)
  • 2xx streaming gate unit tests pass
  • Gzip round-trip publisher test passes
  • Full workspace tests pass (766)
  • WASM build succeeds
  • Staging performance verified (35% TTFB improvement)

Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Phase 3 completes the streaming pipeline by making NextJsNextDataRewriter and GoogleTagManagerIntegration fragment-safe via Mutex<String> accumulation, removing the buffered adapter fallback from Phase 1, and introducing a 2xx streaming gate in publisher.rs. The implementation is correct, WASM-safe (Mutex is a no-op on single-threaded wasm32-wasip1), and the Keep-after-accumulation bug fix has a precisely targeted regression test. No blocking issues.

Non-blocking

🤔 thinking

  • Streaming gate tests verify the formula, not the function — see inline comment at publisher.rs

♻️ refactor

  • GTM uninit-variable patternlet full_content; let text = if... (google_tag_manager.rs:503)
  • GTM indirect accumulation checktext.len() != content.len() (google_tag_manager.rs:522)

📝 note

  • accumulated_text on the shared Arc — google_tag_manager.rs:138

👍 praise

  • Keep-after-accumulation fix + targeted regression test — script_rewriter.rs:89
  • Performance results in the PR description — 35% TTFB / 37% DOM Complete improvements backed by a named methodology, metric definitions, and causal explanation. Well above the usual PR description standard.

CI Status

  • browser integration tests: PASS
  • integration tests: PASS
  • prepare integration artifacts: PASS

Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This is a well-executed PR. No blocking issues found. The P2 items below are genuine improvements but none are required for merge. CI is all green (3/3 checks passing).

The PR cleanly completes Phase 3 of the streaming pipeline — making NextJsNextDataRewriter and GoogleTagManagerIntegration fragment-safe for lol_html text node splitting, removing the buffered adapter fallback, and adding a 2xx streaming gate.

Highlights

  • Keep-after-accumulation bug fix is precisely handled with a targeted regression test
  • Small-chunk (32-byte) pipeline regression tests catch fragmentation bugs end-to-end
  • Clean surgical removal of buffered mode
  • 2xx streaming gate is a smart defensive measure
  • Excellent engineering documentation of performance methodology and results

P2 — rsc_placeholders.rs fragmentation strategy undocumented

🤔 The RSC placeholder rewriter (crates/trusted-server-core/src/integrations/nextjs/rsc_placeholders.rs around lines 57-64) skips fragmented scripts entirely (returns Keep for intermediate fragments), deferring to the post-processor. This is a valid approach since it has a post-processor fallback. However, it's worth documenting this as a deliberate design choice vs. the accumulation pattern used by the GTM and NextData rewriters in this PR. A brief comment there would help future maintainers understand why this rewriter doesn't use the same accumulation approach.

P3 — Minor

  • Spec still references Phase 1 workaround: docs/superpowers/specs/2026-03-25-streaming-response-design.md around lines 230-231 still references the Phase 1 workaround in the "Files Changed" section — should be updated to reflect that all script rewriters are now fragment-safe.
  • Performance results duplicated in 3 places: PR description, plan doc, and spec doc all contain the same performance numbers. Minor maintenance concern if numbers are updated later.

aram356 added a commit that referenced this pull request Mar 31, 2026
- Extract streaming gate into can_stream_response() function so tests
  call production code instead of reimplementing the formula
- Refactor GTM rewrite() to use Option<String> pattern instead of
  uninit variable, replacing indirect text.len() != content.len()
  accumulation check with explicit full_content.is_some()
- Add cross-element safety doc comment on accumulated_text fields
  in GTM and NextJsNextDataRewriter
- Document RSC placeholder deliberate non-accumulation strategy
- Update spec to reflect script rewriters are now fragment-safe
@aram356
Copy link
Copy Markdown
Collaborator Author

aram356 commented Mar 31, 2026

Addressing review feedback:

Fixed in this push:

  • Extracted can_stream_response() function — tests now call production code instead of reimplementing the formula
  • GTM rewrite() refactored to Option<String> pattern, replacing text.len() != content.len() with full_content.is_some()
  • Added cross-element safety invariant doc comment on accumulated_text in both GTM and NextJsNextDataRewriter
  • Documented RSC placeholder deliberate non-accumulation strategy in rsc_placeholders.rs
  • Updated spec to reflect script rewriters are now fragment-safe (removed stale "currently require buffered mode" text)

Acknowledged (not changed):

  • Performance results in 3 places — intentional for discoverability across spec/plan/PR, accepted minor maintenance cost

Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Phase 3 — Fragment-Safe Streaming

Good work on the fragment accumulation design and streaming gate. The RemoveNode / Replace(full) pattern is clean, the streaming gate has solid test coverage, and the performance results are impressive (35% TTFB improvement).

Key concerns:

  1. SSRF protection removed — The allowed_domains proxy allowlist and all redirect chain validation were removed. If intentional, the stale docs need updating. If not, this is a security regression.
  2. Accept-Encoding not restricted — Removing restrict_accept_encoding can cause silently corrupted output for unsupported encodings (e.g., zstd).
  3. Synthetic ID validation relaxed — No length bound on the character-only check means unbounded values accepted for cookies and KV store keys.
  4. Fragment accumulation duplication — The pattern is identical in GTM and NextJS rewriters; a shared helper would reduce maintenance risk.

Several doc comments were also lost when # Errors sections were removed — the summary lines should be restored.

See inline comments for details on each finding.


Findings not attached inline (files/lines outside this diff):

🔧 Synthetic ID validation relaxed — no length bound (crates/trusted-server-core/src/synthetic.rs:117)

is_valid_synthetic_id() (enforcing <64-hex>.<6-alphanum>, exactly 71 bytes) was removed. Now any string of [a-zA-Z0-9._-] of any length is accepted as a synthetic ID via validate_existing_synthetic_id. This means arbitrarily long values can be stored in cookies, used as KV store keys for deletion (delete_consent_from_kv), and logged. At minimum, add a length upper-bound check (e.g., synthetic_id.len() <= 256) to prevent abuse.

Missing doc on is_last_in_text_node field (crates/trusted-server-core/src/integrations/registry.rs:94)

This field drives the entire Phase 3 fragment accumulation protocol. Add a doc comment explaining that lol_html may deliver text in multiple fragments and rewriters needing complete text must accumulate until this is true.

aram356 added 8 commits April 8, 2026 18:19
Accumulate text fragments via Mutex<String> until
last_in_text_node is true, then process the complete text.
Intermediate fragments return RemoveNode to suppress output.
Accumulate text fragments via Mutex<String> until
last_in_text_node is true, then match and rewrite on the complete
text. Non-GTM scripts that were fragmented are emitted unchanged.
All script rewriters (NextJS __NEXT_DATA__, GTM) are now
fragment-safe — they accumulate text internally until
last_in_text_node. The buffered adapter workaround is no longer
needed. Always use streaming mode in create_html_processor.
When rewrite_structured returns Keep on accumulated content,
intermediate fragments were already removed via RemoveNode. Emit
the full accumulated content via Replace to prevent silent data
loss. Also updates spec to reflect Phase 3 completion.
- Add response.get_status().is_success() check to streaming gate so
  4xx/5xx error pages stay buffered with complete status codes
- Add streaming gate unit tests covering all gate conditions
- Add stream_publisher_body gzip round-trip test
- Add small-chunk (32 byte) pipeline tests for __NEXT_DATA__ and GTM
  that prove fragmented text nodes survive the real lol_html path
Phase 3 performance results: 35% TTFB improvement, 37% DOM Complete
improvement on getpurpose.ai staging vs production. Phase 4 adds
binary pass-through streaming via PublisherResponse::PassThrough.
- Extract streaming gate into can_stream_response() function so tests
  call production code instead of reimplementing the formula
- Refactor GTM rewrite() to use Option<String> pattern instead of
  uninit variable, replacing indirect text.len() != content.len()
  accumulation check with explicit full_content.is_some()
- Add cross-element safety doc comment on accumulated_text fields
  in GTM and NextJsNextDataRewriter
- Document RSC placeholder deliberate non-accumulation strategy
- Update spec to reflect script rewriters are now fragment-safe
- Document why Mutex<String> is used (Sync bound on trait, not
  concurrent access) in both NextJsNextDataRewriter and
  GoogleTagManagerIntegration
- Add accumulation_buffer_drains_between_consecutive_script_elements
  test proving the buffer doesn't leak between two sequential
  <script> elements (fragmented GTM followed by fragmented non-GTM)
@aram356 aram356 force-pushed the feature/streaming-pipeline-phase3 branch from db597e6 to ff05483 Compare April 9, 2026 01:41
Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment