Phase 3: Make script rewriters fragment-safe for streaming by aram356 · Pull Request #591 · IABTechLab/trusted-server

aram356 · 2026-03-26T22:32:19Z

Summary

Make all script rewriters fragment-safe so streaming works even with GTM and NextJS active. This removes the buffered fallback introduced in Phase 1, enabling full streaming for all configurations. Also adds the 2xx streaming gate, publisher-level tests, and small-chunk pipeline regression tests.

Closes #586, closes #587, closes #588, closes #589, closes #590.
Part of epic #563. Depends on Phase 2 (#585).

Performance results (staging vs production, median over 5 runs, Chrome 1440x900)

Metric	Production (v135, buffered)	Staging (v136, streaming)	Delta
TTFB	54 ms	35 ms	-19 ms (-35%)
First Paint	186 ms	160 ms	-26 ms (-14%)
First Contentful Paint	186 ms	160 ms	-26 ms (-14%)
DOM Content Loaded	286 ms	282 ms	-4 ms (~same)
DOM Complete	1060 ms	663 ms	-397 ms (-37%)

Production (v135) buffers the entire response body before sending any bytes to the client. Staging (v136) streams processed chunks incrementally via StreamingBody. The 35% TTFB improvement cascades into earlier paint metrics, and DOM Complete sees the largest absolute gain (-397ms) because the browser can parse/render while still receiving the body.

Metric definitions

TTFB (Time to First Byte) — Time from the browser's navigation request until the first byte of the HTML response arrives. Measures server responsiveness.
First Paint (FP) — When the browser renders the first pixel to the screen (background color, border, etc.). Indicates the page is starting to load visually.
First Contentful Paint (FCP) — When the browser renders the first piece of actual content (text, image, canvas). Core Web Vital.
DOM Content Loaded — When the HTML has been fully parsed and all deferred scripts have executed.
DOM Complete — When the page and all sub-resources (images, stylesheets, iframes) have finished loading.

Problem

lol_html fragments text nodes across input chunk boundaries. When the streaming HtmlRewriterAdapter feeds chunks incrementally, a text node like "googletagmanager.com/gtm.js" can be split into "google" and "tagmanager.com/gtm.js" — neither fragment matches the full domain string, so the rewrite silently fails.

Phase 1 worked around this with a buffered adapter mode. Phase 3 fixes the root cause.

Solution

Each script rewriter now accumulates text fragments via Mutex<String> until last_in_text_node() is true, then processes the complete text:

Intermediate fragments: return RemoveNode (suppress output, accumulate)
Last fragment: process full accumulated text, return Replace(rewritten) or Keep
Unfragmented text (common case): no accumulation overhead, processes directly

What changed

File	Lines	What
`script_rewriter.rs`	+106 -3	`NextJsNextDataRewriter` accumulates fragments
`google_tag_manager.rs`	+143 -5	`GoogleTagManagerIntegration` accumulates fragments
`streaming_processor.rs`	-97	Remove `new_buffered()`, `buffered` flag, `accumulated_input`, buffered test
`html_processor.rs`	-7	Remove `has_script_rewriters` check, always use streaming adapter
`publisher.rs`	+211 -5	2xx streaming gate, streaming gate tests, `stream_publisher_body` gzip test
`nextjs/mod.rs`	+50	Small-chunk (32 byte) `__NEXT_DATA__` pipeline regression test
`google_tag_manager.rs`	+45	Small-chunk (32 byte) GTM pipeline regression test

Tests added

fragmented_next_data_is_accumulated_and_rewritten — splits __NEXT_DATA__ mid-URL
unfragmented_next_data_works_without_accumulation — fast path still works
fragmented_next_data_without_rewritable_urls_preserves_content — Keep-after-accumulation bug
fragmented_gtm_snippet_is_accumulated_and_rewritten — splits GTM domain mid-string
non_gtm_fragmented_script_is_passed_through — non-GTM scripts emitted unchanged
small_chunk_next_data_rewrite_survives_fragmentation — 32-byte chunks through full HTML pipeline
small_chunk_gtm_rewrite_survives_fragmentation — 32-byte chunks through full HTML pipeline
streaming_gate_allows_2xx_html_without_post_processors — gate unit test
streaming_gate_blocks_non_2xx_responses — 4xx/5xx stays buffered
streaming_gate_blocks_html_with_post_processors — post-processors force buffering
streaming_gate_allows_non_html_with_post_processors — non-HTML streams regardless
streaming_gate_blocks_non_2xx_json — error JSON stays buffered
stream_publisher_body_preserves_gzip_round_trip — public API gzip test

Verification

cargo test --workspace — 766 passed, 0 failed
cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
cargo fmt --all -- --check — clean
cargo build --release --target wasm32-wasip1 — success

Test plan

prk-Jr

Summary

Phase 3 completes the streaming pipeline by making NextJsNextDataRewriter and GoogleTagManagerIntegration fragment-safe via Mutex<String> accumulation, removing the buffered adapter fallback from Phase 1, and introducing a 2xx streaming gate in publisher.rs. The implementation is correct, WASM-safe (Mutex is a no-op on single-threaded wasm32-wasip1), and the Keep-after-accumulation bug fix has a precisely targeted regression test. No blocking issues.

Non-blocking

🤔 thinking

Streaming gate tests verify the formula, not the function — see inline comment at publisher.rs

♻️ refactor

GTM uninit-variable pattern — let full_content; let text = if... (google_tag_manager.rs:503)
GTM indirect accumulation check — text.len() != content.len() (google_tag_manager.rs:522)

📝 note

accumulated_text on the shared Arc — google_tag_manager.rs:138

👍 praise

Keep-after-accumulation fix + targeted regression test — script_rewriter.rs:89
Performance results in the PR description — 35% TTFB / 37% DOM Complete improvements backed by a named methodology, metric definitions, and causal explanation. Well above the usual PR description standard.

CI Status

browser integration tests: PASS
integration tests: PASS
prepare integration artifacts: PASS

crates/trusted-server-core/src/publisher.rs

crates/trusted-server-core/src/integrations/google_tag_manager.rs

crates/trusted-server-core/src/integrations/nextjs/script_rewriter.rs

crates/trusted-server-core/src/integrations/google_tag_manager.rs

ChristianPavilonis

Review Summary

This is a well-executed PR. No blocking issues found. The P2 items below are genuine improvements but none are required for merge. CI is all green (3/3 checks passing).

The PR cleanly completes Phase 3 of the streaming pipeline — making NextJsNextDataRewriter and GoogleTagManagerIntegration fragment-safe for lol_html text node splitting, removing the buffered adapter fallback, and adding a 2xx streaming gate.

Highlights

Keep-after-accumulation bug fix is precisely handled with a targeted regression test
Small-chunk (32-byte) pipeline regression tests catch fragmentation bugs end-to-end
Clean surgical removal of buffered mode
2xx streaming gate is a smart defensive measure
Excellent engineering documentation of performance methodology and results

P2 — `rsc_placeholders.rs` fragmentation strategy undocumented

🤔 The RSC placeholder rewriter (crates/trusted-server-core/src/integrations/nextjs/rsc_placeholders.rs around lines 57-64) skips fragmented scripts entirely (returns Keep for intermediate fragments), deferring to the post-processor. This is a valid approach since it has a post-processor fallback. However, it's worth documenting this as a deliberate design choice vs. the accumulation pattern used by the GTM and NextData rewriters in this PR. A brief comment there would help future maintainers understand why this rewriter doesn't use the same accumulation approach.

P3 — Minor

⛏ Spec still references Phase 1 workaround: docs/superpowers/specs/2026-03-25-streaming-response-design.md around lines 230-231 still references the Phase 1 workaround in the "Files Changed" section — should be updated to reflect that all script rewriters are now fragment-safe.
⛏ Performance results duplicated in 3 places: PR description, plan doc, and spec doc all contain the same performance numbers. Minor maintenance concern if numbers are updated later.

crates/trusted-server-core/src/publisher.rs

crates/trusted-server-core/src/integrations/google_tag_manager.rs

- Extract streaming gate into can_stream_response() function so tests call production code instead of reimplementing the formula - Refactor GTM rewrite() to use Option<String> pattern instead of uninit variable, replacing indirect text.len() != content.len() accumulation check with explicit full_content.is_some() - Add cross-element safety doc comment on accumulated_text fields in GTM and NextJsNextDataRewriter - Document RSC placeholder deliberate non-accumulation strategy - Update spec to reflect script rewriters are now fragment-safe

aram356 · 2026-03-31T23:03:55Z

Addressing review feedback:

Fixed in this push:

Extracted can_stream_response() function — tests now call production code instead of reimplementing the formula
GTM rewrite() refactored to Option<String> pattern, replacing text.len() != content.len() with full_content.is_some()
Added cross-element safety invariant doc comment on accumulated_text in both GTM and NextJsNextDataRewriter
Documented RSC placeholder deliberate non-accumulation strategy in rsc_placeholders.rs
Updated spec to reflect script rewriters are now fragment-safe (removed stale "currently require buffered mode" text)

Acknowledged (not changed):

Performance results in 3 places — intentional for discoverability across spec/plan/PR, accepted minor maintenance cost

ChristianPavilonis

PR Review: Phase 3 — Fragment-Safe Streaming

Good work on the fragment accumulation design and streaming gate. The RemoveNode / Replace(full) pattern is clean, the streaming gate has solid test coverage, and the performance results are impressive (35% TTFB improvement).

Key concerns:

SSRF protection removed — The allowed_domains proxy allowlist and all redirect chain validation were removed. If intentional, the stale docs need updating. If not, this is a security regression.
Accept-Encoding not restricted — Removing restrict_accept_encoding can cause silently corrupted output for unsupported encodings (e.g., zstd).
Synthetic ID validation relaxed — No length bound on the character-only check means unbounded values accepted for cookies and KV store keys.
Fragment accumulation duplication — The pattern is identical in GTM and NextJS rewriters; a shared helper would reduce maintenance risk.

Several doc comments were also lost when # Errors sections were removed — the summary lines should be restored.

See inline comments for details on each finding.

Findings not attached inline (files/lines outside this diff):

🔧 Synthetic ID validation relaxed — no length bound (crates/trusted-server-core/src/synthetic.rs:117)

is_valid_synthetic_id() (enforcing <64-hex>.<6-alphanum>, exactly 71 bytes) was removed. Now any string of [a-zA-Z0-9._-] of any length is accepted as a synthetic ID via validate_existing_synthetic_id. This means arbitrarily long values can be stored in cookies, used as KV store keys for deletion (delete_consent_from_kv), and logged. At minimum, add a length upper-bound check (e.g., synthetic_id.len() <= 256) to prevent abuse.

⛏ Missing doc on is_last_in_text_node field (crates/trusted-server-core/src/integrations/registry.rs:94)

This field drives the entire Phase 3 fragment accumulation protocol. Add a doc comment explaining that lol_html may deliver text in multiple fragments and rewriters needing complete text must accumulate until this is true.

crates/trusted-server-core/src/publisher.rs

crates/trusted-server-core/src/proxy.rs

crates/trusted-server-core/src/integrations/adserver_mock.rs

crates/trusted-server-core/src/publisher.rs

crates/trusted-server-core/src/integrations/google_tag_manager.rs

crates/trusted-server-core/src/integrations/nextjs/script_rewriter.rs

crates/trusted-server-core/src/publisher.rs

crates/trusted-server-core/src/integrations/google_tag_manager.rs

Accumulate text fragments via Mutex<String> until last_in_text_node is true, then process the complete text. Intermediate fragments return RemoveNode to suppress output.

Accumulate text fragments via Mutex<String> until last_in_text_node is true, then match and rewrite on the complete text. Non-GTM scripts that were fragmented are emitted unchanged.

All script rewriters (NextJS __NEXT_DATA__, GTM) are now fragment-safe — they accumulate text internally until last_in_text_node. The buffered adapter workaround is no longer needed. Always use streaming mode in create_html_processor.

When rewrite_structured returns Keep on accumulated content, intermediate fragments were already removed via RemoveNode. Emit the full accumulated content via Replace to prevent silent data loss. Also updates spec to reflect Phase 3 completion.

- Add response.get_status().is_success() check to streaming gate so 4xx/5xx error pages stay buffered with complete status codes - Add streaming gate unit tests covering all gate conditions - Add stream_publisher_body gzip round-trip test - Add small-chunk (32 byte) pipeline tests for __NEXT_DATA__ and GTM that prove fragmented text nodes survive the real lol_html path

Phase 3 performance results: 35% TTFB improvement, 37% DOM Complete improvement on getpurpose.ai staging vs production. Phase 4 adds binary pass-through streaming via PublisherResponse::PassThrough.

- Extract streaming gate into can_stream_response() function so tests call production code instead of reimplementing the formula - Refactor GTM rewrite() to use Option<String> pattern instead of uninit variable, replacing indirect text.len() != content.len() accumulation check with explicit full_content.is_some() - Add cross-element safety doc comment on accumulated_text fields in GTM and NextJsNextDataRewriter - Document RSC placeholder deliberate non-accumulation strategy - Update spec to reflect script rewriters are now fragment-safe

- Document why Mutex<String> is used (Sync bound on trait, not concurrent access) in both NextJsNextDataRewriter and GoogleTagManagerIntegration - Add accumulation_buffer_drains_between_consecutive_script_elements test proving the buffer doesn't leak between two sequential <script> elements (fragmented GTM followed by fragmented non-GTM)

prk-Jr

👍 Looks good

aram356 self-assigned this Mar 27, 2026

aram356 marked this pull request as draft March 27, 2026 01:47

This was referenced Mar 27, 2026

Implement streaming response optimization for non-Next.js publisher proxy #563

Open

Phase 4: Stream binary pass-through responses via io::copy #594

Open

aram356 marked this pull request as ready for review March 27, 2026 23:54

aram356 requested review from ChristianPavilonis and prk-Jr March 27, 2026 23:54

prk-Jr reviewed Mar 31, 2026

View reviewed changes

ChristianPavilonis reviewed Mar 31, 2026

View reviewed changes

crates/trusted-server-core/src/publisher.rs Outdated Show resolved Hide resolved

crates/trusted-server-core/src/integrations/google_tag_manager.rs Outdated Show resolved Hide resolved

crates/trusted-server-core/src/integrations/google_tag_manager.rs Show resolved Hide resolved

aram356 mentioned this pull request Mar 31, 2026

Phase 2: Stream responses to client via StreamingBody #585

Open

5 tasks

aram356 requested review from ChristianPavilonis and prk-Jr March 31, 2026 23:05

ChristianPavilonis reviewed Apr 6, 2026

View reviewed changes

aram356 added 8 commits April 8, 2026 18:19

Make NextJsNextDataRewriter fragment-safe for streaming

6e6ac7c

Accumulate text fragments via Mutex<String> until last_in_text_node is true, then process the complete text. Intermediate fragments return RemoveNode to suppress output.

Make GoogleTagManagerIntegration rewrite fragment-safe for streaming

2fb546f

Accumulate text fragments via Mutex<String> until last_in_text_node is true, then match and rewrite on the complete text. Non-GTM scripts that were fragmented are emitted unchanged.

Remove buffered mode from HtmlRewriterAdapter

41c6bb3

All script rewriters (NextJS __NEXT_DATA__, GTM) are now fragment-safe — they accumulate text internally until last_in_text_node. The buffered adapter workaround is no longer needed. Always use streaming mode in create_html_processor.

Add Phase 3 results and Phase 4 plan to spec and plan documents

dd2f82e

Phase 3 performance results: 35% TTFB improvement, 37% DOM Complete improvement on getpurpose.ai staging vs production. Phase 4 adds binary pass-through streaming via PublisherResponse::PassThrough.

aram356 force-pushed the feature/streaming-pipeline-phase3 branch from db597e6 to ff05483 Compare April 9, 2026 01:41

aram356 requested a review from ChristianPavilonis April 9, 2026 01:54

prk-Jr approved these changes Apr 9, 2026

View reviewed changes

ChristianPavilonis approved these changes Apr 9, 2026

View reviewed changes

Conversation

aram356 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance results (staging vs production, median over 5 runs, Chrome 1440x900)

Problem

Solution

What changed

Tests added

Verification

Test plan

Uh oh!

prk-Jr left a comment

Choose a reason for hiding this comment

Summary

Non-blocking

🤔 thinking

♻️ refactor

📝 note

👍 praise

CI Status

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ChristianPavilonis left a comment

Choose a reason for hiding this comment

Review Summary

Highlights

P2 — rsc_placeholders.rs fragmentation strategy undocumented

P3 — Minor

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 commented Mar 31, 2026

Uh oh!

ChristianPavilonis left a comment

Choose a reason for hiding this comment

PR Review: Phase 3 — Fragment-Safe Streaming

Findings not attached inline (files/lines outside this diff):

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prk-Jr left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aram356 commented Mar 26, 2026 •

edited

Loading

P2 — `rsc_placeholders.rs` fragmentation strategy undocumented