From 5bf4a9d4ad2e964537580b3ba880d79a01daa40c Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 08:20:09 -0700
Subject: [PATCH 01/29] Organize spec and design documents into
 docs/superpowers

Move spec-like documents from root and docs/internal into a structured
layout under docs/superpowers with timestamped filenames:

- specs/: active design documents (SSC PRD, technical spec, EdgeZero
  migration, auction orchestration flow, production readiness report)
- archive/: completed or historical specs (optimization, sequence,
  publisher IDs audit)
---
 .../archive/2026-03-19-sequence-design.md     |   5 +-
 .../archive/2026-03-24-optimization-design.md | 144 +++--
 .../2026-03-24-publisher-ids-audit-design.md  |  14 +-
 ...3-11-production-readiness-report-design.md | 609 ++++++++++++++++++
 ...03-19-auction-orchestration-flow-design.md |  46 +-
 .../2026-03-19-edgezero-migration-design.md}  |   0
 .../specs/2026-03-24-ssc-prd-design.md}       |   0
 .../2026-03-24-ssc-technical-spec-design.md}  |   0
 8 files changed, 729 insertions(+), 89 deletions(-)
 rename SEQUENCE.md => docs/superpowers/archive/2026-03-19-sequence-design.md (99%)
 rename OPTIMIZATION.md => docs/superpowers/archive/2026-03-24-optimization-design.md (75%)
 rename PUBLISHER_IDS_AUDIT.md => docs/superpowers/archive/2026-03-24-publisher-ids-audit-design.md (96%)
 create mode 100644 docs/superpowers/specs/2026-03-11-production-readiness-report-design.md
 rename AUCTION_ORCHESTRATION_FLOW.md => docs/superpowers/specs/2026-03-19-auction-orchestration-flow-design.md (97%)
 rename docs/{internal/EDGEZERO_MIGRATION.md => superpowers/specs/2026-03-19-edgezero-migration-design.md} (100%)
 rename docs/{internal/ssc-prd.md => superpowers/specs/2026-03-24-ssc-prd-design.md} (100%)
 rename docs/{internal/ssc_technical_spec.md => superpowers/specs/2026-03-24-ssc-technical-spec-design.md} (100%)
diff --git a/SEQUENCE.md b/docs/superpowers/archive/2026-03-19-sequence-design.md
similarity index 99%
rename from SEQUENCE.md
rename to docs/superpowers/archive/2026-03-19-sequence-design.md
index b835a8dd..e6745bfb 100644
--- a/SEQUENCE.md
+++ b/docs/superpowers/archive/2026-03-19-sequence-design.md
@@ -106,12 +106,12 @@ sequenceDiagram
     TS->>TS: ✅ Reconstruct full URL<br/>✅ Validate tstoken (enc+SHA256)
     TS->>CS: GET original_url
     CS-->>TS: 200 (image/HTML)
-    
+
     opt 📄 HTML Response
       TS->>TS: 🔏 Generate signed target URLs<br/>🔄 Rewrite resource URLs
       TS-->>U: 200 text/html (secured)
     end
-    
+
     opt 🖼️ Image Response
       TS->>TS: ✅ Verify content-type<br/>📊 Log pixel tracking
       TS-->>U: 200 image/* (proxied)
@@ -128,6 +128,7 @@ sequenceDiagram
 ```
 
 ## Notes
+
 - TSJS
   - Served first-party at `/static/tsjs=tsjs-unified.min.js?v=<hash>`. The server dynamically concatenates core + enabled integration modules based on config.
   - Discovers ad units and renders placeholders; either uses slot-level HTML (`/first-party/ad`) or JSON auction (`/auction`).
diff --git a/OPTIMIZATION.md b/docs/superpowers/archive/2026-03-24-optimization-design.md
similarity index 75%
rename from OPTIMIZATION.md
rename to docs/superpowers/archive/2026-03-24-optimization-design.md
index 7e6c84ab..d2097144 100644
--- a/OPTIMIZATION.md
+++ b/docs/superpowers/archive/2026-03-24-optimization-design.md
@@ -12,31 +12,31 @@ This document presents a performance analysis and optimization plan for the Trus
 
 ### CPU Breakdown — Top Level
 
-| % CPU | Function | Notes |
-|-------|----------|-------|
-| ~96% | `trusted_server_adapter_fastly::main` | Almost all time is in application code |
-| ~90% | `route_request` → `handle_publisher_request` | Publisher proxy is the hot path |
-| **~76%** | **HTML processing pipeline** (`streaming_processor` → `lol_html`) | **Dominant bottleneck** |
-| ~~5-8%~~ → **3.3%** | `get_settings()` | ~~Redundant config crate parsing~~ **Fixed** — now uses `toml::from_str` |
-| ~5-7% | `handle_publisher_request` (non-HTML) | Backend send, cookie handling |
+| % CPU               | Function                                                          | Notes                                                                    |
+| ------------------- | ----------------------------------------------------------------- | ------------------------------------------------------------------------ |
+| ~96%                | `trusted_server_adapter_fastly::main`                             | Almost all time is in application code                                   |
+| ~90%                | `route_request` → `handle_publisher_request`                      | Publisher proxy is the hot path                                          |
+| **~76%**            | **HTML processing pipeline** (`streaming_processor` → `lol_html`) | **Dominant bottleneck**                                                  |
+| ~~5-8%~~ → **3.3%** | `get_settings()`                                                  | ~~Redundant config crate parsing~~ **Fixed** — now uses `toml::from_str` |
+| ~5-7%               | `handle_publisher_request` (non-HTML)                             | Backend send, cookie handling                                            |
 
 ### CPU Breakdown — HTML Processing (~76% total)
 
-| % CPU | Function | Notes |
-|-------|----------|-------|
-| **~47%** | `lol_html::parser` state machine | HTML tokenizer/parser — character-by-character parsing |
-| ~11% | `create_html_processor` | Building the lol_html rewriter with all handlers |
-| ~18% | Processing callbacks | URL rewriting, attribute scanning, output sink handling |
+| % CPU    | Function                         | Notes                                                   |
+| -------- | -------------------------------- | ------------------------------------------------------- |
+| **~47%** | `lol_html::parser` state machine | HTML tokenizer/parser — character-by-character parsing  |
+| ~11%     | `create_html_processor`          | Building the lol_html rewriter with all handlers        |
+| ~18%     | Processing callbacks             | URL rewriting, attribute scanning, output sink handling |
 
 ### CPU Breakdown — Other Components
 
-| % CPU | Function | Notes |
-|-------|----------|-------|
-| ~2% | `IntegrationRegistry` | Route lookup + attribute rewriting + initialization |
-| ~0.8% | Memory allocation (`RawVec::reserve`) | Buffer growth during processing |
-| ~0.5% | Logging (`fern` / `log_fastly`) | Minimal overhead |
-| ~0.5% | Synthetic ID generation | HMAC computation |
-| ~0.5% | Header extraction | `fastly::http::handle::get_header_values` |
+| % CPU | Function                              | Notes                                               |
+| ----- | ------------------------------------- | --------------------------------------------------- |
+| ~2%   | `IntegrationRegistry`                 | Route lookup + attribute rewriting + initialization |
+| ~0.8% | Memory allocation (`RawVec::reserve`) | Buffer growth during processing                     |
+| ~0.5% | Logging (`fern` / `log_fastly`)       | Minimal overhead                                    |
+| ~0.5% | Synthetic ID generation               | HMAC computation                                    |
+| ~0.5% | Header extraction                     | `fastly::http::handle::get_header_values`           |
 
 ### Key Takeaways
 
@@ -53,12 +53,12 @@ This document presents a performance analysis and optimization plan for the Trus
 
 Measured on `main` branch. Value is in **relative comparison between branches**, not absolute values.
 
-| Endpoint | P50 | P95 | Req/sec | Notes |
-|---|---|---|---|---|
-| `GET /static/tsjs=tsjs-unified.min.js` | 1.9 ms | 3.1 ms | 4,672 | Pure WASM, no backend |
-| `GET /.well-known/trusted-server.json` | 1.3 ms | 1.4 ms | ~770 | Server-side only |
-| `GET /` (publisher proxy) | 400 ms | 595 ms | 21 | Proxies to golf.com, 222KB HTML |
-| `POST /auction` | 984 ms | 1,087 ms | 9.3 | Calls Prebid + APS backends |
+| Endpoint                               | P50    | P95      | Req/sec | Notes                           |
+| -------------------------------------- | ------ | -------- | ------- | ------------------------------- |
+| `GET /static/tsjs=tsjs-unified.min.js` | 1.9 ms | 3.1 ms   | 4,672   | Pure WASM, no backend           |
+| `GET /.well-known/trusted-server.json` | 1.3 ms | 1.4 ms   | ~770    | Server-side only                |
+| `GET /` (publisher proxy)              | 400 ms | 595 ms   | 21      | Proxies to golf.com, 222KB HTML |
+| `POST /auction`                        | 984 ms | 1,087 ms | 9.3     | Calls Prebid + APS backends     |
 
 - **WASM heap**: 3.0-4.1 MB per request
 - **Init overhead**: <2ms (settings parse + orchestrator + registry)
@@ -68,11 +68,11 @@ Measured on `main` branch. Value is in **relative comparison between branches**,
 
 Measured externally against staging deployment (golf.com proxy), `main` branch.
 
-| Endpoint | TTFB | Total | Size | Notes |
-|---|---|---|---|---|
-| `GET /static/tsjs=tsjs-unified.min.js` | ~204 ms | ~219 ms | 28 KB | No backend; includes client-network + edge path from benchmark vantage |
-| `GET /` (publisher proxy, golf.com) | ~234 ms | ~441 ms | 230 KB | Backend + processing |
-| `GET /.well-known/trusted-server.json` | ~191 ms | - | - | Returns 500 (needs investigation) |
+| Endpoint                               | TTFB    | Total   | Size   | Notes                                                                  |
+| -------------------------------------- | ------- | ------- | ------ | ---------------------------------------------------------------------- |
+| `GET /static/tsjs=tsjs-unified.min.js` | ~204 ms | ~219 ms | 28 KB  | No backend; includes client-network + edge path from benchmark vantage |
+| `GET /` (publisher proxy, golf.com)    | ~234 ms | ~441 ms | 230 KB | Backend + processing                                                   |
+| `GET /.well-known/trusted-server.json` | ~191 ms | -       | -      | Returns 500 (needs investigation)                                      |
 
 **Key insight**: Static JS has ~204ms TTFB with zero backend work **from this specific benchmark vantage point**. That number includes client-to-edge RTT, DNS, TLS/connection state, and edge processing — it is **not** a universal Fastly floor.
 
@@ -153,9 +153,9 @@ fn process_gzip_to_gzip<R: Read, W: Write>(&mut self, input: R, output: W) -> Re
 }
 ```
 
-| Impact | LOC | Risk |
-|--------|-----|------|
-| **High** (most responses are gzip; reduces peak memory) | -15/+3 | Low |
+| Impact                                                  | LOC    | Risk |
+| ------------------------------------------------------- | ------ | ---- |
+| **High** (most responses are gzip; reduces peak memory) | -15/+3 | Low  |
 
 #### 1.2 Fix `HtmlRewriterAdapter` — enable true streaming
 
@@ -197,8 +197,8 @@ impl StreamProcessor for HtmlRewriterAdapter {
 }
 ```
 
-| Impact | LOC | Risk |
-|--------|-----|------|
+| Impact                                                                | LOC            | Risk                         |
+| --------------------------------------------------------------------- | -------------- | ---------------------------- |
 | **High** (HTML is most common content type; eliminates 222KB+ buffer) | ~30 refactored | Medium — needs test coverage |
 
 #### 1.3 ~~Eliminate redundant `config` crate parsing in `get_settings()` — ~5-8% CPU~~ DONE (~3.3% post-fix)
@@ -236,8 +236,8 @@ let settings: Settings = postcard::from_bytes(SETTINGS_DATA)
 
 **Recommendation**: Start with the `toml::from_str()` fix (1-line change, no new deps). If profiling still shows meaningful time in TOML parsing, upgrade to `postcard`.
 
-| Impact | LOC | Risk |
-|--------|-----|------|
+| Impact                                   | LOC | Risk                                         |
+| ---------------------------------------- | --- | -------------------------------------------- |
 | **Medium** (~5-8% → ~3.3% CPU, verified) | 1-3 | Low — `build.rs` already resolves everything |
 
 **Status**: Done. Replaced `Settings::from_toml()` with `toml::from_str()` + explicit `normalize()` + `validate()`. Profiling confirmed: **~5-8% → ~3.3% CPU per request**.
@@ -258,21 +258,21 @@ let settings: Settings = postcard::from_bytes(SETTINGS_DATA)
 .max_level(log::LevelFilter::Info)
 ```
 
-| Impact | LOC | Risk |
-|--------|-----|------|
-| Low (~0.5% CPU) | ~3 | None |
+| Impact          | LOC | Risk |
+| --------------- | --- | ---- |
+| Low (~0.5% CPU) | ~3  | None |
 
 #### 1.5 Trivial fixes batch
 
-| Fix | File | LOC |
-|-----|------|-----|
-| Const cookie prefix instead of `format!()` | `publisher.rs:207-210` | 2 |
-| `mem::take` instead of `clone` for overlap buffer | `streaming_replacer.rs:63` | 1 |
-| `eq_ignore_ascii_case` for compression detection | `streaming_processor.rs:47` | 5 |
-| `Cow<str>` for string replacements | `streaming_replacer.rs:120-125` | 5-10 |
-| Remove base64 roundtrip in token computation | `http_util.rs:286-294` | 10-15 |
-| Replace Handlebars with manual interpolation | `synthetic.rs:82-99` | ~20 |
-| Cache `origin_host()` result per-request | `settings.rs` | 5-10 |
+| Fix                                               | File                            | LOC   |
+| ------------------------------------------------- | ------------------------------- | ----- |
+| Const cookie prefix instead of `format!()`        | `publisher.rs:207-210`          | 2     |
+| `mem::take` instead of `clone` for overlap buffer | `streaming_replacer.rs:63`      | 1     |
+| `eq_ignore_ascii_case` for compression detection  | `streaming_processor.rs:47`     | 5     |
+| `Cow<str>` for string replacements                | `streaming_replacer.rs:120-125` | 5-10  |
+| Remove base64 roundtrip in token computation      | `http_util.rs:286-294`          | 10-15 |
+| Replace Handlebars with manual interpolation      | `synthetic.rs:82-99`            | ~20   |
+| Cache `origin_host()` result per-request          | `settings.rs`                   | 5-10  |
 
 ---
 
@@ -285,6 +285,7 @@ The high-impact architectural change. Uses Fastly's `stream_to_client()` API to
 **Files**: `crates/trusted-server-core/src/publisher.rs`, `crates/trusted-server-adapter-fastly/src/main.rs`
 
 **Current flow** (fully buffered):
+
 ```
 req.send() → wait for full response → take_body()
   → process_response_streaming() → collects into Vec<u8>
@@ -292,6 +293,7 @@ req.send() → wait for full response → take_body()
 ```
 
 **New flow** (streaming):
+
 ```
 req.send() → take_body() → set response headers
   → stream_to_client() → returns StreamingBody (headers sent immediately)
@@ -300,6 +302,7 @@ req.send() → take_body() → set response headers
 ```
 
 **Key enablers**:
+
 - `StreamingPipeline.process()` already accepts `W: Write` — `StreamingBody` implements `Write`
 - With Phase 1 fixes (gzip streaming + HTML rewriter streaming), the pipeline is already chunk-based
 - Non-text responses can use `streaming_body.append(body)` for O(1) pass-through
@@ -307,6 +310,7 @@ req.send() → take_body() → set response headers
 **Architecture change in `main.rs`**: The publisher proxy path calls `stream_to_client()` directly instead of returning a `Response`. Other endpoints (static, auction, discovery) continue returning `Response` as before.
 
 **Error handling for streaming**: Once `stream_to_client()` is called, response headers (including status 200) are already sent. If processing fails mid-stream:
+
 - We cannot change the status code — the client already received 200
 - The `StreamingBody` will be aborted on drop (client sees incomplete response)
 - We should log the error server-side for debugging
@@ -339,8 +343,8 @@ match pipeline.process(backend_body, &mut client_body) {
 }
 ```
 
-| Impact | LOC | Risk |
-|--------|-----|------|
+| Impact                                                                     | LOC     | Risk                                            |
+| -------------------------------------------------------------------------- | ------- | ----------------------------------------------- |
 | **High** — reduces time-to-last-byte and peak memory for all proxied pages | ~80-120 | Medium — error handling requires careful design |
 
 #### 2.2 Concurrent origin fetch + auction (future)
@@ -353,8 +357,8 @@ This would overlap origin fetch time with auction execution, so the browser star
 
 **Note**: This requires significant refactoring of the auction orchestrator and HTML processor to support async injection.
 
-| Impact | LOC | Risk |
-|--------|-----|------|
+| Impact                                                                  | LOC      | Risk                        |
+| ----------------------------------------------------------------------- | -------- | --------------------------- |
 | **Very High** for auction pages — browser starts loading ~400ms earlier | ~150-200 | High — complex coordination |
 
 ---
@@ -371,6 +375,7 @@ After implementing Phases 1-2:
 6. If improvement is marginal, don't ship the streaming architecture (Phase 2)
 
 **Success criteria**:
+
 - Peak memory per request reduced by 30%+ (measurable via Fastly logs)
 - Time-to-last-byte reduced for large HTML pages
 - No regression on static endpoints or auction
@@ -380,15 +385,15 @@ After implementing Phases 1-2:
 
 ## Optimization Summary Table
 
-| # | Optimization | Measured CPU | Impact | LOC | Risk | Phase |
-|---|---|---|---|---|---|---|
-| **1.1** | Gzip streaming fix | Part of ~76% HTML pipeline | **High** (memory) | -15/+3 | Low | 1 |
-| **1.2** | HTML rewriter streaming | Part of ~76% HTML pipeline | **High** (memory) | ~30 | Medium | 1 |
-| **1.3** | ~~Eliminate redundant `config` crate~~ | ~~5-8%~~ → **3.3%** | **Done** | 1-3 | Low | 1 |
-| **1.4** | Reduce verbose logging | ~0.5% | Low | ~3 | None | 1 |
-| **1.5** | Trivial fixes batch | <1% combined | Low | ~50 | None | 1 |
-| **2.1** | `stream_to_client()` integration | N/A (architectural) | **High** (TTLB) | ~80-120 | Medium | 2 |
-| **2.2** | Concurrent origin + auction | N/A (architectural) | **Very High** | ~150-200 | High | 2 (future) |
+| #       | Optimization                           | Measured CPU               | Impact            | LOC      | Risk   | Phase      |
+| ------- | -------------------------------------- | -------------------------- | ----------------- | -------- | ------ | ---------- |
+| **1.1** | Gzip streaming fix                     | Part of ~76% HTML pipeline | **High** (memory) | -15/+3   | Low    | 1          |
+| **1.2** | HTML rewriter streaming                | Part of ~76% HTML pipeline | **High** (memory) | ~30      | Medium | 1          |
+| **1.3** | ~~Eliminate redundant `config` crate~~ | ~~5-8%~~ → **3.3%**        | **Done**          | 1-3      | Low    | 1          |
+| **1.4** | Reduce verbose logging                 | ~0.5%                      | Low               | ~3       | None   | 1          |
+| **1.5** | Trivial fixes batch                    | <1% combined               | Low               | ~50      | None   | 1          |
+| **2.1** | `stream_to_client()` integration       | N/A (architectural)        | **High** (TTLB)   | ~80-120  | Medium | 2          |
+| **2.2** | Concurrent origin + auction            | N/A (architectural)        | **Very High**     | ~150-200 | High   | 2 (future) |
 
 ---
 
@@ -473,13 +478,13 @@ The script builds, starts the profiling server, fires requests, stops the server
 
 ### What the Tools Measure
 
-| Tool | What it tells you |
-|---|---|
-| `benchmark.sh` — TTFB analysis | 20 sequential requests — detects cold start patterns |
-| `benchmark.sh` — Cold start | First vs subsequent request latency |
-| `benchmark.sh` — Endpoint latency | Per-endpoint timing breakdown (DNS, connect, TTFB, total) |
-| `benchmark.sh` — Load test (hey) | Throughput (req/sec), latency distribution (P50/P95/P99) |
-| `profile.sh` | Per-function CPU time inside WASM — flame graph via `--profile-guest` |
+| Tool                              | What it tells you                                                     |
+| --------------------------------- | --------------------------------------------------------------------- |
+| `benchmark.sh` — TTFB analysis    | 20 sequential requests — detects cold start patterns                  |
+| `benchmark.sh` — Cold start       | First vs subsequent request latency                                   |
+| `benchmark.sh` — Endpoint latency | Per-endpoint timing breakdown (DNS, connect, TTFB, total)             |
+| `benchmark.sh` — Load test (hey)  | Throughput (req/sec), latency distribution (P50/P95/P99)              |
+| `profile.sh`                      | Per-function CPU time inside WASM — flame graph via `--profile-guest` |
 
 **Use `profile.sh` first** to identify which functions are bottlenecks, then use `benchmark.sh` to measure the impact of fixes on external timing.
 
@@ -516,6 +521,7 @@ A teammate has prepared changes to `streaming_processor.rs` that address items 1
 - **HTML rewriter fix**: `HtmlRewriterAdapter` rewritten to use `lol_html::OutputSink` trait with `Rc<RefCell<Vec<u8>>>` for incremental streaming
 
 **Review notes on the HTML rewriter change**:
+
 - `lol_html::OutputSink` is a public trait (verified in lol_html 2.7.1)
 - The `Rc<RefCell>` pattern is necessary because `HtmlRewriter::new()` takes ownership of the sink, but we need to read output in `process_chunk()`
 - `Option<HtmlRewriter>` with `.take()` is correct — `end()` consumes self
diff --git a/PUBLISHER_IDS_AUDIT.md b/docs/superpowers/archive/2026-03-24-publisher-ids-audit-design.md
similarity index 96%
rename from PUBLISHER_IDS_AUDIT.md
rename to docs/superpowers/archive/2026-03-24-publisher-ids-audit-design.md
index 37c5b7f3..56bea026 100644
--- a/PUBLISHER_IDS_AUDIT.md
+++ b/docs/superpowers/archive/2026-03-24-publisher-ids-audit-design.md
@@ -7,20 +7,24 @@ This document lists all publisher-specific IDs and configurations found in the c
 ### trusted-server.toml
 
 **GAM Configuration:**
+
 - `publisher_id = "3790"` (line 14)
 - `server_url = "https://securepubads.g.doubleclick.net/gampad/ads"` (line 15)
 
 **Equativ Configuration:**
+
 - `sync_url = "https://adapi-srv-eu.smartadserver.com/ac?pgid=2040327&fmtid=137675&synthetic_id={{synthetic_id}}"` (line 8)
   - Page ID: `2040327`
   - Format ID: `137675`
 
 **Test Publisher Domain:**
+
 - `domain = "test-publisher.com"` (line 2)
 - `cookie_domain = ".test-publisher.com"` (line 3)
 - `origin_url = "https://origin.test-publisher.com"` (line 4)
 
 **KV Store Names (user-specific):**
+
 - `counter_store = "jevans_synth_id_counter"` (line 24)
 - `opid_store = "jevans_synth_id_opid"` (line 25)
 
@@ -29,6 +33,7 @@ This document lists all publisher-specific IDs and configurations found in the c
 ### /Users/jevans/trusted-server/crates/trusted-server-core/src/gam.rs
 
 **Permutive Segment Data (lines 295 and 486):**
+
 ```rust
 .with_prmtvctx("129627,137412,138272,139095,139096,139218,141364,143196,143210,143211,143214,143217,144331,144409,144438,144444,144488,144543,144663,144679,144731,144824,144916,145933,146347,146348,146349,146350,146351,146370,146383,146391,146392,146393,146424,146995,147077,147740,148616,148627,148628,149007,150420,150663,150689,150690,150692,150752,150753,150755,150756,150757,150764,150770,150781,150862,154609,155106,155109,156204,164183,164573,165512,166017,166019,166484,166486,166487,166488,166492,166494,166495,166497,166511,167639,172203,172544,173548,176066,178053,178118,178120,178121,178133,180321,186069,199642,199691,202074,202075,202081,233782,238158,adv,bhgp,bhlp,bhgw,bhlq,bhlt,bhgx,bhgv,bhgu,bhhb,rts".to_string())
 ```
@@ -38,27 +43,32 @@ This large string contains Permutive segment IDs that appear to be captured from
 ### /Users/jevans/trusted-server/crates/trusted-server-core/src/prebid.rs
 
 **Equativ Integration:**
+
 - `"pageId": 2040327` (matches config)
 - `"formatId": 137675` (matches config)
 
 ### Test Files
 
 **Test Support Files:**
+
 - GAM publisher ID `"3790"` in test configurations
 - `"test-publisher.com"` and related test domains in multiple test files
 
 ## Impact Assessment
 
 ### High Priority (Publisher-Specific)
+
 1. **GAM Publisher ID (3790)** - Core identifier for ad serving
 2. **Permutive Segments** - Large hardcoded segment string from test traffic
 3. **Equativ Page/Format IDs (2040327, 137675)** - Ad network integration
 
 ### Medium Priority (Environment-Specific)
+
 1. **Test Publisher Domains** - Should be configurable per deployment
-2. **KV Store Names** - Currently user-specific (jevans_*)
+2. **KV Store Names** - Currently user-specific (jevans\_\*)
 
 ### Low Priority (Infrastructure)
+
 1. **Server URLs** - Generally standard but should be configurable
 
 ## Recommendations
@@ -74,4 +84,4 @@ This large string contains Permutive segment IDs that appear to be captured from
 - `trusted-server.toml` - Add permutive segments configuration
 - `crates/trusted-server-core/src/gam.rs` - Remove hardcoded segments (lines 295, 486)
 - `crates/trusted-server-core/src/prebid.rs` - Use configuration for Equativ IDs
-- Test files - Use environment-agnostic test data
\ No newline at end of file
+- Test files - Use environment-agnostic test data
diff --git a/docs/superpowers/specs/2026-03-11-production-readiness-report-design.md b/docs/superpowers/specs/2026-03-11-production-readiness-report-design.md
new file mode 100644
index 00000000..d49fa647
--- /dev/null
+++ b/docs/superpowers/specs/2026-03-11-production-readiness-report-design.md
@@ -0,0 +1,609 @@
+# Production Readiness Report
+
+**Date:** 2026-03-04
+**Branch:** `split-prebid-deferred-bundle`
+**Scope:** Correctness, security foot-guns, optimization
+**Out of scope:** Test coverage gaps (tracked separately)
+
+## Verdict
+
+**Not production-ready for an internet-exposed deployment.** There are several
+high-risk correctness/security foot-guns plus clear performance wins left on the
+table. The codebase is well-structured and uses Rust's type system effectively,
+but the issues below need resolution before a public-facing deployment.
+
+---
+
+## Summary
+
+| Severity | Count | Areas                                                                                                                                                                                        |
+| -------- | ----- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| CRITICAL | 5     | Request signing coherence, admin auth, creative XSS, prototype stacking                                                                                                                      |
+| HIGH     | 5     | Secret logging, auction timeouts, weak secret validation, regex HTML rewriting, config error swallowing                                                                                      |
+| MEDIUM   | 16    | Host spoofing, non-constant-time comparison, cookie injection, SSRF, memory buffering, per-request CPU waste, prototype perf, SDK polling, observer leaks, prebid host trust, bid truncation |
+| LOW      | 14    | Cache headers, error details, MIME types, regex compilation, status codes, allocations                                                                                                       |
+
+---
+
+## CRITICAL
+
+### C-1: Request-signing store selection is inconsistent (hardcoded vs config-driven)
+
+Store names are hardcoded as `"jwks_store"` / `"signing_keys"` in standalone
+functions but read from `settings.request_signing` in the admin endpoints. If
+config and hardcoded values diverge, signing produces keys the verifier cannot
+find, and key rotation writes to the wrong store.
+
+**Refs:**
+
+- [signing.rs:20](crates/common/src/request_signing/signing.rs#L20) -- `FastlyConfigStore::new("jwks_store")` hardcoded
+- [signing.rs:122](crates/common/src/request_signing/signing.rs#L122) -- `FastlyConfigStore::new("jwks_store")` hardcoded
+- [signing.rs:130](crates/common/src/request_signing/signing.rs#L130) -- `FastlySecretStore::new("signing_keys")` hardcoded
+- [jwks.rs:63](crates/common/src/request_signing/jwks.rs#L63) -- `FastlyConfigStore::new("jwks_store")` hardcoded
+- [rotation.rs:44](crates/common/src/request_signing/rotation.rs#L44) -- `FastlyConfigStore::new("jwks_store")` hardcoded, ignores `config_store_id` constructor arg
+- [endpoints.rs:151](crates/common/src/request_signing/endpoints.rs#L151) -- reads `config_store_id`/`secret_store_id` from settings
+
+**Recommendation:** Single source of truth -- either always read store IDs from
+`Settings` and thread them through, or document + assert the hardcoded names
+match config.
+
+---
+
+### C-2: Admin endpoints unprotected unless handler regex covers them
+
+`/admin/keys/rotate` and `/admin/keys/deactivate` are always routed. The
+`enforce_basic_auth` gate only triggers for paths that match a configured
+`handlers[].path` regex. The default config (`^/secure`) does not cover
+`/admin/*`. An operator who doesn't add an explicit admin handler has
+**publicly-accessible key rotation/deletion endpoints**.
+
+**Refs:**
+
+- [main.rs:97-98](crates/fastly/src/main.rs#L97-L98) -- admin route matching
+- [auth.rs:10](crates/common/src/auth.rs#L10) -- `enforce_basic_auth` checks `handlers` list
+- [settings.rs:381](crates/common/src/settings.rs#L381) -- `handlers` parsing
+- [trusted-server.toml:1](trusted-server.toml#L1) -- default handler only covers `^/secure`
+
+**Recommendation:** Either hard-require auth for `/admin/*` paths regardless of
+handler config, or validate at startup that an admin handler exists.
+
+---
+
+### C-3: Unsanitized creative HTML injected into iframe with weakened sandbox (JS)
+
+Creative HTML (`bid.adm`) from the upstream bidder response is injected into an
+iframe via `srcdoc` with no sanitization. The iframe sandbox includes both
+`allow-scripts` and `allow-same-origin`, which together allow script inside the
+iframe to remove its own sandbox attribute and gain full access to the
+publisher's page context, cookies, and localStorage.
+
+**Refs:**
+
+- [request.ts:77-102](crates/js/lib/src/core/request.ts#L77-L102) -- `iframe.srcdoc = buildCreativeDocument(creativeHtml)`
+- [render.ts:104-111](crates/js/lib/src/core/render.ts#L104-L111) -- sandbox with `allow-scripts` + `allow-same-origin`
+- [render.ts:133-137](crates/js/lib/src/core/render.ts#L133-L137) -- `buildCreativeDocument` does raw string replace
+- [auction.ts:141-172](crates/js/lib/src/core/auction.ts#L141-L172) -- `parseAuctionResponse` passes `adm` straight through
+
+**Recommendation:** Either (a) remove `allow-same-origin` if creatives don't
+need it, (b) serve creatives from a separate origin, or (c) sanitize `adm` with
+DOMPurify before injection.
+
+---
+
+### C-4: Multiple global prototype patches stack without coordination (JS)
+
+Five integrations (Lockr, Permutive, DataDome, GTM, GPT) independently
+monkey-patch `Element.prototype.appendChild` and `Element.prototype.insertBefore`.
+Each captures the current prototype method at install time, creating a chain of
+4+ wrappers. Every single `appendChild` call on the publisher page (including
+text nodes, divs, analytics pixels) now executes 4+ function calls with string
+checks.
+
+The shared guard's `reset()` only flips a boolean -- it does not restore the
+original prototype methods, so SPA contexts and test runs accumulate wrappers.
+
+**Refs:**
+
+- [shared/script_guard.ts:155-191](crates/js/lib/src/shared/script_guard.ts#L155-L191) -- shared factory patches
+- [gpt/script_guard.ts:432-450](crates/js/lib/src/integrations/gpt/script_guard.ts#L432-L450) -- independent GPT patch
+- [shared/script_guard.ts:197-199](crates/js/lib/src/shared/script_guard.ts#L197-L199) -- `reset()` doesn't restore originals
+
+**Recommendation:** Use a centralized dispatcher -- single prototype patch,
+register per-integration handlers. Implement proper `reset()` that restores
+originals.
+
+---
+
+### C-5: `.expect()` on regex compilation from user configuration
+
+Configuration-derived regex patterns use `.expect()` which will panic at runtime
+if the pattern is invalid. If `handler.path` or integration config contains
+invalid regex metacharacters and validation is bypassed (env override, manual
+TOML edit), the service crashes on first matching request.
+
+**Refs:**
+
+- [settings.rs:255](crates/common/src/settings.rs#L255) -- `Regex::new(&self.path).expect(...)`
+- [script_rewriter.rs:125](crates/common/src/integrations/nextjs/script_rewriter.rs#L125) -- `Regex::new(&pattern).expect(...)`
+- [script_rewriter.rs:141](crates/common/src/integrations/nextjs/script_rewriter.rs#L141) -- `Regex::new(&pattern).expect(...)`
+- [shared.rs:83](crates/common/src/integrations/nextjs/shared.rs#L83) -- `Regex::new(...).expect(...)`
+
+**Recommendation:** Return `Result` from these constructors; catch at startup
+with a descriptive error message.
+
+---
+
+## HIGH
+
+### H-1: Secrets and PII logged at INFO/DEBUG level
+
+The full `Settings` struct (including `proxy_secret`, `synthetic.secret_key`,
+handler passwords) is logged via `Debug` format at `INFO` level on every
+request. Synthetic ID generation logs client IP, user agent, and other PII.
+Integration responses (full bid payloads) are logged at debug. Logger is
+globally set to debug level.
+
+**Refs:**
+
+- [main.rs:42](crates/fastly/src/main.rs#L42) -- `log::info!("Settings {settings:?}")`
+- [main.rs:177](crates/fastly/src/main.rs#L177) -- logger level set to debug
+- [synthetic.rs:99](crates/common/src/synthetic.rs#L99) -- logs HMAC input (IP, UA)
+- [synthetic.rs:112](crates/common/src/synthetic.rs#L112) -- logs synthetic ID details
+- [prebid.rs:832](crates/common/src/integrations/prebid.rs#L832) -- logs full bid response
+- [aps.rs:444](crates/common/src/integrations/aps.rs#L444) -- logs APS response
+- [adserver_mock.rs:284](crates/common/src/integrations/adserver_mock.rs#L284) -- logs mock response
+
+**Recommendation:** Implement a `Redacted<T>` wrapper for secret fields that
+prints `[REDACTED]` in `Debug`/`Display`. Set production log level to `INFO` or
+`WARN`. Move payload logging to `TRACE`.
+
+---
+
+### H-2: Auction timeout config not enforced by orchestrator wait logic
+
+`settings.auction.timeout_ms` is passed to `AuctionContext` but the orchestrator
+uses `select()` which blocks until each pending request completes or hits the
+backend's `first_byte_timeout` (15s). There is no mechanism to abort remaining
+requests when the auction timeout is reached.
+
+**Refs:**
+
+- [endpoints.rs:51](crates/common/src/auction/endpoints.rs#L51) -- `timeout_ms: settings.auction.timeout_ms`
+- [provider.rs:54](crates/common/src/auction/provider.rs#L54) -- `fn timeout_ms(&self) -> u32`
+- [orchestrator.rs:287](crates/common/src/auction/orchestrator.rs#L287) -- `while !remaining.is_empty() { select(remaining) }`
+- [backend.rs:118-119](crates/common/src/backend.rs#L118-L119) -- hardcoded 15s first_byte_timeout
+
+**Recommendation:** Implement a deadline-based loop that drops remaining pending
+requests when `timeout_ms` elapses, returning partial results.
+
+---
+
+### H-3: Weak/inconsistent secret default validation
+
+Only `synthetic.secret_key` is checked against the literal `"secret-key"`.
+`publisher.proxy_secret` defaults to `"change-me-proxy-secret"` and
+`synthetic.secret_key` defaults to `"trusted-server"` -- neither is caught by
+the single check. A deployment using defaults has predictable encryption keys.
+
+**Refs:**
+
+- [trusted-server.toml:10](trusted-server.toml#L10) -- `proxy_secret = "change-me-proxy-secret"`
+- [trusted-server.toml:15](trusted-server.toml#L15) -- `secret_key = "trusted-server"`
+- [settings.rs:197](crates/common/src/settings.rs#L197) -- `Settings::validate()` -- no proxy_secret check
+- [settings_data.rs:37](crates/common/src/settings_data.rs#L37) -- only checks `== "secret-key"`
+
+**Recommendation:** Reject all known placeholder values for both secrets.
+Consider minimum entropy requirements.
+
+---
+
+### H-4: Regex-based HTML rewriting in `document.write` interception can fail-open (JS)
+
+The GPT guard uses a regex to match and rewrite GPT domain script URLs in
+`document.write` calls. If the regex fails to match (escaped quotes, unusual
+spacing, mixed quote styles), the original unproxied URL is passed through,
+causing the browser to load the GPT script directly from Google's CDN instead of
+through the first-party proxy.
+
+**Refs:**
+
+- [gpt/script_guard.ts:206-230](crates/js/lib/src/integrations/gpt/script_guard.ts#L206-L230) -- `rewriteHtmlString` regex
+
+**Recommendation:** Use DOM-based parsing (`DOMParser`) instead of regex, or
+fail-closed (block unmatched URLs rather than passing through).
+
+---
+
+### H-5: Configuration errors silently disable integrations
+
+When `integration_config()` returns an error (typo in TOML, wrong type), `.ok()`
+converts it to `None`, making the integration appear "not configured" rather than
+"misconfigured". Operators get no feedback that their config is broken.
+
+**Refs:**
+
+- [prebid.rs:211-212](crates/common/src/integrations/prebid.rs#L211-L212) -- `.ok().flatten()?`
+- [nextjs/mod.rs:97-99](crates/common/src/integrations/nextjs/mod.rs#L97-L99) -- `.ok().flatten()?`
+- [adserver_mock.rs:373](crates/common/src/integrations/adserver_mock.rs#L373) -- `BackendConfig::from_url(...).ok()`
+- [aps.rs:521](crates/common/src/integrations/aps.rs#L521) -- `BackendConfig::from_url(...).ok()`
+- [prebid.rs:950](crates/common/src/integrations/prebid.rs#L950) -- `BackendConfig::from_url(...).ok()`
+
+**Recommendation:** Log a warning with the error before converting to `None`, or
+fail the integration registration with a clear message.
+
+---
+
+## MEDIUM
+
+### M-1: X-Forwarded-Host / Forwarded header spoofing
+
+`RequestInfo` trusts `Forwarded`, `X-Forwarded-Host`, and `X-Forwarded-Proto`
+headers without validation against trusted proxies. An attacker setting
+`X-Forwarded-Host: evil.com` causes the HTML rewriter to replace all origin URLs
+with `evil.com`.
+
+**Refs:**
+
+- [http_util.rs:55-75](crates/common/src/http_util.rs#L55-L75) -- `extract_request_host` trusts forwarded headers
+
+**Recommendation:** Strip or validate forwarded headers at the Fastly VCL layer,
+or validate against a configured allowlist.
+
+---
+
+### M-2: Non-constant-time token/password comparison
+
+Signature verification (`tstoken`, clear URL signatures) and basic auth use
+standard `==` comparison, enabling timing side-channel attacks.
+
+**Refs:**
+
+- [proxy.rs:1054-1058](crates/common/src/proxy.rs#L1054-L1058) -- `expected != sig`
+- [http_util.rs:289-291](crates/common/src/http_util.rs#L289-L291) -- `sign_clear_url(...) == token`
+- [auth.rs:17-18](crates/common/src/auth.rs#L17-L18) -- `password == handler.password`
+
+**Recommendation:** Use `subtle::ConstantTimeEq` (already in dependency tree via
+crypto crates).
+
+---
+
+### M-3: Synthetic ID cookie missing HttpOnly flag
+
+The `synthetic_id` cookie is set with `Secure; SameSite=Lax` but no `HttpOnly`.
+Any XSS on the publisher's page can exfiltrate this tracking identifier via
+`document.cookie`.
+
+**Refs:**
+
+- [cookies.rs:67-72](crates/common/src/cookies.rs#L67-L72) -- `create_synthetic_cookie` format string
+
+**Recommendation:** Add `HttpOnly` if client-side JS doesn't need to read this
+cookie directly (it already gets the value via the `x-synthetic-id` header).
+
+---
+
+### M-4: No synthetic ID format validation on inbound values
+
+The synthetic ID from cookies or headers is accepted without format validation.
+An attacker can inject arbitrary strings (very long, special characters,
+newlines) which are then set as response headers, cookies, and forwarded to
+third-party APIs.
+
+**Refs:**
+
+- [synthetic.rs:129-153](crates/common/src/synthetic.rs#L129-L153) -- `get_synthetic_id` accepts any string
+- [publisher.rs:336](crates/common/src/publisher.rs#L336) -- set as response header
+- [proxy.rs:442](crates/common/src/proxy.rs#L442) -- forwarded as query parameter
+
+**Recommendation:** Validate against the expected format (64 hex + dot + 6
+alphanumeric) in production code, not just tests.
+
+---
+
+### M-5: Cookie value not sanitized in Set-Cookie construction
+
+`synthetic_id` is interpolated directly into the `Set-Cookie` header string
+without escaping. A controlled synthetic ID containing semicolons could alter
+cookie attributes (e.g., `evil; Domain=.attacker.com`).
+
+**Refs:**
+
+- [cookies.rs:67-72](crates/common/src/cookies.rs#L67-L72) -- `format!("...={}; Domain=...", synthetic_id, ...)`
+
+**Recommendation:** Validate/sanitize the value before interpolation, or use a
+cookie builder library.
+
+---
+
+### M-6: SSRF via first-party proxy -- no target domain allowlist
+
+The `/first-party/proxy` endpoint proxies to arbitrary URLs (protected only by
+`tstoken` signature). `proxy_with_redirects` follows up to 4 redirects with no
+domain or IP range restriction, allowing SSRF to internal services if a signed
+URL redirects.
+
+**Refs:**
+
+- [proxy.rs:600-621](crates/common/src/proxy.rs#L600-L621) -- `handle_first_party_proxy`
+- [proxy.rs:463-582](crates/common/src/proxy.rs#L463-L582) -- `proxy_with_redirects` follows redirects
+
+**Recommendation:** Validate redirect targets against an allowlist or block
+private IP ranges.
+
+---
+
+### M-7: "Streaming" processing buffers whole bodies in key paths
+
+The gzip+HTML path reads the entire decompressed body into memory, then the
+`HtmlRewriterAdapter` accumulates it again. Processing a 1MB HTML page creates
+3+ full copies in memory simultaneously.
+
+**Refs:**
+
+- [streaming_processor.rs:196](crates/common/src/streaming_processor.rs#L196) -- `decoder.read_to_end(&mut decompressed)`
+- [streaming_processor.rs:398](crates/common/src/streaming_processor.rs#L398) -- `HtmlRewriterAdapter::accumulated_input`
+- [publisher.rs:129](crates/common/src/publisher.rs#L129) -- `process_response_streaming` collects into `Vec<u8>`
+
+**Recommendation:** Feed chunks incrementally to `lol_html::HtmlRewriter`
+instead of accumulating. Use the streaming `process_through_compression` path
+for gzip.
+
+---
+
+### M-8: Per-request CPU waste -- settings parse/validate and registry rebuild
+
+`get_settings()`, `build_orchestrator()`, and `IntegrationRegistry::new()` all
+run on every request. TOML parsing, regex compilation, and router construction
+are repeated for every incoming request.
+
+**Refs:**
+
+- [main.rs:35](crates/fastly/src/main.rs#L35) -- `get_settings()` per request
+- [main.rs:45](crates/fastly/src/main.rs#L45) -- `build_orchestrator()` per request
+- [main.rs:47](crates/fastly/src/main.rs#L47) -- `IntegrationRegistry::new()` per request
+- [settings_data.rs:28-32](crates/common/src/settings_data.rs#L28-L32) -- TOML parsing + validation
+
+**Recommendation:** Cache parsed settings and registry in `OnceLock` or
+equivalent per-instance state (Fastly Compute instances can reuse across
+requests within the same isolate).
+
+---
+
+### M-9: Prebid response rewriting trusts host/scheme from upstream response body
+
+`request_host` and `request_scheme` for URL rewriting are read from the Prebid
+server's response JSON (`ext.trusted_server.request_host`), not from the local
+request context. A compromised or misconfigured bidder can inject arbitrary
+host/scheme values.
+
+**Refs:**
+
+- [prebid.rs:904](crates/common/src/integrations/prebid.rs#L904) -- reads `request_host` from response JSON
+- [prebid.rs:911](crates/common/src/integrations/prebid.rs#L911) -- reads `request_scheme` from response JSON
+- [prebid.rs:922](crates/common/src/integrations/prebid.rs#L922) -- passes them to `transform_prebid_response`
+
+**Recommendation:** Use the local request's host/scheme from `RequestInfo`
+instead of the bidder's response body.
+
+---
+
+### M-10: `concatenated_hash()` allocates full bundle just to hash it, every HTML response
+
+`concatenate_modules()` builds a full `String` of all JS modules (potentially
+hundreds of KB) solely to hash it, then drops it. This runs on every HTML
+response. Since module content is `&'static str`, the hash is constant and
+should be computed once.
+
+**Refs:**
+
+- [bundle.rs:50-55](crates/js/src/bundle.rs#L50-L55) -- `concatenated_hash` allocates full bundle
+- [tsjs.rs:7](crates/common/src/tsjs.rs#L7) -- called on every HTML response
+
+**Recommendation:** Hash modules incrementally without concatenation, and cache
+the result in a `OnceLock`.
+
+---
+
+### M-11: `UrlPatterns` allocates 5+ Strings per HTML attribute
+
+`rewrite_url_value()` calls 5 `format!()` methods for origin/replacement URLs on
+every `href`, `src`, `action`, `srcset` attribute in the HTML. A typical page has
+dozens to hundreds of these.
+
+**Refs:**
+
+- [html_processor.rs:128-177](crates/common/src/html_processor.rs#L128-L177) -- 5 methods each allocating a String
+
+**Recommendation:** Pre-compute these strings once and store as fields in
+`UrlPatterns`.
+
+---
+
+### M-12: Integer truncation on bid dimensions from external input
+
+Bid width/height from external bidder responses are cast from `u64` to `u32` via
+`as u32`, silently wrapping on values > `u32::MAX`.
+
+**Refs:**
+
+- [prebid.rs:751-755](crates/common/src/integrations/prebid.rs#L751-L755) -- `as u32` truncation
+- [adserver_mock.rs:235-236](crates/common/src/integrations/adserver_mock.rs#L235-L236) -- `as u32` truncation
+
+**Recommendation:** Use `u32::try_from()` or `.min(u32::MAX as u64) as u32`.
+
+---
+
+### M-13: MutationObservers on full document subtree never disconnected (JS)
+
+Three separate modules install `MutationObserver` on `document` with
+`subtree: true`. None expose cleanup APIs. In SPA contexts, observers accumulate
+on each bundle re-evaluation, creating memory leaks and callback overhead.
+
+**Refs:**
+
+- [creative/click.ts:355-359](crates/js/lib/src/integrations/creative/click.ts#L355-L359)
+- [creative/dynamic_src_guard.ts:160-165](crates/js/lib/src/integrations/creative/dynamic_src_guard.ts#L160-L165)
+- [gpt/script_guard.ts:499-504](crates/js/lib/src/integrations/gpt/script_guard.ts#L499-L504)
+
+**Recommendation:** Expose `disconnect()` APIs. Consider a shared observer with
+multiple handlers. Only install when needed.
+
+---
+
+### M-14: Fixed-timeout SDK polling with no retry or event-based fallback (JS)
+
+Lockr and Permutive integrations poll for SDK availability with a fixed 2.5s
+window (`50 attempts * 50ms`). On slow networks, the SDK loads after the window
+closes, and the shim is silently skipped.
+
+**Refs:**
+
+- [lockr/index.ts:82-101](crates/js/lib/src/integrations/lockr/index.ts#L82-L101)
+- [permutive/index.ts:81-100](crates/js/lib/src/integrations/permutive/index.ts#L81-L100)
+
+**Recommendation:** Increase timeout, use exponential backoff, or add
+event-based detection (MutationObserver on script insertion).
+
+---
+
+### M-15: `requestAds` callback fires synchronously before bids arrive (JS)
+
+The `requestAds` callback is invoked immediately after initiating the auction
+fetch, not after bids return. Publishers expecting Prebid-style "bids ready"
+semantics will find an empty bid state.
+
+**Refs:**
+
+- [core/request.ts:15-60](crates/js/lib/src/core/request.ts#L15-L60) -- callback at line 55
+
+**Recommendation:** Fire callback inside `.then()` after rendering, or document
+the difference from Prebid's contract.
+
+---
+
+### M-16: `with_body_text_plain` may override `APPLICATION_JSON` Content-Type
+
+All JSON API responses chain `.with_content_type(APPLICATION_JSON)` then
+`.with_body_text_plain(...)`. The latter likely clobbers the Content-Type to
+`text/plain`. Automated clients checking Content-Type before parsing will fail.
+
+**Refs:**
+
+- [endpoints.rs:49-51](crates/common/src/request_signing/endpoints.rs#L49-L51) -- repeated across 6 endpoints
+
+**Recommendation:** Use `.with_body()` instead of `.with_body_text_plain()`.
+
+---
+
+## LOW
+
+### L-1: Lockr regex compiled per request
+
+**Ref:** [lockr.rs:123-126](crates/common/src/integrations/lockr.rs#L123-L126)
+
+Should use `Lazy<Regex>` like other integrations (datadome, nextjs).
+
+### L-2: `serve_static_with_etag` rehashes body already hashed for URL
+
+**Ref:** [http_util.rs:182-186](crates/common/src/http_util.rs#L182-L186)
+
+The SHA-256 hash is computed twice for the same static content.
+
+### L-3: Handlebars engine created per request in synthetic ID generation
+
+**Ref:** [synthetic.rs:84](crates/common/src/synthetic.rs#L84)
+
+Template engine should be initialized once.
+
+### L-4: ETag comparison doesn't handle multi-value `If-None-Match`
+
+**Ref:** [http_util.rs:188-192](crates/common/src/http_util.rs#L188-L192)
+
+Per RFC 7232, `If-None-Match` can contain comma-separated ETags.
+
+### L-5: `image/*` is not a valid Content-Type
+
+**Ref:** [proxy.rs:247-248](crates/common/src/proxy.rs#L247-L248)
+
+Wildcard MIME types are only valid in Accept headers. Use
+`application/octet-stream`.
+
+### L-6: Proxy errors return 502 for client-caused failures
+
+**Ref:** [error.rs:107](crates/common/src/error.rs#L107)
+
+"Missing tsurl", "invalid tstoken", "expired tsexp" should be 400/403.
+
+### L-7: Body re-sent on 301/302 redirects
+
+**Ref:** [proxy.rs:503-506](crates/common/src/proxy.rs#L503-L506)
+
+Per HTTP spec, only 307/308 should preserve body.
+
+### L-8: `rewrite_attribute` allocates String even when no rewriters match
+
+**Ref:** [registry.rs:696](crates/common/src/integrations/registry.rs#L696)
+
+Use `Cow<str>` to avoid allocation on the common no-match path.
+
+### L-9: `StreamingReplacer` clones overlap buffer + N+1 String allocs per chunk
+
+**Ref:** [streaming_replacer.rs:63](crates/common/src/streaming_replacer.rs#L63)
+
+Clone on empty buffer and chained `String::replace()` per replacement pattern.
+
+### L-10: `Compression::from_content_encoding` allocates String for case comparison
+
+**Ref:** [streaming_processor.rs:46-53](crates/common/src/streaming_processor.rs#L46-L53)
+
+Use `eq_ignore_ascii_case` instead of `.to_lowercase()`.
+
+### L-11: `all_module_ids()` allocates Vec on every call
+
+**Ref:** [bundle.rs:17-19](crates/js/src/bundle.rs#L17-L19)
+
+Return from a `OnceLock` or return an iterator.
+
+### L-12: No `X-Content-Type-Options: nosniff` on server-generated responses
+
+**Refs:** [http_util.rs:204](crates/common/src/http_util.rs#L204), [error.rs:18](crates/fastly/src/error.rs#L18)
+
+### L-13: Error responses expose internal error context to clients
+
+**Ref:** [error.rs:17-18](crates/fastly/src/error.rs#L17-L18)
+
+`user_message()` includes configuration/proxy error strings.
+
+### L-14: Static JS bundles cached for only 300 seconds despite cache-busting hash
+
+**Ref:** [http_util.rs:207-208](crates/common/src/http_util.rs#L207-L208)
+
+With `?v={hash}` query strings, `max-age` could be much longer (1 year).
+
+---
+
+## Assumptions / Open Questions
+
+1. Assumes `handlers` are the only auth gate for admin endpoints (C-2).
+2. If Fastly reuses instances, repeated logger init via `.apply()` could become
+   a runtime foot-gun; worth validating in the runtime model
+   ([main.rs:197](crates/fastly/src/main.rs#L197)).
+3. The creative HTML injection (C-3) severity depends on whether upstream bidder
+   responses are already considered trusted. In an open RTB context they are not.
+4. Per-request settings parsing (M-8) may be inherent to Fastly Compute's
+   per-request isolate model -- verify whether instance reuse is possible.
+
+---
+
+## Positive Findings
+
+- No `unsafe` code anywhere in the codebase
+- No `unwrap()` in production code (all use `expect("should ...")`)
+- Proper scheme validation in proxy (`http`/`https` only)
+- Internal header filtering prevents leaking `x-synthetic-id`, `x-geo-*` to third parties
+- `tstoken` signature validation on first-party proxy URLs
+- Insecure default secret key detection (partial -- see H-3)
+- Max redirect limit (4) prevents infinite redirect loops
+- Ed25519 request signing with canonical JSON payloads
+- Module filename allowlisting prevents path traversal in JS serving
+- Well-structured error handling with `error-stack` throughout
diff --git a/AUCTION_ORCHESTRATION_FLOW.md b/docs/superpowers/specs/2026-03-19-auction-orchestration-flow-design.md
similarity index 97%
rename from AUCTION_ORCHESTRATION_FLOW.md
rename to docs/superpowers/specs/2026-03-19-auction-orchestration-flow-design.md
index 4e7dda6b..02a86d62 100644
--- a/AUCTION_ORCHESTRATION_FLOW.md
+++ b/docs/superpowers/specs/2026-03-19-auction-orchestration-flow-design.md
@@ -50,7 +50,7 @@ sequenceDiagram
     activate TS
     Client->>TS: POST /auction<br/>AdRequest with adUnits[]
     Note right of Client: { "adUnits": [{ "code": "header-banner",<br/>  "mediaTypes": { "banner": { "sizes": [[728,90]] } } }] }
-    
+
     TS->>TS: 🔧 Parse AdRequest<br/>🔄 Transform to AuctionRequest<br/>🆔 Generate user IDs<br/>📊 Build context
     deactivate Client
     deactivate TS
@@ -64,7 +64,7 @@ sequenceDiagram
     TS->>Orch: orchestrator.run_auction()
     Orch->>Orch: 🔍 Detect strategy<br/>mediator? parallel_mediation : parallel_only
     deactivate TS
-    
+
     Note over Orch: Strategy determined by config:<br/>[auction]<br/>mediator = "adserver_mock" → parallel_mediation<br/>No mediator → parallel_only
   end
 
@@ -74,27 +74,27 @@ sequenceDiagram
     activate APS
     activate Prebid
     activate Mock
-    
+
     par Parallel Provider Calls
       Orch->>APS: POST /e/dtb/bid<br/>APS TAM format
       Note right of Orch: { "pubId": "5128",<br/>  "slots": [{ "slotID": "header-banner",<br/>    "sizes": [[728,90]] }] }
-      
+
       APS->>Mock: APS TAM request
       Mock-->>APS: APS bid response<br/>(encoded prices, no creative)
       Note right of Mock: { "contextual": { "slots": [{<br/>  "slotID": "header-banner",<br/>  "amznbid": "Mi41MA==", // "2.50"<br/>  "fif": "1" }] } }
-      
+
       APS-->>Orch: AuctionResponse<br/>(APS bids)
     and
       Orch->>Prebid: POST /openrtb2/auction<br/>OpenRTB 2.x format
       Note right of Orch: { "id": "request",<br/>  "imp": [{ "id": "header-banner",<br/>    "banner": { "w": 728, "h": 90 } }] }
-      
+
       Prebid->>Mock: OpenRTB request
       Mock-->>Prebid: OpenRTB response<br/>(clear prices, with creative)
       Note right of Mock: { "seatbid": [{ "seat": "prebid",<br/>  "bid": [{ "price": 2.00, "adm": "<html>..." }] }] }
-      
+
       Prebid-->>Orch: AuctionResponse<br/>(Prebid bids)
     end
-    
+
     Note over Orch: 📊 Collected bids from all providers<br/>APS: encoded prices, no creative<br/>Prebid: clear prices, with creative
     deactivate Mock
     deactivate APS
@@ -108,10 +108,10 @@ sequenceDiagram
       activate Med
       Orch->>Med: POST /adserver/mediate<br/>All bids for final selection
       Note right of Orch: { "id": "auction-123",<br/>  "imp": [...],<br/>  "ext": { "bidder_responses": [<br/>    { "bidder": "amazon-aps",<br/>      "bids": [{ "encoded_price": "Mi41MA==" }] },<br/>    { "bidder": "prebid",<br/>      "bids": [{ "price": 2.00 }] }] } }
-      
+
       Med->>Med: 🔓 Decode APS encoded prices<br/>📏 Apply floor prices<br/>🏆 Select highest CPM per slot
       Note right of Med: Base64 decode: "Mi41MA==" → "2.50"<br/>Winner: APS at $2.50 vs Prebid at $2.00
-      
+
       Med-->>Orch: OpenRTB response with winners
       Note right of Med: { "seatbid": [{ "seat": "amazon-aps",<br/>  "bid": [{ "price": 2.50, "impid": "header-banner" }] }] }
       deactivate Med
@@ -121,7 +121,7 @@ sequenceDiagram
       Note over Client,Mock: 🏆 Direct Winner Selection
       Orch->>Orch: 📏 Compare clear prices only<br/>⚠️  Skip APS (encoded prices)<br/>🏆 Select highest CPM
       Note right of Orch: APS bids skipped (encoded prices)<br/>Winner: Prebid at $2.00 (only clear price)
-      
+
       Note over Orch: 📝 Results: Limited winner selection<br/>Cannot compare encoded APS prices<br/>Prebid wins by default
     end
   end
@@ -132,10 +132,10 @@ sequenceDiagram
     activate TS
     activate Client
     Orch->>Orch: 🔄 Transform to OpenRTB response<br/>🖼️ Generate iframe creatives<br/>🔏 Rewrite creative URLs<br/>📊 Add orchestrator metadata
-    
+
     Orch-->>TS: OpenRTB BidResponse
     Note right of Orch: { "id": "auction-response",<br/>  "seatbid": [{ "seat": "amazon-aps",<br/>    "bid": [{ "price": 2.50,<br/>      "adm": "<iframe src=\"/first-party/proxy?tsurl=...\">",<br/>      "w": 728, "h": 90 }] }] }<br/>  "ext": { "orchestrator": {<br/>    "strategy": "parallel_mediation",<br/>    "bidders": 2, "time_ms": 150 } }
-    
+
     TS-->>Client: 200 OpenRTB response<br/>with winning creative
     deactivate Orch
     deactivate TS
@@ -154,42 +154,49 @@ sequenceDiagram
 ## 📋 Flow Summary
 
 ### **Phase 1: Request Initiation**
+
 - **Browser** sends `POST /auction` with ad units in Prebid.js format
 - **Trusted Server** parses and transforms to internal `AuctionRequest`
 - Generates user IDs (persistent + fresh) and builds auction context
 
 ### **Phase 2: Strategy Detection**
+
 - **Orchestrator** checks configuration for mediator
 - **With mediator** → `parallel_mediation` strategy
 - **Without mediator** → `parallel_only` strategy
 
 ### **Phase 3: Parallel Execution**
+
 - **APS Provider** receives APS TAM format request
   - Mocktioneer returns APS response with **encoded prices** (`amznbid: "Mi41MA=="`)
   - **No creative HTML** provided (typical for real APS)
-- **Prebid Provider** receives OpenRTB 2.x request  
+- **Prebid Provider** receives OpenRTB 2.x request
   - Mocktioneer returns OpenRTB response with **clear prices**
   - **Includes creative HTML** in `adm` field
 
 ### **Phase 4: Winner Selection**
 
 #### **🔄 With Mediator (Recommended)**
+
 1. **AdServer Mediator** receives all bids
 2. **Decodes APS prices** (base64 → actual CPM)
 3. **Applies floor prices** and selects highest CPM per slot
 4. **Returns OpenRTB response** with proper winner selection
 
 #### **⚡ Without Mediator (Limited)**
+
 1. **Orchestrator** compares only clear prices
 2. **APS bids skipped** (encoded prices can't be compared)
 3. **Prebid wins by default** if no other clear-price bidders
 
 ### **Phase 5: Response Assembly**
+
 - **Creative HTML** rewritten with first-party proxy URLs
 - **Orchestrator metadata** added (strategy, timing, bid counts)
 - **OpenRTB response** returned to browser
 
 ### **Phase 6: Creative Rendering**
+
 - **Winning creative** injected into iframe
 - **Resources proxied** through first-party domain
 - **Privacy & security** maintained throughout
@@ -197,22 +204,26 @@ sequenceDiagram
 ## 🔑 Key Technical Details
 
 ### **Price Encoding**
+
 - **APS Mock**: Uses base64 encoding (`"Mi41MA=="` → `"2.50"`)
 - **Real APS**: Uses proprietary encoding (only Amazon/GAM can decode)
 - **Prebid**: Uses clear decimal prices (`2.50`)
 
 ### **Request Formats**
+
 - **APS TAM**: `{ "pubId": "...", "slots": [...] }`
-- **OpenRTB 2.x**: `{ "imp": [...] }` 
+- **OpenRTB 2.x**: `{ "imp": [...] }`
 - **AdRequest**: `{ "adUnits": [...] }`
 
 ### **Response Formats**
+
 - **APS**: `{ "contextual": { "slots": [...] } }` (no `adm`)
 - **OpenRTB**: `{ "seatbid": [{ "seat": "...", "bid": [...] }] }`
 
 ### **Configuration Examples**
 
 #### **Parallel Mediation**
+
 ```toml
 [auction]
 enabled = true
@@ -222,6 +233,7 @@ timeout_ms = 2000
 ```
 
 #### **Parallel Only**
+
 ```toml
 [auction]
 enabled = true
@@ -231,12 +243,14 @@ timeout_ms = 2000
 ```
 
 ### **Advantages of Mediation**
+
 - ✅ **Proper APS integration** - Can decode and compare APS bids
 - ✅ **Fair competition** - All bidders compete on equal footing
 - ✅ **Floor pricing** - Configurable minimum bid thresholds
 - ✅ **Flexibility** - Easy to add new providers
 
 ### **Limitations Without Mediation**
+
 - ❌ **APS bids ignored** - Can't compare encoded prices
 - ❌ **Unfair competition** - Only clear-price bidders compete
-- ❌ **Reduced revenue** - May miss higher APS bids
\ No newline at end of file
+- ❌ **Reduced revenue** - May miss higher APS bids
diff --git a/docs/internal/EDGEZERO_MIGRATION.md b/docs/superpowers/specs/2026-03-19-edgezero-migration-design.md
similarity index 100%
rename from docs/internal/EDGEZERO_MIGRATION.md
rename to docs/superpowers/specs/2026-03-19-edgezero-migration-design.md
diff --git a/docs/internal/ssc-prd.md b/docs/superpowers/specs/2026-03-24-ssc-prd-design.md
similarity index 100%
rename from docs/internal/ssc-prd.md
rename to docs/superpowers/specs/2026-03-24-ssc-prd-design.md
diff --git a/docs/internal/ssc_technical_spec.md b/docs/superpowers/specs/2026-03-24-ssc-technical-spec-design.md
similarity index 100%
rename from docs/internal/ssc_technical_spec.md
rename to docs/superpowers/specs/2026-03-24-ssc-technical-spec-design.md

From deffcec9957771e22eb8f904c94969d1ad9291fa Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 08:23:49 -0700
Subject: [PATCH 02/29] Enable superpowers and chrome-devtools plugins

---
 .claude/settings.json | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/.claude/settings.json b/.claude/settings.json
index 95f403b4..02b602d4 100644
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -25,7 +25,11 @@
       "Bash(git status:*)",
       "mcp__plugin_chrome-devtools-mcp_chrome-devtools__new_page",
       "mcp__plugin_chrome-devtools-mcp_chrome-devtools__performance_stop_trace",
-      "mcp__plugin_chrome-devtools-mcp_chrome-devtools__evaluate_script"
+      "mcp__plugin_chrome-devtools-mcp_chrome-devtools__evaluate_script",
     ]
+  },
+  "enabledPlugins": {
+    "chrome-devtools@claude-plugins-official": true,
+    "superpowers@claude-plugins-official": true
   }
 }

From baefae8cb3ea6ceaa67fe270cf574817d13829f5 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 08:41:53 -0700
Subject: [PATCH 03/29] Add streaming response optimization spec for
 non-Next.js paths

Design spec for streaming HTTP responses through the publisher proxy
when Next.js is disabled. Covers two implementation steps:

1. Make the streaming pipeline chunk-emitting (HtmlRewriterAdapter,
   gzip, encoder finalization)
2. Wire up Fastly StreamingBody via stream_to_client() with entry
   point migration from #[fastly::main] to undecorated main()

Includes streaming gate logic, error handling, rollback strategy,
and testing plan. Verified against Fastly SDK 0.11.12 API.
---
 .../2026-03-25-streaming-response-design.md   | 278 ++++++++++++++++++
 1 file changed, 278 insertions(+)
 create mode 100644 docs/superpowers/specs/2026-03-25-streaming-response-design.md

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
new file mode 100644
index 00000000..80c49ed8
--- /dev/null
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -0,0 +1,278 @@
+# Streaming Response Optimization (Next.js Disabled)
+
+## Problem
+
+When Next.js is disabled, the publisher proxy buffers the entire response body
+in memory before sending any bytes to the client. This creates two costs:
+
+1. **Latency** — The client receives zero bytes until the full response is
+   decompressed, rewritten, and recompressed. For a 222KB HTML page, this adds
+   hundreds of milliseconds to time-to-last-byte.
+2. **Memory** — Peak memory holds ~4x the response size simultaneously
+   (compressed input + decompressed + processed output + recompressed output).
+   With WASM's ~16MB heap, this limits the size of pages we can proxy.
+
+## Scope
+
+**In scope**: All content types flowing through the publisher proxy path — HTML,
+text/JSON, RSC Flight (`text/x-component`), and binary pass-through. Only when
+Next.js is disabled (no post-processor requiring the full document).
+
+**Out of scope**: Concurrent origin+auction fetch, Next.js-enabled paths (these
+require full-document post-processing by design), non-publisher routes (static
+JS, auction, discovery).
+
+## Streaming Gate
+
+Before committing to `stream_to_client()`, check:
+
+1. Backend status is success (2xx).
+2. For HTML content: `html_post_processors()` is empty — no registered
+   post-processors. Non-HTML content types (text/JSON, RSC Flight, binary) can
+   always stream regardless of post-processor registration, since
+   post-processors only apply to HTML.
+
+If either check fails for the given content type, fall back to the current
+buffered path. This keeps the optimization transparent: same behavior for all
+existing configurations, streaming only activates when safe.
+
+## Architecture
+
+Two implementation steps, each independently valuable and testable.
+
+### Step 1: Make the pipeline chunk-emitting
+
+Three changes to existing processors:
+
+#### A) `HtmlRewriterAdapter` — incremental streaming
+
+The current implementation accumulates the entire HTML document and processes it
+on `is_last`. This is unnecessary — `lol_html::HtmlRewriter` supports
+incremental `write()` calls and emits output via its `OutputSink` callback after
+each chunk.
+
+Fix: create the rewriter eagerly in the constructor, use
+`Rc<RefCell<Vec<u8>>>` to share the output buffer between the sink and
+`process_chunk()`, drain the buffer on every call instead of only on `is_last`.
+The output buffer is drained _after_ each `rewriter.write()` returns, so the
+`RefCell` borrow in the sink closure never overlaps with the drain borrow.
+
+Note: this makes `HtmlRewriterAdapter` single-use — `reset()` becomes a no-op
+since the `Settings` are consumed by the rewriter constructor. This matches
+actual usage (one adapter per request).
+
+#### B) Chunk-based decompression for all compression paths
+
+`process_gzip_to_gzip` calls `read_to_end()` to decompress the entire body into
+memory. The deflate and brotli keep-compression paths already use chunk-based
+`process_through_compression()`.
+
+Fix: use the same `process_through_compression` pattern for gzip.
+
+Additionally, `decompress_and_process()` (used by `process_gzip_to_none`,
+`process_deflate_to_none`, `process_brotli_to_none`) also calls
+`read_to_end()`. These strip-compression paths must be converted to chunk-based
+processing too — read decompressed chunks, process each, write uncompressed
+output directly.
+
+Reference: `process_uncompressed` already implements the correct chunk-based
+pattern (read loop → `process_chunk()` per chunk → `write_all()` → flush). The
+compressed paths should follow the same structure.
+
+#### C) `process_through_compression` finalization — prerequisite for B
+
+`process_through_compression` currently uses `drop(encoder)` which silently
+swallows errors. Today this affects deflate and brotli (which already use this
+path). The current `process_gzip_to_gzip` calls `encoder.finish()` explicitly —
+but Step 1B moves gzip to `process_through_compression`, which would **regress**
+gzip from working `finish()` to broken `drop()`. This fix prevents that
+regression and also fixes the pre-existing issue for deflate/brotli.
+
+Fix: call `encoder.finish()` explicitly and propagate errors. This must land
+before or with Step 1B.
+
+### Step 2: Stream response to client
+
+Change the publisher proxy path to use Fastly's `StreamingBody` API:
+
+1. Fetch from origin, receive response headers.
+2. Validate status — if backend error, return buffered error response via
+   `send_to_client()`.
+3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
+   to buffered path.
+4. Finalize all response headers. This requires reordering two things:
+   - **Synthetic ID/cookie headers**: today set _after_ body processing in
+     `handle_publisher_request`. Since they are body-independent (computed from
+     request cookies and consent context), move them _before_ streaming.
+   - **`finalize_response()`** (main.rs): today called _after_ `route_request`
+     returns, adding geo, version, staging, and operator headers. In the
+     streaming path, this must run _before_ `stream_to_client()` since the
+     publisher handler sends the response directly instead of returning it.
+5. Remove `Content-Length` header — the final size is unknown after processing.
+   Fastly's `StreamingBody` sends the response using chunked transfer encoding
+   automatically.
+6. Call `response.stream_to_client()` — headers sent to client immediately.
+7. Pipe origin body through the streaming pipeline, writing chunks directly to
+   `StreamingBody`.
+8. Call `finish()` on success; on error, log and drop (client sees truncated
+   response).
+
+For binary/non-text content: call `response.take_body()` then stream via
+`io::copy(&mut body, &mut streaming_body)`. The `Body` type implements `Read`
+and `StreamingBody` implements `Write`, so this streams the backend body to the
+client without buffering the full content. Today binary responses skip
+`take_body()` and return the response as-is — the streaming path needs to
+explicitly take the body to pipe it through.
+
+#### Entry point change
+
+Migrate `main.rs` from `#[fastly::main]` to an undecorated `main()` with
+`Request::from_client()`. No separate initialization call is needed —
+`#[fastly::main]` is just syntactic sugar for `Request::from_client()` +
+`Response::send_to_client()`. The migration is required because
+`stream_to_client()` / `send_to_client()` are incompatible with
+`#[fastly::main]`'s return-based model.
+
+```rust
+fn main() {
+    let req = Request::from_client();
+    match handle(req) {
+        Ok(()) => {}
+        Err(e) => to_error_response(&e).send_to_client(),
+    }
+}
+```
+
+Note: the return type changes from `Result<Response, Error>` to `()` (or
+`Result<(), Error>`). Errors that currently propagate to `main`'s `Result` must
+now be caught explicitly and sent via `send_to_client()` with
+`to_error_response()`.
+
+Non-streaming routes (static, auction, discovery) use `send_to_client()` as
+before.
+
+## Data Flow
+
+### Streaming path (HTML, text/JSON with processing)
+
+```
+Origin body (gzip)
+  → Read 8KB chunk from GzDecoder
+  → StreamProcessor::process_chunk(chunk, is_last)
+      → HtmlRewriterAdapter: lol_html.write(chunk) → sink emits rewritten bytes
+      → OR StreamingReplacer: URL replacement with overlap buffer
+  → GzEncoder::write(processed_chunk) → compressed bytes
+  → StreamingBody::write(compressed) → chunk sent to client
+  → repeat until EOF
+  → StreamingBody::finish()
+```
+
+Memory at steady state: ~8KB input chunk buffer, lol_html internal parser state,
+gzip encoder window, and overlap buffer for replacer. Roughly constant regardless
+of document size, versus the current ~4x document size.
+
+### Pass-through path (binary, images, fonts, etc.)
+
+```
+Origin body (via take_body())
+  → io::copy(&mut body, &mut streaming_body) → streamed transfer
+  → StreamingBody::finish()
+```
+
+No decompression, no processing. Body streams through as read.
+
+### Buffered fallback path (error responses or post-processors present)
+
+```
+Origin returns 4xx/5xx OR html_post_processors() is non-empty
+  → Current buffered path unchanged
+  → send_to_client() with proper status and full body
+```
+
+## Error Handling
+
+**Backend returns error status**: Detected before calling `stream_to_client()`.
+Return the backend response as-is via `send_to_client()`. Client sees the
+correct error status code. No change from current behavior.
+
+**Processor creation fails**: `create_html_stream_processor()` or pipeline
+construction errors happen _before_ `stream_to_client()` is called. Since
+headers have not been sent yet, return a proper error response via
+`send_to_client()`. Same as current behavior.
+
+**Processing fails mid-stream**: `lol_html` parse error, decompression
+corruption, I/O error during chunk processing. Headers (200 OK) are already
+sent. Log the error server-side, drop the `StreamingBody`. Per the Fastly SDK,
+`StreamingBody` automatically aborts the response if dropped without calling
+`finish()` — the client sees a connection reset / truncated response. This is
+standard reverse proxy behavior.
+
+**Compression finalization fails**: The gzip trailer CRC32 write fails. With the
+fix, `encoder.finish()` is called explicitly and errors propagate. Same
+mid-stream handling — log and truncate.
+
+No retry logic. No fallback to buffered after streaming has started — once
+headers are sent, we are committed.
+
+## Files Changed
+
+| File                                                    | Change                                                                                                                                                                                                | Risk   |
+| ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ |
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); convert all compression paths to chunk-based processing (`process_gzip_to_gzip` and `decompress_and_process`); fix `process_through_compression` to call `finish()` explicitly | High |
+| `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming | Medium |
+| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to undecorated `main()` with `Request::from_client()`; explicit error handling via `to_error_response().send_to_client()`; call `finalize_response()` before streaming | Medium |
+
+**Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
+`HtmlRewriterAdapter`, works as-is), integration registration, JS build
+pipeline, tsjs module serving, auction handler, cookie/synthetic ID logic.
+
+Note: `HtmlWithPostProcessing` wraps `HtmlRewriterAdapter` and applies
+post-processors on `is_last`. In the streaming path the post-processor list is
+empty (that's the gate condition), so the wrapper is a no-op passthrough. It
+remains in place — no need to bypass it.
+
+Clarification: `script_rewriters` (used by Next.js and GTM) are distinct from
+`html_post_processors`. Script rewriters run inside `lol_html` element handlers
+during streaming — they do not require buffering and are unaffected by this
+change. The streaming gate checks only `html_post_processors().is_empty()`, not
+script rewriters. Currently only Next.js registers a post-processor.
+
+## Rollback Strategy
+
+The `#[fastly::main]` to raw `main()` migration is a structural change. If
+streaming causes issues in production, the fastest rollback is reverting the
+`main.rs` change — the buffered path still exists and the pipeline improvements
+(Step 1) are safe to keep regardless. No feature flag needed; a git revert of
+the Step 2 commit restores buffered behavior while retaining Step 1 memory
+improvements.
+
+## Testing Strategy
+
+### Unit tests (streaming_processor.rs)
+
+- `HtmlRewriterAdapter` emits output on every `process_chunk()` call, not just
+  `is_last`.
+- `process_gzip_to_gzip` produces correct output without `read_to_end`.
+- `encoder.finish()` errors propagate (not swallowed by `drop`).
+- Multi-chunk HTML produces identical output to single-chunk processing.
+
+### Integration tests (publisher.rs)
+
+- Streaming gate: when `html_post_processors()` is non-empty, response is
+  buffered.
+- Streaming gate: when `html_post_processors()` is empty, response streams.
+- Backend error (4xx/5xx) returns buffered error response with correct status.
+- Binary content passes through without processing.
+
+### End-to-end validation (Viceroy)
+
+- `cargo test --workspace` — all existing tests pass.
+- Manual verification via `fastly compute serve` against a real origin.
+- Compare response bodies before/after to confirm byte-identical output for
+  HTML, text, and binary.
+
+### Measurement (post-deploy)
+
+- Compare TTFB and time-to-last-byte on staging before and after.
+- Monitor WASM heap usage via Fastly dashboard.
+- Verify no regressions on static endpoints or auction.

From de8dbfdc6085b3f009c09cc498837c4906429f1b Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 10:57:34 -0700
Subject: [PATCH 04/29] Expand testing strategy with Chrome DevTools MCP
 performance measurement

Add before/after measurement protocol using Chrome DevTools MCP tools:
network timing capture, Lighthouse audits, performance traces, memory
snapshots, and automated body hash comparison for correctness.
---
 .../2026-03-25-streaming-response-design.md   | 71 ++++++++++++++++++-
 1 file changed, 69 insertions(+), 2 deletions(-)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 80c49ed8..5606c773 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -271,8 +271,75 @@ improvements.
 - Compare response bodies before/after to confirm byte-identical output for
   HTML, text, and binary.
 
-### Measurement (post-deploy)
+### Performance measurement via Chrome DevTools MCP
+
+Capture before/after metrics using Chrome DevTools MCP against Viceroy locally
+and staging. Run each measurement set on `main` (baseline) and the feature
+branch, then compare.
+
+#### Baseline capture (before — on `main`)
+
+1. Start local server: `fastly compute serve`
+2. Navigate to publisher proxy URL via `navigate_page`
+3. Capture network timing:
+   - `list_network_requests` — record TTFB (`responseStart - requestStart`),
+     total time (`responseEnd - requestStart`), and transfer size for the
+     document request
+   - Filter for the main document (`resourceType: Document`)
+4. Run Lighthouse audit:
+   - `lighthouse_audit` with categories `["performance"]`
+   - Record TTFB, LCP, Speed Index, Total Blocking Time
+5. Capture performance trace:
+   - `performance_start_trace` → load page → `performance_stop_trace`
+   - `performance_analyze_insight` — extract "Time to First Byte" and
+     "Network requests" insights
+6. Take memory snapshot:
+   - `take_memory_snapshot` — record JS heap size as a secondary check
+     (WASM heap is measured separately via Fastly dashboard)
+7. Repeat 3-5 times for stable medians
+
+#### Post-implementation capture (after — on feature branch)
+
+Repeat the same steps on the feature branch. Compare:
+
+| Metric | Source | Expected change |
+|--------|--------|-----------------|
+| TTFB (document) | Network timing | Minimal change (gated by backend response time) |
+| Time to last byte | Network timing (`responseEnd`) | Reduced — body streams incrementally |
+| LCP | Lighthouse | Improved — browser receives `<head>` resources sooner |
+| Speed Index | Lighthouse | Improved — progressive rendering starts earlier |
+| Transfer size | Network timing | Unchanged (same content, same compression) |
+| Response body hash | `evaluate_script` with hash | Identical — correctness check |
+
+#### Automated comparison script
+
+Use `evaluate_script` to compute a response body hash in the browser for
+correctness verification:
+
+```js
+// Run via evaluate_script after page load
+const response = await fetch(location.href);
+const buffer = await response.arrayBuffer();
+const hash = await crypto.subtle.digest('SHA-256', buffer);
+const hex = [...new Uint8Array(hash)].map(b => b.toString(16).padStart(2, '0')).join('');
+hex; // compare this between baseline and feature branch
+```
+
+#### What to watch for
+
+- **TTFB regression**: If TTFB increases, the header finalization reordering
+  may be adding latency. Investigate `finalize_response()` and synthetic ID
+  computation timing.
+- **Body mismatch**: If response body hashes differ between baseline and
+  feature branch, the streaming pipeline is producing different output.
+  Bisect between Step 1 and Step 2 to isolate.
+- **LCP unchanged**: If LCP doesn't improve, the `<head>` content may not be
+  reaching the browser earlier. Check whether `lol_html` emits the `<head>`
+  injection in the first chunk or buffers until more input arrives.
+
+### Measurement (post-deploy to staging)
 
-- Compare TTFB and time-to-last-byte on staging before and after.
+- Repeat Chrome DevTools MCP measurements against staging URL.
+- Compare against Viceroy results to account for real network conditions.
 - Monitor WASM heap usage via Fastly dashboard.
 - Verify no regressions on static endpoints or auction.

From fa741677da17510b134c789318fb8595c41dc616 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 16:01:03 -0700
Subject: [PATCH 05/29] Clarify flush vs drop behavior in
 process_through_compression

---
 .../specs/2026-03-25-streaming-response-design.md         | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 5606c773..72716b73 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -81,9 +81,11 @@ compressed paths should follow the same structure.
 
 #### C) `process_through_compression` finalization — prerequisite for B
 
-`process_through_compression` currently uses `drop(encoder)` which silently
-swallows errors. Today this affects deflate and brotli (which already use this
-path). The current `process_gzip_to_gzip` calls `encoder.finish()` explicitly —
+`process_through_compression` currently calls `flush()` (with error
+propagation) then `drop(encoder)` for finalization. The `flush()` only flushes
+buffered data but does not write compression trailers/footers — `drop()`
+handles finalization but silently swallows errors. Today this affects deflate
+and brotli (which already use this path). The current `process_gzip_to_gzip` calls `encoder.finish()` explicitly —
 but Step 1B moves gzip to `process_through_compression`, which would **regress**
 gzip from working `finish()` to broken `drop()`. This fix prevents that
 regression and also fixes the pre-existing issue for deflate/brotli.

From 221d97164cc7ec40c1ace21c24ce3425137488b1 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 16:17:05 -0700
Subject: [PATCH 06/29] Add implementation plan for streaming response
 optimization

---
 .../plans/2026-03-25-streaming-response.md    | 983 ++++++++++++++++++
 1 file changed, 983 insertions(+)
 create mode 100644 docs/superpowers/plans/2026-03-25-streaming-response.md

diff --git a/docs/superpowers/plans/2026-03-25-streaming-response.md b/docs/superpowers/plans/2026-03-25-streaming-response.md
new file mode 100644
index 00000000..268517b8
--- /dev/null
+++ b/docs/superpowers/plans/2026-03-25-streaming-response.md
@@ -0,0 +1,983 @@
+# Streaming Response Optimization — Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use
+> superpowers:subagent-driven-development (recommended) or
+> superpowers:executing-plans to implement this plan task-by-task. Steps use
+> checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Stream HTTP responses through the publisher proxy instead of buffering
+them, reducing peak memory from ~4x response size to constant and improving
+time-to-last-byte.
+
+**Architecture:** Two independent phases. Phase 1 makes the internal streaming
+pipeline truly chunk-emitting (HtmlRewriterAdapter, compression paths, encoder
+finalization). Phase 2 wires up Fastly's `StreamingBody` API so processed chunks
+flow directly to the client. Each phase is shippable independently.
+
+**Tech Stack:** Rust 1.91.1, Fastly Compute SDK 0.11.12
+(`stream_to_client`/`send_to_client`/`StreamingBody`), `lol_html` (HTML
+rewriting), `flate2` (gzip/deflate), `brotli` (brotli compression).
+
+**Spec:** `docs/superpowers/specs/2026-03-25-streaming-response-design.md`
+**Issue:** #563
+
+---
+
+## File Map
+
+| File | Role | Phase |
+|------|------|-------|
+| `crates/trusted-server-core/src/streaming_processor.rs` | `HtmlRewriterAdapter` rewrite, compression path fixes, encoder finalization | 1 |
+| `crates/trusted-server-core/src/publisher.rs` | `process_response_streaming` refactor to `W: Write`, streaming gate, header reordering | 2 |
+| `crates/trusted-server-adapter-fastly/src/main.rs` | Entry point migration from `#[fastly::main]` to raw `main()`, response routing | 2 |
+
+---
+
+## Phase 1: Make the Pipeline Chunk-Emitting
+
+### Task 1: Fix encoder finalization in `process_through_compression`
+
+This is the prerequisite for Task 2. The current code calls `flush()` then
+`drop(encoder)`, silently swallowing finalization errors. Must fix before
+moving gzip to this path.
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/streaming_processor.rs:334-393`
+- Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
+
+- [ ] **Step 1: Write a test verifying deflate round-trip correctness**
+
+Add to the `#[cfg(test)]` module at the bottom of
+`streaming_processor.rs`:
+
+```rust
+#[test]
+fn test_deflate_round_trip_produces_valid_output() {
+    // Verify that deflate-to-deflate (which uses process_through_compression)
+    // produces valid output that decompresses correctly. This establishes the
+    // correctness contract before we change the finalization path.
+    use flate2::read::ZlibDecoder;
+    use flate2::write::ZlibEncoder;
+
+    let input_data = b"<html><body>hello world</body></html>";
+
+    // Compress input
+    let mut compressed_input = Vec::new();
+    {
+        let mut enc = ZlibEncoder::new(&mut compressed_input, flate2::Compression::default());
+        enc.write_all(input_data)
+            .expect("should compress test input");
+        enc.finish().expect("should finish compression");
+    }
+
+    let replacer = StreamingReplacer::new(vec![Replacement {
+        find: "hello".to_string(),
+        replace_with: "hi".to_string(),
+    }]);
+
+    let config = PipelineConfig {
+        input_compression: Compression::Deflate,
+        output_compression: Compression::Deflate,
+        chunk_size: 8192,
+    };
+
+    let mut pipeline = StreamingPipeline::new(config, replacer);
+    let mut output = Vec::new();
+
+    pipeline
+        .process(&compressed_input[..], &mut output)
+        .expect("should process deflate-to-deflate");
+
+    // Decompress output and verify correctness
+    let mut decompressed = Vec::new();
+    ZlibDecoder::new(&output[..])
+        .read_to_end(&mut decompressed)
+        .expect("should decompress output — implies encoder was finalized correctly");
+
+    assert_eq!(
+        String::from_utf8(decompressed).expect("should be valid UTF-8"),
+        "<html><body>hi world</body></html>",
+        "should have replaced content through deflate round-trip"
+    );
+}
+```
+
+- [ ] **Step 2: Run test to verify it passes (baseline)**
+
+Run: `cargo test --package trusted-server-core test_deflate_round_trip_produces_valid_output`
+
+Expected: PASS (current code happens to work for this case since
+`ZlibEncoder::drop` calls `finish` internally — the test establishes the
+contract).
+
+- [ ] **Step 3: Change `process_through_compression` to take `&mut W` and remove `drop(encoder)`**
+
+`finish()` is not on the `Write` trait — each encoder type
+(`GzEncoder`, `ZlibEncoder`, `CompressorWriter`) has its own `finish()`.
+The fix: change the signature to take `&mut W` so the caller retains
+ownership and calls `finish()` explicitly.
+
+Change signature (line 335-338):
+
+```rust
+    fn process_through_compression<R: Read, W: Write>(
+        &mut self,
+        mut decoder: R,
+        encoder: &mut W,
+    ) -> Result<(), Report<TrustedServerError>> {
+```
+
+Replace lines 383-393 (the `flush` + `drop` block):
+
+```rust
+        encoder.flush().change_context(TrustedServerError::Proxy {
+            message: "Failed to flush encoder".to_string(),
+        })?;
+
+        // Caller owns encoder and must call finish() after this returns.
+        Ok(())
+    }
+```
+
+Then update `process_deflate_to_deflate` (lines 276-289):
+
+```rust
+    fn process_deflate_to_deflate<R: Read, W: Write>(
+        &mut self,
+        input: R,
+        output: W,
+    ) -> Result<(), Report<TrustedServerError>> {
+        use flate2::read::ZlibDecoder;
+        use flate2::write::ZlibEncoder;
+
+        let decoder = ZlibDecoder::new(input);
+        let mut encoder = ZlibEncoder::new(output, flate2::Compression::default());
+        self.process_through_compression(decoder, &mut encoder)?;
+        encoder.finish().change_context(TrustedServerError::Proxy {
+            message: "Failed to finalize deflate encoder".to_string(),
+        })?;
+        Ok(())
+    }
+```
+
+And update `process_brotli_to_brotli` (lines 303-321):
+
+```rust
+    fn process_brotli_to_brotli<R: Read, W: Write>(
+        &mut self,
+        input: R,
+        output: W,
+    ) -> Result<(), Report<TrustedServerError>> {
+        use brotli::enc::writer::CompressorWriter;
+        use brotli::enc::BrotliEncoderParams;
+        use brotli::Decompressor;
+
+        let decoder = Decompressor::new(input, 4096);
+        let mut params = BrotliEncoderParams::default();
+        params.quality = 4;
+        params.lgwin = 22;
+        let mut encoder = CompressorWriter::with_params(output, 4096, &params);
+        self.process_through_compression(decoder, &mut encoder)?;
+        // CompressorWriter finalizes on flush (already called) and into_inner
+        encoder.into_inner();
+        Ok(())
+    }
+```
+
+- [ ] **Step 4: Run all tests**
+
+Run: `cargo test --package trusted-server-core`
+
+Expected: All existing tests pass plus the new one.
+
+- [ ] **Step 5: Commit**
+
+```
+git add crates/trusted-server-core/src/streaming_processor.rs
+git commit -m "Fix encoder finalization: explicit finish instead of drop"
+```
+
+---
+
+### Task 2: Convert `process_gzip_to_gzip` to chunk-based processing
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/streaming_processor.rs:183-225`
+- Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
+
+- [ ] **Step 1: Write a test for gzip chunk-based round-trip**
+
+```rust
+#[test]
+fn test_gzip_to_gzip_produces_correct_output() {
+    use flate2::read::GzDecoder;
+    use flate2::write::GzEncoder;
+
+    let input_data = b"<html><body>hello world</body></html>";
+
+    // Compress input as gzip
+    let mut compressed_input = Vec::new();
+    {
+        let mut enc = GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+        enc.write_all(input_data)
+            .expect("should compress test input");
+        enc.finish().expect("should finish compression");
+    }
+
+    let replacer = StreamingReplacer::new(vec![Replacement {
+        find: "hello".to_string(),
+        replace_with: "hi".to_string(),
+    }]);
+
+    let config = PipelineConfig {
+        input_compression: Compression::Gzip,
+        output_compression: Compression::Gzip,
+        chunk_size: 8192,
+    };
+
+    let mut pipeline = StreamingPipeline::new(config, replacer);
+    let mut output = Vec::new();
+
+    pipeline
+        .process(&compressed_input[..], &mut output)
+        .expect("should process gzip-to-gzip");
+
+    // Decompress and verify
+    let mut decompressed = Vec::new();
+    GzDecoder::new(&output[..])
+        .read_to_end(&mut decompressed)
+        .expect("should decompress gzip output");
+
+    assert_eq!(
+        String::from_utf8(decompressed).expect("should be valid UTF-8"),
+        "<html><body>hi world</body></html>",
+        "should have replaced content through gzip round-trip"
+    );
+}
+```
+
+- [ ] **Step 2: Run test to verify it passes (baseline)**
+
+Run: `cargo test --package trusted-server-core test_gzip_to_gzip_produces_correct_output`
+
+Expected: PASS (current code works, just buffers everything).
+
+- [ ] **Step 3: Rewrite `process_gzip_to_gzip` to use `process_through_compression`**
+
+Replace `process_gzip_to_gzip` (lines 183-225):
+
+```rust
+    fn process_gzip_to_gzip<R: Read, W: Write>(
+        &mut self,
+        input: R,
+        output: W,
+    ) -> Result<(), Report<TrustedServerError>> {
+        use flate2::read::GzDecoder;
+        use flate2::write::GzEncoder;
+
+        let decoder = GzDecoder::new(input);
+        let mut encoder = GzEncoder::new(output, flate2::Compression::default());
+        self.process_through_compression(decoder, &mut encoder)?;
+        encoder.finish().change_context(TrustedServerError::Proxy {
+            message: "Failed to finalize gzip encoder".to_string(),
+        })?;
+        Ok(())
+    }
+```
+
+- [ ] **Step 4: Run all tests**
+
+Run: `cargo test --package trusted-server-core`
+
+Expected: All tests pass.
+
+- [ ] **Step 5: Commit**
+
+```
+git add crates/trusted-server-core/src/streaming_processor.rs
+git commit -m "Convert process_gzip_to_gzip to chunk-based processing"
+```
+
+---
+
+### Task 3: Convert `decompress_and_process` to chunk-based processing
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/streaming_processor.rs:227-262`
+- Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
+
+Note: the `*_to_none` callers (`process_gzip_to_none`,
+`process_deflate_to_none`, `process_brotli_to_none` at lines 264-332) do
+not need changes — they call `decompress_and_process` with the same
+signature.
+
+- [ ] **Step 1: Write a test for gzip-to-none chunk-based processing**
+
+```rust
+#[test]
+fn test_gzip_to_none_produces_correct_output() {
+    use flate2::write::GzEncoder;
+
+    let input_data = b"<html><body>hello world</body></html>";
+
+    let mut compressed_input = Vec::new();
+    {
+        let mut enc = GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+        enc.write_all(input_data)
+            .expect("should compress test input");
+        enc.finish().expect("should finish compression");
+    }
+
+    let replacer = StreamingReplacer::new(vec![Replacement {
+        find: "hello".to_string(),
+        replace_with: "hi".to_string(),
+    }]);
+
+    let config = PipelineConfig {
+        input_compression: Compression::Gzip,
+        output_compression: Compression::None,
+        chunk_size: 8192,
+    };
+
+    let mut pipeline = StreamingPipeline::new(config, replacer);
+    let mut output = Vec::new();
+
+    pipeline
+        .process(&compressed_input[..], &mut output)
+        .expect("should process gzip-to-none");
+
+    assert_eq!(
+        String::from_utf8(output).expect("should be valid UTF-8"),
+        "<html><body>hi world</body></html>",
+        "should have replaced content and output uncompressed"
+    );
+}
+```
+
+- [ ] **Step 2: Run test to verify baseline**
+
+Run: `cargo test --package trusted-server-core test_gzip_to_none_produces_correct_output`
+
+Expected: PASS.
+
+- [ ] **Step 3: Rewrite `decompress_and_process` to use chunk loop**
+
+Replace `decompress_and_process` (lines 227-262) with a chunk-based
+version that mirrors `process_uncompressed`:
+
+```rust
+    fn decompress_and_process<R: Read, W: Write>(
+        &mut self,
+        mut decoder: R,
+        mut output: W,
+        _codec_name: &str,
+    ) -> Result<(), Report<TrustedServerError>> {
+        let mut buffer = vec![0u8; self.config.chunk_size];
+
+        loop {
+            match decoder.read(&mut buffer) {
+                Ok(0) => {
+                    let final_chunk =
+                        self.processor.process_chunk(&[], true).change_context(
+                            TrustedServerError::Proxy {
+                                message: "Failed to process final chunk".to_string(),
+                            },
+                        )?;
+                    if !final_chunk.is_empty() {
+                        output.write_all(&final_chunk).change_context(
+                            TrustedServerError::Proxy {
+                                message: "Failed to write final chunk".to_string(),
+                            },
+                        )?;
+                    }
+                    break;
+                }
+                Ok(n) => {
+                    let processed = self
+                        .processor
+                        .process_chunk(&buffer[..n], false)
+                        .change_context(TrustedServerError::Proxy {
+                            message: "Failed to process chunk".to_string(),
+                        })?;
+                    if !processed.is_empty() {
+                        output.write_all(&processed).change_context(
+                            TrustedServerError::Proxy {
+                                message: "Failed to write processed chunk".to_string(),
+                            },
+                        )?;
+                    }
+                }
+                Err(e) => {
+                    return Err(Report::new(TrustedServerError::Proxy {
+                        message: format!("Failed to read from decoder: {e}"),
+                    }));
+                }
+            }
+        }
+
+        output.flush().change_context(TrustedServerError::Proxy {
+            message: "Failed to flush output".to_string(),
+        })?;
+
+        Ok(())
+    }
+```
+
+- [ ] **Step 4: Run all tests**
+
+Run: `cargo test --package trusted-server-core`
+
+Expected: All tests pass.
+
+- [ ] **Step 5: Commit**
+
+```
+git add crates/trusted-server-core/src/streaming_processor.rs
+git commit -m "Convert decompress_and_process to chunk-based processing"
+```
+
+---
+
+### Task 4: Rewrite `HtmlRewriterAdapter` for incremental streaming
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/streaming_processor.rs:396-472`
+- Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
+
+Important context: `create_html_processor` in `html_processor.rs` returns
+`HtmlWithPostProcessing`, which wraps `HtmlRewriterAdapter`. The wrapper's
+`process_chunk` (line 31-34 of `html_processor.rs`) returns intermediate
+output immediately for `!is_last` chunks — it passes through, not
+swallows. When the post-processor list is empty (streaming gate condition),
+the wrapper is a no-op passthrough. No changes needed to
+`html_processor.rs`.
+
+- [ ] **Step 1: Write a test proving incremental output**
+
+```rust
+#[test]
+fn test_html_rewriter_adapter_emits_output_per_chunk() {
+    use lol_html::Settings;
+
+    let settings = Settings::default();
+    let mut adapter = HtmlRewriterAdapter::new(settings);
+
+    // First chunk should produce output (not empty)
+    let result1 = adapter
+        .process_chunk(b"<html><body>", false)
+        .expect("should process chunk 1");
+    assert!(
+        !result1.is_empty(),
+        "should emit output for non-last chunk, got empty"
+    );
+
+    // Second chunk should also produce output
+    let result2 = adapter
+        .process_chunk(b"<p>hello</p>", false)
+        .expect("should process chunk 2");
+    assert!(
+        !result2.is_empty(),
+        "should emit output for second non-last chunk, got empty"
+    );
+
+    // Final chunk
+    let result3 = adapter
+        .process_chunk(b"</body></html>", true)
+        .expect("should process final chunk");
+
+    // Concatenated output should be the full document
+    let mut full_output = result1;
+    full_output.extend_from_slice(&result2);
+    full_output.extend_from_slice(&result3);
+    let output_str = String::from_utf8(full_output).expect("should be valid UTF-8");
+    assert!(
+        output_str.contains("<html>") && output_str.contains("hello"),
+        "should contain complete document, got: {output_str}"
+    );
+}
+```
+
+- [ ] **Step 2: Run test to verify it fails (current code returns empty for non-last chunks)**
+
+Run: `cargo test --package trusted-server-core test_html_rewriter_adapter_emits_output_per_chunk`
+
+Expected: FAIL — assertion `should emit output for non-last chunk` fails.
+
+- [ ] **Step 3: Rewrite `HtmlRewriterAdapter` to stream incrementally**
+
+Replace the struct and impl (lines 396-472):
+
+```rust
+/// Adapter to use `lol_html` `HtmlRewriter` as a `StreamProcessor`.
+///
+/// Creates the rewriter eagerly and emits output on every `process_chunk`
+/// call. Single-use: `reset()` is a no-op since `Settings` are consumed
+/// by the rewriter constructor.
+pub struct HtmlRewriterAdapter {
+    rewriter: Option<lol_html::HtmlRewriter<'static, RcVecSink>>,
+    output: Rc<RefCell<Vec<u8>>>,
+}
+
+/// Output sink that appends to a shared `Vec<u8>`.
+struct RcVecSink(Rc<RefCell<Vec<u8>>>);
+
+impl lol_html::OutputSink for RcVecSink {
+    fn handle_chunk(&mut self, chunk: &[u8]) {
+        self.0.borrow_mut().extend_from_slice(chunk);
+    }
+}
+
+impl HtmlRewriterAdapter {
+    /// Create a new HTML rewriter adapter.
+    ///
+    /// The rewriter is created immediately, consuming the settings.
+    #[must_use]
+    pub fn new(settings: lol_html::Settings<'static, 'static>) -> Self {
+        let output = Rc::new(RefCell::new(Vec::new()));
+        let sink = RcVecSink(Rc::clone(&output));
+        let rewriter = lol_html::HtmlRewriter::new(settings, sink);
+        Self {
+            rewriter: Some(rewriter),
+            output,
+        }
+    }
+}
+
+impl StreamProcessor for HtmlRewriterAdapter {
+    fn process_chunk(&mut self, chunk: &[u8], is_last: bool) -> Result<Vec<u8>, io::Error> {
+        if let Some(rewriter) = &mut self.rewriter {
+            if !chunk.is_empty() {
+                rewriter.write(chunk).map_err(|e| {
+                    log::error!("Failed to process HTML chunk: {e}");
+                    io::Error::other(format!("HTML processing failed: {e}"))
+                })?;
+            }
+        }
+
+        if is_last {
+            if let Some(rewriter) = self.rewriter.take() {
+                rewriter.end().map_err(|e| {
+                    log::error!("Failed to finalize HTML: {e}");
+                    io::Error::other(format!("HTML finalization failed: {e}"))
+                })?;
+            }
+        }
+
+        // Drain whatever lol_html produced since last call.
+        // Safe: sink borrow released before we borrow here.
+        Ok(std::mem::take(&mut *self.output.borrow_mut()))
+    }
+
+    fn reset(&mut self) {
+        // No-op: rewriter consumed Settings on construction.
+        // Single-use by design (one per request).
+    }
+}
+```
+
+Add these imports at the top of `streaming_processor.rs`:
+
+```rust
+use std::cell::RefCell;
+use std::rc::Rc;
+```
+
+- [ ] **Step 4: Run all tests**
+
+Run: `cargo test --package trusted-server-core`
+
+Expected: The new per-chunk test passes. Some existing tests that assert
+"intermediate chunks return empty" will now fail and need updating.
+
+- [ ] **Step 5: Update existing tests for new behavior**
+
+Update `test_html_rewriter_adapter_accumulates_until_last` — it currently
+asserts empty output for non-last chunks. Change assertions to expect
+non-empty intermediate output and verify the concatenated result.
+
+Update `test_html_rewriter_adapter_handles_large_input` — same: remove
+assertions that intermediate chunks are empty.
+
+Update `test_html_rewriter_adapter_reset` — `reset()` is now a no-op.
+Remove or update this test since the adapter is single-use.
+
+- [ ] **Step 6: Run all tests again**
+
+Run: `cargo test --package trusted-server-core`
+
+Expected: All tests pass.
+
+- [ ] **Step 7: Run clippy**
+
+Run: `cargo clippy --workspace --all-targets --all-features -- -D warnings`
+
+Expected: No warnings.
+
+- [ ] **Step 8: Commit**
+
+```
+git add crates/trusted-server-core/src/streaming_processor.rs
+git commit -m "Rewrite HtmlRewriterAdapter for incremental lol_html streaming"
+```
+
+---
+
+### Task 5: Phase 1 full verification
+
+- [ ] **Step 1: Run full test suite**
+
+Run: `cargo test --workspace`
+
+Expected: All tests pass.
+
+- [ ] **Step 2: Run JS tests**
+
+Run: `cd crates/js/lib && npx vitest run`
+
+Expected: All tests pass.
+
+- [ ] **Step 3: Run clippy and fmt**
+
+Run: `cargo clippy --workspace --all-targets --all-features -- -D warnings && cargo fmt --all -- --check`
+
+Expected: Clean.
+
+- [ ] **Step 4: Build for WASM target**
+
+Run: `cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1`
+
+Expected: Builds successfully.
+
+---
+
+## Phase 2: Stream Response to Client
+
+### Task 6: Migrate entry point from `#[fastly::main]` to raw `main()`
+
+**Files:**
+- Modify: `crates/trusted-server-adapter-fastly/src/main.rs:32-68`
+
+- [ ] **Step 1: Rewrite `main` function**
+
+Replace lines 32-68:
+
+```rust
+fn main() {
+    init_logger();
+
+    let req = Request::from_client();
+
+    // Health probe: independent from settings/routing.
+    if req.get_method() == Method::GET && req.get_path() == "/health" {
+        Response::from_status(200)
+            .with_body_text_plain("ok")
+            .send_to_client();
+        return;
+    }
+
+    let settings = match get_settings() {
+        Ok(s) => s,
+        Err(e) => {
+            log::error!("Failed to load settings: {:?}", e);
+            to_error_response(&e).send_to_client();
+            return;
+        }
+    };
+    log::debug!("Settings {settings:?}");
+
+    let orchestrator = build_orchestrator(&settings);
+
+    let integration_registry = match IntegrationRegistry::new(&settings) {
+        Ok(r) => r,
+        Err(e) => {
+            log::error!("Failed to create integration registry: {:?}", e);
+            to_error_response(&e).send_to_client();
+            return;
+        }
+    };
+
+    let response = futures::executor::block_on(route_request(
+        &settings,
+        &orchestrator,
+        &integration_registry,
+        req,
+    ));
+
+    match response {
+        Ok(resp) => resp.send_to_client(),
+        Err(e) => to_error_response(&e).send_to_client(),
+    }
+}
+```
+
+- [ ] **Step 2: Run all tests**
+
+Run: `cargo test --workspace`
+
+Expected: All tests pass.
+
+- [ ] **Step 3: Build for WASM target**
+
+Run: `cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1`
+
+Expected: Builds successfully.
+
+- [ ] **Step 4: Commit**
+
+```
+git add crates/trusted-server-adapter-fastly/src/main.rs
+git commit -m "Migrate entry point from #[fastly::main] to raw main()"
+```
+
+---
+
+### Task 7: Refactor `process_response_streaming` to accept `W: Write`
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/publisher.rs:97-180`
+
+- [ ] **Step 1: Change signature to accept generic writer**
+
+Change `process_response_streaming` from returning `Body` to writing into
+a generic `W: Write`:
+
+```rust
+fn process_response_streaming<W: Write>(
+    body: Body,
+    output: &mut W,
+    params: &ProcessResponseParams,
+) -> Result<(), Report<TrustedServerError>> {
+```
+
+Remove `let mut output = Vec::new();` (line 117) and
+`Ok(Body::from(output))` (line 179). The caller passes the output writer.
+
+- [ ] **Step 2: Update the call site in `handle_publisher_request`**
+
+In `handle_publisher_request`, replace the current call (lines 338-341):
+
+```rust
+// Before:
+match process_response_streaming(body, &params) {
+    Ok(processed_body) => {
+        response.set_body(processed_body);
+
+// After:
+let mut output = Vec::new();
+match process_response_streaming(body, &mut output, &params) {
+    Ok(()) => {
+        response.set_body(Body::from(output));
+```
+
+This preserves existing behavior — the buffered path still works.
+
+- [ ] **Step 3: Run all tests**
+
+Run: `cargo test --workspace`
+
+Expected: All tests pass (behavior unchanged).
+
+- [ ] **Step 4: Commit**
+
+```
+git add crates/trusted-server-core/src/publisher.rs
+git commit -m "Refactor process_response_streaming to accept generic writer"
+```
+
+---
+
+### Task 8: Add streaming path to publisher proxy
+
+**Files:**
+- Modify: `crates/trusted-server-core/src/publisher.rs`
+- Modify: `crates/trusted-server-adapter-fastly/src/main.rs`
+
+This is the core change. `handle_publisher_request` needs to support two
+modes: buffered (returns `Response`) and streaming (sends response directly
+via `StreamingBody`). The streaming path requires access to Fastly-specific
+types (`StreamingBody`, `send_to_client`), but `publisher.rs` lives in
+`trusted-server-core` which is platform-agnostic.
+
+**Approach:** Add a `ResponseMode` enum or callback that
+`handle_publisher_request` uses to decide how to send the response. The
+simplest approach: split into a preparation phase (returns headers + body
+stream + processing params) and a send phase (in the fastly adapter).
+
+Alternatively, since `StreamingPipeline::process` already takes `W: Write`,
+the adapter can call `process_response_streaming` with a `StreamingBody`
+directly. The key is that the adapter needs to:
+
+1. Call `handle_publisher_request` logic up to the point of body processing
+2. Decide buffered vs streaming
+3. Either buffer or stream
+
+This task is complex — the implementer should read the spec's Step 2
+carefully and adapt the approach to minimize changes. The plan provides the
+structure; exact code depends on how the publisher function is decomposed.
+
+- [ ] **Step 1: Export `finalize_response` or its logic for use before streaming**
+
+In `main.rs`, make `finalize_response` callable from the publisher path.
+Either make it `pub` and move to `trusted-server-core`, or pass a
+pre-finalized response to the streaming path.
+
+- [ ] **Step 2: Add streaming gate check**
+
+Add a helper in `publisher.rs`:
+
+```rust
+fn should_stream(
+    status: u16,
+    content_type: &str,
+    integration_registry: &IntegrationRegistry,
+) -> bool {
+    if !(200..300).contains(&status) {
+        return false;
+    }
+    // Only html_post_processors gate streaming — NOT script_rewriters.
+    // Script rewriters (Next.js, GTM) run inside lol_html element handlers
+    // during streaming and do not require full-document buffering.
+    // Currently only Next.js registers a post-processor.
+    let is_html = content_type.contains("text/html");
+    if is_html && !integration_registry.html_post_processors().is_empty() {
+        return false;
+    }
+    true
+}
+```
+
+- [ ] **Step 3: Restructure `handle_publisher_request` to support streaming**
+
+Split the function into:
+1. Pre-processing: request info, cookies, synthetic ID, consent, backend
+   request — everything before `response.take_body()`
+2. Header finalization: synthetic ID/cookie headers, `finalize_response()`
+   headers, Content-Length removal
+3. Body processing: either buffered (`Vec<u8>`) or streaming
+   (`StreamingBody`)
+
+The streaming path in the fastly adapter:
+```rust
+// After header finalization, before body processing:
+if should_stream {
+    let body = response.take_body();
+    response.remove_header(header::CONTENT_LENGTH);
+    let mut streaming_body = response.stream_to_client();
+
+    match process_response_streaming(body, &mut streaming_body, &params) {
+        Ok(()) => {
+            streaming_body.finish()
+                .expect("should finish streaming body");
+        }
+        Err(e) => {
+            log::error!("Streaming processing failed: {:?}", e);
+            // StreamingBody dropped → client sees abort
+        }
+    }
+} else {
+    // Existing buffered path
+}
+```
+
+- [ ] **Step 4: Handle binary pass-through in streaming path**
+
+For non-text content when streaming is enabled:
+
+```rust
+if !should_process {
+    let body = response.take_body();
+    response.remove_header(header::CONTENT_LENGTH);
+    let mut streaming_body = response.stream_to_client();
+    io::copy(&mut body, &mut streaming_body)
+        .expect("should copy body to streaming output");
+    streaming_body.finish()
+        .expect("should finish streaming body");
+}
+```
+
+- [ ] **Step 5: Run all tests**
+
+Run: `cargo test --workspace`
+
+Expected: All tests pass.
+
+- [ ] **Step 6: Build for WASM target**
+
+Run: `cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1`
+
+Expected: Builds successfully.
+
+- [ ] **Step 7: Commit**
+
+```
+git add crates/trusted-server-core/src/publisher.rs \
+       crates/trusted-server-adapter-fastly/src/main.rs
+git commit -m "Add streaming response path for publisher proxy"
+```
+
+---
+
+### Task 9: Phase 2 full verification
+
+- [ ] **Step 1: Run full test suite**
+
+Run: `cargo test --workspace`
+
+Expected: All tests pass.
+
+- [ ] **Step 2: Run clippy, fmt, JS tests**
+
+```bash
+cargo clippy --workspace --all-targets --all-features -- -D warnings
+cargo fmt --all -- --check
+cd crates/js/lib && npx vitest run
+```
+
+Expected: All clean.
+
+- [ ] **Step 3: Build for WASM target**
+
+Run: `cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1`
+
+Expected: Builds.
+
+- [ ] **Step 4: Manual verification with Viceroy**
+
+Run: `fastly compute serve`
+
+Test:
+- `curl -s http://localhost:7676/ | sha256sum` — compare with baseline
+- `curl -sI http://localhost:7676/` — verify headers present (geo, version,
+  synthetic ID cookie if consent configured)
+- `curl -s http://localhost:7676/static/tsjs=tsjs-unified.min.js` — verify
+  static routes still work via `send_to_client`
+
+- [ ] **Step 5: Chrome DevTools MCP performance capture**
+
+Follow the measurement protocol in the spec's "Performance measurement via
+Chrome DevTools MCP" section. Compare against baseline captured on `main`.
+
+---
+
+### Task 10: Chrome DevTools MCP baseline + comparison
+
+- [ ] **Step 1: Capture baseline on `main`**
+
+Follow spec section "Baseline capture" — use `navigate_page`,
+`list_network_requests`, `lighthouse_audit`, `performance_start_trace` /
+`performance_stop_trace`, `performance_analyze_insight`,
+`take_memory_snapshot`. Record median TTFB, TTLB, LCP, Speed Index across
+5 runs.
+
+- [ ] **Step 2: Capture metrics on feature branch**
+
+Repeat the same measurements after building the feature branch.
+
+- [ ] **Step 3: Compare and document results**
+
+Create a comparison table and save to PR description or a results file.
+Check for:
+- TTLB improvement (primary goal)
+- No TTFB regression
+- Identical response body hash (correctness)
+- LCP/Speed Index improvement (secondary)

From 6968201549cffceec314cb425575eea7344a35e9 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:08:05 -0700
Subject: [PATCH 07/29] Fix encoder finalization: explicit finish instead of
 drop

---
 .../src/streaming_processor.rs                | 82 ++++++++++++++++---
 1 file changed, 69 insertions(+), 13 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index cda62e6f..50c595d9 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -280,12 +280,14 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     ) -> Result<(), Report<TrustedServerError>> {
         use flate2::read::ZlibDecoder;
         use flate2::write::ZlibEncoder;
-        use flate2::Compression;
 
         let decoder = ZlibDecoder::new(input);
-        let encoder = ZlibEncoder::new(output, Compression::default());
-
-        self.process_through_compression(decoder, encoder)
+        let mut encoder = ZlibEncoder::new(output, flate2::Compression::default());
+        self.process_through_compression(decoder, &mut encoder)?;
+        encoder.finish().change_context(TrustedServerError::Proxy {
+            message: "Failed to finalize deflate encoder".to_string(),
+        })?;
+        Ok(())
     }
 
     /// Process deflate compressed input to uncompressed output (decompression only)
@@ -315,9 +317,11 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
             lgwin: 22,
             ..Default::default()
         };
-        let encoder = CompressorWriter::with_params(output, 4096, &params);
-
-        self.process_through_compression(decoder, encoder)
+        let mut encoder = CompressorWriter::with_params(output, 4096, &params);
+        self.process_through_compression(decoder, &mut encoder)?;
+        // CompressorWriter finalizes on flush (already called) and into_inner
+        encoder.into_inner();
+        Ok(())
     }
 
     /// Process brotli compressed input to uncompressed output (decompression only)
@@ -332,10 +336,14 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     }
 
     /// Generic processing through compression layers
+    ///
+    /// The caller retains ownership of `encoder` and must call its
+    /// type-specific finalization method (e.g., `finish()` or `into_inner()`)
+    /// after this function returns successfully.
     fn process_through_compression<R: Read, W: Write>(
         &mut self,
         mut decoder: R,
-        mut encoder: W,
+        encoder: &mut W,
     ) -> Result<(), Report<TrustedServerError>> {
         let mut buffer = vec![0u8; self.config.chunk_size];
 
@@ -380,15 +388,11 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
             }
         }
 
-        // Flush encoder (this also finishes compression)
         encoder.flush().change_context(TrustedServerError::Proxy {
             message: "Failed to flush encoder".to_string(),
         })?;
 
-        // For GzEncoder and similar, we need to finish() to properly close the stream
-        // The flush above might not be enough
-        drop(encoder);
-
+        // Caller owns encoder and must call finish() after this returns.
         Ok(())
     }
 }
@@ -646,6 +650,58 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_deflate_round_trip_produces_valid_output() {
+        // Verify that deflate-to-deflate (which uses process_through_compression)
+        // produces valid output that decompresses correctly. This establishes the
+        // correctness contract before we change the finalization path.
+        use flate2::read::ZlibDecoder;
+        use flate2::write::ZlibEncoder;
+        use std::io::{Read as _, Write as _};
+
+        let input_data = b"<html><body>hello world</body></html>";
+
+        // Compress input
+        let mut compressed_input = Vec::new();
+        {
+            let mut enc =
+                ZlibEncoder::new(&mut compressed_input, flate2::Compression::default());
+            enc.write_all(input_data)
+                .expect("should compress test input");
+            enc.finish().expect("should finish compression");
+        }
+
+        let replacer = StreamingReplacer::new(vec![Replacement {
+            find: "hello".to_string(),
+            replace_with: "hi".to_string(),
+        }]);
+
+        let config = PipelineConfig {
+            input_compression: Compression::Deflate,
+            output_compression: Compression::Deflate,
+            chunk_size: 8192,
+        };
+
+        let mut pipeline = StreamingPipeline::new(config, replacer);
+        let mut output = Vec::new();
+
+        pipeline
+            .process(&compressed_input[..], &mut output)
+            .expect("should process deflate-to-deflate");
+
+        // Decompress output and verify correctness
+        let mut decompressed = Vec::new();
+        ZlibDecoder::new(&output[..])
+            .read_to_end(&mut decompressed)
+            .expect("should decompress output — implies encoder was finalized correctly");
+
+        assert_eq!(
+            String::from_utf8(decompressed).expect("should be valid UTF-8"),
+            "<html><body>hi world</body></html>",
+            "should have replaced content through deflate round-trip"
+        );
+    }
+
     #[test]
     fn test_streaming_pipeline_with_html_rewriter() {
         use lol_html::{element, Settings};

From a4fd5c69568fd815286e7c3946efd97472b62424 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:13:06 -0700
Subject: [PATCH 08/29] Convert process_gzip_to_gzip to chunk-based processing

---
 .../src/streaming_processor.rs                | 85 ++++++++++++-------
 1 file changed, 54 insertions(+), 31 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 50c595d9..accf80e2 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -187,40 +187,13 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     ) -> Result<(), Report<TrustedServerError>> {
         use flate2::read::GzDecoder;
         use flate2::write::GzEncoder;
-        use flate2::Compression;
 
-        // Decompress input
-        let mut decoder = GzDecoder::new(input);
-        let mut decompressed = Vec::new();
-        decoder
-            .read_to_end(&mut decompressed)
-            .change_context(TrustedServerError::Proxy {
-                message: "Failed to decompress gzip".to_string(),
-            })?;
-
-        log::info!("Decompressed size: {} bytes", decompressed.len());
-
-        // Process the decompressed content
-        let processed = self
-            .processor
-            .process_chunk(&decompressed, true)
-            .change_context(TrustedServerError::Proxy {
-                message: "Failed to process content".to_string(),
-            })?;
-
-        log::info!("Processed size: {} bytes", processed.len());
-
-        // Recompress the output
-        let mut encoder = GzEncoder::new(output, Compression::default());
-        encoder
-            .write_all(&processed)
-            .change_context(TrustedServerError::Proxy {
-                message: "Failed to write to gzip encoder".to_string(),
-            })?;
+        let decoder = GzDecoder::new(input);
+        let mut encoder = GzEncoder::new(output, flate2::Compression::default());
+        self.process_through_compression(decoder, &mut encoder)?;
         encoder.finish().change_context(TrustedServerError::Proxy {
-            message: "Failed to finish gzip encoder".to_string(),
+            message: "Failed to finalize gzip encoder".to_string(),
         })?;
-
         Ok(())
     }
 
@@ -702,6 +675,56 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_gzip_to_gzip_produces_correct_output() {
+        use flate2::read::GzDecoder;
+        use flate2::write::GzEncoder;
+        use std::io::{Read as _, Write as _};
+
+        // Arrange
+        let input_data = b"<html><body>hello world</body></html>";
+
+        let mut compressed_input = Vec::new();
+        {
+            let mut enc =
+                GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+            enc.write_all(input_data)
+                .expect("should compress test input");
+            enc.finish().expect("should finish compression");
+        }
+
+        let replacer = StreamingReplacer::new(vec![Replacement {
+            find: "hello".to_string(),
+            replace_with: "hi".to_string(),
+        }]);
+
+        let config = PipelineConfig {
+            input_compression: Compression::Gzip,
+            output_compression: Compression::Gzip,
+            chunk_size: 8192,
+        };
+
+        let mut pipeline = StreamingPipeline::new(config, replacer);
+        let mut output = Vec::new();
+
+        // Act
+        pipeline
+            .process(&compressed_input[..], &mut output)
+            .expect("should process gzip-to-gzip");
+
+        // Assert
+        let mut decompressed = Vec::new();
+        GzDecoder::new(&output[..])
+            .read_to_end(&mut decompressed)
+            .expect("should decompress output — implies encoder was finalized correctly");
+
+        assert_eq!(
+            String::from_utf8(decompressed).expect("should be valid UTF-8"),
+            "<html><body>hi world</body></html>",
+            "should have replaced content through gzip round-trip"
+        );
+    }
+
     #[test]
     fn test_streaming_pipeline_with_html_rewriter() {
         use lol_html::{element, Settings};

From a4f4a7c189eeeaa5a778eac958e4881c623aa8af Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:19:29 -0700
Subject: [PATCH 09/29] Convert decompress_and_process to chunk-based
 processing

---
 .../src/streaming_processor.rs                | 114 ++++++++++++++----
 1 file changed, 89 insertions(+), 25 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index accf80e2..5ea7aa5b 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -197,39 +197,58 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
         Ok(())
     }
 
-    /// Decompress input, process content, and write uncompressed output.
+    /// Decompress input, process content in chunks, and write uncompressed output.
     fn decompress_and_process<R: Read, W: Write>(
         &mut self,
         mut decoder: R,
         mut output: W,
         codec_name: &str,
     ) -> Result<(), Report<TrustedServerError>> {
-        let mut decompressed = Vec::new();
-        decoder
-            .read_to_end(&mut decompressed)
-            .change_context(TrustedServerError::Proxy {
-                message: format!("Failed to decompress {codec_name}"),
-            })?;
-
-        log::info!(
-            "{codec_name} decompressed size: {} bytes",
-            decompressed.len()
-        );
-
-        let processed = self
-            .processor
-            .process_chunk(&decompressed, true)
-            .change_context(TrustedServerError::Proxy {
-                message: "Failed to process content".to_string(),
-            })?;
+        let mut buffer = vec![0u8; self.config.chunk_size];
 
-        log::info!("{codec_name} processed size: {} bytes", processed.len());
+        loop {
+            match decoder.read(&mut buffer) {
+                Ok(0) => {
+                    let final_chunk = self.processor.process_chunk(&[], true).change_context(
+                        TrustedServerError::Proxy {
+                            message: format!("Failed to process final {codec_name} chunk"),
+                        },
+                    )?;
+                    if !final_chunk.is_empty() {
+                        output.write_all(&final_chunk).change_context(
+                            TrustedServerError::Proxy {
+                                message: format!("Failed to write final {codec_name} chunk"),
+                            },
+                        )?;
+                    }
+                    break;
+                }
+                Ok(n) => {
+                    let processed = self
+                        .processor
+                        .process_chunk(&buffer[..n], false)
+                        .change_context(TrustedServerError::Proxy {
+                            message: format!("Failed to process {codec_name} chunk"),
+                        })?;
+                    if !processed.is_empty() {
+                        output.write_all(&processed).change_context(
+                            TrustedServerError::Proxy {
+                                message: format!("Failed to write {codec_name} chunk"),
+                            },
+                        )?;
+                    }
+                }
+                Err(e) => {
+                    return Err(Report::new(TrustedServerError::Proxy {
+                        message: format!("Failed to read from {codec_name} decoder: {e}"),
+                    }));
+                }
+            }
+        }
 
-        output
-            .write_all(&processed)
-            .change_context(TrustedServerError::Proxy {
-                message: "Failed to write output".to_string(),
-            })?;
+        output.flush().change_context(TrustedServerError::Proxy {
+            message: format!("Failed to flush {codec_name} output"),
+        })?;
 
         Ok(())
     }
@@ -725,6 +744,51 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_gzip_to_none_produces_correct_output() {
+        use flate2::write::GzEncoder;
+        use std::io::Write as _;
+
+        // Arrange
+        let input_data = b"<html><body>hello world</body></html>";
+
+        let mut compressed_input = Vec::new();
+        {
+            let mut enc =
+                GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+            enc.write_all(input_data)
+                .expect("should compress test input");
+            enc.finish().expect("should finish compression");
+        }
+
+        let replacer = StreamingReplacer::new(vec![Replacement {
+            find: "hello".to_string(),
+            replace_with: "hi".to_string(),
+        }]);
+
+        let config = PipelineConfig {
+            input_compression: Compression::Gzip,
+            output_compression: Compression::None,
+            chunk_size: 8192,
+        };
+
+        let mut pipeline = StreamingPipeline::new(config, replacer);
+        let mut output = Vec::new();
+
+        // Act
+        pipeline
+            .process(&compressed_input[..], &mut output)
+            .expect("should process gzip-to-none");
+
+        // Assert
+        let result =
+            String::from_utf8(output).expect("should be valid UTF-8 uncompressed output");
+        assert_eq!(
+            result, "<html><body>hi world</body></html>",
+            "should have replaced content after gzip decompression"
+        );
+    }
+
     #[test]
     fn test_streaming_pipeline_with_html_rewriter() {
         use lol_html::{element, Settings};

From 105244c1dab0468c4155c220adf81da04b8c3264 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:23:49 -0700
Subject: [PATCH 10/29] Rewrite HtmlRewriterAdapter for incremental lol_html
 streaming

---
 .../src/streaming_processor.rs                | 241 +++++++++++-------
 1 file changed, 144 insertions(+), 97 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 5ea7aa5b..20171b6a 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -6,6 +6,9 @@
 //! - Memory-efficient streaming
 //! - UTF-8 boundary handling
 
+use std::cell::RefCell;
+use std::rc::Rc;
+
 use error_stack::{Report, ResultExt};
 use std::io::{self, Read, Write};
 
@@ -389,81 +392,70 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     }
 }
 
-/// Adapter to use `lol_html` `HtmlRewriter` as a `StreamProcessor`
-/// Important: Due to `lol_html`'s ownership model, we must accumulate input
-/// and process it all at once when the stream ends. This is a limitation
-/// of the `lol_html` library's API design.
+/// Shared output buffer used as an [`lol_html::OutputSink`].
+///
+/// The `HtmlRewriter` invokes [`OutputSink::handle_chunk`] synchronously during
+/// each [`HtmlRewriter::write`] call, so the buffer is drained after every
+/// `process_chunk` invocation to emit output incrementally.
+struct RcVecSink(Rc<RefCell<Vec<u8>>>);
+
+impl lol_html::OutputSink for RcVecSink {
+    fn handle_chunk(&mut self, chunk: &[u8]) {
+        self.0.borrow_mut().extend_from_slice(chunk);
+    }
+}
+
+/// Adapter to use `lol_html` [`HtmlRewriter`](lol_html::HtmlRewriter) as a [`StreamProcessor`].
+///
+/// Output is emitted incrementally on every [`StreamProcessor::process_chunk`] call.
+/// The adapter is single-use: one adapter per request. Calling [`StreamProcessor::reset`]
+/// is a no-op because the rewriter consumes its settings on construction.
 pub struct HtmlRewriterAdapter {
-    settings: lol_html::Settings<'static, 'static>,
-    accumulated_input: Vec<u8>,
+    rewriter: Option<lol_html::HtmlRewriter<'static, RcVecSink>>,
+    output: Rc<RefCell<Vec<u8>>>,
 }
 
 impl HtmlRewriterAdapter {
-    /// Create a new HTML rewriter adapter
+    /// Create a new HTML rewriter adapter that streams output per chunk.
     #[must_use]
     pub fn new(settings: lol_html::Settings<'static, 'static>) -> Self {
+        let output = Rc::new(RefCell::new(Vec::new()));
+        let sink = RcVecSink(Rc::clone(&output));
+        let rewriter = lol_html::HtmlRewriter::new(settings, sink);
         Self {
-            settings,
-            accumulated_input: Vec::new(),
+            rewriter: Some(rewriter),
+            output,
         }
     }
 }
 
 impl StreamProcessor for HtmlRewriterAdapter {
     fn process_chunk(&mut self, chunk: &[u8], is_last: bool) -> Result<Vec<u8>, io::Error> {
-        // Accumulate input chunks
-        self.accumulated_input.extend_from_slice(chunk);
-
-        if !chunk.is_empty() {
-            log::debug!(
-                "Buffering chunk: {} bytes, total buffered: {} bytes",
-                chunk.len(),
-                self.accumulated_input.len()
-            );
+        if let Some(rewriter) = &mut self.rewriter {
+            if !chunk.is_empty() {
+                rewriter.write(chunk).map_err(|e| {
+                    log::error!("Failed to process HTML chunk: {e}");
+                    io::Error::other(format!("HTML processing failed: {e}"))
+                })?;
+            }
         }
 
-        // Only process when we have all the input
         if is_last {
-            log::info!(
-                "Processing complete document: {} bytes",
-                self.accumulated_input.len()
-            );
-
-            // Process all accumulated input at once
-            let mut output = Vec::new();
-
-            // Create rewriter with output sink
-            let mut rewriter = lol_html::HtmlRewriter::new(
-                std::mem::take(&mut self.settings),
-                |chunk: &[u8]| {
-                    output.extend_from_slice(chunk);
-                },
-            );
-
-            // Process the entire document
-            rewriter.write(&self.accumulated_input).map_err(|e| {
-                log::error!("Failed to process HTML: {}", e);
-                io::Error::other(format!("HTML processing failed: {}", e))
-            })?;
-
-            // Finalize the rewriter
-            rewriter.end().map_err(|e| {
-                log::error!("Failed to finalize: {}", e);
-                io::Error::other(format!("HTML finalization failed: {}", e))
-            })?;
-
-            log::debug!("Output size: {} bytes", output.len());
-            self.accumulated_input.clear();
-            Ok(output)
-        } else {
-            // Return empty until we have all input
-            // This is a limitation of lol_html's API
-            Ok(Vec::new())
+            if let Some(rewriter) = self.rewriter.take() {
+                rewriter.end().map_err(|e| {
+                    log::error!("Failed to finalize HTML: {e}");
+                    io::Error::other(format!("HTML finalization failed: {e}"))
+                })?;
+            }
         }
+
+        // Drain whatever lol_html produced since the last call
+        Ok(std::mem::take(&mut *self.output.borrow_mut()))
     }
 
     fn reset(&mut self) {
-        self.accumulated_input.clear();
+        // No-op: the rewriter consumed its Settings on construction.
+        // Single-use by design (one adapter per request).
     }
 }
 
@@ -530,7 +522,7 @@ mod tests {
     }
 
     #[test]
-    fn test_html_rewriter_adapter_accumulates_until_last() {
+    fn test_html_rewriter_adapter_streams_incrementally() {
         use lol_html::{element, Settings};
 
         // Create a simple HTML rewriter that replaces text
@@ -544,32 +536,40 @@ mod tests {
 
         let mut adapter = HtmlRewriterAdapter::new(settings);
 
-        // Test that intermediate chunks return empty
         let chunk1 = b"<html><body>";
         let result1 = adapter
             .process_chunk(chunk1, false)
             .expect("should process chunk1");
-        assert_eq!(result1.len(), 0, "Should return empty for non-last chunk");
 
         let chunk2 = b"<p>original</p>";
         let result2 = adapter
             .process_chunk(chunk2, false)
             .expect("should process chunk2");
-        assert_eq!(result2.len(), 0, "Should return empty for non-last chunk");
 
-        // Test that last chunk processes everything
         let chunk3 = b"</body></html>";
         let result3 = adapter
             .process_chunk(chunk3, true)
             .expect("should process final chunk");
+
+        // Concatenate all outputs and verify the final HTML is correct
+        let mut all_output = result1;
+        all_output.extend_from_slice(&result2);
+        all_output.extend_from_slice(&result3);
+
         assert!(
-            !result3.is_empty(),
-            "Should return processed content for last chunk"
+            !all_output.is_empty(),
+            "should produce non-empty concatenated output"
         );
 
-        let output = String::from_utf8(result3).expect("output should be valid UTF-8");
-        assert!(output.contains("replaced"), "Should have replaced content");
-        assert!(output.contains("<html>"), "Should have complete HTML");
+        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
+        assert!(
+            output.contains("replaced"),
+            "should have replaced content in concatenated output"
+        );
+        assert!(
+            output.contains("<html>"),
+            "should have complete HTML in concatenated output"
+        );
     }
 
     #[test]
@@ -586,59 +586,59 @@ mod tests {
         }
         large_html.push_str("</body></html>");
 
-        // Process in chunks
+        // Process in chunks and collect all output
         let chunk_size = 1024;
         let bytes = large_html.as_bytes();
-        let mut chunks = bytes.chunks(chunk_size);
-        let mut last_chunk = chunks.next().unwrap_or(&[]);
+        let mut chunks = bytes.chunks(chunk_size).peekable();
+        let mut all_output = Vec::new();
 
-        for chunk in chunks {
+        while let Some(chunk) = chunks.next() {
+            let is_last = chunks.peek().is_none();
             let result = adapter
-                .process_chunk(last_chunk, false)
-                .expect("should process intermediate chunk");
-            assert_eq!(result.len(), 0, "Intermediate chunks should return empty");
-            last_chunk = chunk;
+                .process_chunk(chunk, is_last)
+                .expect("should process chunk");
+            all_output.extend_from_slice(&result);
         }
 
-        // Process last chunk
-        let result = adapter
-            .process_chunk(last_chunk, true)
-            .expect("should process last chunk");
-        assert!(!result.is_empty(), "Last chunk should return content");
+        assert!(
+            !all_output.is_empty(),
+            "should produce non-empty output for large document"
+        );
 
-        let output = String::from_utf8(result).expect("output should be valid UTF-8");
+        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
         assert!(
             output.contains("Paragraph 999"),
-            "Should contain all content"
+            "should contain all content from large document"
         );
     }
 
     #[test]
-    fn test_html_rewriter_adapter_reset() {
+    fn test_html_rewriter_adapter_reset_is_noop() {
         use lol_html::Settings;
 
         let settings = Settings::default();
         let mut adapter = HtmlRewriterAdapter::new(settings);
 
         // Process some content
-        adapter
-            .process_chunk(b"<html>", false)
-            .expect("should process html tag");
-        adapter
-            .process_chunk(b"<body>test</body>", false)
-            .expect("should process body");
-
-        // Reset should clear accumulated input
+        let result1 = adapter
+            .process_chunk(b"<html><body>test</body></html>", false)
+            .expect("should process html");
+
+        // Reset is a no-op — the adapter is single-use by design
         adapter.reset();
 
-        // After reset, adapter should be ready for new input
-        let result = adapter
-            .process_chunk(b"<p>new</p>", true)
-            .expect("should process new content after reset");
-        let output = String::from_utf8(result).expect("output should be valid UTF-8");
-        assert_eq!(
-            output, "<p>new</p>",
-            "Should only contain new input after reset"
+        // The rewriter is still alive; finalize it
+        let result2 = adapter
+            .process_chunk(b"", true)
+            .expect("should finalize after reset");
+
+        let mut all_output = result1;
+        all_output.extend_from_slice(&result2);
+
+        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
+        assert!(
+            output.contains("test"),
+            "should still produce output after no-op reset"
         );
     }
 
@@ -789,6 +789,53 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_html_rewriter_adapter_emits_output_per_chunk() {
+        use lol_html::Settings;
+
+        let settings = Settings::default();
+        let mut adapter = HtmlRewriterAdapter::new(settings);
+
+        // Send three chunks
+        let chunk1 = b"<html><body>";
+        let result1 = adapter
+            .process_chunk(chunk1, false)
+            .expect("should process chunk1");
+        assert!(
+            !result1.is_empty(),
+            "should emit output for first chunk, got empty"
+        );
+
+        let chunk2 = b"<p>hello</p>";
+        let result2 = adapter
+            .process_chunk(chunk2, false)
+            .expect("should process chunk2");
+
+        let chunk3 = b"</body></html>";
+        let result3 = adapter
+            .process_chunk(chunk3, true)
+            .expect("should process final chunk");
+
+        // Concatenate all outputs and verify correctness
+        let mut all_output = result1;
+        all_output.extend_from_slice(&result2);
+        all_output.extend_from_slice(&result3);
+
+        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
+        assert!(
+            output.contains("<html>"),
+            "should contain html tag in concatenated output"
+        );
+        assert!(
+            output.contains("<p>hello</p>"),
+            "should contain paragraph in concatenated output"
+        );
+        assert!(
+            output.contains("</html>"),
+            "should contain closing html tag in concatenated output"
+        );
+    }
+
     #[test]
     fn test_streaming_pipeline_with_html_rewriter() {
         use lol_html::{element, Settings};

From d72669c6c8057c411177692d8f4be4e0ab3d95a4 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:28:08 -0700
Subject: [PATCH 11/29] Unify compression paths into single process_chunks
 method

---
 .../src/streaming_processor.rs                | 300 +++++-------------
 1 file changed, 73 insertions(+), 227 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 20171b6a..7062df93 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -94,6 +94,10 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
 
     /// Process a stream from input to output
     ///
+    /// Handles all supported compression transformations by wrapping the raw
+    /// reader/writer in the appropriate decoder/encoder, then delegating to
+    /// [`Self::process_chunks`].
+    ///
     /// # Errors
     ///
     /// Returns an error if the compression transformation is unsupported or if reading/writing fails.
@@ -106,253 +110,96 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
             self.config.input_compression,
             self.config.output_compression,
         ) {
-            (Compression::None, Compression::None) => self.process_uncompressed(input, output),
-            (Compression::Gzip, Compression::Gzip) => self.process_gzip_to_gzip(input, output),
-            (Compression::Gzip, Compression::None) => self.process_gzip_to_none(input, output),
+            (Compression::None, Compression::None) => self.process_chunks(input, output),
+            (Compression::Gzip, Compression::Gzip) => {
+                use flate2::read::GzDecoder;
+                use flate2::write::GzEncoder;
+
+                let decoder = GzDecoder::new(input);
+                let mut encoder = GzEncoder::new(output, flate2::Compression::default());
+                self.process_chunks(decoder, &mut encoder)?;
+                encoder.finish().change_context(TrustedServerError::Proxy {
+                    message: "Failed to finalize gzip encoder".to_string(),
+                })?;
+                Ok(())
+            }
+            (Compression::Gzip, Compression::None) => {
+                use flate2::read::GzDecoder;
+
+                self.process_chunks(GzDecoder::new(input), output)
+            }
             (Compression::Deflate, Compression::Deflate) => {
-                self.process_deflate_to_deflate(input, output)
+                use flate2::read::ZlibDecoder;
+                use flate2::write::ZlibEncoder;
+
+                let decoder = ZlibDecoder::new(input);
+                let mut encoder = ZlibEncoder::new(output, flate2::Compression::default());
+                self.process_chunks(decoder, &mut encoder)?;
+                encoder.finish().change_context(TrustedServerError::Proxy {
+                    message: "Failed to finalize deflate encoder".to_string(),
+                })?;
+                Ok(())
             }
             (Compression::Deflate, Compression::None) => {
-                self.process_deflate_to_none(input, output)
+                use flate2::read::ZlibDecoder;
+
+                self.process_chunks(ZlibDecoder::new(input), output)
             }
             (Compression::Brotli, Compression::Brotli) => {
-                self.process_brotli_to_brotli(input, output)
+                use brotli::enc::writer::CompressorWriter;
+                use brotli::enc::BrotliEncoderParams;
+                use brotli::Decompressor;
+
+                let decoder = Decompressor::new(input, 4096);
+                let params = BrotliEncoderParams {
+                    quality: 4,
+                    lgwin: 22,
+                    ..Default::default()
+                };
+                let mut encoder = CompressorWriter::with_params(output, 4096, &params);
+                self.process_chunks(decoder, &mut encoder)?;
+                // CompressorWriter finalizes on flush (already called) and into_inner
+                encoder.into_inner();
+                Ok(())
+            }
+            (Compression::Brotli, Compression::None) => {
+                use brotli::Decompressor;
+
+                self.process_chunks(Decompressor::new(input, 4096), output)
             }
-            (Compression::Brotli, Compression::None) => self.process_brotli_to_none(input, output),
             _ => Err(Report::new(TrustedServerError::Proxy {
                 message: "Unsupported compression transformation".to_string(),
             })),
         }
     }
 
-    /// Process uncompressed stream
-    fn process_uncompressed<R: Read, W: Write>(
-        &mut self,
-        mut input: R,
-        mut output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        let mut buffer = vec![0u8; self.config.chunk_size];
-
-        loop {
-            match input.read(&mut buffer) {
-                Ok(0) => {
-                    // End of stream - process any remaining data
-                    let final_chunk = self.processor.process_chunk(&[], true).change_context(
-                        TrustedServerError::Proxy {
-                            message: "Failed to process final chunk".to_string(),
-                        },
-                    )?;
-                    if !final_chunk.is_empty() {
-                        output.write_all(&final_chunk).change_context(
-                            TrustedServerError::Proxy {
-                                message: "Failed to write final chunk".to_string(),
-                            },
-                        )?;
-                    }
-                    break;
-                }
-                Ok(n) => {
-                    // Process this chunk
-                    let processed = self
-                        .processor
-                        .process_chunk(&buffer[..n], false)
-                        .change_context(TrustedServerError::Proxy {
-                            message: "Failed to process chunk".to_string(),
-                        })?;
-                    if !processed.is_empty() {
-                        output
-                            .write_all(&processed)
-                            .change_context(TrustedServerError::Proxy {
-                                message: "Failed to write processed chunk".to_string(),
-                            })?;
-                    }
-                }
-                Err(e) => {
-                    return Err(Report::new(TrustedServerError::Proxy {
-                        message: format!("Failed to read from input: {}", e),
-                    }));
-                }
-            }
-        }
-
-        output.flush().change_context(TrustedServerError::Proxy {
-            message: "Failed to flush output".to_string(),
-        })?;
-
-        Ok(())
-    }
-
-    /// Process gzip compressed stream
-    fn process_gzip_to_gzip<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use flate2::read::GzDecoder;
-        use flate2::write::GzEncoder;
-
-        let decoder = GzDecoder::new(input);
-        let mut encoder = GzEncoder::new(output, flate2::Compression::default());
-        self.process_through_compression(decoder, &mut encoder)?;
-        encoder.finish().change_context(TrustedServerError::Proxy {
-            message: "Failed to finalize gzip encoder".to_string(),
-        })?;
-        Ok(())
-    }
-
-    /// Decompress input, process content in chunks, and write uncompressed output.
-    fn decompress_and_process<R: Read, W: Write>(
-        &mut self,
-        mut decoder: R,
-        mut output: W,
-        codec_name: &str,
-    ) -> Result<(), Report<TrustedServerError>> {
-        let mut buffer = vec![0u8; self.config.chunk_size];
-
-        loop {
-            match decoder.read(&mut buffer) {
-                Ok(0) => {
-                    let final_chunk = self.processor.process_chunk(&[], true).change_context(
-                        TrustedServerError::Proxy {
-                            message: format!("Failed to process final {codec_name} chunk"),
-                        },
-                    )?;
-                    if !final_chunk.is_empty() {
-                        output.write_all(&final_chunk).change_context(
-                            TrustedServerError::Proxy {
-                                message: format!("Failed to write final {codec_name} chunk"),
-                            },
-                        )?;
-                    }
-                    break;
-                }
-                Ok(n) => {
-                    let processed = self
-                        .processor
-                        .process_chunk(&buffer[..n], false)
-                        .change_context(TrustedServerError::Proxy {
-                            message: format!("Failed to process {codec_name} chunk"),
-                        })?;
-                    if !processed.is_empty() {
-                        output.write_all(&processed).change_context(
-                            TrustedServerError::Proxy {
-                                message: format!("Failed to write {codec_name} chunk"),
-                            },
-                        )?;
-                    }
-                }
-                Err(e) => {
-                    return Err(Report::new(TrustedServerError::Proxy {
-                        message: format!("Failed to read from {codec_name} decoder: {e}"),
-                    }));
-                }
-            }
-        }
-
-        output.flush().change_context(TrustedServerError::Proxy {
-            message: format!("Failed to flush {codec_name} output"),
-        })?;
-
-        Ok(())
-    }
-
-    /// Process gzip compressed input to uncompressed output (decompression only)
-    fn process_gzip_to_none<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use flate2::read::GzDecoder;
-
-        self.decompress_and_process(GzDecoder::new(input), output, "gzip")
-    }
-
-    /// Process deflate compressed stream
-    fn process_deflate_to_deflate<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use flate2::read::ZlibDecoder;
-        use flate2::write::ZlibEncoder;
-
-        let decoder = ZlibDecoder::new(input);
-        let mut encoder = ZlibEncoder::new(output, flate2::Compression::default());
-        self.process_through_compression(decoder, &mut encoder)?;
-        encoder.finish().change_context(TrustedServerError::Proxy {
-            message: "Failed to finalize deflate encoder".to_string(),
-        })?;
-        Ok(())
-    }
-
-    /// Process deflate compressed input to uncompressed output (decompression only)
-    fn process_deflate_to_none<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use flate2::read::ZlibDecoder;
-
-        self.decompress_and_process(ZlibDecoder::new(input), output, "deflate")
-    }
-
-    /// Process brotli compressed stream
-    fn process_brotli_to_brotli<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use brotli::enc::writer::CompressorWriter;
-        use brotli::enc::BrotliEncoderParams;
-        use brotli::Decompressor;
-
-        let decoder = Decompressor::new(input, 4096);
-        let params = BrotliEncoderParams {
-            quality: 4,
-            lgwin: 22,
-            ..Default::default()
-        };
-        let mut encoder = CompressorWriter::with_params(output, 4096, &params);
-        self.process_through_compression(decoder, &mut encoder)?;
-        // CompressorWriter finalizes on flush (already called) and into_inner
-        encoder.into_inner();
-        Ok(())
-    }
-
-    /// Process brotli compressed input to uncompressed output (decompression only)
-    fn process_brotli_to_none<R: Read, W: Write>(
-        &mut self,
-        input: R,
-        output: W,
-    ) -> Result<(), Report<TrustedServerError>> {
-        use brotli::Decompressor;
-
-        self.decompress_and_process(Decompressor::new(input, 4096), output, "brotli")
-    }
-
-    /// Generic processing through compression layers
+    /// Read chunks from `reader`, pass each through the processor, and write output to `writer`.
     ///
-    /// The caller retains ownership of `encoder` and must call its
-    /// type-specific finalization method (e.g., `finish()` or `into_inner()`)
-    /// after this function returns successfully.
-    fn process_through_compression<R: Read, W: Write>(
+    /// This is the single unified chunk loop used by all compression paths.
+    /// The caller is responsible for wrapping `reader`/`writer` in the appropriate
+    /// decoder/encoder and for finalizing the encoder (e.g., calling `finish()`)
+    /// after this method returns.
+    ///
+    /// # Errors
+    ///
+    /// Returns an error if reading, processing, or writing any chunk fails.
+    fn process_chunks<R: Read, W: Write>(
         &mut self,
-        mut decoder: R,
-        encoder: &mut W,
+        mut reader: R,
+        mut writer: W,
     ) -> Result<(), Report<TrustedServerError>> {
         let mut buffer = vec![0u8; self.config.chunk_size];
 
         loop {
-            match decoder.read(&mut buffer) {
+            match reader.read(&mut buffer) {
                 Ok(0) => {
-                    // End of stream
                     let final_chunk = self.processor.process_chunk(&[], true).change_context(
                         TrustedServerError::Proxy {
                             message: "Failed to process final chunk".to_string(),
                         },
                     )?;
                     if !final_chunk.is_empty() {
-                        encoder.write_all(&final_chunk).change_context(
+                        writer.write_all(&final_chunk).change_context(
                             TrustedServerError::Proxy {
                                 message: "Failed to write final chunk".to_string(),
                             },
@@ -368,7 +215,7 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                             message: "Failed to process chunk".to_string(),
                         })?;
                     if !processed.is_empty() {
-                        encoder.write_all(&processed).change_context(
+                        writer.write_all(&processed).change_context(
                             TrustedServerError::Proxy {
                                 message: "Failed to write processed chunk".to_string(),
                             },
@@ -377,17 +224,16 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                 }
                 Err(e) => {
                     return Err(Report::new(TrustedServerError::Proxy {
-                        message: format!("Failed to read from decoder: {}", e),
+                        message: format!("Failed to read: {e}"),
                     }));
                 }
             }
         }
 
-        encoder.flush().change_context(TrustedServerError::Proxy {
-            message: "Failed to flush encoder".to_string(),
+        writer.flush().change_context(TrustedServerError::Proxy {
+            message: "Failed to flush output".to_string(),
         })?;
 
-        // Caller owns encoder and must call finish() after this returns.
         Ok(())
     }
 }

From 80e51d4807411bc5d3bb77cc6a6971d2a5e4cebb Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:30:06 -0700
Subject: [PATCH 12/29] Update plan with compression refactor implementation
 note

---
 docs/superpowers/plans/2026-03-25-streaming-response.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/docs/superpowers/plans/2026-03-25-streaming-response.md b/docs/superpowers/plans/2026-03-25-streaming-response.md
index 268517b8..4afca7fe 100644
--- a/docs/superpowers/plans/2026-03-25-streaming-response.md
+++ b/docs/superpowers/plans/2026-03-25-streaming-response.md
@@ -35,6 +35,14 @@ rewriting), `flate2` (gzip/deflate), `brotli` (brotli compression).
 
 ## Phase 1: Make the Pipeline Chunk-Emitting
 
+> **Implementation note (2026-03-26):** Tasks 1-3 were implemented as planned,
+> then followed by a refactor that unified all 9 `process_*_to_*` methods into
+> a single `process_chunks` method with inline decoder/encoder creation in
+> `process()`. This eliminated ~150 lines of duplication. The refactor was
+> committed as "Unify compression paths into single process_chunks method".
+> Tasks 1-3 descriptions below reflect the original plan; the final code is
+> cleaner than described.
+
 ### Task 1: Fix encoder finalization in `process_through_compression`
 
 This is the prerequisite for Task 2. The current code calls `flush()` then

From c505c00395efb034ef2dce6047f7adc2dcb11948 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 09:33:45 -0700
Subject: [PATCH 13/29] Accumulate output for post-processors in
 HtmlWithPostProcessing

---
 .../trusted-server-core/src/html_processor.rs | 29 +++++++++++++++----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 540ab29d..30550318 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -20,6 +20,9 @@ use crate::tsjs;
 struct HtmlWithPostProcessing {
     inner: HtmlRewriterAdapter,
     post_processors: Vec<Arc<dyn IntegrationHtmlPostProcessor>>,
+    /// Buffer that accumulates all intermediate output when post-processors
+    /// need the full document. Left empty on the streaming-only path.
+    accumulated_output: Vec<u8>,
     origin_host: String,
     request_host: String,
     request_scheme: String,
@@ -29,12 +32,26 @@ struct HtmlWithPostProcessing {
 impl StreamProcessor for HtmlWithPostProcessing {
     fn process_chunk(&mut self, chunk: &[u8], is_last: bool) -> Result<Vec<u8>, io::Error> {
         let output = self.inner.process_chunk(chunk, is_last)?;
-        if !is_last || output.is_empty() || self.post_processors.is_empty() {
+
+        // Streaming-optimized path: no post-processors, pass through immediately.
+        if self.post_processors.is_empty() {
             return Ok(output);
         }
 
-        let Ok(output_str) = std::str::from_utf8(&output) else {
-            return Ok(output);
+        // Post-processors need the full document. Accumulate until the last chunk.
+        self.accumulated_output.extend_from_slice(&output);
+        if !is_last {
+            return Ok(Vec::new());
+        }
+
+        // Final chunk: run post-processors on the full accumulated output.
+        let full_output = std::mem::take(&mut self.accumulated_output);
+        if full_output.is_empty() {
+            return Ok(full_output);
+        }
+
+        let Ok(output_str) = std::str::from_utf8(&full_output) else {
+            return Ok(full_output);
         };
 
         let ctx = IntegrationHtmlContext {
@@ -50,10 +67,10 @@ impl StreamProcessor for HtmlWithPostProcessing {
             .iter()
             .any(|p| p.should_process(output_str, &ctx))
         {
-            return Ok(output);
+            return Ok(full_output);
         }
 
-        let mut html = String::from_utf8(output).map_err(|e| {
+        let mut html = String::from_utf8(full_output).map_err(|e| {
             io::Error::other(format!(
                 "HTML post-processing expected valid UTF-8 output: {e}"
             ))
@@ -79,6 +96,7 @@ impl StreamProcessor for HtmlWithPostProcessing {
 
     fn reset(&mut self) {
         self.inner.reset();
+        self.accumulated_output.clear();
         self.document_state.clear();
     }
 }
@@ -467,6 +485,7 @@ pub fn create_html_processor(config: HtmlProcessorConfig) -> impl StreamProcesso
     HtmlWithPostProcessing {
         inner: HtmlRewriterAdapter::new(rewriter_settings),
         post_processors,
+        accumulated_output: Vec::new(),
         origin_host: config.origin_host,
         request_host: config.request_host,
         request_scheme: config.request_scheme,

From 6cae7f9982c8a1a8d02793b58c1469b0e67f0d7b Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 00:46:53 -0700
Subject: [PATCH 14/29] Add streaming response optimization spec for
 non-Next.js paths

---
 .../2026-03-25-streaming-response-design.md   | 194 ++++++++++++++++++
 1 file changed, 194 insertions(+)
 create mode 100644 docs/superpowers/specs/2026-03-25-streaming-response-design.md

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
new file mode 100644
index 00000000..7011dea6
--- /dev/null
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -0,0 +1,194 @@
+# Streaming Response Optimization (Next.js Disabled)
+
+## Problem
+
+When Next.js is disabled, the publisher proxy buffers the entire response body
+in memory before sending any bytes to the client. This creates two costs:
+
+1. **Latency** — The client receives zero bytes until the full response is
+   decompressed, rewritten, and recompressed. For a 222KB HTML page, this adds
+   hundreds of milliseconds to time-to-last-byte.
+2. **Memory** — Peak memory holds ~4x the response size simultaneously
+   (compressed input + decompressed + processed output + recompressed output).
+   With WASM's ~16MB heap, this limits the size of pages we can proxy.
+
+## Scope
+
+**In scope**: All content types flowing through the publisher proxy path — HTML,
+text/JSON, and binary pass-through. Only when Next.js is disabled (no
+post-processor requiring the full document).
+
+**Out of scope**: Concurrent origin+auction fetch, Next.js-enabled paths (these
+require full-document post-processing by design), non-publisher routes (static
+JS, auction, discovery).
+
+## Streaming Gate
+
+Before committing to `stream_to_client()`, check:
+
+1. Backend status is success (2xx).
+2. `html_post_processors()` is empty — no registered post-processors.
+
+If either check fails, fall back to the current buffered path. This keeps the
+optimization transparent: same behavior for all existing configurations,
+streaming only activates when safe.
+
+## Architecture
+
+Two implementation steps, each independently valuable and testable.
+
+### Step 1: Make the pipeline chunk-emitting
+
+Three changes to existing processors:
+
+#### A) `HtmlRewriterAdapter` — incremental streaming
+
+The current implementation accumulates the entire HTML document and processes it
+on `is_last`. This is unnecessary — `lol_html::HtmlRewriter` supports
+incremental `write()` calls and emits output via its `OutputSink` callback after
+each chunk.
+
+Fix: create the rewriter eagerly in the constructor, use
+`Rc<RefCell<Vec<u8>>>` to share the output buffer between the sink and
+`process_chunk()`, drain the buffer on every call instead of only on `is_last`.
+
+#### B) `process_gzip_to_gzip` — chunk-based decompression
+
+Currently calls `read_to_end()` to decompress the entire body into memory. The
+deflate and brotli paths already use the chunk-based
+`process_through_compression()`.
+
+Fix: use the same `process_through_compression` pattern for gzip.
+
+#### C) `process_through_compression` finalization
+
+Currently uses `drop(encoder)` which silently swallows errors from the gzip
+trailer CRC32 checksum.
+
+Fix: call `encoder.finish()` explicitly and propagate errors.
+
+### Step 2: Stream response to client
+
+Change the publisher proxy path to use Fastly's `StreamingBody` API:
+
+1. Fetch from origin, receive response headers.
+2. Validate status — if backend error, return buffered error response via
+   `send_to_client()`.
+3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
+   to buffered path.
+4. Finalize all response headers (cookies, synthetic ID, geo, version).
+5. Call `response.stream_to_client()` — headers sent to client immediately.
+6. Pipe origin body through the streaming pipeline, writing chunks directly to
+   `StreamingBody`.
+7. Call `finish()` on success; on error, log and drop (client sees truncated
+   response).
+
+For binary/non-text content: use `StreamingBody::append(body)` for zero-copy
+pass-through, bypassing the pipeline entirely.
+
+#### Entry point change
+
+Migrate `main.rs` from `#[fastly::main]` to raw `main()` with `fastly::init()`
++ `Request::from_client()`. This is required because `stream_to_client()` /
+`send_to_client()` are incompatible with `#[fastly::main]`'s return-based model.
+
+Non-streaming routes (static, auction, discovery) use `send_to_client()` as
+before.
+
+## Data Flow
+
+### Streaming path (HTML, text/JSON with processing)
+
+```
+Origin body (gzip)
+  → Read 8KB chunk from GzDecoder
+  → StreamProcessor::process_chunk(chunk, is_last)
+      → HtmlRewriterAdapter: lol_html.write(chunk) → sink emits rewritten bytes
+      → OR StreamingReplacer: URL replacement with overlap buffer
+  → GzEncoder::write(processed_chunk) → compressed bytes
+  → StreamingBody::write(compressed) → chunk sent to client
+  → repeat until EOF
+  → StreamingBody::finish()
+```
+
+Memory at steady state: ~8KB input chunk buffer + lol_html internal parser state
++ gzip encoder window + overlap buffer for replacer. Roughly constant regardless
+of document size, versus the current ~4x document size.
+
+### Pass-through path (binary, images, fonts, etc.)
+
+```
+Origin body
+  → StreamingBody::append(body) → zero-copy transfer
+```
+
+No decompression, no processing, no buffering.
+
+### Buffered fallback path (error responses or post-processors present)
+
+```
+Origin returns 4xx/5xx OR html_post_processors() is non-empty
+  → Current buffered path unchanged
+  → send_to_client() with proper status and full body
+```
+
+## Error Handling
+
+**Backend returns error status**: Detected before calling `stream_to_client()`.
+Return the backend response as-is via `send_to_client()`. Client sees the
+correct error status code. No change from current behavior.
+
+**Processing fails mid-stream**: `lol_html` parse error, decompression
+corruption, I/O error. Headers (200 OK) are already sent. Log the error
+server-side, drop the `StreamingBody`. Client sees a truncated response and the
+connection closes. Standard reverse proxy behavior.
+
+**Compression finalization fails**: The gzip trailer CRC32 write fails. With the
+fix, `encoder.finish()` is called explicitly and errors propagate. Same
+mid-stream handling — log and truncate.
+
+No retry logic. No fallback to buffered after streaming has started — once
+headers are sent, we are committed.
+
+## Files Changed
+
+| File | Change | Risk |
+|------|--------|------|
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally; fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | Medium |
+| `crates/trusted-server-core/src/publisher.rs` | Split `handle_publisher_request` into streaming vs buffered paths based on `html_post_processors().is_empty()` | Medium |
+| `crates/trusted-server-adapter-fastly/src/main.rs` | Migrate from `#[fastly::main]` to raw `main()` with `fastly::init()` + `Request::from_client()`; route results to `send_to_client()` or let streaming path handle its own output | Medium |
+
+**Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
+`HtmlRewriterAdapter`, works as-is), integration registration, JS build
+pipeline, tsjs module serving, auction handler, cookie/synthetic ID logic.
+
+## Testing Strategy
+
+### Unit tests (streaming_processor.rs)
+
+- `HtmlRewriterAdapter` emits output on every `process_chunk()` call, not just
+  `is_last`.
+- `process_gzip_to_gzip` produces correct output without `read_to_end`.
+- `encoder.finish()` errors propagate (not swallowed by `drop`).
+- Multi-chunk HTML produces identical output to single-chunk processing.
+
+### Integration tests (publisher.rs)
+
+- Streaming gate: when `html_post_processors()` is non-empty, response is
+  buffered.
+- Streaming gate: when `html_post_processors()` is empty, response streams.
+- Backend error (4xx/5xx) returns buffered error response with correct status.
+- Binary content passes through without processing.
+
+### End-to-end validation (Viceroy)
+
+- `cargo test --workspace` — all existing tests pass.
+- Manual verification via `fastly compute serve` against a real origin.
+- Compare response bodies before/after to confirm byte-identical output for
+  HTML, text, and binary.
+
+### Measurement (post-deploy)
+
+- Compare TTFB and time-to-last-byte on staging before and after.
+- Monitor WASM heap usage via Fastly dashboard.
+- Verify no regressions on static endpoints or auction.

From 930a584e102692a55f1c0de9bcd84588f7d8955c Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 00:50:17 -0700
Subject: [PATCH 15/29] Address spec review: Content-Length, streaming gate,
 finalization order, rollback

---
 .../2026-03-25-streaming-response-design.md   | 58 ++++++++++++++-----
 1 file changed, 44 insertions(+), 14 deletions(-)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 7011dea6..f745f3dd 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -15,8 +15,8 @@ in memory before sending any bytes to the client. This creates two costs:
 ## Scope
 
 **In scope**: All content types flowing through the publisher proxy path — HTML,
-text/JSON, and binary pass-through. Only when Next.js is disabled (no
-post-processor requiring the full document).
+text/JSON, RSC Flight (`text/x-component`), and binary pass-through. Only when
+Next.js is disabled (no post-processor requiring the full document).
 
 **Out of scope**: Concurrent origin+auction fetch, Next.js-enabled paths (these
 require full-document post-processing by design), non-publisher routes (static
@@ -27,11 +27,14 @@ JS, auction, discovery).
 Before committing to `stream_to_client()`, check:
 
 1. Backend status is success (2xx).
-2. `html_post_processors()` is empty — no registered post-processors.
+2. For HTML content: `html_post_processors()` is empty — no registered
+   post-processors. Non-HTML content types (text/JSON, RSC Flight, binary) can
+   always stream regardless of post-processor registration, since
+   post-processors only apply to HTML.
 
-If either check fails, fall back to the current buffered path. This keeps the
-optimization transparent: same behavior for all existing configurations,
-streaming only activates when safe.
+If either check fails for the given content type, fall back to the current
+buffered path. This keeps the optimization transparent: same behavior for all
+existing configurations, streaming only activates when safe.
 
 ## Architecture
 
@@ -51,6 +54,12 @@ each chunk.
 Fix: create the rewriter eagerly in the constructor, use
 `Rc<RefCell<Vec<u8>>>` to share the output buffer between the sink and
 `process_chunk()`, drain the buffer on every call instead of only on `is_last`.
+The output buffer is drained *after* each `rewriter.write()` returns, so the
+`RefCell` borrow in the sink closure never overlaps with the drain borrow.
+
+Note: this makes `HtmlRewriterAdapter` single-use — `reset()` becomes a no-op
+since the `Settings` are consumed by the rewriter constructor. This matches
+actual usage (one adapter per request).
 
 #### B) `process_gzip_to_gzip` — chunk-based decompression
 
@@ -60,12 +69,16 @@ deflate and brotli paths already use the chunk-based
 
 Fix: use the same `process_through_compression` pattern for gzip.
 
-#### C) `process_through_compression` finalization
+#### C) `process_through_compression` finalization — prerequisite for B
 
-Currently uses `drop(encoder)` which silently swallows errors from the gzip
-trailer CRC32 checksum.
+`process_through_compression` currently uses `drop(encoder)` which silently
+swallows errors. For gzip specifically, the trailer contains a CRC32 checksum —
+if `finish()` fails, corrupted responses are served silently. Today this affects
+deflate and brotli (which already use `process_through_compression`); after Step
+1B moves gzip to this path, it will affect gzip too.
 
-Fix: call `encoder.finish()` explicitly and propagate errors.
+Fix: call `encoder.finish()` explicitly and propagate errors. This must land
+before or with Step 1B.
 
 ### Step 2: Stream response to client
 
@@ -77,10 +90,13 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
 3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
    to buffered path.
 4. Finalize all response headers (cookies, synthetic ID, geo, version).
-5. Call `response.stream_to_client()` — headers sent to client immediately.
-6. Pipe origin body through the streaming pipeline, writing chunks directly to
+5. Remove `Content-Length` header — the final size is unknown after processing.
+   Fastly's `StreamingBody` sends the response using chunked transfer encoding
+   automatically.
+6. Call `response.stream_to_client()` — headers sent to client immediately.
+7. Pipe origin body through the streaming pipeline, writing chunks directly to
    `StreamingBody`.
-7. Call `finish()` on success; on error, log and drop (client sees truncated
+8. Call `finish()` on success; on error, log and drop (client sees truncated
    response).
 
 For binary/non-text content: use `StreamingBody::append(body)` for zero-copy
@@ -154,7 +170,7 @@ headers are sent, we are committed.
 
 | File | Change | Risk |
 |------|--------|------|
-| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally; fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | Medium |
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | High |
 | `crates/trusted-server-core/src/publisher.rs` | Split `handle_publisher_request` into streaming vs buffered paths based on `html_post_processors().is_empty()` | Medium |
 | `crates/trusted-server-adapter-fastly/src/main.rs` | Migrate from `#[fastly::main]` to raw `main()` with `fastly::init()` + `Request::from_client()`; route results to `send_to_client()` or let streaming path handle its own output | Medium |
 
@@ -162,6 +178,20 @@ headers are sent, we are committed.
 `HtmlRewriterAdapter`, works as-is), integration registration, JS build
 pipeline, tsjs module serving, auction handler, cookie/synthetic ID logic.
 
+Note: `HtmlWithPostProcessing` wraps `HtmlRewriterAdapter` and applies
+post-processors on `is_last`. In the streaming path the post-processor list is
+empty (that's the gate condition), so the wrapper is a no-op passthrough. It
+remains in place — no need to bypass it.
+
+## Rollback Strategy
+
+The `#[fastly::main]` to raw `main()` migration is a structural change. If
+streaming causes issues in production, the fastest rollback is reverting the
+`main.rs` change — the buffered path still exists and the pipeline improvements
+(Step 1) are safe to keep regardless. No feature flag needed; a git revert of
+the Step 2 commit restores buffered behavior while retaining Step 1 memory
+improvements.
+
 ## Testing Strategy
 
 ### Unit tests (streaming_processor.rs)

From a2b71bf53be89be5167eaa4a605767c50b3afb67 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 01:13:06 -0700
Subject: [PATCH 16/29] Address deep review: header timing, error phases,
 process_response_streaming refactor

---
 .../2026-03-25-streaming-response-design.md   | 38 ++++++++++++-------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index f745f3dd..dd31097d 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -54,7 +54,7 @@ each chunk.
 Fix: create the rewriter eagerly in the constructor, use
 `Rc<RefCell<Vec<u8>>>` to share the output buffer between the sink and
 `process_chunk()`, drain the buffer on every call instead of only on `is_last`.
-The output buffer is drained *after* each `rewriter.write()` returns, so the
+The output buffer is drained _after_ each `rewriter.write()` returns, so the
 `RefCell` borrow in the sink closure never overlaps with the drain borrow.
 
 Note: this makes `HtmlRewriterAdapter` single-use — `reset()` becomes a no-op
@@ -90,6 +90,10 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
 3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
    to buffered path.
 4. Finalize all response headers (cookies, synthetic ID, geo, version).
+   Today, synthetic ID/cookie headers are set _after_ body processing in
+   `handle_publisher_request`. Since they are body-independent (computed from
+   request cookies and consent context), they must be reordered to run _before_
+   `stream_to_client()` so headers are complete before streaming begins.
 5. Remove `Content-Length` header — the final size is unknown after processing.
    Fastly's `StreamingBody` sends the response using chunked transfer encoding
    automatically.
@@ -99,13 +103,16 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
 8. Call `finish()` on success; on error, log and drop (client sees truncated
    response).
 
-For binary/non-text content: use `StreamingBody::append(body)` for zero-copy
-pass-through, bypassing the pipeline entirely.
+For binary/non-text content: call `response.take_body()` then
+`StreamingBody::append(body)` for zero-copy pass-through, bypassing the pipeline
+entirely. Today binary responses skip `take_body()` and return the response
+as-is — the streaming path needs to explicitly take the body to hand it to
+`append()`.
 
 #### Entry point change
 
 Migrate `main.rs` from `#[fastly::main]` to raw `main()` with `fastly::init()`
-+ `Request::from_client()`. This is required because `stream_to_client()` /
+\+ `Request::from_client()`. This is required because `stream_to_client()` /
 `send_to_client()` are incompatible with `#[fastly::main]`'s return-based model.
 
 Non-streaming routes (static, auction, discovery) use `send_to_client()` as
@@ -128,7 +135,7 @@ Origin body (gzip)
 ```
 
 Memory at steady state: ~8KB input chunk buffer + lol_html internal parser state
-+ gzip encoder window + overlap buffer for replacer. Roughly constant regardless
+\+ gzip encoder window + overlap buffer for replacer. Roughly constant regardless
 of document size, versus the current ~4x document size.
 
 ### Pass-through path (binary, images, fonts, etc.)
@@ -154,10 +161,15 @@ Origin returns 4xx/5xx OR html_post_processors() is non-empty
 Return the backend response as-is via `send_to_client()`. Client sees the
 correct error status code. No change from current behavior.
 
+**Processor creation fails**: `create_html_stream_processor()` or pipeline
+construction errors happen _before_ `stream_to_client()` is called. Since
+headers have not been sent yet, return a proper error response via
+`send_to_client()`. Same as current behavior.
+
 **Processing fails mid-stream**: `lol_html` parse error, decompression
-corruption, I/O error. Headers (200 OK) are already sent. Log the error
-server-side, drop the `StreamingBody`. Client sees a truncated response and the
-connection closes. Standard reverse proxy behavior.
+corruption, I/O error during chunk processing. Headers (200 OK) are already
+sent. Log the error server-side, drop the `StreamingBody`. Client sees a
+truncated response and the connection closes. Standard reverse proxy behavior.
 
 **Compression finalization fails**: The gzip trailer CRC32 write fails. With the
 fix, `encoder.finish()` is called explicitly and errors propagate. Same
@@ -168,11 +180,11 @@ headers are sent, we are committed.
 
 ## Files Changed
 
-| File | Change | Risk |
-|------|--------|------|
-| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | High |
-| `crates/trusted-server-core/src/publisher.rs` | Split `handle_publisher_request` into streaming vs buffered paths based on `html_post_processors().is_empty()` | Medium |
-| `crates/trusted-server-adapter-fastly/src/main.rs` | Migrate from `#[fastly::main]` to raw `main()` with `fastly::init()` + `Request::from_client()`; route results to `send_to_client()` or let streaming path handle its own output | Medium |
+| File                                                    | Change                                                                                                                                                                                                | Risk   |
+| ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ |
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | High   |
+| `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming | Medium |
+| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to raw `main()` with `fastly::init()` + `Request::from_client()`; route results to `send_to_client()` or let streaming path handle its own output                      | Medium |
 
 **Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
 `HtmlRewriterAdapter`, works as-is), integration registration, JS build

From b363e562ac105b114ad5562f04245507393d80e8 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Wed, 25 Mar 2026 07:59:42 -0700
Subject: [PATCH 17/29] Address deep review: remove fastly::init, fix API
 assumptions, add missing paths

---
 .../2026-03-25-streaming-response-design.md   | 100 +++++++++++++-----
 1 file changed, 71 insertions(+), 29 deletions(-)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index dd31097d..80c49ed8 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -61,21 +61,32 @@ Note: this makes `HtmlRewriterAdapter` single-use — `reset()` becomes a no-op
 since the `Settings` are consumed by the rewriter constructor. This matches
 actual usage (one adapter per request).
 
-#### B) `process_gzip_to_gzip` — chunk-based decompression
+#### B) Chunk-based decompression for all compression paths
 
-Currently calls `read_to_end()` to decompress the entire body into memory. The
-deflate and brotli paths already use the chunk-based
+`process_gzip_to_gzip` calls `read_to_end()` to decompress the entire body into
+memory. The deflate and brotli keep-compression paths already use chunk-based
 `process_through_compression()`.
 
 Fix: use the same `process_through_compression` pattern for gzip.
 
+Additionally, `decompress_and_process()` (used by `process_gzip_to_none`,
+`process_deflate_to_none`, `process_brotli_to_none`) also calls
+`read_to_end()`. These strip-compression paths must be converted to chunk-based
+processing too — read decompressed chunks, process each, write uncompressed
+output directly.
+
+Reference: `process_uncompressed` already implements the correct chunk-based
+pattern (read loop → `process_chunk()` per chunk → `write_all()` → flush). The
+compressed paths should follow the same structure.
+
 #### C) `process_through_compression` finalization — prerequisite for B
 
 `process_through_compression` currently uses `drop(encoder)` which silently
-swallows errors. For gzip specifically, the trailer contains a CRC32 checksum —
-if `finish()` fails, corrupted responses are served silently. Today this affects
-deflate and brotli (which already use `process_through_compression`); after Step
-1B moves gzip to this path, it will affect gzip too.
+swallows errors. Today this affects deflate and brotli (which already use this
+path). The current `process_gzip_to_gzip` calls `encoder.finish()` explicitly —
+but Step 1B moves gzip to `process_through_compression`, which would **regress**
+gzip from working `finish()` to broken `drop()`. This fix prevents that
+regression and also fixes the pre-existing issue for deflate/brotli.
 
 Fix: call `encoder.finish()` explicitly and propagate errors. This must land
 before or with Step 1B.
@@ -89,11 +100,14 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
    `send_to_client()`.
 3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
    to buffered path.
-4. Finalize all response headers (cookies, synthetic ID, geo, version).
-   Today, synthetic ID/cookie headers are set _after_ body processing in
-   `handle_publisher_request`. Since they are body-independent (computed from
-   request cookies and consent context), they must be reordered to run _before_
-   `stream_to_client()` so headers are complete before streaming begins.
+4. Finalize all response headers. This requires reordering two things:
+   - **Synthetic ID/cookie headers**: today set _after_ body processing in
+     `handle_publisher_request`. Since they are body-independent (computed from
+     request cookies and consent context), move them _before_ streaming.
+   - **`finalize_response()`** (main.rs): today called _after_ `route_request`
+     returns, adding geo, version, staging, and operator headers. In the
+     streaming path, this must run _before_ `stream_to_client()` since the
+     publisher handler sends the response directly instead of returning it.
 5. Remove `Content-Length` header — the final size is unknown after processing.
    Fastly's `StreamingBody` sends the response using chunked transfer encoding
    automatically.
@@ -103,17 +117,36 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
 8. Call `finish()` on success; on error, log and drop (client sees truncated
    response).
 
-For binary/non-text content: call `response.take_body()` then
-`StreamingBody::append(body)` for zero-copy pass-through, bypassing the pipeline
-entirely. Today binary responses skip `take_body()` and return the response
-as-is — the streaming path needs to explicitly take the body to hand it to
-`append()`.
+For binary/non-text content: call `response.take_body()` then stream via
+`io::copy(&mut body, &mut streaming_body)`. The `Body` type implements `Read`
+and `StreamingBody` implements `Write`, so this streams the backend body to the
+client without buffering the full content. Today binary responses skip
+`take_body()` and return the response as-is — the streaming path needs to
+explicitly take the body to pipe it through.
 
 #### Entry point change
 
-Migrate `main.rs` from `#[fastly::main]` to raw `main()` with `fastly::init()`
-\+ `Request::from_client()`. This is required because `stream_to_client()` /
-`send_to_client()` are incompatible with `#[fastly::main]`'s return-based model.
+Migrate `main.rs` from `#[fastly::main]` to an undecorated `main()` with
+`Request::from_client()`. No separate initialization call is needed —
+`#[fastly::main]` is just syntactic sugar for `Request::from_client()` +
+`Response::send_to_client()`. The migration is required because
+`stream_to_client()` / `send_to_client()` are incompatible with
+`#[fastly::main]`'s return-based model.
+
+```rust
+fn main() {
+    let req = Request::from_client();
+    match handle(req) {
+        Ok(()) => {}
+        Err(e) => to_error_response(&e).send_to_client(),
+    }
+}
+```
+
+Note: the return type changes from `Result<Response, Error>` to `()` (or
+`Result<(), Error>`). Errors that currently propagate to `main`'s `Result` must
+now be caught explicitly and sent via `send_to_client()` with
+`to_error_response()`.
 
 Non-streaming routes (static, auction, discovery) use `send_to_client()` as
 before.
@@ -134,18 +167,19 @@ Origin body (gzip)
   → StreamingBody::finish()
 ```
 
-Memory at steady state: ~8KB input chunk buffer + lol_html internal parser state
-\+ gzip encoder window + overlap buffer for replacer. Roughly constant regardless
+Memory at steady state: ~8KB input chunk buffer, lol_html internal parser state,
+gzip encoder window, and overlap buffer for replacer. Roughly constant regardless
 of document size, versus the current ~4x document size.
 
 ### Pass-through path (binary, images, fonts, etc.)
 
 ```
-Origin body
-  → StreamingBody::append(body) → zero-copy transfer
+Origin body (via take_body())
+  → io::copy(&mut body, &mut streaming_body) → streamed transfer
+  → StreamingBody::finish()
 ```
 
-No decompression, no processing, no buffering.
+No decompression, no processing. Body streams through as read.
 
 ### Buffered fallback path (error responses or post-processors present)
 
@@ -168,8 +202,10 @@ headers have not been sent yet, return a proper error response via
 
 **Processing fails mid-stream**: `lol_html` parse error, decompression
 corruption, I/O error during chunk processing. Headers (200 OK) are already
-sent. Log the error server-side, drop the `StreamingBody`. Client sees a
-truncated response and the connection closes. Standard reverse proxy behavior.
+sent. Log the error server-side, drop the `StreamingBody`. Per the Fastly SDK,
+`StreamingBody` automatically aborts the response if dropped without calling
+`finish()` — the client sees a connection reset / truncated response. This is
+standard reverse proxy behavior.
 
 **Compression finalization fails**: The gzip trailer CRC32 write fails. With the
 fix, `encoder.finish()` is called explicitly and errors propagate. Same
@@ -182,9 +218,9 @@ headers are sent, we are committed.
 
 | File                                                    | Change                                                                                                                                                                                                | Risk   |
 | ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ |
-| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); fix `process_gzip_to_gzip` to use chunk-based processing; fix `process_through_compression` to call `finish()` explicitly | High   |
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); convert all compression paths to chunk-based processing (`process_gzip_to_gzip` and `decompress_and_process`); fix `process_through_compression` to call `finish()` explicitly | High |
 | `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming | Medium |
-| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to raw `main()` with `fastly::init()` + `Request::from_client()`; route results to `send_to_client()` or let streaming path handle its own output                      | Medium |
+| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to undecorated `main()` with `Request::from_client()`; explicit error handling via `to_error_response().send_to_client()`; call `finalize_response()` before streaming | Medium |
 
 **Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
 `HtmlRewriterAdapter`, works as-is), integration registration, JS build
@@ -195,6 +231,12 @@ post-processors on `is_last`. In the streaming path the post-processor list is
 empty (that's the gate condition), so the wrapper is a no-op passthrough. It
 remains in place — no need to bypass it.
 
+Clarification: `script_rewriters` (used by Next.js and GTM) are distinct from
+`html_post_processors`. Script rewriters run inside `lol_html` element handlers
+during streaming — they do not require buffering and are unaffected by this
+change. The streaming gate checks only `html_post_processors().is_empty()`, not
+script rewriters. Currently only Next.js registers a post-processor.
+
 ## Rollback Strategy
 
 The `#[fastly::main]` to raw `main()` migration is a structural change. If

From b83f61c4a5c8eab14fc34e766d48d82b25e719aa Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 10:37:39 -0700
Subject: [PATCH 18/29] Apply rustfmt formatting to streaming_processor

---
 .../src/streaming_processor.rs                | 20 ++++++++-----------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 7062df93..40ec51cb 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -215,11 +215,11 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                             message: "Failed to process chunk".to_string(),
                         })?;
                     if !processed.is_empty() {
-                        writer.write_all(&processed).change_context(
-                            TrustedServerError::Proxy {
+                        writer
+                            .write_all(&processed)
+                            .change_context(TrustedServerError::Proxy {
                                 message: "Failed to write processed chunk".to_string(),
-                            },
-                        )?;
+                            })?;
                     }
                 }
                 Err(e) => {
@@ -502,8 +502,7 @@ mod tests {
         // Compress input
         let mut compressed_input = Vec::new();
         {
-            let mut enc =
-                ZlibEncoder::new(&mut compressed_input, flate2::Compression::default());
+            let mut enc = ZlibEncoder::new(&mut compressed_input, flate2::Compression::default());
             enc.write_all(input_data)
                 .expect("should compress test input");
             enc.finish().expect("should finish compression");
@@ -551,8 +550,7 @@ mod tests {
 
         let mut compressed_input = Vec::new();
         {
-            let mut enc =
-                GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+            let mut enc = GzEncoder::new(&mut compressed_input, flate2::Compression::default());
             enc.write_all(input_data)
                 .expect("should compress test input");
             enc.finish().expect("should finish compression");
@@ -600,8 +598,7 @@ mod tests {
 
         let mut compressed_input = Vec::new();
         {
-            let mut enc =
-                GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+            let mut enc = GzEncoder::new(&mut compressed_input, flate2::Compression::default());
             enc.write_all(input_data)
                 .expect("should compress test input");
             enc.finish().expect("should finish compression");
@@ -627,8 +624,7 @@ mod tests {
             .expect("should process gzip-to-none");
 
         // Assert
-        let result =
-            String::from_utf8(output).expect("should be valid UTF-8 uncompressed output");
+        let result = String::from_utf8(output).expect("should be valid UTF-8 uncompressed output");
         assert_eq!(
             result, "<html><body>hi world</body></html>",
             "should have replaced content after gzip decompression"

From aeca9f6479c33d87263da7a24f61d37a04f72b64 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 10:47:24 -0700
Subject: [PATCH 19/29] Add debug logging, brotli round-trip test, and
 post-processor accumulation test

- Add debug-level logging to process_chunks showing total bytes
  read and written per pipeline invocation
- Add brotli-to-brotli round-trip test to cover the into_inner()
  finalization path
- Add test proving HtmlWithPostProcessing accumulates output when
  post-processors are registered while streaming path passes through
---
 .../trusted-server-core/src/html_processor.rs | 85 +++++++++++++++++++
 .../src/streaming_processor.rs                | 57 +++++++++++++
 2 files changed, 142 insertions(+)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 30550318..95ccf9c3 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -1010,4 +1010,89 @@ mod tests {
                 .collect::<String>()
         );
     }
+
+    #[test]
+    fn post_processors_accumulate_while_streaming_path_passes_through() {
+        use crate::streaming_processor::{HtmlRewriterAdapter, StreamProcessor};
+        use lol_html::Settings;
+
+        // --- Streaming path: no post-processors → output emitted per chunk ---
+        let mut streaming = HtmlWithPostProcessing {
+            inner: HtmlRewriterAdapter::new(Settings::default()),
+            post_processors: Vec::new(),
+            accumulated_output: Vec::new(),
+            origin_host: String::new(),
+            request_host: String::new(),
+            request_scheme: String::new(),
+            document_state: IntegrationDocumentState::default(),
+        };
+
+        let chunk1 = streaming
+            .process_chunk(b"<html><body>", false)
+            .expect("should process chunk1");
+        let chunk2 = streaming
+            .process_chunk(b"<p>hello</p>", false)
+            .expect("should process chunk2");
+        let chunk3 = streaming
+            .process_chunk(b"</body></html>", true)
+            .expect("should process final chunk");
+
+        assert!(
+            !chunk1.is_empty() || !chunk2.is_empty(),
+            "should emit intermediate output on streaming path"
+        );
+
+        let mut streaming_all = chunk1;
+        streaming_all.extend_from_slice(&chunk2);
+        streaming_all.extend_from_slice(&chunk3);
+
+        // --- Buffered path: post-processor registered → accumulates until is_last ---
+        struct NoopPostProcessor;
+        impl IntegrationHtmlPostProcessor for NoopPostProcessor {
+            fn integration_id(&self) -> &'static str {
+                "test-noop"
+            }
+            fn post_process(&self, _html: &mut String, _ctx: &IntegrationHtmlContext<'_>) -> bool {
+                false
+            }
+        }
+
+        let mut buffered = HtmlWithPostProcessing {
+            inner: HtmlRewriterAdapter::new(Settings::default()),
+            post_processors: vec![Arc::new(NoopPostProcessor)],
+            accumulated_output: Vec::new(),
+            origin_host: String::new(),
+            request_host: String::new(),
+            request_scheme: String::new(),
+            document_state: IntegrationDocumentState::default(),
+        };
+
+        let buf1 = buffered
+            .process_chunk(b"<html><body>", false)
+            .expect("should process chunk1");
+        let buf2 = buffered
+            .process_chunk(b"<p>hello</p>", false)
+            .expect("should process chunk2");
+        let buf3 = buffered
+            .process_chunk(b"</body></html>", true)
+            .expect("should process final chunk");
+
+        assert!(
+            buf1.is_empty() && buf2.is_empty(),
+            "should return empty for intermediate chunks when post-processors are registered"
+        );
+        assert!(
+            !buf3.is_empty(),
+            "should emit all output in final chunk when post-processors are registered"
+        );
+
+        // Both paths should produce identical output
+        let streaming_str =
+            String::from_utf8(streaming_all).expect("streaming output should be valid UTF-8");
+        let buffered_str = String::from_utf8(buf3).expect("buffered output should be valid UTF-8");
+        assert_eq!(
+            streaming_str, buffered_str,
+            "streaming and buffered paths should produce identical output"
+        );
+    }
 }
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 40ec51cb..4f189926 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -189,6 +189,8 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
         mut writer: W,
     ) -> Result<(), Report<TrustedServerError>> {
         let mut buffer = vec![0u8; self.config.chunk_size];
+        let mut total_read: u64 = 0;
+        let mut total_written: u64 = 0;
 
         loop {
             match reader.read(&mut buffer) {
@@ -199,6 +201,7 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                         },
                     )?;
                     if !final_chunk.is_empty() {
+                        total_written += final_chunk.len() as u64;
                         writer.write_all(&final_chunk).change_context(
                             TrustedServerError::Proxy {
                                 message: "Failed to write final chunk".to_string(),
@@ -208,6 +211,7 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                     break;
                 }
                 Ok(n) => {
+                    total_read += n as u64;
                     let processed = self
                         .processor
                         .process_chunk(&buffer[..n], false)
@@ -215,6 +219,7 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                             message: "Failed to process chunk".to_string(),
                         })?;
                     if !processed.is_empty() {
+                        total_written += processed.len() as u64;
                         writer
                             .write_all(&processed)
                             .change_context(TrustedServerError::Proxy {
@@ -234,6 +239,10 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
             message: "Failed to flush output".to_string(),
         })?;
 
+        log::debug!(
+            "Streaming pipeline complete: read {total_read} bytes, wrote {total_written} bytes"
+        );
+
         Ok(())
     }
 }
@@ -631,6 +640,54 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_brotli_round_trip_produces_valid_output() {
+        use brotli::enc::writer::CompressorWriter;
+        use brotli::Decompressor;
+        use std::io::{Read as _, Write as _};
+
+        let input_data = b"<html><body>hello world</body></html>";
+
+        // Compress input with brotli
+        let mut compressed_input = Vec::new();
+        {
+            let mut enc = CompressorWriter::new(&mut compressed_input, 4096, 4, 22);
+            enc.write_all(input_data)
+                .expect("should compress test input");
+            enc.flush().expect("should flush brotli encoder");
+        }
+
+        let replacer = StreamingReplacer::new(vec![Replacement {
+            find: "hello".to_string(),
+            replace_with: "hi".to_string(),
+        }]);
+
+        let config = PipelineConfig {
+            input_compression: Compression::Brotli,
+            output_compression: Compression::Brotli,
+            chunk_size: 8192,
+        };
+
+        let mut pipeline = StreamingPipeline::new(config, replacer);
+        let mut output = Vec::new();
+
+        pipeline
+            .process(&compressed_input[..], &mut output)
+            .expect("should process brotli-to-brotli");
+
+        // Decompress output and verify correctness
+        let mut decompressed = Vec::new();
+        Decompressor::new(&output[..], 4096)
+            .read_to_end(&mut decompressed)
+            .expect("should decompress output — implies encoder was finalized correctly");
+
+        assert_eq!(
+            String::from_utf8(decompressed).expect("should be valid UTF-8"),
+            "<html><body>hi world</body></html>",
+            "should have replaced content through brotli round-trip"
+        );
+    }
+
     #[test]
     fn test_html_rewriter_adapter_emits_output_per_chunk() {
         use lol_html::Settings;

From e1c6cb81e3c95bbb757a9bba67fa818969ea8658 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 11:05:07 -0700
Subject: [PATCH 20/29] Address deep review: imports, stale comments, brotli
 finalization, tests

- Group std imports together (cell, io, rc) before external crates
- Document supported compression combinations on PipelineConfig
- Remove dead-weight byte counters from process_chunks hot loop
- Fix stale comment referencing removed process_through_compression
- Fix brotli finalization: use drop(encoder) instead of into_inner()
  to make the intent clear (CompressorWriter writes trailer on drop)
- Document reset() as no-op on HtmlRewriterAdapter (single-use)
- Add brotli round-trip test covering into_inner finalization path
- Add gzip HTML rewriter pipeline test (compressed round-trip with
  lol_html, not just StreamingReplacer)
- Add HtmlWithPostProcessing accumulation vs streaming behavior test
---
 .../trusted-server-core/src/html_processor.rs |   3 +-
 .../src/streaming_processor.rs                | 126 +++++++++++++-----
 2 files changed, 94 insertions(+), 35 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 95ccf9c3..52fba915 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -94,8 +94,9 @@ impl StreamProcessor for HtmlWithPostProcessing {
         Ok(html.into_bytes())
     }
 
+    /// No-op. `HtmlWithPostProcessing` wraps a single-use
+    /// [`HtmlRewriterAdapter`] and cannot be meaningfully reset.
     fn reset(&mut self) {
-        self.inner.reset();
         self.accumulated_output.clear();
         self.document_state.clear();
     }
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 4f189926..6e915737 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -7,10 +7,10 @@
 //! - UTF-8 boundary handling
 
 use std::cell::RefCell;
+use std::io::{self, Read, Write};
 use std::rc::Rc;
 
 use error_stack::{Report, ResultExt};
-use std::io::{self, Read, Write};
 
 use crate::error::TrustedServerError;
 
@@ -56,7 +56,21 @@ impl Compression {
     }
 }
 
-/// Configuration for the streaming pipeline
+/// Configuration for the streaming pipeline.
+///
+/// # Supported compression combinations
+///
+/// | Input | Output | Behavior |
+/// |-------|--------|----------|
+/// | None | None | Pass-through processing |
+/// | Gzip | Gzip | Decompress → process → recompress |
+/// | Gzip | None | Decompress → process |
+/// | Deflate | Deflate | Decompress → process → recompress |
+/// | Deflate | None | Decompress → process |
+/// | Brotli | Brotli | Decompress → process → recompress |
+/// | Brotli | None | Decompress → process |
+///
+/// All other combinations return an error at runtime.
 pub struct PipelineConfig {
     /// Input compression type
     pub input_compression: Compression,
@@ -158,8 +172,9 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                 };
                 let mut encoder = CompressorWriter::with_params(output, 4096, &params);
                 self.process_chunks(decoder, &mut encoder)?;
-                // CompressorWriter finalizes on flush (already called) and into_inner
-                encoder.into_inner();
+                // CompressorWriter writes the brotli stream trailer on drop.
+                // process_chunks already called flush(), so drop finalizes cleanly.
+                drop(encoder);
                 Ok(())
             }
             (Compression::Brotli, Compression::None) => {
@@ -189,8 +204,6 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
         mut writer: W,
     ) -> Result<(), Report<TrustedServerError>> {
         let mut buffer = vec![0u8; self.config.chunk_size];
-        let mut total_read: u64 = 0;
-        let mut total_written: u64 = 0;
 
         loop {
             match reader.read(&mut buffer) {
@@ -201,7 +214,6 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                         },
                     )?;
                     if !final_chunk.is_empty() {
-                        total_written += final_chunk.len() as u64;
                         writer.write_all(&final_chunk).change_context(
                             TrustedServerError::Proxy {
                                 message: "Failed to write final chunk".to_string(),
@@ -211,7 +223,6 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                     break;
                 }
                 Ok(n) => {
-                    total_read += n as u64;
                     let processed = self
                         .processor
                         .process_chunk(&buffer[..n], false)
@@ -219,7 +230,6 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                             message: "Failed to process chunk".to_string(),
                         })?;
                     if !processed.is_empty() {
-                        total_written += processed.len() as u64;
                         writer
                             .write_all(&processed)
                             .change_context(TrustedServerError::Proxy {
@@ -239,10 +249,6 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
             message: "Failed to flush output".to_string(),
         })?;
 
-        log::debug!(
-            "Streaming pipeline complete: read {total_read} bytes, wrote {total_written} bytes"
-        );
-
         Ok(())
     }
 }
@@ -308,10 +314,12 @@ impl StreamProcessor for HtmlRewriterAdapter {
         Ok(std::mem::take(&mut *self.output.borrow_mut()))
     }
 
-    fn reset(&mut self) {
-        // No-op: the rewriter consumed its Settings on construction.
-        // Single-use by design (one adapter per request).
-    }
+    /// No-op. `HtmlRewriterAdapter` is single-use: the rewriter consumes its
+    /// [`Settings`](lol_html::Settings) on construction and cannot be recreated.
+    /// Calling [`process_chunk`](StreamProcessor::process_chunk) after
+    /// [`process_chunk`](StreamProcessor::process_chunk) with `is_last = true`
+    /// will produce empty output.
+    fn reset(&mut self) {}
 }
 
 /// Adapter to use our existing `StreamingReplacer` as a `StreamProcessor`
@@ -468,40 +476,33 @@ mod tests {
     }
 
     #[test]
-    fn test_html_rewriter_adapter_reset_is_noop() {
+    fn test_html_rewriter_adapter_reset_then_finalize() {
         use lol_html::Settings;
 
         let settings = Settings::default();
         let mut adapter = HtmlRewriterAdapter::new(settings);
 
-        // Process some content
-        let result1 = adapter
+        adapter
             .process_chunk(b"<html><body>test</body></html>", false)
             .expect("should process html");
 
-        // Reset is a no-op — the adapter is single-use by design
+        // reset() is a documented no-op — adapter is single-use
         adapter.reset();
 
-        // The rewriter is still alive; finalize it
-        let result2 = adapter
+        // Finalize still works; the rewriter is still alive
+        let final_output = adapter
             .process_chunk(b"", true)
             .expect("should finalize after reset");
 
-        let mut all_output = result1;
-        all_output.extend_from_slice(&result2);
-
-        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
-        assert!(
-            output.contains("test"),
-            "should still produce output after no-op reset"
-        );
+        // Output may or may not be empty depending on lol_html buffering,
+        // but it should not error
+        let _ = final_output;
     }
 
     #[test]
     fn test_deflate_round_trip_produces_valid_output() {
-        // Verify that deflate-to-deflate (which uses process_through_compression)
-        // produces valid output that decompresses correctly. This establishes the
-        // correctness contract before we change the finalization path.
+        // Verify that deflate-to-deflate produces valid output that decompresses
+        // correctly, confirming that encoder finalization works.
         use flate2::read::ZlibDecoder;
         use flate2::write::ZlibEncoder;
         use std::io::{Read as _, Write as _};
@@ -772,4 +773,61 @@ mod tests {
             "Should not contain original URL"
         );
     }
+
+    #[test]
+    fn test_gzip_pipeline_with_html_rewriter() {
+        use flate2::read::GzDecoder;
+        use flate2::write::GzEncoder;
+        use lol_html::{element, Settings};
+        use std::io::{Read as _, Write as _};
+
+        let settings = Settings {
+            element_content_handlers: vec![element!("a[href]", |el| {
+                if let Some(href) = el.get_attribute("href") {
+                    if href.contains("example.com") {
+                        el.set_attribute("href", &href.replace("example.com", "test.com"))?;
+                    }
+                }
+                Ok(())
+            })],
+            ..Settings::default()
+        };
+
+        let input = b"<html><body><a href=\"https://example.com\">Link</a></body></html>";
+
+        let mut compressed_input = Vec::new();
+        {
+            let mut enc = GzEncoder::new(&mut compressed_input, flate2::Compression::default());
+            enc.write_all(input).expect("should compress test input");
+            enc.finish().expect("should finish compression");
+        }
+
+        let adapter = HtmlRewriterAdapter::new(settings);
+        let config = PipelineConfig {
+            input_compression: Compression::Gzip,
+            output_compression: Compression::Gzip,
+            chunk_size: 8192,
+        };
+        let mut pipeline = StreamingPipeline::new(config, adapter);
+        let mut output = Vec::new();
+
+        pipeline
+            .process(&compressed_input[..], &mut output)
+            .expect("pipeline should process gzip HTML");
+
+        let mut decompressed = Vec::new();
+        GzDecoder::new(&output[..])
+            .read_to_end(&mut decompressed)
+            .expect("should decompress output");
+
+        let result = String::from_utf8(decompressed).expect("output should be valid UTF-8");
+        assert!(
+            result.contains("https://test.com"),
+            "should have replaced URL through gzip HTML pipeline"
+        );
+        assert!(
+            !result.contains("example.com"),
+            "should not contain original URL after gzip HTML pipeline"
+        );
+    }
 }

From 9753026afc8a81e2f1ddee452bab727df08f05b5 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 11:12:42 -0700
Subject: [PATCH 21/29] Address second deep review: correctness, docs, and test
 robustness
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Add Eq derive to Compression enum (all unit variants, trivially correct)
- Brotli finalization: use into_inner() instead of drop() to skip
  redundant flush and make finalization explicit
- Document process_chunks flush semantics: callers must still call
  encoder-specific finalization after this method returns
- Warn when HtmlRewriterAdapter receives data after finalization
  (rewriter already consumed, data would be silently lost)
- Make HtmlWithPostProcessing::reset() a true no-op matching its doc
  (clearing auxiliary state without resetting rewriter is inconsistent)
- Document extra copying overhead on post-processor path vs streaming
- Assert output content in reset-then-finalize test (was discarded)
- Relax per-chunk emission test to not depend on lol_html internal
  buffering behavior — assert concatenated correctness + at least one
  intermediate chunk emitted
---
 .../trusted-server-core/src/html_processor.rs | 19 +++--
 .../src/streaming_processor.rs                | 82 +++++++++++--------
 2 files changed, 64 insertions(+), 37 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 52fba915..d9840cfb 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -17,6 +17,16 @@ use crate::settings::Settings;
 use crate::streaming_processor::{HtmlRewriterAdapter, StreamProcessor};
 use crate::tsjs;
 
+/// Wraps [`HtmlRewriterAdapter`] with optional post-processing.
+///
+/// When `post_processors` is empty (the common streaming path), chunks pass
+/// through immediately with no extra copying. When post-processors are
+/// registered, intermediate output is accumulated in `accumulated_output`
+/// until `is_last`, then post-processors run on the full document. This adds
+/// an extra copy per chunk compared to the pre-streaming adapter (which
+/// accumulated raw input instead of rewriter output). The overhead is
+/// acceptable because the post-processor path is already fully buffered —
+/// the real streaming win comes from the empty-post-processor path in Phase 2.
 struct HtmlWithPostProcessing {
     inner: HtmlRewriterAdapter,
     post_processors: Vec<Arc<dyn IntegrationHtmlPostProcessor>>,
@@ -95,11 +105,10 @@ impl StreamProcessor for HtmlWithPostProcessing {
     }
 
     /// No-op. `HtmlWithPostProcessing` wraps a single-use
-    /// [`HtmlRewriterAdapter`] and cannot be meaningfully reset.
-    fn reset(&mut self) {
-        self.accumulated_output.clear();
-        self.document_state.clear();
-    }
+    /// [`HtmlRewriterAdapter`] that cannot be reset. Clearing auxiliary
+    /// state without resetting the rewriter would leave the processor
+    /// in an inconsistent state, so this method intentionally does nothing.
+    fn reset(&mut self) {}
 }
 
 /// Configuration for HTML processing
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 6e915737..3915494c 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -35,7 +35,7 @@ pub trait StreamProcessor {
 }
 
 /// Compression type for the stream
-#[derive(Debug, Clone, Copy, PartialEq)]
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub enum Compression {
     None,
     Gzip,
@@ -172,9 +172,12 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
                 };
                 let mut encoder = CompressorWriter::with_params(output, 4096, &params);
                 self.process_chunks(decoder, &mut encoder)?;
-                // CompressorWriter writes the brotli stream trailer on drop.
-                // process_chunks already called flush(), so drop finalizes cleanly.
-                drop(encoder);
+                // CompressorWriter emits the brotli stream trailer via flush(),
+                // which process_chunks already called. into_inner() avoids a
+                // redundant flush on drop and makes finalization explicit.
+                // Note: unlike flate2's finish(), CompressorWriter has no
+                // fallible finalization method — flush() is the only option.
+                let _ = encoder.into_inner();
                 Ok(())
             }
             (Compression::Brotli, Compression::None) => {
@@ -191,9 +194,11 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     /// Read chunks from `reader`, pass each through the processor, and write output to `writer`.
     ///
     /// This is the single unified chunk loop used by all compression paths.
-    /// The caller is responsible for wrapping `reader`/`writer` in the appropriate
-    /// decoder/encoder and for finalizing the encoder (e.g., calling `finish()`)
-    /// after this method returns.
+    /// The method calls `writer.flush()` before returning. For the `None → None`
+    /// path this is the only finalization needed. For compressed paths, the caller
+    /// must still call the encoder's type-specific finalization (e.g., `finish()`
+    /// for flate2, `into_inner()` for brotli) — `flush()` alone does not write
+    /// compression trailers for all codecs.
     ///
     /// # Errors
     ///
@@ -292,13 +297,22 @@ impl HtmlRewriterAdapter {
 
 impl StreamProcessor for HtmlRewriterAdapter {
     fn process_chunk(&mut self, chunk: &[u8], is_last: bool) -> Result<Vec<u8>, io::Error> {
-        if let Some(rewriter) = &mut self.rewriter {
-            if !chunk.is_empty() {
-                rewriter.write(chunk).map_err(|e| {
-                    log::error!("Failed to process HTML chunk: {e}");
-                    io::Error::other(format!("HTML processing failed: {e}"))
-                })?;
+        match &mut self.rewriter {
+            Some(rewriter) => {
+                if !chunk.is_empty() {
+                    rewriter.write(chunk).map_err(|e| {
+                        log::error!("Failed to process HTML chunk: {e}");
+                        io::Error::other(format!("HTML processing failed: {e}"))
+                    })?;
+                }
+            }
+            None if !chunk.is_empty() => {
+                log::warn!(
+                    "HtmlRewriterAdapter: {} bytes received after finalization, data will be lost",
+                    chunk.len()
+                );
             }
+            None => {}
         }
 
         if is_last {
@@ -482,7 +496,7 @@ mod tests {
         let settings = Settings::default();
         let mut adapter = HtmlRewriterAdapter::new(settings);
 
-        adapter
+        let result1 = adapter
             .process_chunk(b"<html><body>test</body></html>", false)
             .expect("should process html");
 
@@ -490,13 +504,17 @@ mod tests {
         adapter.reset();
 
         // Finalize still works; the rewriter is still alive
-        let final_output = adapter
+        let result2 = adapter
             .process_chunk(b"", true)
             .expect("should finalize after reset");
 
-        // Output may or may not be empty depending on lol_html buffering,
-        // but it should not error
-        let _ = final_output;
+        let mut all_output = result1;
+        all_output.extend_from_slice(&result2);
+        let output = String::from_utf8(all_output).expect("output should be valid UTF-8");
+        assert!(
+            output.contains("test"),
+            "should produce correct output despite no-op reset"
+        );
     }
 
     #[test]
@@ -696,27 +714,27 @@ mod tests {
         let settings = Settings::default();
         let mut adapter = HtmlRewriterAdapter::new(settings);
 
-        // Send three chunks
-        let chunk1 = b"<html><body>";
+        // Send three chunks — lol_html may buffer internally, so individual
+        // chunk outputs may vary by version. The contract is that concatenated
+        // output is correct, and that output is not deferred entirely to is_last.
         let result1 = adapter
-            .process_chunk(chunk1, false)
+            .process_chunk(b"<html><body>", false)
             .expect("should process chunk1");
-        assert!(
-            !result1.is_empty(),
-            "should emit output for first chunk, got empty"
-        );
-
-        let chunk2 = b"<p>hello</p>";
         let result2 = adapter
-            .process_chunk(chunk2, false)
+            .process_chunk(b"<p>hello</p>", false)
             .expect("should process chunk2");
-
-        let chunk3 = b"</body></html>";
         let result3 = adapter
-            .process_chunk(chunk3, true)
+            .process_chunk(b"</body></html>", true)
             .expect("should process final chunk");
 
-        // Concatenate all outputs and verify correctness
+        // At least one intermediate chunk should produce output (verifies
+        // we're not deferring everything to is_last like the old adapter).
+        assert!(
+            !result1.is_empty() || !result2.is_empty(),
+            "should emit some output before is_last"
+        );
+
+        // Concatenated output must be correct
         let mut all_output = result1;
         all_output.extend_from_slice(&result2);
         all_output.extend_from_slice(&result3);

From 0a4ece7c82480df7e07ef6ace4ea5773dcd0ac02 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 11:32:43 -0700
Subject: [PATCH 22/29] Add active post-processor test and precise flush docs
 per codec

- Add test that feeds multiple chunks through HtmlWithPostProcessing
  with an active post-processor (should_process returns true, mutates
  HTML). Verifies the post-processor receives the complete accumulated
  document and its mutations appear in the output.
- Make flush semantics per-codec explicit in process_chunks doc:
  flate2 needs finish() after flush, brotli is finalized by flush
---
 .../trusted-server-core/src/html_processor.rs | 62 +++++++++++++++++++
 .../src/streaming_processor.rs                |  8 ++-
 2 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index d9840cfb..9e6efafb 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -1105,4 +1105,66 @@ mod tests {
             "streaming and buffered paths should produce identical output"
         );
     }
+
+    #[test]
+    fn active_post_processor_receives_full_document_and_mutates_output() {
+        use crate::streaming_processor::{HtmlRewriterAdapter, StreamProcessor};
+        use lol_html::Settings;
+
+        struct AppendCommentProcessor;
+        impl IntegrationHtmlPostProcessor for AppendCommentProcessor {
+            fn integration_id(&self) -> &'static str {
+                "test-append"
+            }
+            fn should_process(&self, html: &str, _ctx: &IntegrationHtmlContext<'_>) -> bool {
+                html.contains("</html>")
+            }
+            fn post_process(&self, html: &mut String, _ctx: &IntegrationHtmlContext<'_>) -> bool {
+                html.push_str("<!-- processed -->");
+                true
+            }
+        }
+
+        let mut processor = HtmlWithPostProcessing {
+            inner: HtmlRewriterAdapter::new(Settings::default()),
+            post_processors: vec![Arc::new(AppendCommentProcessor)],
+            accumulated_output: Vec::new(),
+            origin_host: String::new(),
+            request_host: String::new(),
+            request_scheme: String::new(),
+            document_state: IntegrationDocumentState::default(),
+        };
+
+        // Feed multiple chunks
+        let r1 = processor
+            .process_chunk(b"<html><body>", false)
+            .expect("should process chunk1");
+        let r2 = processor
+            .process_chunk(b"<p>content</p>", false)
+            .expect("should process chunk2");
+        let r3 = processor
+            .process_chunk(b"</body></html>", true)
+            .expect("should process final chunk");
+
+        // Intermediate chunks return empty (buffered for post-processor)
+        assert!(
+            r1.is_empty() && r2.is_empty(),
+            "should buffer intermediate chunks"
+        );
+
+        // Final chunk contains the full document with post-processor mutation
+        let output = String::from_utf8(r3).expect("should be valid UTF-8");
+        assert!(
+            output.contains("<p>content</p>"),
+            "should contain original content"
+        );
+        assert!(
+            output.contains("</html>"),
+            "should contain complete document"
+        );
+        assert!(
+            output.contains("<!-- processed -->"),
+            "should contain post-processor mutation"
+        );
+    }
 }
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 3915494c..ac226d95 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -196,9 +196,11 @@ impl<P: StreamProcessor> StreamingPipeline<P> {
     /// This is the single unified chunk loop used by all compression paths.
     /// The method calls `writer.flush()` before returning. For the `None → None`
     /// path this is the only finalization needed. For compressed paths, the caller
-    /// must still call the encoder's type-specific finalization (e.g., `finish()`
-    /// for flate2, `into_inner()` for brotli) — `flush()` alone does not write
-    /// compression trailers for all codecs.
+    /// must still call the encoder's type-specific finalization after this returns:
+    /// - **flate2** (`GzEncoder`, `ZlibEncoder`): call `finish()` — `flush()` does
+    ///   not write the gzip/deflate trailer.
+    /// - **brotli** (`CompressorWriter`): `flush()` does finalize the stream, so
+    ///   the caller only needs `into_inner()` to reclaim the writer.
     ///
     /// # Errors
     ///

From 68d11e875754623892bf66b1730f73716d9cea30 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 11:46:21 -0700
Subject: [PATCH 23/29] Fix text node fragmentation regression for script
 rewriters
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

lol_html fragments text nodes across chunk boundaries when fed
incrementally. This breaks script rewriters (NextJS __NEXT_DATA__,
GTM) that expect complete text content — a split domain like
"google" + "tagmanager.com" would silently miss the rewrite.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming mode, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, feeds lol_html in one write()
  call on is_last (script rewriters registered)

create_html_processor selects the mode based on whether
script_rewriters is non-empty. This preserves the old behavior
(single-pass processing) when rewriters need it, while enabling
streaming when they don't.

Add regression test proving lol_html does fragment text across
chunk boundaries, confirming the issue is real.
---
 .../trusted-server-core/src/html_processor.rs |  15 +-
 .../src/streaming_processor.rs                | 135 ++++++++++++++++--
 2 files changed, 137 insertions(+), 13 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 9e6efafb..1839eb59 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -455,6 +455,7 @@ pub fn create_html_processor(config: HtmlProcessorConfig) -> impl StreamProcesso
         }),
     ];
 
+    let has_script_rewriters = !script_rewriters.is_empty();
     for script_rewriter in script_rewriters {
         let selector = script_rewriter.selector();
         let rewriter = script_rewriter.clone();
@@ -492,8 +493,20 @@ pub fn create_html_processor(config: HtmlProcessorConfig) -> impl StreamProcesso
         ..RewriterSettings::default()
     };
 
+    // Use buffered mode when script rewriters are registered. lol_html fragments
+    // text nodes across chunk boundaries during streaming, which breaks rewriters
+    // that expect complete text content (e.g., __NEXT_DATA__, GTM inline scripts).
+    // Buffered mode feeds the entire document to lol_html in one write() call,
+    // preserving text node integrity. When no script rewriters are active,
+    // streaming mode emits output incrementally per chunk.
+    let inner = if has_script_rewriters {
+        HtmlRewriterAdapter::new_buffered(rewriter_settings)
+    } else {
+        HtmlRewriterAdapter::new(rewriter_settings)
+    };
+
     HtmlWithPostProcessing {
-        inner: HtmlRewriterAdapter::new(rewriter_settings),
+        inner,
         post_processors,
         accumulated_output: Vec::new(),
         origin_host: config.origin_host,
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index ac226d95..2ca71bc0 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -275,16 +275,33 @@ impl lol_html::OutputSink for RcVecSink {
 
 /// Adapter to use `lol_html` [`HtmlRewriter`](lol_html::HtmlRewriter) as a [`StreamProcessor`].
 ///
-/// Output is emitted incrementally on every [`StreamProcessor::process_chunk`] call.
+/// Operates in one of two modes:
+///
+/// - **Streaming** (`buffered = false`): output is emitted incrementally on every
+///   [`StreamProcessor::process_chunk`] call. Use when no script rewriters are
+///   registered.
+/// - **Buffered** (`buffered = true`): input is accumulated and processed in a
+///   single `write()` call on `is_last`. Use when script rewriters are registered,
+///   because `lol_html` fragments text nodes across chunk boundaries and rewriters
+///   that expect complete text content (e.g., `__NEXT_DATA__`, GTM) would silently
+///   miss rewrites on split fragments.
+///
 /// The adapter is single-use: one adapter per request. Calling [`StreamProcessor::reset`]
 /// is a no-op because the rewriter consumes its settings on construction.
 pub struct HtmlRewriterAdapter {
     rewriter: Option<lol_html::HtmlRewriter<'static, RcVecSink>>,
     output: Rc<RefCell<Vec<u8>>>,
+    /// When true, input is accumulated and fed to `lol_html` in one pass on `is_last`.
+    buffered: bool,
+    /// Accumulated input for the buffered path.
+    accumulated_input: Vec<u8>,
 }
 
 impl HtmlRewriterAdapter {
     /// Create a new HTML rewriter adapter that streams output per chunk.
+    ///
+    /// Use [`Self::new_buffered`] when script rewriters are registered to
+    /// avoid text node fragmentation.
     #[must_use]
     pub fn new(settings: lol_html::Settings<'static, 'static>) -> Self {
         let output = Rc::new(RefCell::new(Vec::new()));
@@ -293,28 +310,69 @@ impl HtmlRewriterAdapter {
         Self {
             rewriter: Some(rewriter),
             output,
+            buffered: false,
+            accumulated_input: Vec::new(),
+        }
+    }
+
+    /// Create a new HTML rewriter adapter that buffers all input before processing.
+    ///
+    /// This avoids `lol_html` text node fragmentation that breaks script rewriters
+    /// expecting complete text content. The entire document is fed to the rewriter
+    /// in a single `write()` call when `is_last` is true.
+    #[must_use]
+    pub fn new_buffered(settings: lol_html::Settings<'static, 'static>) -> Self {
+        let output = Rc::new(RefCell::new(Vec::new()));
+        let sink = RcVecSink(Rc::clone(&output));
+        let rewriter = lol_html::HtmlRewriter::new(settings, sink);
+        Self {
+            rewriter: Some(rewriter),
+            output,
+            buffered: true,
+            accumulated_input: Vec::new(),
         }
     }
 }
 
 impl StreamProcessor for HtmlRewriterAdapter {
     fn process_chunk(&mut self, chunk: &[u8], is_last: bool) -> Result<Vec<u8>, io::Error> {
-        match &mut self.rewriter {
-            Some(rewriter) => {
-                if !chunk.is_empty() {
-                    rewriter.write(chunk).map_err(|e| {
-                        log::error!("Failed to process HTML chunk: {e}");
+        if self.buffered {
+            // Buffered mode: accumulate input, process all at once on is_last.
+            if !chunk.is_empty() {
+                self.accumulated_input.extend_from_slice(chunk);
+            }
+            if !is_last {
+                return Ok(Vec::new());
+            }
+            // Feed entire document to lol_html in one pass
+            if let Some(rewriter) = &mut self.rewriter {
+                if !self.accumulated_input.is_empty() {
+                    let input = std::mem::take(&mut self.accumulated_input);
+                    rewriter.write(&input).map_err(|e| {
+                        log::error!("Failed to process HTML: {e}");
                         io::Error::other(format!("HTML processing failed: {e}"))
                     })?;
                 }
             }
-            None if !chunk.is_empty() => {
-                log::warn!(
-                    "HtmlRewriterAdapter: {} bytes received after finalization, data will be lost",
-                    chunk.len()
-                );
+        } else {
+            // Streaming mode: feed chunks to lol_html incrementally.
+            match &mut self.rewriter {
+                Some(rewriter) => {
+                    if !chunk.is_empty() {
+                        rewriter.write(chunk).map_err(|e| {
+                            log::error!("Failed to process HTML chunk: {e}");
+                            io::Error::other(format!("HTML processing failed: {e}"))
+                        })?;
+                    }
+                }
+                None if !chunk.is_empty() => {
+                    log::warn!(
+                        "HtmlRewriterAdapter: {} bytes received after finalization, data will be lost",
+                        chunk.len()
+                    );
+                }
+                None => {}
             }
-            None => {}
         }
 
         if is_last {
@@ -352,6 +410,59 @@ mod tests {
     use super::*;
     use crate::streaming_replacer::{Replacement, StreamingReplacer};
 
+    /// Verify that `lol_html` fragments text nodes when input chunks split
+    /// mid-text-node. This is critical: if `lol_html` does fragment, then
+    /// script rewriters (`NextJS` `__NEXT_DATA__`, `GTM`) that expect full
+    /// text content will silently miss rewrites when the streaming adapter
+    /// feeds chunks incrementally.
+    #[test]
+    fn lol_html_fragments_text_across_chunk_boundaries() {
+        use std::cell::RefCell;
+        use std::rc::Rc;
+
+        let fragments: Rc<RefCell<Vec<(String, bool)>>> = Rc::new(RefCell::new(Vec::new()));
+        let fragments_clone = Rc::clone(&fragments);
+
+        let mut rewriter = lol_html::HtmlRewriter::new(
+            lol_html::Settings {
+                element_content_handlers: vec![lol_html::text!("script", move |text| {
+                    fragments_clone
+                        .borrow_mut()
+                        .push((text.as_str().to_string(), text.last_in_text_node()));
+                    Ok(())
+                })],
+                ..lol_html::Settings::default()
+            },
+            |_chunk: &[u8]| {},
+        );
+
+        // Split "googletagmanager.com/gtm.js" across two chunks
+        rewriter
+            .write(b"<script>google")
+            .expect("should write chunk1");
+        rewriter
+            .write(b"tagmanager.com/gtm.js</script>")
+            .expect("should write chunk2");
+        rewriter.end().expect("should end");
+
+        let frags = fragments.borrow();
+        // lol_html should emit at least 2 text fragments since input was split
+        assert!(
+            frags.len() >= 2,
+            "should fragment text across chunk boundaries, got {} fragments: {:?}",
+            frags.len(),
+            *frags
+        );
+        // No single fragment should contain the full domain
+        assert!(
+            !frags
+                .iter()
+                .any(|(text, _)| text.contains("googletagmanager.com")),
+            "no individual fragment should contain the full domain when split across chunks: {:?}",
+            *frags
+        );
+    }
+
     #[test]
     fn test_uncompressed_pipeline() {
         let replacer = StreamingReplacer::new(vec![Replacement {

From 6faeea0190099e7c347b25dfd64727d9639e18cb Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 13:59:51 -0700
Subject: [PATCH 24/29] Gate streaming adapter on script rewriter presence

lol_html fragments text nodes across input chunk boundaries. Script
rewriters (NextJS __NEXT_DATA__, GTM) expect complete text content
and would silently miss rewrites on split fragments.

Add dual-mode HtmlRewriterAdapter:
- new(): streaming, emits output per chunk (no script rewriters)
- new_buffered(): accumulates input, single write() on is_last

create_html_processor selects mode based on script_rewriters. This
preserves correctness while enabling streaming for configs without
script rewriters. Phase 3 will make rewriters fragment-safe.

Add regression test proving lol_html does fragment text nodes.
---
 .../trusted-server-core/src/html_processor.rs |  9 ++++-----
 .../src/streaming_processor.rs                | 19 +++++++++----------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/crates/trusted-server-core/src/html_processor.rs b/crates/trusted-server-core/src/html_processor.rs
index 1839eb59..079681db 100644
--- a/crates/trusted-server-core/src/html_processor.rs
+++ b/crates/trusted-server-core/src/html_processor.rs
@@ -494,11 +494,10 @@ pub fn create_html_processor(config: HtmlProcessorConfig) -> impl StreamProcesso
     };
 
     // Use buffered mode when script rewriters are registered. lol_html fragments
-    // text nodes across chunk boundaries during streaming, which breaks rewriters
-    // that expect complete text content (e.g., __NEXT_DATA__, GTM inline scripts).
-    // Buffered mode feeds the entire document to lol_html in one write() call,
-    // preserving text node integrity. When no script rewriters are active,
-    // streaming mode emits output incrementally per chunk.
+    // text nodes across input chunk boundaries, breaking rewriters that expect
+    // complete text (e.g., __NEXT_DATA__, GTM). Buffered mode feeds the entire
+    // document in one write() call, preserving text node integrity.
+    // Phase 3 will make rewriters fragment-safe, enabling streaming for all configs.
     let inner = if has_script_rewriters {
         HtmlRewriterAdapter::new_buffered(rewriter_settings)
     } else {
diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index 2ca71bc0..a65958dc 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -277,14 +277,14 @@ impl lol_html::OutputSink for RcVecSink {
 ///
 /// Operates in one of two modes:
 ///
-/// - **Streaming** (`buffered = false`): output is emitted incrementally on every
-///   [`StreamProcessor::process_chunk`] call. Use when no script rewriters are
-///   registered.
-/// - **Buffered** (`buffered = true`): input is accumulated and processed in a
-///   single `write()` call on `is_last`. Use when script rewriters are registered,
-///   because `lol_html` fragments text nodes across chunk boundaries and rewriters
-///   that expect complete text content (e.g., `__NEXT_DATA__`, GTM) would silently
-///   miss rewrites on split fragments.
+/// - **Streaming** ([`new`](Self::new)): output is emitted incrementally on every
+///   [`process_chunk`](StreamProcessor::process_chunk) call. Use when no script
+///   rewriters are registered.
+/// - **Buffered** ([`new_buffered`](Self::new_buffered)): input is accumulated and
+///   processed in a single `write()` call on `is_last`. Use when script rewriters
+///   are registered, because `lol_html` fragments text nodes across chunk boundaries
+///   and rewriters that expect complete text content would silently miss rewrites on
+///   split fragments. (See Phase 3 plan for making rewriters fragment-safe.)
 ///
 /// The adapter is single-use: one adapter per request. Calling [`StreamProcessor::reset`]
 /// is a no-op because the rewriter consumes its settings on construction.
@@ -344,7 +344,6 @@ impl StreamProcessor for HtmlRewriterAdapter {
             if !is_last {
                 return Ok(Vec::new());
             }
-            // Feed entire document to lol_html in one pass
             if let Some(rewriter) = &mut self.rewriter {
                 if !self.accumulated_input.is_empty() {
                     let input = std::mem::take(&mut self.accumulated_input);
@@ -355,7 +354,7 @@ impl StreamProcessor for HtmlRewriterAdapter {
                 }
             }
         } else {
-            // Streaming mode: feed chunks to lol_html incrementally.
+            // Streaming mode: feed chunks to `lol_html` incrementally.
             match &mut self.rewriter {
                 Some(rewriter) => {
                     if !chunk.is_empty() {

From 73c992e8b8b3fe13995858c140fabc570e624e32 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 14:01:49 -0700
Subject: [PATCH 25/29] Document text node fragmentation workaround and Phase 3
 plan

Add section to spec explaining the lol_html text fragmentation issue,
the dual-mode HtmlRewriterAdapter workaround (Phase 1), and the
planned fix to make script rewriters fragment-safe (Phase 3, #584).
---
 .../2026-03-25-streaming-response-design.md      | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 72716b73..c42afd5c 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -239,6 +239,22 @@ during streaming — they do not require buffering and are unaffected by this
 change. The streaming gate checks only `html_post_processors().is_empty()`, not
 script rewriters. Currently only Next.js registers a post-processor.
 
+## Text Node Fragmentation (Phase 3)
+
+`lol_html` fragments text nodes across input chunk boundaries when processing
+HTML incrementally. Script rewriters (`NextJsNextDataRewriter`,
+`GoogleTagManagerIntegration`) expect complete text content — if a domain string
+is split across chunks, the rewrite silently fails.
+
+**Phase 1 workaround**: `HtmlRewriterAdapter` has two modes. `new()` streams
+per chunk (no script rewriters). `new_buffered()` accumulates input and
+processes in one `write()` call (script rewriters registered).
+`create_html_processor` selects the mode automatically.
+
+**Phase 3** will make each script rewriter fragment-safe by accumulating text
+fragments internally via `is_last_in_text_node`. This removes the buffered
+fallback and enables streaming for all configurations. See #584.
+
 ## Rollback Strategy
 
 The `#[fastly::main]` to raw `main()` migration is a structural change. If

From 75f455acc37f7aebd23c0ba67639f1bdba443faa Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Thu, 26 Mar 2026 14:06:20 -0700
Subject: [PATCH 26/29] Add buffered mode guard, anti-fragmentation test, and
 fix stale spec

- Add post-finalization warning to buffered path (was only in streaming)
- Add buffered_adapter_prevents_text_fragmentation test proving
  new_buffered() delivers complete text to lol_html handlers
- Fix spec: html_processor.rs is changed (selects adapter mode), and
  script_rewriters do require buffered mode (not "unaffected")
---
 .../src/streaming_processor.rs                | 60 ++++++++++++++++++-
 .../2026-03-25-streaming-response-design.md   | 16 +++--
 2 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/crates/trusted-server-core/src/streaming_processor.rs b/crates/trusted-server-core/src/streaming_processor.rs
index a65958dc..5a4ea290 100644
--- a/crates/trusted-server-core/src/streaming_processor.rs
+++ b/crates/trusted-server-core/src/streaming_processor.rs
@@ -339,7 +339,14 @@ impl StreamProcessor for HtmlRewriterAdapter {
         if self.buffered {
             // Buffered mode: accumulate input, process all at once on is_last.
             if !chunk.is_empty() {
-                self.accumulated_input.extend_from_slice(chunk);
+                if self.rewriter.is_none() {
+                    log::warn!(
+                        "HtmlRewriterAdapter: {} bytes received after finalization, data will be lost",
+                        chunk.len()
+                    );
+                } else {
+                    self.accumulated_input.extend_from_slice(chunk);
+                }
             }
             if !is_last {
                 return Ok(Vec::new());
@@ -462,6 +469,57 @@ mod tests {
         );
     }
 
+    /// Companion to [`lol_html_fragments_text_across_chunk_boundaries`]:
+    /// proves that `new_buffered()` prevents fragmentation by feeding the
+    /// entire document to `lol_html` in one `write()` call.
+    #[test]
+    fn buffered_adapter_prevents_text_fragmentation() {
+        use std::cell::RefCell;
+        use std::rc::Rc;
+
+        let fragments: Rc<RefCell<Vec<(String, bool)>>> = Rc::new(RefCell::new(Vec::new()));
+        let fragments_clone = Rc::clone(&fragments);
+
+        let settings = lol_html::Settings {
+            element_content_handlers: vec![lol_html::text!("script", move |text| {
+                fragments_clone
+                    .borrow_mut()
+                    .push((text.as_str().to_string(), text.last_in_text_node()));
+                Ok(())
+            })],
+            ..lol_html::Settings::default()
+        };
+
+        let mut adapter = HtmlRewriterAdapter::new_buffered(settings);
+
+        // Feed the same split chunks as the fragmentation test
+        let r1 = adapter
+            .process_chunk(b"<script>google", false)
+            .expect("should process chunk1");
+        assert!(
+            r1.is_empty(),
+            "buffered adapter should return empty before is_last"
+        );
+
+        let r2 = adapter
+            .process_chunk(b"tagmanager.com/gtm.js</script>", true)
+            .expect("should process chunk2");
+        assert!(
+            !r2.is_empty(),
+            "buffered adapter should emit output on is_last"
+        );
+
+        let frags = fragments.borrow();
+        // With buffered mode, the text handler should see the complete string
+        assert!(
+            frags
+                .iter()
+                .any(|(text, _)| text.contains("googletagmanager.com")),
+            "buffered adapter should deliver complete text to handler, got: {:?}",
+            *frags
+        );
+    }
+
     #[test]
     fn test_uncompressed_pipeline() {
         let replacer = StreamingReplacer::new(vec![Replacement {
diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index c42afd5c..034624b5 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -224,9 +224,10 @@ headers are sent, we are committed.
 | `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming | Medium |
 | `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to undecorated `main()` with `Request::from_client()`; explicit error handling via `to_error_response().send_to_client()`; call `finalize_response()` before streaming | Medium |
 
-**Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
-`HtmlRewriterAdapter`, works as-is), integration registration, JS build
-pipeline, tsjs module serving, auction handler, cookie/synthetic ID logic.
+**Minimal changes**: `html_processor.rs` now selects `HtmlRewriterAdapter` mode
+based on script rewriter presence (see [Text Node Fragmentation](#text-node-fragmentation-phase-3)),
+but is otherwise unchanged. Integration registration, JS build pipeline, tsjs
+module serving, auction handler, cookie/synthetic ID logic are not changed.
 
 Note: `HtmlWithPostProcessing` wraps `HtmlRewriterAdapter` and applies
 post-processors on `is_last`. In the streaming path the post-processor list is
@@ -235,9 +236,12 @@ remains in place — no need to bypass it.
 
 Clarification: `script_rewriters` (used by Next.js and GTM) are distinct from
 `html_post_processors`. Script rewriters run inside `lol_html` element handlers
-during streaming — they do not require buffering and are unaffected by this
-change. The streaming gate checks only `html_post_processors().is_empty()`, not
-script rewriters. Currently only Next.js registers a post-processor.
+and currently require buffered mode because `lol_html` fragments text nodes
+across chunk boundaries (see [Phase 3](#text-node-fragmentation-phase-3)).
+`html_post_processors` require the full document for post-processing.
+The streaming gate checks `html_post_processors().is_empty()` for the
+post-processor path; `create_html_processor` separately gates the adapter mode
+on `script_rewriters`. Currently only Next.js registers a post-processor.
 
 ## Text Node Fragmentation (Phase 3)
 

From 94f238a337f22d0fb9b98170204c41815f79a154 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Tue, 31 Mar 2026 13:38:29 -0700
Subject: [PATCH 27/29] Address PR review feedback on streaming response spec

- Replace html_post_processors().is_empty() with has_html_post_processors()
  to avoid allocating Vec<Arc<...>> in the streaming gate check
- Add step to implement has_html_post_processors() on IntegrationRegistry
- Add EC implementation coordination note on handle_publisher_request
  restructuring step
- Renumber Phase 2 Task 8 steps accordingly
---
 .../plans/2026-03-25-streaming-response.md    | 33 +++++++++++++++----
 .../2026-03-25-streaming-response-design.md   | 24 +++++++-------
 2 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/docs/superpowers/plans/2026-03-25-streaming-response.md b/docs/superpowers/plans/2026-03-25-streaming-response.md
index 268517b8..1c547565 100644
--- a/docs/superpowers/plans/2026-03-25-streaming-response.md
+++ b/docs/superpowers/plans/2026-03-25-streaming-response.md
@@ -821,7 +821,20 @@ In `main.rs`, make `finalize_response` callable from the publisher path.
 Either make it `pub` and move to `trusted-server-core`, or pass a
 pre-finalized response to the streaming path.
 
-- [ ] **Step 2: Add streaming gate check**
+- [ ] **Step 2: Add `has_html_post_processors()` to `IntegrationRegistry`**
+
+Add a method that returns `bool` to avoid the allocation that
+`html_post_processors()` incurs (cloning `Vec<Arc<dyn ...>>`):
+
+```rust
+pub fn has_html_post_processors(&self) -> bool {
+    !self.inner.html_post_processors.is_empty()
+}
+```
+
+**File:** `crates/trusted-server-core/src/integrations/registry.rs`
+
+- [ ] **Step 3: Add streaming gate check**
 
 Add a helper in `publisher.rs`:
 
@@ -834,19 +847,25 @@ fn should_stream(
     if !(200..300).contains(&status) {
         return false;
     }
+    // Use has_html_post_processors() to avoid allocating a Vec<Arc<...>>
+    // just to check emptiness.
     // Only html_post_processors gate streaming — NOT script_rewriters.
     // Script rewriters (Next.js, GTM) run inside lol_html element handlers
     // during streaming and do not require full-document buffering.
     // Currently only Next.js registers a post-processor.
     let is_html = content_type.contains("text/html");
-    if is_html && !integration_registry.html_post_processors().is_empty() {
+    if is_html && integration_registry.has_html_post_processors() {
         return false;
     }
     true
 }
 ```
 
-- [ ] **Step 3: Restructure `handle_publisher_request` to support streaming**
+- [ ] **Step 4: Restructure `handle_publisher_request` to support streaming**
+
+> **Note:** This step may need adjustment to align with the EC (Edge Compute)
+> implementation. Coordinate with the EC work before finalizing the
+> restructuring approach.
 
 Split the function into:
 1. Pre-processing: request info, cookies, synthetic ID, consent, backend
@@ -879,7 +898,7 @@ if should_stream {
 }
 ```
 
-- [ ] **Step 4: Handle binary pass-through in streaming path**
+- [ ] **Step 5: Handle binary pass-through in streaming path**
 
 For non-text content when streaming is enabled:
 
@@ -895,19 +914,19 @@ if !should_process {
 }
 ```
 
-- [ ] **Step 5: Run all tests**
+- [ ] **Step 6: Run all tests**
 
 Run: `cargo test --workspace`
 
 Expected: All tests pass.
 
-- [ ] **Step 6: Build for WASM target**
+- [ ] **Step 7: Build for WASM target**
 
 Run: `cargo build --package trusted-server-adapter-fastly --release --target wasm32-wasip1`
 
 Expected: Builds successfully.
 
-- [ ] **Step 7: Commit**
+- [ ] **Step 8: Commit**
 
 ```
 git add crates/trusted-server-core/src/publisher.rs \
diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 72716b73..f132136f 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -27,10 +27,12 @@ JS, auction, discovery).
 Before committing to `stream_to_client()`, check:
 
 1. Backend status is success (2xx).
-2. For HTML content: `html_post_processors()` is empty — no registered
-   post-processors. Non-HTML content types (text/JSON, RSC Flight, binary) can
-   always stream regardless of post-processor registration, since
-   post-processors only apply to HTML.
+2. For HTML content: `has_html_post_processors()` returns false — no registered
+   post-processors. This method returns a `bool` directly, avoiding the
+   allocation of cloning the `Vec<Arc<dyn IntegrationHtmlPostProcessor>>` that
+   `html_post_processors()` performs. Non-HTML content types (text/JSON, RSC
+   Flight, binary) can always stream regardless of post-processor registration,
+   since post-processors only apply to HTML.
 
 If either check fails for the given content type, fall back to the current
 buffered path. This keeps the optimization transparent: same behavior for all
@@ -100,8 +102,8 @@ Change the publisher proxy path to use Fastly's `StreamingBody` API:
 1. Fetch from origin, receive response headers.
 2. Validate status — if backend error, return buffered error response via
    `send_to_client()`.
-3. Check streaming gate — if `html_post_processors()` is non-empty, fall back
-   to buffered path.
+3. Check streaming gate — if `has_html_post_processors()` returns true, fall
+   back to buffered path.
 4. Finalize all response headers. This requires reordering two things:
    - **Synthetic ID/cookie headers**: today set _after_ body processing in
      `handle_publisher_request`. Since they are body-independent (computed from
@@ -186,7 +188,7 @@ No decompression, no processing. Body streams through as read.
 ### Buffered fallback path (error responses or post-processors present)
 
 ```
-Origin returns 4xx/5xx OR html_post_processors() is non-empty
+Origin returns 4xx/5xx OR has_html_post_processors() is true
   → Current buffered path unchanged
   → send_to_client() with proper status and full body
 ```
@@ -236,8 +238,8 @@ remains in place — no need to bypass it.
 Clarification: `script_rewriters` (used by Next.js and GTM) are distinct from
 `html_post_processors`. Script rewriters run inside `lol_html` element handlers
 during streaming — they do not require buffering and are unaffected by this
-change. The streaming gate checks only `html_post_processors().is_empty()`, not
-script rewriters. Currently only Next.js registers a post-processor.
+change. The streaming gate checks only `has_html_post_processors()`, not script
+rewriters. Currently only Next.js registers a post-processor.
 
 ## Rollback Strategy
 
@@ -260,9 +262,9 @@ improvements.
 
 ### Integration tests (publisher.rs)
 
-- Streaming gate: when `html_post_processors()` is non-empty, response is
+- Streaming gate: when `has_html_post_processors()` is true, response is
   buffered.
-- Streaming gate: when `html_post_processors()` is empty, response streams.
+- Streaming gate: when `has_html_post_processors()` is false, response streams.
 - Backend error (4xx/5xx) returns buffered error response with correct status.
 - Binary content passes through without processing.
 

From 1f2091dc8e43c69cd88b94a83017828227322d40 Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Tue, 31 Mar 2026 13:39:27 -0700
Subject: [PATCH 28/29] Move EC coordination note to Phase 2 / Step 2 level

Both review comments apply to Phase 2 as a whole, not individual steps.
Move the EC implementation note to the Phase 2 header in the plan and
the Step 2 header in the spec.
---
 docs/superpowers/plans/2026-03-25-streaming-response.md    | 7 +++----
 .../specs/2026-03-25-streaming-response-design.md          | 3 +++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/superpowers/plans/2026-03-25-streaming-response.md b/docs/superpowers/plans/2026-03-25-streaming-response.md
index 1c547565..b9545813 100644
--- a/docs/superpowers/plans/2026-03-25-streaming-response.md
+++ b/docs/superpowers/plans/2026-03-25-streaming-response.md
@@ -652,6 +652,9 @@ Expected: Builds successfully.
 
 ## Phase 2: Stream Response to Client
 
+> **Note:** Phase 2 may need adjustment to align with the EC (Edge Compute)
+> implementation. Coordinate with the EC work before finalizing the approach.
+
 ### Task 6: Migrate entry point from `#[fastly::main]` to raw `main()`
 
 **Files:**
@@ -863,10 +866,6 @@ fn should_stream(
 
 - [ ] **Step 4: Restructure `handle_publisher_request` to support streaming**
 
-> **Note:** This step may need adjustment to align with the EC (Edge Compute)
-> implementation. Coordinate with the EC work before finalizing the
-> restructuring approach.
-
 Split the function into:
 1. Pre-processing: request info, cookies, synthetic ID, consent, backend
    request — everything before `response.take_body()`
diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index f132136f..9465f87d 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -97,6 +97,9 @@ before or with Step 1B.
 
 ### Step 2: Stream response to client
 
+> **Note:** Step 2 may need adjustment to align with the EC (Edge Compute)
+> implementation. Coordinate with the EC work before finalizing the approach.
+
 Change the publisher proxy path to use Fastly's `StreamingBody` API:
 
 1. Fetch from origin, receive response headers.

From d00fc5db80636bdd3739a27acfb8bf3ebc51632f Mon Sep 17 00:00:00 2001
From: Aram Grigoryan <132480+aram356@users.noreply.github.com>
Date: Tue, 31 Mar 2026 13:43:18 -0700
Subject: [PATCH 29/29] Formatting

---
 .../plans/2026-03-25-streaming-response.md    | 21 +++++++---
 .../2026-03-25-streaming-response-design.md   | 38 ++++++++++---------
 2 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/docs/superpowers/plans/2026-03-25-streaming-response.md b/docs/superpowers/plans/2026-03-25-streaming-response.md
index b9545813..3311b28f 100644
--- a/docs/superpowers/plans/2026-03-25-streaming-response.md
+++ b/docs/superpowers/plans/2026-03-25-streaming-response.md
@@ -25,11 +25,11 @@ rewriting), `flate2` (gzip/deflate), `brotli` (brotli compression).
 
 ## File Map
 
-| File | Role | Phase |
-|------|------|-------|
-| `crates/trusted-server-core/src/streaming_processor.rs` | `HtmlRewriterAdapter` rewrite, compression path fixes, encoder finalization | 1 |
-| `crates/trusted-server-core/src/publisher.rs` | `process_response_streaming` refactor to `W: Write`, streaming gate, header reordering | 2 |
-| `crates/trusted-server-adapter-fastly/src/main.rs` | Entry point migration from `#[fastly::main]` to raw `main()`, response routing | 2 |
+| File                                                    | Role                                                                                   | Phase |
+| ------------------------------------------------------- | -------------------------------------------------------------------------------------- | ----- |
+| `crates/trusted-server-core/src/streaming_processor.rs` | `HtmlRewriterAdapter` rewrite, compression path fixes, encoder finalization            | 1     |
+| `crates/trusted-server-core/src/publisher.rs`           | `process_response_streaming` refactor to `W: Write`, streaming gate, header reordering | 2     |
+| `crates/trusted-server-adapter-fastly/src/main.rs`      | Entry point migration from `#[fastly::main]` to raw `main()`, response routing         | 2     |
 
 ---
 
@@ -42,6 +42,7 @@ This is the prerequisite for Task 2. The current code calls `flush()` then
 moving gzip to this path.
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/streaming_processor.rs:334-393`
 - Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
 
@@ -202,6 +203,7 @@ git commit -m "Fix encoder finalization: explicit finish instead of drop"
 ### Task 2: Convert `process_gzip_to_gzip` to chunk-based processing
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/streaming_processor.rs:183-225`
 - Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
 
@@ -303,6 +305,7 @@ git commit -m "Convert process_gzip_to_gzip to chunk-based processing"
 ### Task 3: Convert `decompress_and_process` to chunk-based processing
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/streaming_processor.rs:227-262`
 - Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
 
@@ -441,6 +444,7 @@ git commit -m "Convert decompress_and_process to chunk-based processing"
 ### Task 4: Rewrite `HtmlRewriterAdapter` for incremental streaming
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/streaming_processor.rs:396-472`
 - Test: `crates/trusted-server-core/src/streaming_processor.rs` (test module)
 
@@ -658,6 +662,7 @@ Expected: Builds successfully.
 ### Task 6: Migrate entry point from `#[fastly::main]` to raw `main()`
 
 **Files:**
+
 - Modify: `crates/trusted-server-adapter-fastly/src/main.rs:32-68`
 
 - [ ] **Step 1: Rewrite `main` function**
@@ -737,6 +742,7 @@ git commit -m "Migrate entry point from #[fastly::main] to raw main()"
 ### Task 7: Refactor `process_response_streaming` to accept `W: Write`
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/publisher.rs:97-180`
 
 - [ ] **Step 1: Change signature to accept generic writer**
@@ -792,6 +798,7 @@ git commit -m "Refactor process_response_streaming to accept generic writer"
 ### Task 8: Add streaming path to publisher proxy
 
 **Files:**
+
 - Modify: `crates/trusted-server-core/src/publisher.rs`
 - Modify: `crates/trusted-server-adapter-fastly/src/main.rs`
 
@@ -867,6 +874,7 @@ fn should_stream(
 - [ ] **Step 4: Restructure `handle_publisher_request` to support streaming**
 
 Split the function into:
+
 1. Pre-processing: request info, cookies, synthetic ID, consent, backend
    request — everything before `response.take_body()`
 2. Header finalization: synthetic ID/cookie headers, `finalize_response()`
@@ -875,6 +883,7 @@ Split the function into:
    (`StreamingBody`)
 
 The streaming path in the fastly adapter:
+
 ```rust
 // After header finalization, before body processing:
 if should_stream {
@@ -964,6 +973,7 @@ Expected: Builds.
 Run: `fastly compute serve`
 
 Test:
+
 - `curl -s http://localhost:7676/ | sha256sum` — compare with baseline
 - `curl -sI http://localhost:7676/` — verify headers present (geo, version,
   synthetic ID cookie if consent configured)
@@ -995,6 +1005,7 @@ Repeat the same measurements after building the feature branch.
 
 Create a comparison table and save to PR description or a results file.
 Check for:
+
 - TTLB improvement (primary goal)
 - No TTFB regression
 - Identical response body hash (correctness)
diff --git a/docs/superpowers/specs/2026-03-25-streaming-response-design.md b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
index 9465f87d..b0a8d9c2 100644
--- a/docs/superpowers/specs/2026-03-25-streaming-response-design.md
+++ b/docs/superpowers/specs/2026-03-25-streaming-response-design.md
@@ -223,11 +223,11 @@ headers are sent, we are committed.
 
 ## Files Changed
 
-| File                                                    | Change                                                                                                                                                                                                | Risk   |
-| ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ |
-| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); convert all compression paths to chunk-based processing (`process_gzip_to_gzip` and `decompress_and_process`); fix `process_through_compression` to call `finish()` explicitly | High |
-| `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming | Medium |
-| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to undecorated `main()` with `Request::from_client()`; explicit error handling via `to_error_response().send_to_client()`; call `finalize_response()` before streaming | Medium |
+| File                                                    | Change                                                                                                                                                                                                                                                     | Risk   |
+| ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ |
+| `crates/trusted-server-core/src/streaming_processor.rs` | Rewrite `HtmlRewriterAdapter` to stream incrementally (becomes single-use); convert all compression paths to chunk-based processing (`process_gzip_to_gzip` and `decompress_and_process`); fix `process_through_compression` to call `finish()` explicitly | High   |
+| `crates/trusted-server-core/src/publisher.rs`           | Refactor `process_response_streaming` to accept `W: Write` instead of hardcoding `Vec<u8>`; split `handle_publisher_request` into streaming vs buffered paths; reorder synthetic ID/cookie logic before streaming                                          | Medium |
+| `crates/trusted-server-adapter-fastly/src/main.rs`      | Migrate from `#[fastly::main]` to undecorated `main()` with `Request::from_client()`; explicit error handling via `to_error_response().send_to_client()`; call `finalize_response()` before streaming                                                      | Medium |
 
 **Not changed**: `html_processor.rs` (builds lol_html `Settings` passed to
 `HtmlRewriterAdapter`, works as-is), integration registration, JS build
@@ -309,14 +309,14 @@ branch, then compare.
 
 Repeat the same steps on the feature branch. Compare:
 
-| Metric | Source | Expected change |
-|--------|--------|-----------------|
-| TTFB (document) | Network timing | Minimal change (gated by backend response time) |
-| Time to last byte | Network timing (`responseEnd`) | Reduced — body streams incrementally |
-| LCP | Lighthouse | Improved — browser receives `<head>` resources sooner |
-| Speed Index | Lighthouse | Improved — progressive rendering starts earlier |
-| Transfer size | Network timing | Unchanged (same content, same compression) |
-| Response body hash | `evaluate_script` with hash | Identical — correctness check |
+| Metric             | Source                         | Expected change                                       |
+| ------------------ | ------------------------------ | ----------------------------------------------------- |
+| TTFB (document)    | Network timing                 | Minimal change (gated by backend response time)       |
+| Time to last byte  | Network timing (`responseEnd`) | Reduced — body streams incrementally                  |
+| LCP                | Lighthouse                     | Improved — browser receives `<head>` resources sooner |
+| Speed Index        | Lighthouse                     | Improved — progressive rendering starts earlier       |
+| Transfer size      | Network timing                 | Unchanged (same content, same compression)            |
+| Response body hash | `evaluate_script` with hash    | Identical — correctness check                         |
 
 #### Automated comparison script
 
@@ -325,11 +325,13 @@ correctness verification:
 
 ```js
 // Run via evaluate_script after page load
-const response = await fetch(location.href);
-const buffer = await response.arrayBuffer();
-const hash = await crypto.subtle.digest('SHA-256', buffer);
-const hex = [...new Uint8Array(hash)].map(b => b.toString(16).padStart(2, '0')).join('');
-hex; // compare this between baseline and feature branch
+const response = await fetch(location.href)
+const buffer = await response.arrayBuffer()
+const hash = await crypto.subtle.digest('SHA-256', buffer)
+const hex = [...new Uint8Array(hash)]
+  .map((b) => b.toString(16).padStart(2, '0'))
+  .join('')
+hex // compare this between baseline and feature branch
 ```
 
 #### What to watch for