Skip to content

Phase 2: Stream responses to client via StreamingBody#585

Open
aram356 wants to merge 11 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase2
Open

Phase 2: Stream responses to client via StreamingBody#585
aram356 wants to merge 11 commits intospecs/streaming-response-optimizationfrom
feature/streaming-pipeline-phase2

Conversation

@aram356
Copy link
Copy Markdown
Collaborator

@aram356 aram356 commented Mar 26, 2026

Summary

Stream HTTP responses directly to the client via Fastly's StreamingBody API when Next.js is disabled. This eliminates full-body buffering on the proxy path, reducing peak memory from ~4x response size to constant and improving TTFB.

Closes #573, closes #574, closes #575, closes #576.
Part of epic #563. Depends on Phase 1 (#583).

What changed

Entry point migration (main.rs):

  • Replaced #[fastly::main] with undecorated main() using Request::from_client()
  • route_request returns Option<Response>None when the streaming path already sent the response via stream_to_client()
  • All non-streaming routes use explicit send_to_client()

process_response_streaming now generic over W: Write (publisher.rs):

  • Changed from returning Body (internal Vec<u8>) to writing into &mut W
  • Enables passing StreamingBody directly as the output sink
  • Deduplicated PipelineConfig creation and content-encoding extraction

Streaming path via PublisherResponse enum (publisher.rs):

  • handle_publisher_request returns PublisherResponse::Stream or PublisherResponse::Buffered
  • Streaming gate: should_process && !request_host.is_empty() && (!is_html || !has_post_processors)
  • Synthetic ID / cookie headers set before body processing (body-independent)
  • Content-Length removed before streaming (chunked transfer)
  • stream_publisher_body() public API bridges core ↔ adapter
  • Error handling: pre-stream errors → send_to_client() with status; mid-stream → log and drop(streaming_body) (abort)

Files changed

File Lines What
main.rs +56 -15 Entry point migration, streaming dispatch
publisher.rs +185 -120 PublisherResponse enum, W: Write refactor, streaming gate

Task 10 (Chrome DevTools metrics) — deferred

Requires a running publisher origin to measure TTFB/TTLB. Local dev uses localhost:9090 which has no mock server. Metrics capture deferred to staging deployment. See #577.

Verification

  • cargo test --workspace — 754 passed, 0 failed
  • cargo clippy --workspace --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • npx vitest run — 282 passed
  • cargo build --release --target wasm32-wasip1 — success

Test plan

aram356 added 5 commits March 26, 2026 14:29
Replace #[fastly::main] with an undecorated main() that calls
Request::from_client() and explicitly sends responses via
send_to_client(). This is required for Phase 2's stream_to_client()
support — #[fastly::main] auto-calls send_to_client() on the
returned Response, which is incompatible with streaming.

The program still compiles to wasm32-wasip1 and runs on Fastly
Compute — #[fastly::main] was just syntactic sugar.

Also simplifies route_request to return Response directly instead
of Result<Response, Error>, since it already converts all errors
to HTTP responses internally.
Change signature from returning Body (with internal Vec<u8>) to
writing into a generic &mut W: Write parameter. This enables
Task 8 to pass StreamingBody directly as the output sink.

The call site in handle_publisher_request passes &mut Vec<u8>
for now, preserving the buffered behavior until the streaming
path is wired up.
Split handle_publisher_request into streaming and buffered paths
based on the streaming gate:
- Streaming: 2xx + processable content + no HTML post-processors
- Buffered: post-processors registered (Next.js) or non-processable

Streaming path returns PublisherResponse::Stream with the origin
body and processing params. The adapter calls finalize_response()
to set all headers, then stream_to_client() to commit them, and
pipes the body through stream_publisher_body() into StreamingBody.

Synthetic ID/cookie headers are set before body processing (they
are body-independent), so they are included in the streamed headers.

Mid-stream errors log and drop the StreamingBody — client sees a
truncated response, standard proxy behavior.
- Replace streaming_body.finish().expect() with log::error on failure
  (expect panics in WASM, and headers are already committed anyway)
- Restore explanatory comments for cookie parsing, SSC capture,
  synthetic ID generation, and consent extraction ordering
Hoist the non-processable early return above the streaming gate so
content_encoding extraction happens once. The streaming gate condition
is also simplified since should_process and request_host are already
guaranteed at that point.
Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Clean Phase 2 implementation. The PublisherResponse enum correctly models the streaming/buffered decision at the type level, the entry point migration is well-motivated, and the streaming error stratification handles all three failure tiers (pre-stream, mid-stream, finish) correctly. CI is all green.

Non-blocking

🤔 thinking

  • test_content_type_detection tests shadow logic (publisher.rs:487) — the test reimplements should_process inline rather than calling production code. If the logic in handle_publisher_request changes, this test won't catch it. It also checks text/css and text/javascript as separate conditions, but the production code uses content_type.contains("text/") — currently equivalent but coupled by coincidence. Consider extracting is_processable_content_type(ct: &str) -> bool and testing that directly.

🌱 seedling / ⛏ nitpick

See inline comments.

👍 praise

See inline comments.

CI Status

  • integration tests: PASS
  • browser integration tests: PASS
  • prepare integration artifacts: PASS

Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Clean, well-scoped Phase 2 implementation (~375 lines, 2 files). The PublisherResponse enum, W: Write generic refactor, and entry-point migration are well-designed. CI is green.

Highlights:

  • PublisherResponse enum models streaming/buffered at the type level — compile-time correctness
  • Entry point migration doc comment explains why — excellent for maintainers
  • Three-tier error stratification (pre-stream → HTTP response, mid-stream → log+abort, finish → log)
  • W: Write generic is the minimal right abstraction
  • PipelineConfig deduplication removes prior triplication

Totals: 1 P1, 4 P2, 3 P3 — No blockers.


Findings not placeable on diff lines (test code unchanged in this PR):

P3: PR description lists "2xx status" gate but code doesn't implement it
The PR description says "The backend returns a 2xx status" as a streaming condition, but this isn't implemented. Either update the description or add the check (see P1 inline comment on the streaming gate).

🤔 P2: Test reimplements predicate differently than production (publisher.rs ~L486-491)
test_content_type_detection uses individual contains("text/html") || contains("text/css") || ... but production uses broad contains("text/"). The test won't catch regressions if the production predicate changes. Consider extracting the predicate into a named function (fn should_process_content_type(ct: &str) -> bool) and testing that directly.

P3: test_content_encoding_detection tests Fastly SDK, not app logic (publisher.rs ~L558-578)
This test creates a Request, sets headers, then reads them back — it tests Fastly SDK header behavior, not the application's content-encoding handling. Consider testing Compression::from_content_encoding directly instead.

P3: UTF-8 tests verify stdlib behavior, not application code (publisher.rs ~L517-549)
test_invalid_utf8_handling and test_utf8_conversion_edge_cases verify String::from_utf8 behavior — standard library invariants. They provide no regression coverage of application code. Consider replacing with tests that exercise actual invalid UTF-8 handling through the pipeline.

@aram356 aram356 changed the base branch from feature/streaming-pipeline-phase1 to specs/streaming-response-optimization March 31, 2026 20:53
@aram356
Copy link
Copy Markdown
Collaborator Author

aram356 commented Mar 31, 2026

Addressing review feedback:

Fixed in this push:

  • OwnedProcessResponseParams fields narrowed to pub(crate)
  • Buffered path now sets Content-Length to output.len() instead of removing it
  • Added has_html_post_processors() -> bool on IntegrationRegistry to avoid Vec<Arc<…>> clone
  • Extracted is_processable_content_type() function and updated test_content_type_detection to test production code directly instead of reimplementing the predicate
  • Fixed stray merge artifact in apply_synthetic_id_headers

Addressed in later phases:

Acknowledged (P3 nitpicks):

  • test_content_encoding_detection tests SDK behavior — valid observation, low risk
  • UTF-8 tests verify stdlib — valid observation, low risk

aram356 added 2 commits March 31, 2026 14:14
- Narrow OwnedProcessResponseParams fields to pub(crate)
- Set Content-Length on buffered responses instead of removing it
- Add has_html_post_processors() to avoid Vec<Arc<...>> allocation
- Extract is_processable_content_type() and test it directly
- Fix stray merge artifact in apply_synthetic_id_headers
…ation' into feature/streaming-pipeline-phase2
Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Overall this is a well-executed PR with strong test coverage (754 passing), clean architecture (PublisherResponse enum, generic W: Write, process_chunks consolidation), and significant security improvements (synthetic ID validation, proxy allowlist, error propagation from integration builders).

Requesting changes for three issues:

  1. Streaming gate should check HTTP status code — non-2xx responses (404/500 error pages) will be streamed with JS injection
  2. Unsupported Content-Encoding fallback — misbehaving origins returning unsupported encodings produce broken streamed responses
  3. Proxy allowlist docs inaccurate — docs say only redirects are checked, but code also checks initial target

Additional suggestions: fix synthetic ID debug logging inconsistency, update CHANGELOG, and correct the "2xx" doc comment.


Cross-cutting findings (not in diff)

🔧 Proxy allowlist docs only describe redirect enforcement, but code also enforces on initial target

Files: docs/guide/first-party-proxy.md (~line 441), docs/guide/configuration.md (~line 761)

The docs say the allowlist applies when "a proxied request receives an HTTP redirect," but proxy.rs now also checks the initial target host before the first request. Operators reading the docs would think only redirect destinations are filtered.

Suggestion: Update to say: "Both the initial proxy target and any redirect destinations are checked against allowed_domains."

🤔 Synthetic ID logged in debug output

File: crates/trusted-server-core/src/publisher.rs (~line 383-386, not in diff)

log::debug!("Proxy synthetic IDs - trusted: {}, ...") exposes the raw synthetic ID value. The rest of this PR carefully avoids logging raw synthetic IDs (e.g., synthetic.rs changed to log without the value). This log line is inconsistent with that approach.

Suggestion: Remove the synthetic ID value from the log message.

🤔 CHANGELOG missing Phase 2 entries

File: CHANGELOG.md (~line 8-13, not in diff)

The [Unreleased] section only has the synthetic ID validation entry. This PR introduces several user-facing changes that should be documented: streaming response support, proxy.allowed_domains, integration config errors failing startup, placeholder secret policy change, entry point migration.

aram356 added 3 commits April 8, 2026 11:42
Non-2xx responses now stay buffered to prevent committing error
status irreversibly via stream_to_client() and injecting JS into
error pages. Unsupported Content-Encoding values (e.g. zstd from
misbehaving origins) fall back to buffered mode so failures produce
proper error responses instead of truncated streams.

Also removes raw synthetic ID from debug logging for privacy
consistency, fixes std::io::Write import inconsistency, and
corrects misleading "200 OK" comment in streaming error path.
Resolve conflicts between streaming Phase 2 and main's EC rename
(SSC → Edge Cookie) and RuntimeServices/consent route changes:
- Adapt streaming PublisherResponse to use runtime_services_for_consent_route
- Rename extracted apply_synthetic_id_headers → apply_ec_headers
- Use services.kv_store() for KV deletion instead of store name
- Remove stray merge artifact
…ation' into feature/streaming-pipeline-phase2
Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This change wires StreamingBody into the publisher proxy and moves response finalization into explicit send/stream paths. I found two blocking correctness issues in the new streaming gate/fallback logic plus one scope question.

CI Status

  • prepare integration artifacts: PASS
  • integration tests: PASS
  • browser integration tests: PASS
  • local fmt/clippy/rust/js verification: not rerun in this review (the PR description reports clean runs)

let is_html = content_type.contains("text/html");
let has_post_processors = integration_registry.has_html_post_processors();
let encoding_supported = is_supported_content_encoding(&content_encoding);
let can_stream = encoding_supported && (!is_html || !has_post_processors);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — This gate is still too permissive for HTML. has_html_post_processors() only catches the Next.js path, but google_tag_manager registers a script rewriter without a post-processor and create_html_processor() still switches HTML into new_buffered() whenever any script rewriter is present. In that configuration we commit headers via stream_to_client() even though the body is still fully buffered, so the constant-memory and clean-error guarantees no longer hold.

Fix: Gate HTML streaming on script rewriters as well, or narrow Phase 2 to configurations with neither script rewriters nor post-processors.

integration_registry,
};
let mut output = Vec::new();
process_response_streaming(body, &mut output, &params)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — The unsupported-encoding fallback still routes the body through process_response_streaming(). Because Compression::from_content_encoding() maps unknown encodings to None, a Content-Encoding: zstd HTML/JS/JSON response will be treated as identity-encoded bytes instead of being passed through unchanged or rejected explicitly.

Fix: If the encoding is unsupported, bypass rewriting and return the origin response untouched, or raise an explicit error before calling the rewrite pipeline.

.map(|h| h.to_str().unwrap_or_default())
.unwrap_or_default()
.to_lowercase();
if !should_process || request_host.is_empty() || !is_success {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question — The PR summary/spec says Phase 2 covers binary pass-through, but this early return keeps every !should_process response on the plain buffered path. Is binary streaming intentionally deferred, and if so should the PR description/spec be narrowed to match the implementation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants