Skip to content

[APMSVLS-487] respect Datadog-Client-Computed-Stats header#1245

Draft
lucaspimentel wants to merge 10 commits into
mainfrom
lpimentel/respect-client-computed-stats
Draft

[APMSVLS-487] respect Datadog-Client-Computed-Stats header#1245
lucaspimentel wants to merge 10 commits into
mainfrom
lpimentel/respect-client-computed-stats

Conversation

@lucaspimentel
Copy link
Copy Markdown
Member

@lucaspimentel lucaspimentel commented Jun 3, 2026

Overview

APMSVLS-487

Makes the extension respect the tracer's Datadog-Client-Computed-Stats header and moves _dd.compute_stats from a baked-in function tag to a per-span backend directive.

Background

Span attribute _dd.compute_stats asks the backend to compute trace stats. It must be set to "1" only when nobody else computed them — neither the extension (agent) nor the tracer. Previously the extension:

  1. Baked _dd.compute_stats into the function tags unconditionally (tags_from_env), which also leaked the key into _dd.tags.function.
  2. Ignored Datadog-Client-Computed-Stats entirely, so when a tracer computed stats client-side, the backend was still asked to compute them.

Canonical semantics (validated against the Go agent)

The Go agent (pkg/serverless/tags/tags.go) only ever sets _dd.compute_stats = "1", never "0", and leaves the key absent otherwise. This PR matches that:

Set _dd.compute_stats="1" iff !compute_trace_stats_on_extension && !client_computed_stats; otherwise leave it absent.

compute_trace_stats_on_extension client_computed_stats _dd.compute_stats Extension computes stats?
false false "1" No
false true absent No
true false absent Yes
true true absent No

(Emitting absent instead of today's "0" is not a regression: "0" and absent are equivalent "off" states to the backend.)

Changes

  • tags/lambda/tags.rs — stop baking _dd.compute_stats in tags_from_env (no longer leaks into _dd.tags.function); COMPUTE_STATS_KEY is now pub so the integration test can reuse it instead of re-declaring the literal.
  • Path A: traces/trace_processor.rsChunkProcessor gains client_computed_stats and stamps _dd.compute_stats="1" per-span only when neither side computes stats; the extension-side stats-generation guard in send_processed_traces now also skips when client_computed_stats is set.
  • Path B (extension-generated aws.lambda span)client_computed_stats is propagated from the tracer's placeholder span through context.rsprocessor.rsprocessor_service.rstrace_agent.rs, so Path B reuses the same ChunkProcessor stamping (single source of truth).
  • OTLP: otlp/agent.rs — the OTLP stats-generation guard previously checked only compute_trace_stats_on_extension, so an OTLP request carrying Datadog-Client-Computed-Stats would still generate extension-side stats and double-count against the tracer. It now also skips when client_computed_stats is set, mirroring the send_processed_traces guard.

Note on the header value (cross-runtime)

Datadog-Client-Computed-Stats is not standardized ("true" .NET/Java/PHP/Python, "yes" JS/Ruby/C++, "t" Go). bottlecap consumes the already-parsed client_computed_stats bool from libdd_trace_utils, where any non-empty value → true, so the fix triggers on every runtime. A separate libdatadog PR (DataDog/libdatadog#2071) aligns the header parsing with the Go agent's isHeaderTrue/ParseBool rules; this PR only consumes the bool and does not depend on that change.

Testing

  • Tier 0 (header-parsing contract) — moved to the libdatadog bump PR (#1244), since those tests assert libdatadog parsing behavior that the db05e1f → 48da0d8 bump changes. This branch keeps Tiers 1–3 and rebases onto chore(deps): bump libdatadog to 48da0d8 for client-computed header fix #1244 after it merges.
  • Tier 1 (trace_processor.rs) — truth-table on Span.meta, stats-skip guard via the real StatsConcentratorService, and updated tags.rs unit tests asserting the key no longer appears in the tags map. Fixed the logs/metrics integration tests that asserted the old leak.
  • Tier 2 (context.rs, processor.rs) — context-level flag recording and an end-to-end Path B test driving send_ctx_spans through the trace_tx channel, asserting _dd.compute_stats on the aws.lambda span across the truth table.
  • Tier 3 (apm_integration_test.rs) — full fake-intake E2E routing a trace through SendingTraceProcessor::send_processed_traces: asserts on the captured AgentPayload span meta and on stats_payloads() (stats suppressed unless the extension computes and the tracer did not).

⚠️ Still TODO before un-drafting

  • Manual E2E gate on the golang-default/java cases (test_trace_aws_lambda_hits_metric) — those runtimes flush client-side stats async, which may not complete before the extension forwards data.
  • Rebase onto the libdatadog bump PR (#1244) after it merges.

"Computing stats twice doesn't make them twice as true, it just makes the backend twice as grumpy." — Claude 🤖

@lucaspimentel lucaspimentel changed the title [APMSVLS-487] respect Datadog-Client-Computed-Stats and stamp _dd.compute_stats per-span [APMSVLS-487] respect Datadog-Client-Computed-Stats header Jun 3, 2026
@datadog-datadog-prod-us1-2

This comment has been minimized.

Tests-first plan for respecting the Datadog-Client-Computed-Stats header:
canonical _dd.compute_stats rule, Path A/B production changes, and tiered
tests (header contract, trace_processor unit, Path B unit, full fake-intake E2E).

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
Document how libdatadog parses client_computed_stats (non-empty) and
client_computed_top_level (presence-only) differently from each other and
from the Go agent's uniform isHeaderTrue/ParseBool rule. Note both are latent
and fold both into the follow-up libdatadog PR.

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
… contract

Add a test module at the (&headers).into() boundary in trace_agent.rs that
locks the client_computed_stats parsing contract the rest of the fix relies on:
truthy values (true/yes/t/1) -> true across all runtimes, absent/empty -> false.
A documenting test asserts the current libdatadog (db05e1f) divergence where
falsey strings (false/0) parse as true, pointing at the follow-up libdatadog#2071.

Green on current libdatadog with no production change.
…tead of baking it

Move _dd.compute_stats from a baked-in function tag to a per-span backend directive
stamped by the trace processor, and respect the tracer's Datadog-Client-Computed-Stats
signal.

- tags.rs: stop inserting _dd.compute_stats in tags_from_env (it no longer leaks
  into _dd.tags.function); make COMPUTE_STATS_KEY pub(crate).
- trace_processor.rs: add client_computed_stats to ChunkProcessor; stamp
  _dd.compute_stats="1" on each span's meta only when neither the extension
  (compute_trace_stats_on_extension) nor the tracer (client_computed_stats) computes
  stats, matching the Go agent (set "1" or leave absent). Add && !client_computed_stats
  to the extension-side stats-generation guard.
- Tests: truth-table on Span.meta (#1118 guard), stats-skip guard via the real
  StatsConcentratorService, updated tags.rs unit tests, and removed the stale
  _dd.compute_stats assertions from the logs/metrics integration tests.

All 548 workspace tests pass; fmt + clippy clean.
…ath B aws.lambda span

Thread the tracer's Datadog-Client-Computed-Stats signal through the extension-generated
aws.lambda span (Path B) so _dd.compute_stats is stamped consistently with Path A
(the tracer-trace path handled in Tier 1).

- context.rs: add Context.client_computed_stats (+ Default false); ContextBuffer::add_tracer_span
  takes the flag and records it on the matching context.
- processor.rs: add_tracer_span threads the param; send_ctx_spans captures
  context.client_computed_stats before get_ctx_spans consumes the context; send_spans takes the
  param and sets it on its TracerHeaderTags so Path B reuses the Tier-1 ChunkProcessor stamping
  (single source of truth); send_cold_start_span passes false.
- processor_service.rs: ProcessorCommand::AddTracerSpan + handle + handler carry the flag.
- trace_agent.rs: annotate tracer_header_tags type and pass client_computed_stats to
  add_tracer_span at the placeholder-span branch.
- Tests: context-level flag recording, and an end-to-end Path B test that drives send_ctx_spans
  through the trace_tx channel and asserts _dd.compute_stats on the aws.lambda span across the
  full truth table.

All 550 workspace tests pass; fmt + clippy clean.
…s + stats suppression

Add end-to-end coverage that routes a trace through SendingTraceProcessor::send_processed_traces
(so process_traces/ChunkProcessor actually run) and asserts on the wire payloads captured by the
in-process fake-intake.

- run_processor_pipeline(compute_on_extension, client_computed_stats): drains the trace_tx channel
  into AggregatorService synchronously after send (avoids a flush race), flushes via TraceFlusher,
  and flushes stats via StatsConcentratorService -> StatsAggregator -> StatsFlusher. Parameterizes
  header_tags_with(client_computed_stats).
- T3.1 e2e_client_computed_stats_leaves_compute_stats_absent: tracer computed -> span meta absent.
- T3.2 e2e_compute_stats_truth_table_on_captured_span: "1" only for the neither-computes row.
- T3.3 e2e_stats_suppressed_unless_extension_computes: one stats payload only for
  (compute_on_extension, !client_computed_stats); empty otherwise.
- T3.4 e2e_client_computed_stats_absent_meta_and_no_stats: combined meta-absent + zero stats.

554/554 workspace tests pass; fmt + clippy clean.
…datadog bump PR

The Tier 0 tests assert libdatadog parsing behavior (not the _dd.compute_stats
feature), and libdatadog#2071 changes that behavior via the db05e1f -> 48da0d8
bump in PR #1244. Move them there so the bump carries its own behavioral
coverage and the assertions don't need to assert opposite things on two
branches. Feature branch keeps Tiers 1-3; rebase onto #1244 after it merges.
@lucaspimentel lucaspimentel force-pushed the lpimentel/respect-client-computed-stats branch from d5220c9 to 003415f Compare June 3, 2026 22:27
The OTLP trace pipeline's stats-generation guard only checked
compute_trace_stats_on_extension, so an OTLP request carrying
Datadog-Client-Computed-Stats would still generate extension-side
stats, double-counting against the tracer's client-side stats. Mirror
the guard already used in send_processed_traces by also skipping when
client_computed_stats is set.

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
Promote COMPUTE_STATS_KEY from pub(crate) to pub so the apm_integration test crate can import it instead of re-declaring the
  "_dd.compute_stats" string literal, keeping a single source of truth for the tag key.

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant