Skip to content

Add safe extraction for malformed base64 kafka headers#11472

Open
amarziali wants to merge 8 commits into
masterfrom
andrea.marziali/kafkaheaders
Open

Add safe extraction for malformed base64 kafka headers#11472
amarziali wants to merge 8 commits into
masterfrom
andrea.marziali/kafkaheaders

Conversation

@amarziali
Copy link
Copy Markdown
Contributor

What does this do

When DD_KAFKA_CLIENT_BASE64_DECODING_ENABLED=true, the Kafka producer instrumentation calls extractContextAndGetSpanContext on the outgoing ProducerRecord headers. If any header value is not valid Base64 (e.g. a header produced by a non-DD service, a URL-safe encoded value, or a plain string), Base64.getDecoder().decode() throws IllegalArgumentException. That exception escapes @OnMethodEnter, ByteBuddy's suppress = Throwable.class silently swallows it and returns a null AgentScope, which then causes an NPE in BaseDecorator.beforeFinish at scope.context()

The fix moves the Base64 decode into a safe Function<byte[], String> (Functions.base64Decode) that catches any Exception and returns null. TextMapExtractAdapter.forEachKey checks for null, logs the failing header key with EXCLUDE_TELEMETRY (since it's not actionable) , and skips that header. Valid headers continue to be processed normally.

Depends to #11466

Motivation

Additional Notes

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and (comp: or inst:) labels in addition to any other useful labels
  • Avoid using close, fix, or any linking keywords when referencing an issue
    Use solves instead, and assign the PR milestone to the issue
  • Update the CODEOWNERS file on source file addition, migration, or deletion
  • Update public documentation with any new configuration flags or behaviors
  • Add your completed PR to the merge queue by commenting /merge. You can also:
    • Customize the commit message associated with the merge with /merge --commit-message "..."
    • Remove your PR from the merge queue with /merge -c
    • Skip all merge queue checks with /merge -f --reason "reason"; please use this judiciously, as some checks do not run at the PR-level
    • Get more information in this doc

Jira ticket: [PROJ-IDENT]

@amarziali amarziali requested review from a team as code owners May 27, 2026 15:45
@amarziali amarziali added the type: bug Bug report and fix label May 27, 2026
@amarziali amarziali requested a review from a team as a code owner May 27, 2026 15:45
@amarziali amarziali added inst: kafka Kafka instrumentation tag: telemetry error reported Reported by error telemetry labels May 27, 2026
@amarziali amarziali requested review from mcculls and ygree and removed request for a team May 27, 2026 15:45
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1435039fe3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@amarziali amarziali force-pushed the andrea.marziali/kafkaheaders branch from 1435039 to e90e93e Compare May 27, 2026 15:54
@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 27, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 2 Pipeline jobs failed

Run system tests | main / End-to-end #8 / spring-boot-wildfly 8   View in Datadog   GitHub Actions

🔄 Retry job. This looks flaky and may succeed on retry. Connection to Docker registry timed out while fetching image due to network issues, leading to repeated fetch failures.

Run system tests | Check system tests success   View in Datadog   GitHub Actions

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 46edb6e | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 27, 2026

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results
Scenario Candidate master Δ (95% CI of mean)
startup:insecure-bank:iast:Agent 14.04 s 13.95 s [-0.2%; +1.4%] (no difference)
startup:insecure-bank:tracing:Agent 12.85 s 12.90 s [-1.4%; +0.5%] (no difference)
startup:petclinic:appsec:Agent 16.69 s 16.45 s [+0.2%; +2.8%] (maybe worse)
startup:petclinic:iast:Agent 16.64 s 16.56 s [-0.4%; +1.4%] (no difference)
startup:petclinic:profiling:Agent 16.46 s 16.63 s [-2.6%; +0.6%] (no difference)
startup:petclinic:tracing:Agent 15.96 s 15.87 s [-0.6%; +1.7%] (no difference)

Commit: 46edb6e1 · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/kafkaheaders
git_commit_date 1780305900 1780318689
git_commit_sha 7b18b2e 46edb6e
See matching parameters
Baseline Candidate
ci_job_date 1780319713 1780319713
ci_job_id 1729092713 1729092713
ci_pipeline_id 116226261 116226261
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce same

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master andrea.marziali/kafkaheaders
git_commit_date 1780305900 1780318689
git_commit_sha 7b18b2e 46edb6e
See matching parameters
Baseline Candidate
ci_job_date 1780319755 1780319755
ci_job_id 1729092715 1729092715
ci_pipeline_id 116226261 116226261
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume same

Comment thread internal-api/src/main/java/datadog/trace/api/Functions.java Outdated
amarziali and others added 2 commits June 1, 2026 13:31
@amarziali amarziali force-pushed the andrea.marziali/kafkaheaders branch from 8f45fd1 to d661d93 Compare June 1, 2026 11:31
Base automatically changed from andrea.marziali/propagators-transform to master June 1, 2026 12:03
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot requested a review from a team as a code owner June 1, 2026 12:03
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot requested review from mhlidd and removed request for a team June 1, 2026 12:03
@amarziali amarziali force-pushed the andrea.marziali/kafkaheaders branch from cf6f637 to 02c89f0 Compare June 1, 2026 12:08
@amarziali amarziali enabled auto-merge June 1, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: kafka Kafka instrumentation tag: telemetry error reported Reported by error telemetry type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants