Add system-probe-lite support for discovery in Helm chart#2479
Add system-probe-lite support for discovery in Helm chart#2479
Conversation
20dc494 to
8ed1071
Compare
When only discovery is enabled and no other system-probe feature is active, use the lightweight system-probe-lite binary instead of full system-probe. Falls back to system-probe if system-probe-lite is not available in the image. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8ed1071 to
f296163
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2961631fe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…tainer Use discovery-enabled helper in _container-system-probe.yaml instead of directly checking datadog.discovery.enabled, so the cgroups volume mount is also present when enabledByDefault=true. Add a render test covering the cgroups mount for both cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@codex review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
fanny-jiang
left a comment
There was a problem hiding this comment.
LGTM, but the changes as they are now are breaking in GKE Autopilot due to the restricted container, command value, being modified without a corresponding update to the Datadog GKE Autopilot WorkloadAllowlist.
https://docs.cloud.google.com/kubernetes-engine/docs/reference/crds/workloadallowlist#containers (see containers[].command)
Contributions to update the WorkloadAllowlist are welcome! We have internal docs on how updates can be submitted. Until then, the changes should be gated for both GKE Autopilot and GKE GDC. Thanks!
The GKE Autopilot WorkloadAllowlist does not yet support system-probe-lite, so gate should-use-system-probe-lite to fall back to regular system-probe when providers.gke.autopilot or providers.gke.gdc is enabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-probe command Add datadog.discovery.enabled to the "with system-probe" test case and add a new "with discovery only" test case. Assert in verifyAutopilotWorkloadAllowlistConstraints that the system-probe container always uses the regular system-probe command on autopilot (not system-probe-lite). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Conflicts: charts/datadog/CHANGELOG.md
wdhif
left a comment
There was a problem hiding this comment.
Blocking until we validate DataDog/datadog-operator#2610
What this PR does / why we need it:
Adds support for starting system-probe-lite (SPL) instead of the full system-probe in the Helm chart when only discovery is enabled.
SPL is a privileged Rust binary that implements just the discovery module. The goal of this support is to ensure that SPL is started instead of system-probe whenever only discovery is enabled.
How it works:
If any other system-probe features than discovery are also enabled (NPM, USM, etc.), the regular
system-probebinary is always used directlyWhen only discovery is enabled (no other system-probe features):
discovery.enabledwas explicitly set by the user, we fall back to the fullsystem-probebinary (since the user opted in to discovery knowingly)discovery.enabledByDefaultis set (i.e. discovery was turned on without the user's explicit choice), we fall back tosleep infinityto avoid starting system-probe unexpectedly (or crashing) when using an older agent image withoutsystem-probe-lite(or without the discovery feature altogether).Special notes for your reviewer:
The
discovery-enabledhelper useskindIs "invalid"to distinguish betweenenabled: false(explicitly disabled) andenabledbeing unset/nil. This is important because a simpleorwould treatfalseand nil the same way, causingenabledByDefault: trueto override an explicitenabled: false.Note that discovery is not turned on by default in this PR, that's a separate, future step.
Checklist
[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]
<chartName>/minor-version,<chartName>/patch-version, or<chartName>/no-version-bump)datadogordatadog-operatorchart or value changes, update the test baselines (run:make update-test-baselines)GitHub CI takes care of the below, but are still required:
.github/helm-docs.sh)CHANGELOG.mdhas been updatedREADME.md