resctrl-mon: add NRI plugin for per-pod resctrl monitoring groups#666
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new standalone NRI plugin (nri-resctrl-mon) to create per-pod resctrl monitoring groups (mon_groups) and optionally persist .begin/.end counter snapshots under a host directory for passive AET/Kepler-style consumers.
Changes:
- Introduces the
nri-resctrl-monplugin implementation (resctrl ops, lifecycle hooks, in-memory pod/container tracking, snapshot store) plus unit tests. - Adds a Helm chart, sample configuration, and documentation under a new “monitoring” docs category.
- Wires the plugin into the repository build via the top-level
Makefile.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| sample-configs/nri-resctrl-mon.yaml | Adds example YAML configuration for the new plugin. |
| Makefile | Registers nri-resctrl-mon in the build plugin list. |
| docs/monitoring/resctrl-mon.md | New end-user/developer documentation for resctrl-mon behavior and snapshots. |
| docs/monitoring/index.md | Adds a new “monitoring” docs section index. |
| docs/index.md | Links the new monitoring docs section from the main docs index. |
| docs/deployment/helm/resctrl-mon.md | Includes the Helm chart README into the docs site. |
| docs/deployment/helm/index.md | Adds resctrl-mon to the Helm deployment docs index. |
| deployment/helm/resctrl-mon/values.yaml | Default Helm values for the resctrl-mon DaemonSet/chart. |
| deployment/helm/resctrl-mon/values.schema.json | Helm values schema for chart parameter validation. |
| deployment/helm/resctrl-mon/templates/daemonset.yaml | DaemonSet template for running the plugin with required mounts/capabilities. |
| deployment/helm/resctrl-mon/templates/configmap.yaml | ConfigMap template for the plugin configuration file. |
| deployment/helm/resctrl-mon/templates/_helpers.tpl | Shared Helm template helpers (labels/selectors). |
| deployment/helm/resctrl-mon/README.md | Helm chart usage and configuration documentation. |
| deployment/helm/resctrl-mon/Chart.yaml | Helm chart metadata. |
| deployment/helm/resctrl-mon/.helmignore | Helm packaging ignore rules. |
| cmd/plugins/resctrl-mon/state.go | In-memory tracking for per-pod mon_group and container membership. |
| cmd/plugins/resctrl-mon/snapshot.go | Snapshot store implementation (.begin/.end, symlinks, pruning). |
| cmd/plugins/resctrl-mon/snapshot_test.go | Unit tests for snapshot store behavior and pruning. |
| cmd/plugins/resctrl-mon/resctrl.go | Resctrl filesystem operations (create/remove mon_groups, write tasks, read mon_data, cleanup). |
| cmd/plugins/resctrl-mon/resctrl_test.go | Unit tests for resctrl operations and safety validation. |
| cmd/plugins/resctrl-mon/plugin.go | Core NRI hook implementation and configuration parsing/validation. |
| cmd/plugins/resctrl-mon/plugin_test.go | Unit tests for lifecycle behavior, filtering, and snapshot integration. |
| cmd/plugins/resctrl-mon/main.go | Plugin entrypoint, flags, and NRI stub wiring. |
| cmd/plugins/resctrl-mon/Dockerfile | Container image build for the new plugin. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e44434b to
d1171cd
Compare
Add nri-resctrl-mon, a standalone NRI plugin that creates per-pod resctrl monitoring groups (mon_groups) to support passive monitorning of Application Energy Telemetry (AET). The plugin uses the PostCreateContainer hook to assign container PIDs to mon_groups before exec/fork, eliminating the fork race that plagues userspace daemon approaches. RMID allocation is delegated to the kernel via mkdir/rmdir on the resctrl filesystem. Includes: - Plugin source (main.go, plugin.go, resctrl.go, state.go) - Unit tests (plugin_test.go, resctrl_test.go) - Dockerfile following nri-memory-qos pattern - Helm chart (Chart.yaml, values.yaml, templates/, schema) - Documentation (monitoring category, plugin docs, Helm docs) - Sample configuration Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com> Signed-off-by: Jedrzej Wasiukiewicz <jedrzej.wasiukiewicz@intel.com>
CRI-O versions up to and including 1.35 do not populate Container.Pid in NRI events (the field was added to containerToNRI in CRI-O main, commit 19d319695, not yet in any release branch). This causes GetPid() to return 0 in PostStartContainer and Synchronize, preventing the plugin from writing PIDs to the resctrl mon_group tasks file. Add a readContainerPID() fallback that reads the init PID from CRI-O's container pidfile at: /run/containers/storage/overlay-containers/<id>/userdata/pidfile The fallback is only attempted when GetPid() returns 0, so it is a no-op on runtimes that populate the PID (containerd, future CRI-O). On non-CRI-O runtimes, the pidfile simply doesn't exist and the read returns 0 gracefully. Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com>
askervin
left a comment
There was a problem hiding this comment.
LGTM.
@cmcantalupo , in case you wish to test how this plugin will feel like in helm users' point of view once nri-plugins gets released, I pushed your feature branch in my test/build/pr666 branch. That triggers github actions to publish helm charts and container images in ghcr.io. Now it should be possible to install nri-resctrl-mon from there:
helm install \
--devel \
-n kube-system \
nri-resctrl-mon \
oci://ghcr.io/askervin/nri-plugins/helm-charts/nri-resctrl-mon \
--version v0.12-pr666-unstable \
--set image.name=ghcr.io/askervin/nri-plugins/nri-resctrl-mon \
--set image.tag=v0.12-pr666-unstable \
--set image.pullPolicy=Always \
...
marquiz
left a comment
There was a problem hiding this comment.
Thanks @cmcantalupo for the PR. And apologies for sitting on it for so long...
A few questions/notes below but I think we could merge this.
The biggest question maybe is why no use the goresctrl library for managing the resctrl fs? I think it's API should be able to cover the stuff that's now re-implemented in resctrl.go. If possible, I'd like to see us use one implementation for that, to reduce duplicate maintenance work (and have wider usage for the one impl)
- Add Kepler project link in documentation (docs/monitoring/resctrl-mon.md) - Make resctrlPath configurable via Helm values.yaml instead of hardcoding in the ConfigMap template - Log warning when removeContainer/podHasNoContainers is called for an untracked pod (defensive corner-case visibility) Signed-off-by: Christopher M. Cantalupo <christopher.m.cantalupo@intel.com>
I've actually been working on moving most of the implementation for this plugin into goresctrl to enable use cases outside of NRI. This is on my fork The modified nri-plugin that uses the new package: I hope this resolution can be made in a future PR. |
Description
Add nri-resctrl-mon, a standalone NRI plugin that creates per-pod resctrl monitoring groups (
mon_groups) to support passive monitoring of Application Energy Telemetry (AET) via consumers such as Kepler.Motivation
Userspace daemon approaches to resctrl mon_group management suffer from a fork race: a container's first threads can execute before the daemon writes their PIDs into the mon_group's
tasksfile, causing energy attribution gaps. By using the NRIStartContainerhook, this plugin assigns the container's init PID to a mon_group beforeexecruns and threads fork, eliminating the race entirely.What's included
main.go,plugin.go,resctrl.go,state.go)plugin_test.go,resctrl_test.go)nri-memory-qospatternChart.yaml,values.yaml, templates, JSON schema)Design decisions
PostCreateContainercreates the mon_group directory (assigns the RMID), andStartContainerwrites the init PID while the process is paused — before any user threads fork.PostStartContaineris a fallback in case the PID is not available atStartContainer(should not happen on containerd ≥ 2.x).mkdir/rmdiron resctrl delegates RMID lifecycle to the kernel, avoiding userspace exhaustion bugs.Synchronizere-creates mon_groups for running pods on plugin restart and removes orphaned mon_groups left by a previous instance.SYS_ADMIN+DAC_OVERRIDEonly (noprivileged: true).hostPID: trueis required so the plugin can write host-namespace PIDs to the resctrltasksfile.Testing
Stats