cmdrunner: release process handle in _pidAlive to avoid pidfd leak#378
Conversation
os.FindProcess on Linux with Go 1.23+ opens a pidfd, and pidWait polls _pidAlive roughly once per second for every plugin process. Without a matching Release the pidfd leaks on each poll, and under Nomad with a few hundred allocations it adds up fast -- one reporter saw it cycle between 20k and 130k open FDs until the process hit EMFILE. Defer proc.Release() right after FindProcess so the handle is closed on every return path. Mirrors the syscall.CloseHandle defer already used in the Windows implementation. Reported downstream in hashicorp/nomad#27847.
There was a problem hiding this comment.
Thanks for the PR @texasich! LGTM once the lint is addressed!
I've reproduced the circumstances described in hashicorp/nomad#27847 with both the current tip of Noamd and a version of Nomad using this PR for go-plugin (via a replace directive in my go.mod). I run a single minimal job and then restart the agent. (Note this requires running not-in--dev mode.)
Using this little script:
#!/usr/bin/env bash
while :
do
ls /proc/$1/fd/ | wc -l
sleep 1
doneWe see that before the patch, the number of open file handles increases after restart Nomad. After the patch, it does not.
| if err == nil { | ||
| // On Linux with Go 1.23+, FindProcess opens a pidfd which must be | ||
| // released or it leaks an FD on every call. | ||
| defer proc.Release() |
There was a problem hiding this comment.
This is triggering a very dumb lint. Can we either swallow the return value or put a //nolint:errcheck directive here?
defer proc.Release() trips the errcheck lint because Release returns an error. The handle is short-lived and there's nothing actionable to recover from a release failure, so swallow the error explicitly to make the intent clear. Per @tgross review on hashicorp#378. Signed-off-by: texasich <texasich@users.noreply.github.com>
|
Quick CI update — lint is clean on 8382575. The remaining failure is When you get a moment, could you re-run the |
ritikrajdev
left a comment
There was a problem hiding this comment.
Thanks for checking this out. Looks Good to me as well.
When Nomad clients built on Go >=1.23 are restarted, the go-plugin client starts leaking a pidfd file handle 1/sec when polling for the plugin server. Long-term we should look at this polling behavior in more detail in the library because the pidfd may let us track the plugin status without polling. But in the meantime the leaking pidfd has been fixed upstream and we should pull that in. Ref: hashicorp/go-plugin#378 Fixes: #27847
When Nomad clients built on Go >=1.23 are restarted, the go-plugin client starts leaking a pidfd file handle 1/sec when polling for the plugin server. Long-term we should look at this polling behavior in more detail in the library because the pidfd may let us track the plugin status without polling. But in the meantime the leaking pidfd has been fixed upstream and we should pull that in. Ref: hashicorp/go-plugin#378 Fixes: #27847
When Nomad clients built on Go >=1.23 are restarted, the go-plugin client starts leaking a pidfd file handle 1/sec when polling for the plugin server. Long-term we should look at this polling behavior in more detail in the library because the pidfd may let us track the plugin status without polling. But in the meantime the leaking pidfd has been fixed upstream and we should pull that in. Ref: hashicorp/go-plugin#378 Fixes: #27847
Description
_pidAliveininternal/cmdrunner/process_posix.gocallsos.FindProcess(pid)and never releases the returned*os.Process. On Linux with Go 1.23+ that call now opens a pidfd under the hood (os.pidfdFind→pidfd_open), so every invocation leaks one file descriptor.pidWaitpolls once per second per plugin, so the leak scales linearly with plugin count × uptime. In the downstream Nomad report (hashicorp/nomad#27847) a client with a few hundred allocations saw the host-side FD count cycle between 20k and 130k, eventually trippingEMFILEand breaking CNI config loads, disk-stats collection,pipe2, and docker socket dials.The fix is a one-liner:
defer proc.Release()right after theFindProcesscall, so the pidfd is closed on every return path. The Windows implementation already does the equivalent withdefer syscall.CloseHandle(h)inprocess_windows.go, so this just brings the POSIX side in line.Before:
(one new FD per poll iteration per plugin, never closed)
Related Issue
Downstream report: hashicorp/nomad#27847
How Has This Been Tested?
go build ./...andgo vet ./...clean on both host (Windows) andGOOS=linuxgo test ./internal/cmdrunner/...passes_pidAlivewhich has always done the equivalent cleanup viadefer syscall.CloseHandle(h), so behavior is symmetric across platformsCaching the
*os.Processon the runner would avoid the repeatedFindProcessentirely and is probably the better long-term change, but it's a larger refactor touchingRunnerlifecycle. Keeping this PR to the minimal, backport-friendly fix; happy to follow up with the caching variant in a separate PR if preferred.