RD-3960 add opt-in init: true skipper.yaml field for PID-1 reaping#192
Open
simba1997 wants to merge 2 commits into
Open
RD-3960 add opt-in init: true skipper.yaml field for PID-1 reaping#192simba1997 wants to merge 2 commits into
init: true skipper.yaml field for PID-1 reaping#192simba1997 wants to merge 2 commits into
Conversation
skipper-entrypoint.sh runs the user command via `bash -c "$@"` with no `exec`, so bash stays PID-1 inside the build container. Bash as PID-1 does not forward SIGTERM to its children. When the docker/podman client receives SIGTERM (e.g. Jenkins pipeline cancellation, manual `docker stop`), bash swallows it and the user command (mindthegap, packer, etc.) survives until SIGKILL. Any descendant that detached from the controlling group survives even SIGKILL of bash and is left running on the host as a zombie holding registry connections, file handles, etc. Reproduced locally with debian + bash entrypoint mimicking skipper-entrypoint.sh: `docker stop -t 5` took the full 5s grace (SIGTERM ignored, SIGKILL after) without --init; with --init it took 0s (SIGTERM forwarded clean). `docker run --init` uses Docker's built-in tini as PID-1. `podman run --init` uses catatonit. Both are tiny, purpose-built init systems that forward signals and reap zombies. No image changes required from skipper consumers. This is a more general fix than per-consumer entrypoint override (#191): every skipper user benefits without any change on their side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review concern that an always-on `--init` could change behaviour for consumers whose entrypoints depend on the current no-init semantics (custom init systems, daemon-spawning workflows, PID-1 asserts in tests). Introduce a new `init` config field defaulting to false. When set to true in skipper.yaml, `--init` is added to the docker/podman run command. When unset or false, behaviour is identical to today. Existing consumers see no change. Consumers that want PID-1 signal forwarding and zombie reaping set `init: true` in their skipper.yaml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
init: true skipper.yaml field for PID-1 reaping
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
`skipper-entrypoint.sh` runs the user command via `bash -c "$@"` with no `exec`, so bash stays PID-1 inside the build container. Bash as PID-1 does not forward SIGTERM and does not reap orphaned children. When the container is stopped (Jenkins pipeline cancellation, manual `docker stop`, etc.), SIGTERM is swallowed and long-running children survive as host-visible zombies.
Add a new opt-in `init` config field to `skipper.yaml`. When set to `true`, `--init` is added to `docker/podman run` so a real init (tini on docker, catatonit on podman) runs as PID-1.
Default is `false` — existing consumers see no behaviour change. This addresses the concern that an always-on `--init` could break consumers whose workflows depend on the current no-init semantics (custom init systems, daemon-spawning patterns, tests that assert PID-1 == specific binary).
Reproduction
Debian container, bash entrypoint mimicking `skipper-entrypoint.sh` (`bash -c "$@"`, no `exec`):
Usage
In a consumer's `skipper.yaml`:
```yaml
registry: my.registry:5000
build-container-image: my-build-image
init: true # opt in to PID-1 reaping
```
Changes
Relation to #191
This PR is orthogonal to #191 (entrypoint override). Both are config-file additions that opt-in fix the orphan-on-cancel bug:
For the rootfs-star RC pipeline (RD-3960), `init: true` is the simpler fix: no Dockerfile changes, no apt-installed dumb-init, no wrapper script.
Test plan
🤖 Generated with Claude Code