Skip to content

RD-3960 add opt-in init: true skipper.yaml field for PID-1 reaping#192

Open
simba1997 wants to merge 2 commits into
upstreamfrom
Mohammad/RD-3960-docker-init
Open

RD-3960 add opt-in init: true skipper.yaml field for PID-1 reaping#192
simba1997 wants to merge 2 commits into
upstreamfrom
Mohammad/RD-3960-docker-init

Conversation

@simba1997
Copy link
Copy Markdown

@simba1997 simba1997 commented May 12, 2026

Summary

`skipper-entrypoint.sh` runs the user command via `bash -c "$@"` with no `exec`, so bash stays PID-1 inside the build container. Bash as PID-1 does not forward SIGTERM and does not reap orphaned children. When the container is stopped (Jenkins pipeline cancellation, manual `docker stop`, etc.), SIGTERM is swallowed and long-running children survive as host-visible zombies.

Add a new opt-in `init` config field to `skipper.yaml`. When set to `true`, `--init` is added to `docker/podman run` so a real init (tini on docker, catatonit on podman) runs as PID-1.

Default is `false` — existing consumers see no behaviour change. This addresses the concern that an always-on `--init` could break consumers whose workflows depend on the current no-init semantics (custom init systems, daemon-spawning patterns, tests that assert PID-1 == specific binary).

Reproduction

Debian container, bash entrypoint mimicking `skipper-entrypoint.sh` (`bash -c "$@"`, no `exec`):

Setup `docker stop -t 5`
Default (bash PID-1) 6s — SIGTERM ignored, SIGKILL after grace
`--init` 0s — SIGTERM forwarded cleanly

Usage

In a consumer's `skipper.yaml`:

```yaml
registry: my.registry:5000
build-container-image: my-build-image
init: true # opt in to PID-1 reaping
```

Changes

  • `skipper/cli.py`: read `init` from config defaults, thread through three `runner.run` callsites.
  • `skipper/runner.py`: accept `init` kwarg, add `--init` to docker args when true.
  • `tests/test_cli.py`: new `SKIPPER_CONF_WITH_INIT` fixture + assertion that `init=True` flows through. Updated 28 existing `assert_called_once_with` blocks with `init=False`.
  • `tests/test_runner.py`: new positive/negative tests that `--init` is/isn't in the docker argv.

Relation to #191

This PR is orthogonal to #191 (entrypoint override). Both are config-file additions that opt-in fix the orphan-on-cancel bug:

For the rootfs-star RC pipeline (RD-3960), `init: true` is the simpler fix: no Dockerfile changes, no apt-installed dumb-init, no wrapper script.

Test plan

  • Unit tests pass (CI).
  • Manual: skipper.yaml without `init` key → `docker exec ... ps -ef` shows PID-1 = user entrypoint (unchanged behaviour).
  • Manual: skipper.yaml with `init: true` → PID-1 = `docker-init`. `docker stop` terminates cleanly.

🤖 Generated with Claude Code

Mohamed Hallumi and others added 2 commits May 12, 2026 16:13
skipper-entrypoint.sh runs the user command via `bash -c "$@"` with no
`exec`, so bash stays PID-1 inside the build container. Bash as PID-1
does not forward SIGTERM to its children. When the docker/podman client
receives SIGTERM (e.g. Jenkins pipeline cancellation, manual `docker
stop`), bash swallows it and the user command (mindthegap, packer, etc.)
survives until SIGKILL. Any descendant that detached from the
controlling group survives even SIGKILL of bash and is left running on
the host as a zombie holding registry connections, file handles, etc.

Reproduced locally with debian + bash entrypoint mimicking
skipper-entrypoint.sh: `docker stop -t 5` took the full 5s grace (SIGTERM
ignored, SIGKILL after) without --init; with --init it took 0s
(SIGTERM forwarded clean).

`docker run --init` uses Docker's built-in tini as PID-1. `podman run
--init` uses catatonit. Both are tiny, purpose-built init systems that
forward signals and reap zombies. No image changes required from
skipper consumers.

This is a more general fix than per-consumer entrypoint override
(#191): every skipper user benefits without any change on their side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review concern that an always-on `--init` could change
behaviour for consumers whose entrypoints depend on the current
no-init semantics (custom init systems, daemon-spawning workflows,
PID-1 asserts in tests).

Introduce a new `init` config field defaulting to false. When set
to true in skipper.yaml, `--init` is added to the docker/podman run
command. When unset or false, behaviour is identical to today.

Existing consumers see no change. Consumers that want PID-1 signal
forwarding and zombie reaping set `init: true` in their skipper.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@simba1997 simba1997 changed the title RD-3960 run docker/podman with --init so PID-1 reaps zombies RD-3960 add opt-in init: true skipper.yaml field for PID-1 reaping May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant