Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 26 additions & 1 deletion profiles/container-image.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Container Image Profile

> **Profile Version:** 1.1.0
> **Profile Version:** 1.2.0
> **Applies to:** All `ubi10-*` and standalone container image projects

This profile extends the [universal constitution](../constitution.md) with requirements specific to **UBI-based** container image projects in the crunchtools organization.
Expand Down Expand Up @@ -87,6 +87,31 @@ For images based on `ubi-init` that run systemd:
```
3. Set stop signal: `STOPSIGNAL SIGRTMIN+3`
4. Use init as entrypoint: `ENTRYPOINT ["/sbin/init"]`
5. **Self-heal internal services.** Every long-running service inside a systemd
container MUST set `Restart=on-failure` (with `RestartSec`) via a drop-in.
The host-level Podman unit's `Restart=always` only restarts the *container* —
it cannot recover a service that dies *inside* a still-running container.
```dockerfile
COPY config/php-fpm-restart.conf /etc/systemd/system/php-fpm.service.d/restart.conf
```
```ini
[Service]
Restart=on-failure
RestartSec=5s
```
6. **Bound resource consumption to the container limit.** Services that spawn
worker pools (php-fpm, httpd) MUST cap concurrency so the stack cannot exceed
the container's `--memory` cgroup and trigger the OOM killer. For php-fpm,
prefer `pm = ondemand` with an explicit `pm.max_children` sized to
`(memory_limit ÷ avg_worker_RSS)`, plus `pm.max_requests` to recycle workers.
Comment on lines +105 to +106
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In a multi-service systemd container (e.g., running systemd, httpd, MariaDB, and php-fpm together), allocating the entire container memory limit to the php-fpm worker pool formula can still lead to OOM starvation of other essential services. The formula should subtract the memory overhead of other running processes to ensure safe boundaries.

Suggested change
prefer `pm = ondemand` with an explicit `pm.max_children` sized to
`(memory_limit ÷ avg_worker_RSS)`, plus `pm.max_requests` to recycle workers.
prefer `pm = ondemand` with an explicit `pm.max_children` sized to
`((container_memory_limit - other_services_memory) ÷ avg_worker_RSS)`, plus `pm.max_requests` to recycle workers.

Stock defaults (`pm = dynamic`, `pm.max_children = 50`) are unsafe in a
memory-capped container.

> **Rationale:** On 2026-05-27 crunchtools.com ran the stock php-fpm pool
> (`pm.max_children = 50`) inside a 2 GB container. A traffic spike stacked
> workers past the cgroup, the OOM killer shot MariaDB, neither service had
> `Restart=`, and the blog served 503s silently for ~5 days. Both requirements
> above exist to make that failure mode self-healing and bounded.

---

Expand Down
Loading