A minimal, opinionated deployment tool for Docker Compose + Traefik stacks. Replaces Kamal's useful subset (rolling deploys, health checks, automatic failure recovery) without its baggage (parallel config, local-first execution, registry coupling).
- Docker Compose is the single source of truth. No
deploy.yml, no parallel declarations. The tool readsdocker-compose.ymland does its work. - Two layers, clean split. A server-side CLI (
flow-deploy) handles the single-node deploy lifecycle — pull, scale, health check, cutover. A GitHub Action (flow-deploy-action) handles host discovery, fleet orchestration, and SSH fan-out. Single-host projects can skip the action entirely. - Accessories are left alone. Databases, caches, and other stateful services are never restarted during a deploy unless explicitly requested.
- Builds happen in CI. The tool does not build images. GitHub Actions builds and pushes to GHCR. The tool pulls and swaps.
- Failure is a no-op. If a new container fails its health check, the old container continues serving traffic untouched. The deploy exits nonzero.
- Logging is the interface. All output is structured, human-readable, and designed to flow through an SSH session back into GitHub Actions logs.
Every service in docker-compose.yml is classified by a label:
| Label | Behavior |
|---|---|
deploy.role=app |
Rolled during deploy. Health-checked. Old container preserved on failure. |
deploy.role=accessory |
Never touched during deploy. Only started/stopped via explicit commands. |
| (no label) | Ignored entirely. The tool does not interact with unlabeled services. |
The deploy runs as a single transaction — flow-deploy deploy owns the full lifecycle including git operations. When --tag <sha> is provided, the value serves double duty: it is both the Docker image tag and the git SHA to checkout. When --tag is omitted, the tool deploys the current HEAD — the git checkout is skipped and DEPLOY_TAG is set to the current commit SHA.
Pre-flight and git checkout (before any service work):
0a. Dirty check git status --porcelain
If non-empty → log "working tree is dirty — deploy aborted", exit 1
0b. Fetch git fetch origin
0c. Record previous SHA previous_sha = git rev-parse HEAD
0d. Checkout (detached) git checkout --detach <sha> (skipped when --tag is omitted)
For each service with deploy.role=app, in the order they appear in the compose file:
1. Pull new image <compose-command> pull <service>
2. Start new container <compose-command> up -d --no-deps --no-recreate --scale <service>=2
3. Wait for health check poll new container until healthy or timeout
4a. If healthy:
Graceful shutdown docker stop --time <drain> <old_id>
Remove old container docker rm <old_id>
Scale back to 1 <compose-command> up -d --no-deps --scale <service>=1
✓ Continue to next service
4b. If unhealthy:
Stop new container docker stop <new_id> && docker rm <new_id>
Scale back to 1 <compose-command> up -d --no-deps --scale <service>=1
Restore repo git checkout --detach <previous_sha>
✗ Abort deploy, exit 1
On success: the server is in detached HEAD at <sha>. Log HEAD detached at <sha>.
On failure: the repo is restored to <previous_sha> before exiting. The invariant is preserved: git rev-parse HEAD always matches the image SHA that is actively serving traffic.
Where <compose-command> is the project's compose wrapper (see §3.1).
When stopping the old container, the tool sends SIGTERM and waits for in-flight requests to complete before removing it. This is the default behavior — not optional.
| Label | Default | Description |
|---|---|---|
deploy.drain |
30 |
Seconds to wait after SIGTERM before SIGKILL |
This maps directly to docker stop --time <seconds>. Traefik removes the container from its pool when it stops, so the drain period gives in-flight requests time to complete. Applications should handle SIGTERM gracefully (stop accepting new connections, finish existing ones).
The tool relies entirely on Docker's native health check mechanism as declared in docker-compose.yml. The tool does not define, override, or interpret health checks — it simply polls docker inspect for the container's health status.
A service with deploy.role=app MUST have a healthcheck defined. The tool refuses to deploy a service without one.
Configuration (via labels on the service):
| Label | Default | Description |
|---|---|---|
deploy.healthcheck.timeout |
120 |
Seconds to wait for healthy before aborting |
deploy.healthcheck.poll |
2 |
Seconds between health status polls |
Services are deployed in the order they appear in docker-compose.yml. If a service fails, all subsequently listed services are skipped. Previously deployed services in the same run are NOT rolled back — they already passed health checks and are serving traffic. This matches the expand-then-contract migration discipline: each service should be independently deployable.
If explicit ordering is needed beyond file order, a label is available:
| Label | Default | Description |
|---|---|---|
deploy.order |
100 |
Integer. Lower deploys first. Ties broken by file order. |
Host information is declared in the compose file using Docker Compose's native x- extension mechanism. This keeps all deployment topology in the same file that defines the services.
Single-host (most projects): Declare defaults at the top level:
x-deploy:
host: app-1.example.com
user: deploy
port: 22
dir: /srv/myapp
services:
web:
image: ghcr.io/myorg/myapp:${DEPLOY_TAG:-latest}
labels:
deploy.role: app
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
worker:
image: ghcr.io/myorg/myapp:${DEPLOY_TAG:-latest}
labels:
deploy.role: app
healthcheck:
test: ["CMD", "celery", "inspect", "ping"]
postgres:
image: postgres:16
labels:
deploy.role: accessoryAll services inherit host, user, port, and dir from x-deploy. One declaration, no repetition.
Multi-host: Per-service labels override the defaults:
x-deploy:
host: app-1.example.com
user: deploy
dir: /srv/myapp
services:
web:
labels:
deploy.role: app
celery-default:
labels:
deploy.role: app
deploy.host: worker-1.example.com
celery-email:
labels:
deploy.role: app
deploy.host: worker-1.example.com
centrifugo:
labels:
deploy.role: app
deploy.host: realtime-1.example.com
deploy.dir: /srv/centrifugo
postgres:
labels:
deploy.role: accessory
deploy.host: db-1.example.com| Label | Default | Description |
|---|---|---|
deploy.host |
x-deploy.host |
SSH hostname for this service |
deploy.user |
x-deploy.user |
SSH user for this service |
deploy.port |
x-deploy.port |
SSH port for this service |
deploy.dir |
x-deploy.dir |
Project directory on the remote host |
Resolution order (highest priority wins):
- GitHub Actions environment variable (
HOST_NAME,HOST_USER,SSH_PORT) - Per-service label (
deploy.host,deploy.user,deploy.port,deploy.dir) - Top-level
x-deploydefault - Error if none is set (for
host)
Environment variable overrides exist because x-deploy values are committed to the repository. For public repos, operators may prefer to keep hostnames, usernames, and SSH ports out of version control. GitHub Actions environment variables provide this — set them in a GitHub environment and reference them in the deploy workflow.
| Environment Variable | Overrides | Description |
|---|---|---|
HOST_NAME |
x-deploy.host / deploy.host |
SSH hostname for all services |
HOST_USER |
x-deploy.user / deploy.user |
SSH user for all services |
SSH_PORT |
deploy.port / x-deploy.port |
SSH port for all connections |
The GitHub Action (§7) reads these values by running <compose-command> config in CI, which outputs the fully merged YAML with all overrides applied. It then groups services by host, applies any environment variable overrides, and SSHes to each one.
Deferred to v2 (see §12.3). Hook support (pre-deploy commands like migrations, post-deploy commands like cache warming) is a natural extension but not required for v1. Migrations are left to the framework's own boot sequence or manual execution via flow-deploy exec.
The tool needs to know which image tag to deploy. When --tag is omitted, the tool uses the current HEAD SHA — deploying whatever is currently checked out. When --tag is provided, the tool checks out that ref and uses it as the image tag:
flow-deploy deploy --tag abc123f
The --tag value temporarily overrides the image tag for all deploy.role=app services before pulling. This is implemented via the DEPLOY_TAG environment variable, which compose files can reference:
services:
web:
image: ghcr.io/myorg/myapp:${DEPLOY_TAG:-latest}The deployed SHA is always available via git rev-parse HEAD — the server is kept in detached HEAD at the deployed commit. The tool does not write any state files to the working tree.
The tool expects a project directory containing a checked-out Git repository with a docker-compose.yml at its root. The standard layout:
/srv/myapp/
├── docker-compose.yml # Base service definitions
├── docker-compose.prod.yml # Production overrides
├── script/
│ ├── dev # Local dev compose wrapper
│ └── prod # Production compose wrapper
├── Dockerfile
├── .env # Environment variables (secrets)
└── (application source)
The tool writes no state files to the working tree. The deploy lock is stored at .git/deploy-lock (invisible to git status). The current deployed SHA is always git rev-parse HEAD.
The tool never calls docker compose directly. It delegates to a compose wrapper script that knows which override files to use for the current environment.
The wrapper is specified by the COMPOSE_COMMAND environment variable, which defaults to script/prod if the file exists.
Resolution order:
COMPOSE_COMMANDenv var (explicit override)script/prod(if present and executable)docker compose(bare fallback — no overrides)
A typical script/prod:
#!/usr/bin/env bash
docker compose -f docker-compose.yml -f docker-compose.prod.yml "$@"A wrapper that includes image tagging:
#!/usr/bin/env bash
DOCKER_IMAGE_TAG="${DOCKER_IMAGE_TAG:-latest}" \
docker compose -f docker-compose.yml -f docker-compose.prod.yml "$@"This pattern means the tool is completely agnostic about which compose files exist or how they're layered. The project defines its own composition strategy, and the tool just calls the wrapper with the appropriate subcommand arguments (pull, up, stop, etc.).
All examples in this spec that reference <compose-command> refer to whatever this wrapper resolves to.
The GitHub Action (§7.1) accepts a command input that specifies the same wrapper. The action runs <command> config in CI to get the fully merged YAML for host discovery and service classification. This means both the CI-side action and the server-side tool use the same wrapper — one declaration, both sides agree.
Secrets are managed via .env files, which Docker Compose reads natively. The tool does not manage, rotate, or inject secrets. This is intentionally left to the operator.
The .env file should be excluded from version control and provisioned separately (via Ansible, manual setup, or a secrets manager). The tool only requires that the file exists if the compose file references it.
The tool is invoked as flow-deploy <command> [options].
Perform a rolling deploy of all deploy.role=app services.
flow-deploy deploy [--tag TAG] [--service SERVICE] [--dry-run]
| Flag | Description |
|---|---|
--tag TAG |
Image tag and git ref to deploy (defaults to current HEAD SHA) |
--service SERVICE |
Deploy only a specific service (repeatable) |
--dry-run |
Show what would happen without executing |
Exit codes:
0— all services deployed successfully1— deploy failed (dirty tree, git error, unhealthy service, etc.)2— deploy lock held by another process
To deploy a previous version, run flow-deploy deploy --tag <previous-sha>. The SHA is always a git commit — find it in your CI history or via git log on the server.
Show the current state of all managed services.
flow-deploy status
Output includes: current HEAD SHA, service name, role, container ID, image tag, health status.
Run a command inside a running service container. Convenience wrapper around docker compose exec.
flow-deploy exec <service> <command...>
Tail logs for a service. Convenience wrapper around docker compose logs.
flow-deploy logs <service> [--follow] [--tail N]
Output resolved deploy configuration as JSON. Used by the GitHub Action to discover hosts without requiring the Python source tree.
flow-deploy config [--command COMMAND]
| Flag | Description |
|---|---|
--command COMMAND |
Override compose command (e.g. docker compose) |
Reads COMPOSE_COMMAND env var (same as all other commands), runs <compose-command> config, parses host discovery, and emits a JSON array of host groups to stdout. Errors go to stderr with exit code 1.
Update the tool to the latest release.
flow-deploy upgrade
This detects the system's libc (musl or glibc), downloads the latest binary from the GitHub release, and atomically replaces the current binary. Works when run manually on the server or via SSH from a CI pipeline.
Only one deploy may run at a time per project directory. The tool uses a lock file (.git/deploy-lock) containing the PID and timestamp of the running deploy. The lock is stored inside .git/ to avoid dirtying the working tree. The lock is:
- Acquired before any mutations (git checkout, container changes)
- Released on completion (success or failure)
- Retained if git restore fails after a failed deploy — prevents further deploys until manual intervention (the error message includes the fix:
git checkout --detach <sha> && rm .git/deploy-lock) - Automatically broken if the holding PID is no longer running (stale lock recovery)
- Reported with a clear message if held by another process (exit code 2)
All output goes to stdout. The format is human-readable and designed for both terminal use and GitHub Actions log rendering.
[12:34:56] ── deploy ──────────────────────────────
[12:34:56] tag: abc123f
[12:34:56] services: web, worker
[12:34:56]
[12:34:56] ▸ web
[12:34:56] pulling ghcr.io/myorg/myapp:abc123f...
[12:34:58] pulled (2.1s)
[12:34:58] starting new container...
[12:34:58] waiting for health check (timeout: 120s)...
[12:35:11] healthy (10.2s)
[12:35:11] draining old container (a1b2c3d, 30s timeout)...
[12:35:14] ✓ web deployed (16.1s)
[12:35:12]
[12:35:12] ▸ worker
[12:35:12] pulling ghcr.io/myorg/myapp:abc123f...
[12:35:13] pulled (1.0s)
[12:35:13] starting new container...
[12:35:13] waiting for health check (timeout: 120s)...
[12:35:23] healthy (9.8s)
[12:35:23] draining old container (e4f5g6h, 30s timeout)...
[12:35:26] ✓ worker deployed (13.5s)
[12:35:24]
[12:35:24] ── complete (25.6s) ─────────────────────
Failure output:
[12:35:12] ▸ worker
[12:35:12] pulling ghcr.io/myorg/myapp:abc123f...
[12:35:13] pulled (1.0s)
[12:35:13] starting new container...
[12:35:13] waiting for health check (timeout: 120s)...
[12:37:13] ✗ health check timeout (120.0s)
[12:37:13] aborting: stopping new container (x9y8z7w)...
[12:37:14] aborted, old container still serving
[12:37:14] ✗ worker FAILED
[12:37:14]
[12:37:14] restoring repo to a1b2c3d...
[12:37:14] ── FAILED (deploy aborted) ─────────────
Dirty-tree output:
[12:34:56] ERROR: working tree is dirty — deploy aborted
Since the tool runs over SSH, output naturally appears in Actions logs. For richer integration, the tool emits GitHub Actions log commands when it detects the GITHUB_ACTIONS=true environment variable (passed through SSH):
::group::service-name/::endgroup::for collapsible sections::error::for deploy failures- Step summary written to
$GITHUB_STEP_SUMMARYif available
For multi-host deploys or when you want host discovery from compose labels, use the GitHub Action. The action:
- Checks out the repo (already done by the workflow)
- Runs
<command> configto get the fully merged compose YAML - Parses
x-deployanddeploy.*labels to discover hosts - Groups services by host
- SSHes to each host:
flow-deploy deploy --tag <tag> - Streams logs back to GitHub Actions
name: Deploy
on:
push:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
outputs:
tag: ${{ steps.meta.outputs.version }}
steps:
- uses: actions/checkout@v4
- uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE }}
tags: |
type=sha,prefix=
- uses: docker/build-push-action@v5
with:
push: true
tags: ${{ steps.meta.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: flowcanon/deploy-action@master
with:
tag: ${{ needs.build.outputs.tag }}
ssh-key: ${{ secrets.DEPLOY_SSH_KEY }}
command: script/prodThe command input tells the action which compose wrapper to use for config parsing. This is the same wrapper flow-deploy uses on the server — one declaration, both sides agree.
For staging, swap the wrapper:
- uses: flowcanon/deploy-action@master
with:
tag: ${{ needs.build.outputs.tag }}
ssh-key: ${{ secrets.DEPLOY_SSH_KEY }}
command: script/stagingFor single-host projects, the action is optional. A raw SSH command works:
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- name: Deploy
run: |
ssh -o StrictHostKeyChecking=no deploy@${{ secrets.PROD_HOST }} \
"cd /srv/myapp && \
GITHUB_ACTIONS=true flow-deploy deploy --tag ${{ needs.build.outputs.tag }}"This is the simplest possible deploy: one SSH call, no action, no host discovery. The tool handles git operations (fetch, detached checkout), calls script/prod, and runs the rolling deploy. No separate git pull or git checkout is needed — the tool owns the full transaction.
# Install the latest binary
curl -fsSL https://deploy.flowcanon.com/install | sh
# Verify
flow-deploy --versionThis installs a standalone binary to ~/.local/bin. Set INSTALL_DIR to install elsewhere. The installer detects the system's libc (musl or glibc) and downloads the appropriate binary from GitHub releases.
# Clone the project
git clone git@github.com:myorg/myapp.git /srv/myapp
# Authenticate with GHCR (once per server)
echo $GHCR_TOKEN | docker login ghcr.io -u $GHCR_USER --password-stdin
# Create .env with secrets
cp .env.example .env
vim .env
# Ensure script/prod exists and is executable
chmod +x script/prod
# Start accessories (these run independently of deploys)
cd /srv/myapp
script/prod up -d postgres redis
# First deploy
flow-deploy deploy# From GitHub Actions or any CI — upgrade all hosts then deploy
for host in web1 web2 web3; do
ssh deploy@$host "flow-deploy upgrade"
doneOr embed it in the deploy workflow:
- name: Upgrade and Deploy
run: |
ssh deploy@${{ secrets.PROD_HOST }} \
"flow-deploy upgrade && \
cd /srv/myapp && flow-deploy deploy --tag ${{ needs.build.outputs.tag }}"The tool has zero configuration files. All behavior is controlled by x-deploy defaults, labels on services in docker-compose.yml, and optional environment variable overrides:
Top-level defaults (via x-deploy):
| Key | Required | Description |
|---|---|---|
x-deploy.host |
No* | Default SSH hostname |
x-deploy.user |
No* | Default SSH user |
x-deploy.port |
No | Default SSH port |
x-deploy.dir |
No | Default project directory on remote host |
* Required unless overridden by an environment variable or per-service label.
Per-service labels (override x-deploy defaults):
| Label | Required | Default | Description |
|---|---|---|---|
deploy.role |
Yes | (none) | app or accessory |
deploy.host |
No | x-deploy.host |
SSH hostname for this service |
deploy.user |
No | x-deploy.user |
SSH user for this service |
deploy.port |
No | x-deploy.port |
SSH port for this service |
deploy.dir |
No | x-deploy.dir |
Project directory on remote host |
deploy.order |
No | 100 |
Deploy order (lower first) |
deploy.drain |
No | 30 |
Seconds to wait after SIGTERM before SIGKILL |
deploy.healthcheck.timeout |
No | 120 |
Seconds before aborting |
deploy.healthcheck.poll |
No | 2 |
Seconds between polls |
Environment variable overrides (highest priority, set via GitHub Actions environment):
| Variable | Overrides | Description |
|---|---|---|
HOST_NAME |
deploy.host / x-deploy.host |
SSH hostname for all services |
HOST_USER |
deploy.user / x-deploy.user |
SSH user for all services |
SSH_PORT |
deploy.port / x-deploy.port |
SSH port for all connections |
These are useful for public repositories where hostnames and usernames should not be committed to version control.
Plus the standard Docker/Traefik labels you're already using.
Explicitly out of scope, by design:
- Build images. That's CI's job.
- Manage secrets. Use
.envfiles, Ansible, Vault, or whatever you prefer. - Provision servers. That's Ansible's job.
- Manage DNS or SSL. That's Traefik's job.
- Run from your laptop. SSH into the server for manual operations, or trigger from CI.
- Replace Docker Compose. It's a thin orchestration layer on top of compose, not a replacement.
All v1 design decisions are made with the v2 roadmap in mind. Specifically:
- The single-command
flow-deploy deployis a convenience that runs prepare + health check + cutover in one shot. v2 splits this into discrete phases without breaking the v1 interface. - The
deploy.rolelabel convention is extensible — v2 adds behavior, not new classification schemes. - The compose command wrapper (§3.1) means the tool never makes assumptions about compose file structure, which keeps it compatible with arbitrarily complex service topologies.
- The tool writes no state to the working tree. The deploy lock lives in
.git/deploy-lock, and the current deployed SHA is alwaysgit rev-parse HEAD. v2 adds.git/deploy-preparefor two-phase state tracking. - Host discovery via
x-deployanddeploy.hostlabels (§2.6) gives the GitHub Action enough information to orchestrate multi-host deploys in v1 (sequential) and v2 (two-phase coordinated).
v1 supports multi-host deploys via the GitHub Action (§7.1), which discovers hosts from compose labels, SSHes to each one, and runs flow-deploy deploy sequentially. This works but has a gap: if host 3 of 4 fails, hosts 1 and 2 are already on the new version while hosts 3 and 4 are on the old version.
v2 introduces three commands that decompose the deploy lifecycle to solve this:
| Command | Behavior |
|---|---|
flow-deploy prepare --tag TAG |
Pull new image, start new container alongside old, health check. Both containers running. Old still serving traffic. Stop here. |
flow-deploy cutover |
Graceful shutdown of old containers. New containers take traffic. |
flow-deploy cancel |
Kill new containers. Old containers continue serving. No-op rollback. |
flow-deploy deploy remains available as a convenience that runs prepare + cutover in one shot, preserving full backward compatibility with v1 workflows.
CI orchestration for fleet deploys:
build ──→ prepare (all nodes in parallel, fail-fast)
│
├── all green ──→ cutover (all nodes in parallel)
│
└── any red ───→ cancel (all nodes, best-effort)
The tool remains single-node — it has no awareness of other nodes. CI (GitHub Actions) is the fleet orchestrator, using job dependencies and fail-fast matrix strategies to coordinate across hosts.
Stale prepare protection: If CI crashes between prepare and cutover, two containers are left running indefinitely. The tool writes a .deploy-prepare file with a timestamp. If a prepare is older than a configurable threshold (default: 30 minutes), subsequent commands auto-cancel it and flow-deploy status warns about it.
If a service is already scaled to N (e.g., 3 workers), the rolling deploy should scale to N+1, health check the new instance, then kill one old instance, repeating until all N are replaced. Managed via a deploy.scale label:
| Label | Default | Description |
|---|---|---|
deploy.scale |
1 |
Number of instances for this service |
Hook support for pre-deploy commands (migrations, asset compilation) and post-deploy commands (cache warming, notifications). Deferred from v1 — migrations are handled by the framework's own boot sequence or via manual flow-deploy exec.
Captured here for context on why v1 works the way it does:
-
Compose scaling behavior: Confirmed that
--no-recreate --scale=2leaves the old container untouched (env vars, volumes, networks unchanged). Validated as the correct approach for rolling deploys. -
Compose override detection: The tool delegates to the project's compose wrapper (
script/prodorCOMPOSE_COMMAND). It never detects or assembles compose file stacks itself. Projects define their own composition strategy. -
Traefik drain: Graceful shutdown is the default behavior.
docker stop --time <drain>sends SIGTERM and waits (default 30s) before SIGKILL. Configurable viadeploy.drainlabel. -
Hook complexity: Deferred to v2. Not needed for v1 — migrations are handled by the framework or via manual
flow-deploy exec. -
Fleet coordination: The server-side tool (
flow-deploy) is single-node by design. Multi-node coordination is handled by the GitHub Action (flow-deploy-action), which discovers hosts from compose labels and orchestrates via SSH. v2's two-phase deploy (prepare/cutover/cancel) gives the action the primitives it needs for coordinated fleet deploys.