Skip to content

Commit 33b6db4

Browse files
committed
If deploy fails, rollback state
1 parent 521de7c commit 33b6db4

6 files changed

Lines changed: 262 additions & 13 deletions

File tree

SPEC.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,19 @@ Every service in `docker-compose.yml` is classified by a label:
2929

3030
### 2.2 Deploy Lifecycle
3131

32-
For each service with `deploy.role=app`, in the order they appear in the compose file:
32+
The deploy runs as a single transaction — `flow-deploy deploy --tag <sha>` owns the full lifecycle including git operations. The `--tag` value serves double duty: it is both the Docker image tag and the git SHA to checkout.
33+
34+
**Pre-flight and git checkout (before any service work):**
35+
36+
```
37+
0a. Dirty check git status --porcelain
38+
If non-empty → log "working tree is dirty — deploy aborted", exit 1
39+
0b. Fetch git fetch origin
40+
0c. Record previous SHA previous_sha = git rev-parse HEAD
41+
0d. Checkout (detached) git checkout --detach <sha>
42+
```
43+
44+
**For each service with `deploy.role=app`, in the order they appear in the compose file:**
3345

3446
```
3547
1. Pull new image <compose-command> pull <service>
@@ -43,9 +55,14 @@ For each service with `deploy.role=app`, in the order they appear in the compose
4355
4b. If unhealthy:
4456
Stop new container docker stop <new_id> && docker rm <new_id>
4557
Scale back to 1 <compose-command> up -d --no-deps --scale <service>=1
58+
Restore repo git checkout --detach <previous_sha>
4659
✗ Abort deploy, exit 1
4760
```
4861

62+
**On success:** the server is in detached HEAD at `<sha>`. Log `HEAD detached at <sha>`.
63+
64+
**On failure:** the repo is restored to `<previous_sha>` before exiting. The invariant is preserved: `git rev-parse HEAD` always matches the image SHA that is actively serving traffic.
65+
4966
Where `<compose-command>` is the project's compose wrapper (see §3.1).
5067

5168
### 2.3 Graceful Shutdown
@@ -398,9 +415,16 @@ Failure output:
398415
[12:37:14] rollback complete, old container still serving
399416
[12:37:14] ✗ worker FAILED
400417
[12:37:14]
418+
[12:37:14] restoring repo to a1b2c3d...
401419
[12:37:14] ── FAILED (deploy aborted) ─────────────
402420
```
403421
422+
Dirty-tree output:
423+
424+
```
425+
[12:34:56] ERROR: working tree is dirty — deploy aborted
426+
```
427+
404428
### 6.1 GitHub Actions Integration
405429
406430
Since the tool runs over SSH, output naturally appears in Actions logs. For richer integration, the tool emits GitHub Actions log commands when it detects the `GITHUB_ACTIONS=true` environment variable (passed through SSH):
@@ -421,7 +445,7 @@ For multi-host deploys or when you want host discovery from compose labels, use
421445
2. Runs `<command> config` to get the fully merged compose YAML
422446
3. Parses `x-deploy` and `deploy.*` labels to discover hosts
423447
4. Groups services by host
424-
5. SSHes to each host: `git pull` → `flow-deploy deploy --tag <tag>`
448+
5. SSHes to each host: `flow-deploy deploy --tag <tag>`
425449
6. Streams logs back to GitHub Actions
426450
427451
```yaml
@@ -504,11 +528,10 @@ For single-host projects, the action is optional. A raw SSH command works:
504528
run: |
505529
ssh -o StrictHostKeyChecking=no deploy@${{ secrets.PROD_HOST }} \
506530
"cd /srv/myapp && \
507-
git fetch && git checkout -B main origin/main && \
508531
GITHUB_ACTIONS=true flow-deploy deploy --tag ${{ needs.build.outputs.tag }}"
509532
```
510533
511-
This is the simplest possible deploy: one SSH call, no action, no host discovery. The tool runs locally on the server, calls `script/prod`, and handles the rolling deploy.
534+
This is the simplest possible deploy: one SSH call, no action, no host discovery. The tool handles git operations (fetch, detached checkout), calls `script/prod`, and runs the rolling deploy. No separate `git pull` or `git checkout` is needed — the tool owns the full transaction.
512535

513536
---
514537

docs/github-actions.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@ The deploy pipeline:
88

99
1. CI builds your Docker image and pushes to GHCR
1010
2. The deploy action discovers hosts from your `docker-compose.yml`
11-
3. For each host: authenticates with GHCR, pulls the repo, and runs `flow-deploy deploy`
11+
3. For each host: authenticates with GHCR and runs `flow-deploy deploy` (which handles git fetch and checkout internally)
1212

1313
## Prerequisites
1414

1515
On your deploy server:
1616

1717
- Docker and Docker Compose
1818
- Traefik (or your reverse proxy) running
19-
- Git (the server repo is updated via `git pull --ff-only` before each deploy)
19+
- Git (`flow-deploy` handles `git fetch` and detached checkout internally during each deploy)
2020
- `flow-deploy` installed:
2121

2222
```sh
@@ -164,8 +164,7 @@ For each host group discovered from your compose config:
164164
1. **SSH agent** — loads your deploy key
165165
2. **Discover hosts** — parses `docker-compose.yml` for `x-deploy` and `deploy.*` labels, groups services by `(host, user, port, dir)`
166166
3. **GHCR login** — authenticates Docker on the server (and logs out after)
167-
4. **Git pull** — fast-forward only, fails safely if the server has diverged
168-
5. **Deploy** — runs `flow-deploy deploy --tag <tag>` on the server
167+
4. **Deploy** — runs `flow-deploy deploy --tag <tag>` on the server (git fetch and detached checkout are handled by the tool)
169168

170169
## Host Discovery
171170

@@ -221,8 +220,8 @@ To cut releases with binaries and changelogs, see the release workflow in this r
221220

222221
## Troubleshooting
223222

224-
**`git pull --ff-only` fails:**
225-
The server repo has diverged from the remote. SSH into the server and resolve manually — check for local commits or uncommitted changes.
223+
**`working tree is dirty — deploy aborted`:**
224+
The server repo has uncommitted changes. SSH into the server and resolve manually — `git status` will show what's dirty.
226225

227226
**`unauthorized` pulling from GHCR:**
228227
Pass `registry-token: ${{ secrets.GITHUB_TOKEN }}` to the deploy action. The job needs `packages: write` (or at least `packages: read`) permission.

src/flow_deploy/deploy.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import signal
44
import time
55

6-
from flow_deploy import compose, config, containers, lock, log, tags
6+
from flow_deploy import compose, config, containers, git, lock, log, tags
77

88

99
def deploy(
@@ -12,7 +12,7 @@ def deploy(
1212
dry_run: bool = False,
1313
cmd: list[str] | None = None,
1414
) -> int:
15-
"""Perform a rolling deploy. Returns exit code (0=success, 1=failure, 2=locked)."""
15+
"""Perform a rolling deploy. Returns exit code (0=success, 1=failure, 2=locked, 3=skipped)."""
1616
compose_cmd = cmd or compose.resolve_command()
1717

1818
# Parse compose config
@@ -47,11 +47,18 @@ def deploy(
4747
_dry_run(tag, app_services)
4848
return 0
4949

50+
# Git pre-flight: dirty check, fetch, checkout detached
51+
git_code, previous_sha = git.preflight_and_checkout(tag)
52+
if git_code != 0:
53+
return 1
54+
5055
# Acquire lock
5156
if not lock.acquire():
5257
lock_info = lock.read_lock()
5358
pid = lock_info["pid"] if lock_info else "unknown"
5459
log.error(f"Deploy lock held by PID {pid}")
60+
# Restore repo to previous state before exiting
61+
git.restore(previous_sha)
5562
return 2
5663

5764
# Register signal handlers for cleanup
@@ -80,6 +87,7 @@ def _cleanup_handler(signum, frame):
8087
result = _deploy_service(svc, tag, compose_cmd, project=project)
8188
if result != 0:
8289
log.info("")
90+
git.restore(previous_sha)
8391
log.footer("FAILED (deploy aborted)")
8492
lock.release()
8593
return 1
@@ -88,6 +96,7 @@ def _cleanup_handler(signum, frame):
8896
tags.write_tag(tag)
8997

9098
log.info("")
99+
log.info(f"HEAD detached at {tag}")
91100
log.footer(f"complete ({elapsed:.1f}s)")
92101
finally:
93102
lock.release()

src/flow_deploy/git.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""Git operations for detached-HEAD deploy strategy."""
2+
3+
from flow_deploy import log, process
4+
5+
6+
def is_dirty() -> bool:
7+
"""Return True if the working tree has uncommitted changes."""
8+
result = process.run(["git", "status", "--porcelain"])
9+
return bool(result.stdout.strip())
10+
11+
12+
def fetch() -> process.Result:
13+
"""Fetch from origin."""
14+
return process.run(["git", "fetch", "origin"])
15+
16+
17+
def current_sha() -> str:
18+
"""Return the current HEAD SHA."""
19+
result = process.run(["git", "rev-parse", "HEAD"])
20+
return result.stdout.strip()
21+
22+
23+
def checkout_detached(sha: str) -> process.Result:
24+
"""Checkout a specific SHA in detached HEAD mode."""
25+
return process.run(["git", "checkout", "--detach", sha])
26+
27+
28+
def preflight_and_checkout(tag: str) -> tuple[int, str | None]:
29+
"""Run git pre-flight checks and checkout the deploy SHA.
30+
31+
Returns (exit_code, previous_sha).
32+
- (0, previous_sha) on success — repo is now at `tag` in detached HEAD.
33+
- (1, None) on error — git operation failed or working tree is dirty.
34+
"""
35+
# 1. Dirty check
36+
if is_dirty():
37+
log.error("working tree is dirty — deploy aborted")
38+
return 1, None
39+
40+
# 2. Fetch
41+
result = fetch()
42+
if result.returncode != 0:
43+
log.error(f"git fetch failed: {result.stderr.strip()}")
44+
return 1, None
45+
46+
# 3. Record previous SHA
47+
previous_sha = current_sha()
48+
49+
# 4. Checkout new SHA (detached HEAD)
50+
result = checkout_detached(tag)
51+
if result.returncode != 0:
52+
log.error(f"git checkout failed: {result.stderr.strip()}")
53+
return 1, None
54+
55+
return 0, previous_sha
56+
57+
58+
def restore(previous_sha: str) -> bool:
59+
"""Restore repo to a previous SHA after a failed deploy."""
60+
log.step(f"restoring repo to {previous_sha[:7]}...")
61+
result = checkout_detached(previous_sha)
62+
if result.returncode != 0:
63+
log.error(f"git restore failed: {result.stderr.strip()}")
64+
return False
65+
return True

tests/test_deploy.py

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from flow_deploy.deploy import deploy, rollback
77

88
COMPOSE_CMD = ["docker", "compose"]
9+
PREV_SHA = "prev123abc"
910

1011
COMPOSE_CONFIG_YAML = """\
1112
services:
@@ -70,13 +71,25 @@ def _err(stderr="error"):
7071
return process.Result(1, "", stderr)
7172

7273

74+
def _git_preflight():
75+
"""Return the 4 mock responses for a clean git preflight."""
76+
return [
77+
_ok(""), # git status --porcelain (clean)
78+
_ok(), # git fetch origin
79+
_ok(PREV_SHA + "\n"), # git rev-parse HEAD
80+
_ok(), # git checkout --detach <sha>
81+
]
82+
83+
7384
def _setup_happy_path(mock_process, monkeypatch, tmp_path):
7485
"""Set up mock responses for a successful 2-service deploy."""
7586
monkeypatch.chdir(tmp_path)
7687
mock_process.responses.extend(
7788
[
7889
# compose config
7990
_ok(COMPOSE_CONFIG_YAML),
91+
# git preflight
92+
*_git_preflight(),
8093
# web: pull
8194
_ok(),
8295
# web: scale to 2
@@ -124,6 +137,7 @@ def test_deploy_service_filter(mock_process, monkeypatch, tmp_path):
124137
mock_process.responses.extend(
125138
[
126139
_ok(COMPOSE_CONFIG_YAML),
140+
*_git_preflight(),
127141
# web only: pull, scale, ps, health, stop, rm, scale back
128142
_ok(),
129143
_ok(),
@@ -158,6 +172,7 @@ def test_deploy_health_check_failure(mock_process, monkeypatch, tmp_path):
158172
mock_process.responses.extend(
159173
[
160174
_ok(COMPOSE_CONFIG_YAML),
175+
*_git_preflight(),
161176
# web: pull
162177
_ok(),
163178
# web: scale to 2
@@ -170,6 +185,8 @@ def test_deploy_health_check_failure(mock_process, monkeypatch, tmp_path):
170185
_ok(),
171186
# web: scale back to 1
172187
_ok(),
188+
# git restore to previous SHA
189+
_ok(),
173190
]
174191
)
175192
result = deploy(tag="abc123", cmd=COMPOSE_CMD)
@@ -181,7 +198,10 @@ def test_deploy_pull_failure(mock_process, monkeypatch, tmp_path):
181198
mock_process.responses.extend(
182199
[
183200
_ok(COMPOSE_CONFIG_YAML),
201+
*_git_preflight(),
184202
_err("pull failed"),
203+
# git restore to previous SHA
204+
_ok(),
185205
]
186206
)
187207
result = deploy(tag="abc123", cmd=COMPOSE_CMD)
@@ -190,7 +210,14 @@ def test_deploy_pull_failure(mock_process, monkeypatch, tmp_path):
190210

191211
def test_deploy_lock_held(mock_process, monkeypatch, tmp_path):
192212
monkeypatch.chdir(tmp_path)
193-
mock_process.responses.append(_ok(COMPOSE_CONFIG_YAML))
213+
mock_process.responses.extend(
214+
[
215+
_ok(COMPOSE_CONFIG_YAML),
216+
*_git_preflight(),
217+
# git restore after lock rejection
218+
_ok(),
219+
]
220+
)
194221
# Pre-acquire lock with current PID
195222
from flow_deploy import lock
196223

@@ -245,10 +272,12 @@ def test_deploy_container_count_mismatch(mock_process, monkeypatch, tmp_path):
245272
mock_process.responses.extend(
246273
[
247274
_ok(single_svc_config),
275+
*_git_preflight(),
248276
_ok(), # pull
249277
_ok(), # scale to 2
250278
_ok(WEB_CONTAINER_OLD + "\n"), # only 1 container returned
251279
_ok(), # scale back to 1
280+
_ok(), # git restore
252281
]
253282
)
254283
result = deploy(tag="abc123", cmd=COMPOSE_CMD)
@@ -286,6 +315,7 @@ def test_rollback(mock_process, monkeypatch, tmp_path):
286315
mock_process.responses.extend(
287316
[
288317
_ok(single_svc_config),
318+
*_git_preflight(),
289319
_ok(),
290320
_ok(),
291321
_ok(WEB_CONTAINER_OLD + "\n" + WEB_CONTAINER_NEW + "\n"),
@@ -299,6 +329,21 @@ def test_rollback(mock_process, monkeypatch, tmp_path):
299329
assert result == 0
300330

301331

332+
def test_deploy_dirty_tree_fails(mock_process, monkeypatch, tmp_path, capsys):
333+
monkeypatch.chdir(tmp_path)
334+
mock_process.responses.extend(
335+
[
336+
_ok(COMPOSE_CONFIG_YAML),
337+
# git status --porcelain returns dirty
338+
_ok(" M somefile.py\n"),
339+
]
340+
)
341+
result = deploy(tag="abc123", cmd=COMPOSE_CMD)
342+
assert result == 1
343+
err = capsys.readouterr().err
344+
assert "dirty" in err
345+
346+
302347
def test_rollback_no_previous(monkeypatch, tmp_path):
303348
monkeypatch.chdir(tmp_path)
304349
result = rollback(cmd=COMPOSE_CMD)

0 commit comments

Comments
 (0)