Skip to content

Recover blocked runs with saved branch/worktree state #35

@rochecompaan

Description

@rochecompaan

Summary

A run can produce useful unmerged work and then end with status: "blocked" because an external prerequisite is unavailable. When the prerequisite is later fixed, patchmill run-once --issue <n> currently dead-ends with a stale branch/worktree safety error instead of offering a supported recovery path.

Observed error on retry:

Non-resumable run state for issue #<n> has stale branch/worktree; clean up before starting a fresh run

This is safe in the sense that it avoids overwriting existing work, but it leaves the operator to manually inspect, preserve, delete, or resurrect the branch/worktree. That is risky when the branch contains unmerged commits.

Sanitized context from a real run

Patchmill version: 0.12.0

Flow:

  1. patchmill run-once --issue <n> selected an agent-ready issue.
  2. Patchmill created or reused the spec/plan state.
  3. Patchmill created a worktree and branch similar to:
    • worktree: .worktrees/patchmill-issue-<n>-<slug>
    • branch: agent/issue-<n>-<slug>
  4. The implementation agent made several commits on that branch.
  5. Required verification could not run because a project-managed verification environment was unavailable, and the project instructions prohibited substituting host-run tests.
  6. The implementation agent returned a blocked result similar to:
{
  "status": "blocked",
  "reason": "Required project verification environment is unavailable; project instructions prohibit substituting host tests.",
  "questions": [
    {
      "question": "Can you start or repair the project verification environment so the required readiness and test commands can run?",
      "recommendedAnswer": "Start or repair the project-managed verification stack and confirm the required readiness command succeeds before rerunning verification."
    }
  ],
  "commits": [
    "<commit-1>",
    "<commit-2>",
    "<commit-3>",
    "<commit-4>"
  ],
  "validation": [
    "formatting on changed files: passed",
    "git diff --check: passed",
    "required verification readiness check: failed because the project-managed environment was not running",
    "required test command: not run because the required verification environment was not ready"
  ]
}
  1. Patchmill wrote .patchmill/runs/issue-<n>.json with:
{
  "issueNumber": <n>,
  "status": "blocked",
  "branch": "agent/issue-<n>-<slug>",
  "worktreePath": ".worktrees/patchmill-issue-<n>-<slug>",
  "lastError": "Required project verification environment is unavailable; project instructions prohibit substituting host tests."
}
  1. A later patchmill run-once --issue <n> failed immediately with:
Non-resumable run state for issue #<n> has stale branch/worktree; clean up before starting a fresh run

The existing worktree was clean, the branch was unmerged, and the commits were still valuable.

Why this happens

The current resumability model appears to treat only these states as resumable:

state.status === "claimed" || state.status === "planning" || state.status === "implementing"

A blocked run is therefore non-resumable. Then the stale branch/worktree guard refuses to reset or continue when existing non-resumable state still records a branch or worktree.

That safety guard is good, but the UX needs a guided recovery path.

Recommendation

Patchmill should treat blocked runs with preserved branch/worktree state as recoverable, or provide an explicit recovery command.

Option A: make some blocked states resumable

A blocked state should be resumable when:

  • it has a saved branch and/or worktree,
  • the branch/worktree still exists,
  • the worktree is clean or the user explicitly confirms how to handle local modifications,
  • the branch is not already merged, and
  • the blocker is external/actionable rather than a completed terminal outcome.

For environment or prerequisite blockers, retrying after the prerequisite is fixed should continue from the existing branch/worktree instead of requiring manual cleanup.

Option B: add patchmill recover --issue <n>

Add a command that inspects the saved run state and prints a recovery report:

Issue #<n> is blocked because required verification could not run.

Saved branch: agent/issue-<n>-<slug>
Saved worktree: .worktrees/patchmill-issue-<n>-<slug>
Worktree status: clean
Branch status: unmerged
Commits: <commit-1> <commit-2> <commit-3> <commit-4>

Recovery options:
1. retry verification and continue the existing run
2. rebase/cherry-pick the existing work onto current main, then continue
3. archive the branch/worktree and start fresh
4. abandon recovery and leave everything unchanged

Potential flags:

patchmill recover --issue <n> --retry
patchmill recover --issue <n> --retry-verification
patchmill recover --issue <n> --rebase-current-main
patchmill recover --issue <n> --archive-and-rerun
patchmill recover --issue <n> --abandon

Option C: improve the existing run-once error

If adding a new command is too much, run-once should at least replace the generic stale-state error with a recovery report and exact next commands. It should include:

  • run state path,
  • saved branch,
  • saved worktree,
  • worktree clean/dirty status,
  • whether the branch is merged,
  • blocker reason,
  • commits created by the blocked run, and
  • safe suggested actions.

Safety requirements

  • Never auto-delete a branch/worktree containing unmerged work.
  • Never overwrite dirty worktree changes without explicit confirmation.
  • Prefer archiving over deletion when the user asks to start fresh.
  • If the branch diverges from current main, warn before continuing and suggest rebase/cherry-pick recovery.
  • If the branch was already merged, offer state cleanup rather than retrying implementation.

Suggested acceptance criteria

  • A blocked run with saved clean branch/worktree can be recovered without manual deletion.
  • run-once no longer leaves the user with only a generic stale branch/worktree error.
  • Recovery output includes the blocker reason and the saved branch/worktree paths.
  • Recovery distinguishes at least these cases:
    • clean unmerged branch/worktree,
    • dirty worktree,
    • branch already merged,
    • branch diverged from current main,
    • missing worktree but existing branch,
    • missing branch/worktree but stale run state.
  • Tests cover a blocked run caused by an external verification prerequisite and verify that Patchmill suggests retry/recover rather than destructive cleanup.

Why this matters

blocked does not necessarily mean the run is terminal or disposable. In many cases it means Patchmill successfully produced partial work but could not complete validation because a human or external system needs to fix a prerequisite. Patchmill has enough state to guide safe recovery, preserve useful commits, and resume once the blocker is resolved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions