feat(engine): Retry affordance for a stuck autopilot goal#1148
Open
psdjungpulzze wants to merge 2 commits into
Open
feat(engine): Retry affordance for a stuck autopilot goal#1148psdjungpulzze wants to merge 2 commits into
psdjungpulzze wants to merge 2 commits into
Conversation
When an autopilot goal's run fails, the failed task is parked on an open `run_failure` pause and the goal stalls — previously the only recovery was drilling into each task's /attention card to resume its pause. This adds a goal-level Retry affordance, the counterpart to the existing "Resume goal" (backlog → queued). - `retryGoalFailures(goalId, projectId, resolvedBy?)` resumes every open `run_failure` pause under the goal via the existing `resolvePause` (per-kind transition + audit + realtime + transient-budget reset), re-queuing each failed task. Scoped to `run_failure` only — clarification/risk/standoff pauses are deliberate human gates, not failures. Idempotent + best-effort; fires one coarse engine.changed. - POST /api/v2/projects/:projectId/goals/:goalId/retry — mirrors the resume route (auth, UUID-guard, no-leak 404), passing the actor as resolvedBy. - GoalDetailData gains `stuck.failedTasks` (count of open run_failure pauses under the goal), computed in the one shared serializer so page + GET route agree. The detail surface shows an "Autopilot: stuck" badge + a Retry button (only when stuck), and suppresses the "scoping" spinner while stuck. No agent-config update: goal resume/retry are not agent-exposed tools. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…just run_failure
Review rework: the goal-level Retry resumed ONLY open `run_failure` pauses, so a
goal stuck on a `deliverable_incomplete` pause stayed stuck and showed no Retry
button. The acceptance criterion requires resuming every open pause of kind
run_failure OR deliverable_incomplete.
- retryGoalFailures (pause.ts): filter `kind: { in: [run_failure,
deliverable_incomplete] }`. resolveTaskTransition already re-queues
deliverable_incomplete identically, so the resume path Just Works once selected.
- buildGoalDetailData (goal-detail.ts): count both kinds for the `stuck.failedTasks`
signal so the Retry button shows/enables when the goal's only open pauses are
deliverable_incomplete.
- Sync doc comments + STUCK_BADGE/route/client copy to both failure kinds.
- Tests: assert the two-kind filter; add deliverable_incomplete resume coverage.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a goal-level Retry affordance to recover a stuck autopilot goal — one whose run failed and stalled on an open
run_failurepause. Previously the only recovery was drilling into each failed task's/attentioncard and resuming its pause individually. This is the counterpart to the existing Resume goal (backlog → queued).What "stuck" means
An autopilot goal is stuck when ≥1 of its tasks has an open
run_failurepause (a run failed/abandoned and is waiting on a human). The planning task that scopes the goal, or any child task, can land here — leaving the goal stalled (and previously still showing the "scoping" spinner indefinitely).Changes
retryGoalFailures(goalId, projectId, resolvedBy?)(src/lib/engine/pause.ts) — resumes every openrun_failurepause under the goal via the existingresolvePause(so it inherits per-kind transition, audit,engine:pause-resolvedpublish, and transient-retry-budget reset), re-queuing each failed task. Scoped torun_failureonly — clarification/risk/standoff pauses are deliberate human gates, not failures, so they stay on/attentionfor an explicit decision. Idempotent + best-effort (a pause another actor resolved in between is skipped); fires one coarseengine.changed.POST /api/v2/projects/:projectId/goals/:goalId/retry— mirrors the resume route (auth, UUID-guard, no-leak 404), passing the resolving user asresolvedBy.GoalDetailData.stuck.failedTasks— count of openrun_failurepauses under the goal, computed in the one shared serializer (buildGoalDetailData) so the server-rendered page and the GET goal-read route agree byte-for-byte.engine-goal-detail-client.tsx) — anAutopilot: stuckbadge + aRetry N failed task(s)button, shown only when stuck; the "scoping" spinner is suppressed while stuck (a failed run isn't scoping). NewSTUCK_BADGEin the sharedgoal-badgesvocabulary.AI agent maintenance
No agent-config update needed: goal
resume/retryare operator UI actions, not agent-exposed tools (noTOOL_NAMESentry references them).Tests
pause.test.ts—retryGoalFailures: resumes all open run_failure pauses + re-queues each task + fires one engine.changed; no-op (no publish) when nothing is stuck; skips an already-resolved pause (idempotent, not counted).goal-detail.test.ts—stuck.failedTaskscounts open run_failure pauses scoped to the goal; zero when none.engine-goal-detail-client.test.tsx— badge + button visibility/pluralization, scoping suppression while stuck, POST-to-retry-route + refetch, error path.retry/route.test.ts— actor pass-through, no-leak 404s, auth pass-through, scoped lookup.Full
src/lib/enginesuite (685 tests), the four touched suites,tsc --noEmit, and eslint on changed files all pass.🤖 Generated with Claude Code