From ca67caf08bf71066369bccbc2eb6c6fba88ce762 Mon Sep 17 00:00:00 2001 From: brettheap Date: Sun, 24 May 2026 18:13:56 +0000 Subject: [PATCH 01/51] FEAT-013: managed session creation and lifecycle (spec + plan) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the FEAT-013 spec, implementation plan, research, data model, contracts, and quickstart for operator-driven creation of standard multi-agent tmux layouts inside bench containers. Managed panes have a five-state lifecycle (creating / ready / degraded / failed / removed) with predecessor_id linkage for recreates, a pending-managed tmux pane-title marker that keeps the FEAT-004 scan out of in-flight panes, per-container serialization, durable daemon-restart recovery, and indefinite audit retention. Recreate chains are bounded at depth 16. Surface: app.managed_* methods under FEAT-011's host-only contract plus a legacy managed.* CLI namespace; 8 methods + 9 new closed-set error codes. Storage: additive SQLite migration (managed_layout, managed_pane); no existing table altered. Process artifacts: three clarification sessions (initial 15-Q, post-plan review 6-Q, alignment cleanup 5-Q) recorded under spec.md §Clarifications; a deep-and-wide release-gate audit produced 15 per-domain checklists plus plan-review.md, alignment-check.md, and alignment-recheck.md tracking post-plan follow-ups. Two analyze passes closed 18 findings total (15 from the first, 3 minor from the second). CLAUDE.md SPECKIT block updated to point at this feature's plan. Co-Authored-By: Claude Opus 4.7 --- .specify/feature.json | 2 +- CLAUDE.md | 2 +- .../checklists/accessibility.md | 27 ++ .../checklists/alignment-check.md | 71 +++++ .../checklists/alignment-recheck.md | 47 +++ .../checklists/api.md | 52 ++++ .../checklists/concurrency.md | 42 +++ .../checklists/configuration.md | 37 +++ .../checklists/data-model.md | 62 ++++ .../checklists/deployment.md | 33 ++ .../checklists/error-handling.md | 47 +++ .../checklists/idempotency.md | 37 +++ .../checklists/integration.md | 44 +++ .../checklists/observability.md | 47 +++ .../checklists/performance.md | 34 ++ .../checklists/plan-review.md | 93 ++++++ .../checklists/requirements.md | 95 ++++++ .../checklists/security.md | 46 +++ .../checklists/testing-strategy.md | 42 +++ .../checklists/ux.md | 48 +++ .../clarify-questions.md | 110 +++++++ .../contracts/error-codes.md | 115 +++++++ .../contracts/managed-methods.md | 294 ++++++++++++++++++ .../contracts/state-machine.md | 107 +++++++ .../data-model.md | 255 +++++++++++++++ specs/013-managed-session-lifecycle/plan.md | 142 +++++++++ .../quickstart.md | 266 ++++++++++++++++ .../013-managed-session-lifecycle/research.md | 265 ++++++++++++++++ specs/013-managed-session-lifecycle/spec.md | 166 ++++++++++ 29 files changed, 2626 insertions(+), 2 deletions(-) create mode 100644 specs/013-managed-session-lifecycle/checklists/accessibility.md create mode 100644 specs/013-managed-session-lifecycle/checklists/alignment-check.md create mode 100644 specs/013-managed-session-lifecycle/checklists/alignment-recheck.md create mode 100644 specs/013-managed-session-lifecycle/checklists/api.md create mode 100644 specs/013-managed-session-lifecycle/checklists/concurrency.md create mode 100644 specs/013-managed-session-lifecycle/checklists/configuration.md create mode 100644 specs/013-managed-session-lifecycle/checklists/data-model.md create mode 100644 specs/013-managed-session-lifecycle/checklists/deployment.md create mode 100644 specs/013-managed-session-lifecycle/checklists/error-handling.md create mode 100644 specs/013-managed-session-lifecycle/checklists/idempotency.md create mode 100644 specs/013-managed-session-lifecycle/checklists/integration.md create mode 100644 specs/013-managed-session-lifecycle/checklists/observability.md create mode 100644 specs/013-managed-session-lifecycle/checklists/performance.md create mode 100644 specs/013-managed-session-lifecycle/checklists/plan-review.md create mode 100644 specs/013-managed-session-lifecycle/checklists/requirements.md create mode 100644 specs/013-managed-session-lifecycle/checklists/security.md create mode 100644 specs/013-managed-session-lifecycle/checklists/testing-strategy.md create mode 100644 specs/013-managed-session-lifecycle/checklists/ux.md create mode 100644 specs/013-managed-session-lifecycle/clarify-questions.md create mode 100644 specs/013-managed-session-lifecycle/contracts/error-codes.md create mode 100644 specs/013-managed-session-lifecycle/contracts/managed-methods.md create mode 100644 specs/013-managed-session-lifecycle/contracts/state-machine.md create mode 100644 specs/013-managed-session-lifecycle/data-model.md create mode 100644 specs/013-managed-session-lifecycle/plan.md create mode 100644 specs/013-managed-session-lifecycle/quickstart.md create mode 100644 specs/013-managed-session-lifecycle/research.md create mode 100644 specs/013-managed-session-lifecycle/spec.md diff --git a/.specify/feature.json b/.specify/feature.json index 6861a34..2bc0fd1 100644 --- a/.specify/feature.json +++ b/.specify/feature.json @@ -1,3 +1,3 @@ { - "feature_directory": "specs/011-app-backend-contract" + "feature_directory": "specs/013-managed-session-lifecycle" } diff --git a/CLAUDE.md b/CLAUDE.md index f46085c..d5878ae 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,7 +1,7 @@ For additional context about technologies to be used, project structure, shell commands, and other important information, read the current plan: -`specs/011-app-backend-contract/plan.md`. +`specs/013-managed-session-lifecycle/plan.md`. # AgentTower Agent Context diff --git a/specs/013-managed-session-lifecycle/checklists/accessibility.md b/specs/013-managed-session-lifecycle/checklists/accessibility.md new file mode 100644 index 0000000..37255d1 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/accessibility.md @@ -0,0 +1,27 @@ +# Accessibility Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that accessibility requirements for the operator-facing surfaces touched by this feature are present, complete, and measurable — or explicitly scoped to a sibling feature. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Coverage + +- [ ] CHK001 Are accessibility requirements explicitly excluded or deferred to FEAT-012 in this spec? [Clarity, Gap] +- [ ] CHK002 Are keyboard-navigation requirements specified for the layout-creation flow? [Gap] +- [ ] CHK003 Are screen-reader requirements specified for the managed/adopted distinction (FR-005)? [Gap, Spec §FR-005] +- [ ] CHK004 Are accessibility requirements specified for the lifecycle-state indicators (`creating`, `ready`, `degraded`, `failed`, `removed`) such that they are perceivable without color alone? [Gap, Spec §FR-007] +- [ ] CHK005 Are accessibility requirements specified for the diagnostic surface (FR-013) such that "failed stage" is announced clearly to assistive tech? [Gap, Spec §FR-013] +- [ ] CHK006 Are focus-management requirements specified for the confirmation dialogs of remove/recreate (FR-010/FR-011)? [Gap] +- [ ] CHK007 Are accessibility requirements specified for the live progress feedback during the up-to-2-min layout creation (live region, polite vs assertive)? [Gap, Spec §SC-001] +- [ ] CHK008 Are accessibility requirements specified for surfacing the `predecessor_id` chain or the recreate history? [Gap, Spec §FR-011] +- [ ] CHK009 Are accessibility requirements specified for error messages (SESSION_NAME_CONFLICT, daemon unhealthy)? [Gap, Spec §FR-016] +- [ ] CHK010 Are accessibility requirements specified for any audit/history view (FR-021 indefinite retention)? [Gap] + +## Clarity / Consistency + +- [ ] CHK011 Are color-contrast requirements specified for `degraded` vs `failed` state indicators so they are distinguishable to users with color-vision deficiency? [Gap, Spec §FR-007] +- [ ] CHK012 Are accessibility requirements consistent across managed-pane surfaces and existing adopted-pane surfaces (FR-008)? [Consistency, Spec §FR-008] + +## Measurability + +- [ ] CHK013 Are accessibility requirements stated in objectively-testable form (specific WCAG criteria, role/name/value expectations)? [Measurability] diff --git a/specs/013-managed-session-lifecycle/checklists/alignment-check.md b/specs/013-managed-session-lifecycle/checklists/alignment-check.md new file mode 100644 index 0000000..afc9587 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/alignment-check.md @@ -0,0 +1,71 @@ +# Alignment Check: Post-Clarify-2 Spec Elements vs Downstream Artifacts + +**Purpose**: After the post-plan-review clarification session (Spec §Clarifications "Session 2026-05-24 (post-plan review)") added **FR-022, FR-023, FR-024, SC-009** and extended **FR-013, FR-018, FR-020, §Assumptions**, verify that every downstream artifact (plan.md, research.md, data-model.md, contracts/*, quickstart.md, plan-review.md) is still aligned. Each item tests *requirements-document alignment*, not implementation. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) — Session 2026-05-24 (post-plan review) +**Depth**: release gate. **Audience**: feature author before `/speckit.tasks`. + +## FR-013 alignment (`failed_stage` closed enum promoted into FR) + +- [ ] CHK001 Does plan.md reference the closed `failed_stage` enum (or FR-013 by ID) somewhere in Technical Context or Constitution Check evidence? [Consistency, Spec §FR-013 vs Plan] +- [ ] CHK002 Do research §R7's enum values match FR-013's inline closed set verbatim (no spelling drift, no extras)? [Consistency, Spec §FR-013 vs Research §R7] +- [ ] CHK003 Does data-model.md's `failed_stage` CHECK constraint enumerate the same six values as FR-013 (in both `managed_layout` and `managed_pane`)? [Consistency, Spec §FR-013 vs Data-Model §DDL] +- [ ] CHK004 Do contracts/managed-methods.md M3 / M5 detail-response shapes include `failed_stage` with canonical values from FR-013? [Consistency, Spec §FR-013 vs Contracts §M3/M5] +- [ ] CHK005 Do contracts/state-machine.md transition triggers reference each FR-013 enum value at least once across the trigger column? [Consistency, Spec §FR-013 vs Contracts §state-machine] + +## FR-018 alignment (cancel-in-flight create explicitly out-of-scope) + +- [ ] CHK006 Is "cancellation of in-flight layout creation" called out as out-of-scope in plan.md (Summary, Technical Context, or Constitution Check)? [Coverage, Spec §FR-018 vs Plan] +- [ ] CHK007 Does contracts/managed-methods.md §M6 (or a sibling note) acknowledge cancel-in-flight is unsupported and reference FR-018? [Consistency, Spec §FR-018 vs Contracts §M6] +- [ ] CHK008 Does research §R2 align with FR-018's explicit out-of-scope (not only "reserved for a later feature")? [Consistency, Spec §FR-018 vs Research §R2] + +## FR-020 alignment (recovery outcomes readable from list/detail surface) + +- [ ] CHK009 Do contracts/managed-methods.md M3 (or M5) response shapes demonstrate how a recovery outcome surfaces (e.g., `failed_stage = "recovery_reattach"` in a sample)? [Consistency, Spec §FR-020 vs Contracts §M3/M5] +- [ ] CHK010 Does data-model.md describe that recovery outcome is visible via the same detail surface used for normal operation (not only via events)? [Coverage, Spec §FR-020 vs Data-Model] +- [ ] CHK011 Does quickstart.md's daemon-restart section show the operator reading recovery outcomes from list/detail (not only via the audit log)? [Coverage, Spec §FR-020 vs Quickstart] +- [ ] CHK012 Does contracts/state-machine.md's Recovery section reference the visibility of recovery outcomes from a read surface? [Coverage, Spec §FR-020 vs Contracts §state-machine] + +## FR-022 alignment (5-minute pending-marker TTL sweep) + +- [ ] CHK013 Does plan.md Technical Context describe the 5-minute sweep as a measurable system property and tie it to FR-022 (by ID or by behavior)? [Consistency, Spec §FR-022 vs Plan] +- [ ] CHK014 Does research §R5 produce the same TTL value (5 min) and sweep cadence (boot + 60 s) as FR-022 mandates? [Consistency, Spec §FR-022 vs Research §R5] +- [ ] CHK015 Does data-model.md show that a swept pending-managed pane transitions to `failed` with `failed_stage = pane_create` (no tmux pane) or `failed_stage = registration` (pane exists but never registered)? [Consistency, Spec §FR-022 vs Data-Model + §FR-013] +- [ ] CHK016 Does contracts/state-machine.md's `creating → failed` transition row name the FR-022 TTL sweep as a trigger, distinct from registration failure? [Consistency, Spec §FR-022 vs Contracts §state-machine] + +## FR-023 alignment (recreate-chain depth bound 16) + +- [ ] CHK017 Does plan.md Constraints / Scale section reference FR-023 or the depth-16 bound? [Consistency, Spec §FR-023 vs Plan] +- [ ] CHK018 Does data-model.md's `chain_depth` CHECK constraint match FR-023's "maximum depth of 16" wording exactly (off-by-one consistent with R4's `>= 15` rejection rule)? [Consistency, Spec §FR-023 vs Data-Model §DDL vs Research §R4] +- [ ] CHK019 Does contracts/error-codes.md `managed_pane_recreate_chain_too_deep` reference FR-023 and include the bound (16) in its details schema? [Consistency, Spec §FR-023 vs Contracts §error-codes] +- [ ] CHK020 Does contracts/state-machine.md's Recreate Semantics section reference FR-023's bound? [Consistency, Spec §FR-023 vs Contracts §state-machine] +- [ ] CHK021 Does quickstart.md's edge-cases table list the recreate-chain-too-deep scenario with FR-023 reference? [Coverage, Spec §FR-023 vs Quickstart] + +## FR-024 alignment (operator YAML override capability) + +- [ ] CHK022 Does plan.md (Summary, Technical Context, or Constitution Check evidence) reference FR-024 and the canonical YAML paths? [Consistency, Spec §FR-024 vs Plan] +- [ ] CHK023 Do research §R8/R9 enumerate the same canonical paths as spec §Assumptions (no path drift)? [Consistency, Spec §Assumptions vs Research §R8/R9] +- [ ] CHK024 Does quickstart.md's Preconditions section reference the operator-overridable YAML paths per FR-024 (not just example file contents)? [Consistency, Spec §FR-024 vs Quickstart] +- [ ] CHK025 Do contracts/error-codes.md `managed_template_not_found` / `managed_launch_command_not_found` descriptions reference FR-024's override-resolution rule (operator file with same name wins)? [Consistency, Spec §FR-024 vs Contracts §error-codes] + +## SC-009 alignment (recovery visible within 5s of socket-ready) + +- [ ] CHK026 Does plan.md Performance Goals list SC-009 alongside SC-001 / SC-003 / SC-008? [Completeness, Spec §SC-009 vs Plan] +- [ ] CHK027 Does quickstart.md's daemon-restart section state SC-009's 5-second visibility window explicitly (not just SC-008's reattach window)? [Coverage, Spec §SC-009 vs Quickstart] +- [ ] CHK028 Do contracts/managed-methods.md M3 (or §Events) describe the readability path within the SC-009 time bound? [Consistency, Spec §SC-009 vs Contracts §M3] +- [ ] CHK029 Does the test plan in plan.md (`tests/contract/` or `tests/integration/`) include coverage for SC-009 readability post-restart? [Coverage, Spec §SC-009 vs Plan §Project Structure] + +## §Assumptions alignment (new YAML-paths bullet) + +- [ ] CHK030 Does plan.md (Technical Context or Constitution Check) reference the new §Assumptions bullet naming the two YAML paths? [Consistency, Spec §Assumptions vs Plan] +- [ ] CHK031 Are the canonical paths in §Assumptions identical (character-for-character) to those in research §R8/R9 and quickstart preconditions? [Consistency, Spec §Assumptions vs Research §R8/R9 vs Quickstart] + +## Cross-cutting traceability + +- [ ] CHK032 Is the "Session 2026-05-24 (post-plan review)" Clarifications block cross-referenced from plan.md (e.g., "see §Clarifications post-plan review for FR-022/023/024 origin")? [Traceability, Spec §Clarifications vs Plan] +- [ ] CHK033 Are FR-022 / FR-023 / FR-024 / SC-009 each traceable to at least one user story or acceptance scenario, or are they explicitly system-level requirements only (with that rationale stated)? [Traceability, Spec §FR/SC vs §User Scenarios] +- [ ] CHK034 Are plan-review.md CHK036–CHK041 now markable as resolved by the post-clarify-2 spec amendments alone (no remaining code-level dependency)? [Coverage, Plan-Review vs Spec amendments] +- [ ] CHK035 Is the spec's FR numbering still contiguous (FR-001..FR-024 with no gaps) after the amendments? [Consistency, Spec §FR] +- [ ] CHK036 Is the spec's SC numbering still contiguous (SC-001..SC-009 with no gaps) after the amendments? [Consistency, Spec §SC] +- [ ] CHK037 Are the new closed-set error codes referenced in error-codes.md (`managed_pane_recreate_chain_too_deep`) **only** triggered by FR-023, or do their `details` schemas also need updating to reflect FR-022's TTL-driven failures? [Coverage, Gap, Spec §FR-022/023 vs Contracts §error-codes] +- [ ] CHK038 Is there any conflict between FR-013's inline `failed_stage` enum and the legacy text "specific failed stage" used elsewhere in spec.md (Edge Cases, SC-006)? [Conflict, Spec §FR-013 vs Spec §Edge Cases / §SC-006] diff --git a/specs/013-managed-session-lifecycle/checklists/alignment-recheck.md b/specs/013-managed-session-lifecycle/checklists/alignment-recheck.md new file mode 100644 index 0000000..26fec42 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/alignment-recheck.md @@ -0,0 +1,47 @@ +# Alignment Recheck: Post-Alignment-Cleanup Verification + +**Purpose**: After the alignment-cleanup clarification round (Spec §Clarifications "Session 2026-05-24 (alignment cleanup)"), verify the 5 edits landed correctly, flag any items still open from `alignment-check.md` round 1 that were NOT addressed, and surface any new gaps introduced by the cleanup edits themselves. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) — Sessions "post-plan review" + "alignment cleanup" +**Depth**: release gate. **Audience**: feature author before `/speckit.tasks`. + +## Verify alignment-cleanup edits applied (sanity check) + +- [ ] CHK001 Does spec.md SC-006 reference "FR-013 closed set" rather than the abstract "specific failed stage" wording? [Consistency, Spec §SC-006 vs §FR-013] +- [ ] CHK002 Do FR-022, FR-023, FR-024, SC-009 each carry an inline `(traces to USx)` annotation matching the alignment-cleanup Q2 decision? [Traceability, Spec §FR-022/023/024 §SC-009] +- [ ] CHK003 Does spec.md contain a `### Session 2026-05-24 (alignment cleanup)` sub-session under `## Clarifications` with five Q/A bullets? [Completeness, Spec §Clarifications] +- [ ] CHK004 Does plan.md carry a Provenance blockquote citing BOTH `Session 2026-05-24 (post-plan review)` AND `Session 2026-05-24 (alignment cleanup)`? [Traceability, Plan §Summary] +- [ ] CHK005 Are plan-review.md CHK036–CHK041 marked `[x]` with per-item "Resolved 2026-05-24" annotations? [Completeness, Plan-Review §Newly Introduced Gaps] +- [ ] CHK006 Does plan-review.md include an amendment note flagging FR-022 / FR-020 / SC-009 implementation footprint for `/speckit.tasks`? [Completeness, Plan-Review] + +## New gaps introduced by the alignment-cleanup edits + +- [ ] CHK007 Are the new `(traces to USx)` annotations consistent with the rest of the FR/SC list — should ALL FRs and SCs carry similar annotations for parity, or were FR-022/023/024 and SC-009 explicitly the only system-level ones needing disambiguation? [Consistency, Gap, Spec §FR/SC] +- [ ] CHK008 If only the new system-level FRs/SCs carry the annotation, is the asymmetry documented (e.g., a note in §Clarifications "alignment cleanup" Q2 explaining why FR-001..FR-021 do NOT need it)? [Clarity, Gap, Spec §Clarifications] + +## Still-outstanding items from alignment-check.md round 1 + +These items were flagged "Likely failing" in alignment-check.md but were NOT in scope of the alignment-cleanup clarify round (which only handled the 5 "Worth investigating" judgment calls). They remain open as cross-doc wording edits. + +- [ ] CHK009 Does plan.md Summary explicitly name "cancel in-flight create" as out-of-scope, or rely only on the FR-018 reference? [Coverage, Spec §FR-018 vs Plan] (alignment-check.md CHK006 — still open) +- [ ] CHK010 Does research §R2 use "out of scope" wording aligned with FR-018, instead of "reserved for a later feature"? [Consistency, Spec §FR-018 vs Research §R2] (alignment-check.md CHK008 — still open) +- [ ] CHK011 Does contracts/managed-methods.md §M3 sample response include a `recovery_reattach` `failed_stage` example, or only the general `failed_stage` field? [Consistency, Spec §FR-020 vs Contracts §M3] (alignment-check.md CHK009 — still open) +- [ ] CHK012 Does quickstart.md US3 daemon-restart section show the recovery-failure read path (not only the all-ready outcome)? [Coverage, Spec §FR-020 vs Quickstart] (alignment-check.md CHK011 — still open) +- [ ] CHK013 Does contracts/state-machine.md Recovery section reference visibility from the M3 / M5 detail surface? [Coverage, Spec §FR-020 vs Contracts §state-machine] (alignment-check.md CHK012 — still open) +- [ ] CHK014 Does plan.md Technical Context cite FR-022 / FR-023 / FR-024 by ID anywhere (not only behaviorally)? [Consistency, Spec §FR-022/023/024 vs Plan] (alignment-check.md CHK013 / CHK017 / CHK022 — still open) +- [ ] CHK015 Does contracts/error-codes.md `managed_template_not_found` / `managed_launch_command_not_found` reference the FR-024 override-resolution rule (operator file with same `name` wins)? [Consistency, Spec §FR-024 vs Contracts §error-codes] (alignment-check.md CHK025 — still open) +- [ ] CHK016 Does plan.md Performance Goals list SC-009 ≤ 5s alongside SC-001 / SC-003 / SC-008? [Completeness, Spec §SC-009 vs Plan] (alignment-check.md CHK026 — still open) +- [ ] CHK017 Does quickstart.md restart section cite SC-009 by ID and name the 5-second visibility window? [Coverage, Spec §SC-009 vs Quickstart] (alignment-check.md CHK027 — still open) +- [ ] CHK018 Does plan.md `tests/contract/` or `tests/integration/` list include coverage for SC-009 readability post-restart? [Coverage, Spec §SC-009 vs Plan §Project Structure] (alignment-check.md CHK029 — still open) + +## Forward-pointing tasks queued for /speckit.tasks (from alignment-cleanup Q3) + +- [ ] CHK019 Will the FR-022 pending-marker sweep loop be captured as an implementation task by `/speckit.tasks` (per the plan-review.md amendment note)? [Coverage, Spec §FR-022] +- [ ] CHK020 Will the FR-020 detail-surface readability (recovery outcome fields in M3/M5 response shapes) be captured as an implementation task by `/speckit.tasks`? [Coverage, Spec §FR-020] +- [ ] CHK021 Will the SC-009 ≤ 5-second post-restart visibility test be captured for `/speckit.tasks`? [Coverage, Spec §SC-009] + +## Cross-doc traceability under both Clarifications sessions + +- [ ] CHK022 Does research.md cite the post-plan and alignment-cleanup Clarifications sessions as the documented origin of FR-022/023/024/SC-009 + the SC-006 rewording? [Traceability, Research vs Spec §Clarifications] +- [ ] CHK023 Does data-model.md acknowledge the FR-022 TTL behavior with a note in the recovery / pending-marker section? [Coverage, Spec §FR-022 vs Data-Model] +- [ ] CHK024 Are the SC-009 5-second budget and the FR-022 5-minute TTL consistent with each other — different time horizons, no overlap or conflict? [Consistency, Spec §FR-022 vs §SC-009] diff --git a/specs/013-managed-session-lifecycle/checklists/api.md b/specs/013-managed-session-lifecycle/checklists/api.md new file mode 100644 index 0000000..dc1e5d7 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/api.md @@ -0,0 +1,52 @@ +# API Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that the daemon socket API contract requirements for managed-layout operations are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Requirement Completeness + +- [ ] CHK001 Are request/response schemas specified for the create-layout operation? [Gap, Spec §FR-001] +- [ ] CHK002 Are request/response schemas specified for the remove-managed-pane operation? [Gap, Spec §FR-010] +- [ ] CHK003 Are request/response schemas specified for the recreate-managed-pane operation? [Gap, Spec §FR-011] +- [ ] CHK004 Are request/response schemas specified for listing managed layouts and managed panes? [Gap, Spec §FR-005] +- [ ] CHK005 Is the structured error response specified for `SESSION_NAME_CONFLICT` (code, message, hint)? [Gap, Spec §FR-016] +- [ ] CHK006 Are error response codes/strings enumerated for every failure mode listed in FR-013 and FR-016? [Completeness] +- [ ] CHK007 Is the contract for the lifecycle event stream defined (event types, payload shape, ordering)? [Gap, Spec §FR-015] +- [ ] CHK008 Are API versioning requirements specified for the new managed-layout operations? [Gap] +- [ ] CHK009 Is the API contract for cancellation of an in-flight create-layout defined? [Gap, Scenario Coverage] +- [ ] CHK010 Is the contract for re-attaching to surviving panes after daemon restart specified (operator-driven, automatic, hybrid)? [Gap, Spec §FR-020] +- [ ] CHK011 Are pagination/filtering requirements specified for layout listing and event listing? [Gap] +- [ ] CHK012 Is the contract for the predecessor_id linkage queryable through the API (e.g., GET predecessor chain)? [Gap, Spec §FR-011] +- [ ] CHK013 Are the contract requirements specified for the `promoted_from_adopted` transition stub (e.g., not-implemented response in MVP)? [Gap, Spec §FR-007] + +## Requirement Clarity + +- [ ] CHK014 Is idempotency-key behavior defined for create-layout (header name, scope, lifetime)? [Clarity, Spec §FR-014] +- [ ] CHK015 Is the contract behavior under FR-019 serialization defined (block-and-wait, queue-and-poll, immediate-reject-with-retry-after)? [Clarity, Spec §FR-019] +- [ ] CHK016 Is the pending-managed-marker visibility specified for API consumers (part of the pane resource, separate field, hidden)? [Clarity, Gap, Spec §FR-014] +- [ ] CHK017 Are timing/SLA requirements specified for API responses (synchronous vs async create-layout)? [Clarity, Gap, Spec §SC-001] +- [ ] CHK018 Are the API authentication/identification requirements specified or explicitly absent for MVP? [Clarity, Spec §Assumptions] + +## Requirement Consistency + +- [ ] CHK019 Are the contracts consistent between thin client → daemon and app → daemon for the same operations? [Consistency, Spec §FR-017] +- [ ] CHK020 Are the contracts for distinguishing managed vs adopted agents specified consistently across endpoints (FR-005)? [Consistency] +- [ ] CHK021 Are deprecation/migration requirements specified should any FEAT-011 contract surface change? [Gap] + +## Scenario Coverage + +- [ ] CHK022 Is the contract behavior defined for the bench-container disappearance edge case (long-poll error, immediate failure, retry-after)? [Coverage, Gap, Spec §Edge Cases] +- [ ] CHK023 Are concurrent-request semantics specified for non-create operations (remove, recreate) in addition to create-layout? [Coverage, Spec §FR-019] +- [ ] CHK024 Is the contract for surfacing the `degraded` reason (which subsystem degraded: log, command, registration) specified? [Coverage, Gap, Spec §FR-013] + +## Edge Case Coverage + +- [ ] CHK025 Is the contract behavior specified when the operator retries with the same idempotency key but different inputs? [Gap, Spec §FR-014] +- [ ] CHK026 Is the contract behavior specified for remove of a pane that is currently in `creating` state? [Gap] +- [ ] CHK027 Is the contract behavior specified for recreate of a pane whose predecessor record is missing (e.g., pruned in a future version)? [Gap, Spec §FR-021] + +## Non-Functional API + +- [ ] CHK028 Are response-size or pagination requirements specified for high-volume audit/event queries (FR-021 indefinite retention)? [Gap] +- [ ] CHK029 Are observability requirements specified for the API contract (request-id propagation, log fields)? [Gap, Cross-ref: observability.md] diff --git a/specs/013-managed-session-lifecycle/checklists/concurrency.md b/specs/013-managed-session-lifecycle/checklists/concurrency.md new file mode 100644 index 0000000..20f009f --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/concurrency.md @@ -0,0 +1,42 @@ +# Concurrency Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that concurrency requirements (serialization, locking, races, ordering) are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Serialization Scope + +- [ ] CHK001 Are concurrency requirements specified for layout-creation against the same container (FR-019)? [Completeness, Spec §FR-019] +- [ ] CHK002 Are concurrency requirements specified for layout-creation across different containers (must they also serialize, or run in parallel)? [Gap, Spec §FR-019] +- [ ] CHK003 Are concurrency requirements specified for remove + recreate ordering on the same managed pane? [Gap] +- [ ] CHK004 Are concurrency requirements specified for two operators issuing the same operation at the same time on the same pane (e.g., two removes, two recreates)? [Gap] + +## Locking Model + +- [ ] CHK005 Is the locking model specified for the per-container serialization (mutex, semaphore, queue)? [Gap, Spec §FR-019] +- [ ] CHK006 Are deadlock-prevention requirements specified (per-container locks must release on operator disconnect / crash)? [Gap, Spec §FR-019] +- [ ] CHK007 Are starvation-prevention requirements specified for the FR-019 wait queue (FIFO ordering, max wait time, fairness)? [Gap] +- [ ] CHK008 Is lock granularity specified (per-container vs per-layout vs per-pane)? [Clarity, Spec §FR-019] + +## Race Conditions + +- [ ] CHK009 Are concurrency requirements specified for the scan + creation flow interaction (FR-014 marker is the mitigation — but what is the low-level race set)? [Coverage, Spec §FR-014] +- [ ] CHK010 Are concurrency requirements specified for the daemon's handling of overlapping retries on the same pending-managed layout? [Gap, Spec §FR-014] +- [ ] CHK011 Are concurrency requirements specified for the predecessor_id chain (two simultaneous recreations of the same predecessor)? [Gap, Spec §FR-011] +- [ ] CHK012 Are race conditions enumerated for the periodic scan vs creation completion (low-level race set)? [Coverage] +- [ ] CHK013 Are concurrency requirements specified for the case where tmux itself executes commands asynchronously vs the daemon's expected ordering? [Gap] + +## Recovery & Restart + +- [ ] CHK014 Are concurrency requirements specified for daemon-restart recovery vs an in-flight operator request at the moment of restart? [Gap, Spec §FR-020] +- [ ] CHK015 Are concurrency requirements specified for resumption of partially-serialized work after a daemon crash? [Gap, Spec §FR-019, FR-020] + +## Event Ordering + +- [ ] CHK016 Are concurrency requirements specified for the lifecycle event stream (consumer ordering guarantees per pane, per layout)? [Gap, Spec §FR-015] +- [ ] CHK017 Are concurrency requirements specified for the audit/history append-only semantics under concurrent writers? [Gap, Spec §FR-021] + +## Consistency + +- [ ] CHK018 Are concurrency requirements consistent with the assumption "MVP authorization is socket-access based" (single operator typical, but the requirements still cover concurrent calls)? [Consistency, Spec §Assumptions] +- [ ] CHK019 Are concurrency safety properties testable from the operator surface alone? [Measurability] diff --git a/specs/013-managed-session-lifecycle/checklists/configuration.md b/specs/013-managed-session-lifecycle/checklists/configuration.md new file mode 100644 index 0000000..cecaf77 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/configuration.md @@ -0,0 +1,37 @@ +# Configuration Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that configuration requirements (templates, launch command profiles, paths, defaults, validation) are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Schema Definition + +- [ ] CHK001 Are the standard templates' configuration shapes specified (file format, location, schema)? [Gap, Spec §FR-001] +- [ ] CHK002 Are the standard templates' default contents (1 master + 2 slaves, 2 masters + 2 slaves) specified field-by-field? [Gap, Spec §FR-001] +- [ ] CHK003 Are the launch command profile configuration shapes specified (file format, location, fields)? [Gap, Spec §FR-002] +- [ ] CHK004 Are configuration requirements specified for label-pattern templates (FR-003) — is the pattern configurable per template? [Gap, Spec §FR-003] + +## Defaults & Overrides + +- [ ] CHK005 Are configuration overrides specified (per-container, per-layout-instance, per-pane)? [Gap] +- [ ] CHK006 Are defaults specified for omitted configuration fields (default capability, default label pattern, default working directory)? [Gap] +- [ ] CHK007 Are the precedence rules between operator-supplied launch commands and template-default commands specified? [Clarity, Spec §FR-002] + +## Validation + +- [ ] CHK008 Are validation requirements specified for configuration before layout creation (required fields, command syntax, label-pattern syntax)? [Gap] +- [ ] CHK009 Are validation requirements specified for the tmux session name input (length, character set)? [Gap, Spec §FR-016] + +## Lifecycle + +- [ ] CHK010 Are configuration reload requirements specified (does the daemon hot-reload, or restart-only)? [Gap] +- [ ] CHK011 Are configuration migration requirements specified across versions of the template schema? [Gap, Cross-ref: deployment.md] +- [ ] CHK012 Are configuration requirements specified for the durable storage path used by FR-020? [Gap, Spec §FR-020] +- [ ] CHK013 Are configuration requirements specified for the canonical local-socket path (FR-017)? [Gap, Spec §FR-017] +- [ ] CHK014 Are configuration requirements specified for the scan interval that interacts with the pending-managed marker (FR-014)? [Gap, Spec §FR-014] +- [ ] CHK015 Are configuration requirements specified for the audit retention behavior in MVP (file location, format) even though retention is indefinite? [Gap, Spec §FR-021] + +## Tmux Adapter + +- [ ] CHK016 Are configuration requirements specified for which tmux pane-control flags AgentTower must support? [Gap] +- [ ] CHK017 Are configuration requirements specified for tmux server selection (default socket vs custom)? [Gap] diff --git a/specs/013-managed-session-lifecycle/checklists/data-model.md b/specs/013-managed-session-lifecycle/checklists/data-model.md new file mode 100644 index 0000000..95adf0e --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/data-model.md @@ -0,0 +1,62 @@ +# Data Model Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that data-model and lifecycle-state-machine requirements (entities, attributes, transitions, constraints, durability) are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Entity Attribute Completeness + +- [ ] CHK001 Are all attributes of `Managed Layout` enumerated (id, template_id, container_id, state, created_at, updated_at, owner, …)? [Completeness, Spec §Key Entities] +- [ ] CHK002 Are all attributes of `Managed Pane` enumerated (id, layout_id, role, capability, label, launch_command_ref, state, predecessor_id, pending_marker, tmux_pane_ref, created_at, …)? [Completeness, Spec §Key Entities] +- [ ] CHK003 Are all attributes of `Launch Command Profile` enumerated (id, name, command, env, working_dir, …)? [Completeness, Spec §Key Entities] +- [ ] CHK004 Are all attributes of `Lifecycle Event` enumerated (id, layout_id, pane_id, event_type, timestamp, payload, actor)? [Completeness, Spec §Key Entities] +- [ ] CHK005 Are required-vs-optional field markers specified for every entity attribute? [Completeness] +- [ ] CHK006 Are `Adopted Agent` attributes within FEAT-013's scope clarified (delegated to FEAT-006, partially overridden, fully owned here)? [Clarity, Dependency, Spec §Key Entities] + +## State Machine Coverage + +- [ ] CHK007 Is the lifecycle state transition graph fully enumerated (every valid transition from every state)? [Coverage, Gap, Spec §FR-007] +- [ ] CHK008 Are illegal lifecycle state transitions enumerated (e.g., `removed → ready` without a recreate; `failed → ready` without a recreate)? [Coverage, Gap] +- [ ] CHK009 Is the state of the predecessor record at the moment of recreation defined (must be `removed` or `failed`; not `ready` or `creating`)? [Clarity, Spec §FR-011] +- [ ] CHK010 Are the relationships between layout-level state and pane-level state defined (e.g., a layout is `ready` iff all panes are `ready` or `degraded`)? [Gap] +- [ ] CHK011 Is the boundary between `creating` and `ready` defined precisely (at pane spawn, at first prompt, at registration)? [Clarity, Spec §FR-007] +- [ ] CHK012 Is the data-model representation of the `promoted_from_adopted` reserved transition specified (extra optional field, sentinel value, separate table)? [Gap, Spec §FR-007] + +## Constraints & Identity + +- [ ] CHK013 Is the field type for `predecessor_id` defined (UUID, opaque string, integer)? [Gap] +- [ ] CHK014 Is the label uniqueness constraint scope storage specified (database constraint, application-level check, both)? [Clarity, Spec §FR-003] +- [ ] CHK015 Are unique constraints enumerated (layout_id PK, pane_id PK, label uniqueness per container, tmux session-name uniqueness)? [Completeness] +- [ ] CHK016 Is the cardinality between Managed Layout and Managed Pane specified (1:N enforced)? [Completeness] +- [ ] CHK017 Is the cardinality between Managed Pane and Lifecycle Event specified (1:N append-only)? [Completeness] +- [ ] CHK018 Is the relationship between Managed Pane and the underlying tmux pane identifier specified (tmux pane_id stored, recomputed, both)? [Clarity, Spec §FR-007] + +## Durability & Persistence + +- [ ] CHK019 Is the data-at-rest requirement specified (sqlite, json file, in-memory only)? [Gap, Spec §FR-020] +- [ ] CHK020 Is the durability boundary specified for FR-020 (which records must be durable, which may be in-memory)? [Clarity, Spec §FR-020] +- [ ] CHK021 Is the retention model for `Lifecycle Event` storage specified (indefinite per FR-021, but is the storage shape and growth profile specified)? [Clarity, Spec §FR-021] +- [ ] CHK022 Are timestamp requirements specified (UTC, monotonic, system-clock-only, RFC3339)? [Gap] +- [ ] CHK023 Is the data model robust against partial writes during the failure of a layout-creation transaction (write-ahead, idempotent commit)? [Gap, Spec §FR-014] + +## Schema Evolution + +- [ ] CHK024 Are schema migration requirements specified for adding `predecessor_id`, pending-marker, etc.? [Gap] +- [ ] CHK025 Are forward/backward compatibility requirements specified for the durable store across daemon upgrades? [Gap, Cross-ref: deployment.md] + +## Consistency + +- [ ] CHK026 Is the data model consistent with the FEAT-011 agent registry (same id space, FK constraints)? [Consistency, Dependency] +- [ ] CHK027 Are there any data-model conflicts with the `Adopted Agent` storage owned by FEAT-006? [Conflict, Dependency] +- [ ] CHK028 Does the data model align with FR-008's "same registry/queue/route/event/health/direct-send surfaces" claim (no parallel managed-only tables)? [Consistency, Spec §FR-008] + +## Edge Cases + +- [ ] CHK029 Is the recreate-chain depth (predecessor → predecessor → …) bounded or explicitly unbounded? [Gap, Spec §FR-011] +- [ ] CHK030 Is the data shape for "failed stage" (FR-013) defined as an enum or free-text? [Clarity, Spec §FR-013] +- [ ] CHK031 Is the pending-managed marker's representation specified (field on Managed Pane, separate record, tmux pane title prefix)? [Gap, Spec §FR-014] + +## Non-Functional + +- [ ] CHK032 Are concurrency-safety requirements specified at the data model level (row-level locks, optimistic concurrency, transaction isolation)? [Gap, Spec §FR-019] +- [ ] CHK033 Are integrity-check / fsck-style requirements specified for the durable store on daemon boot (FR-020)? [Gap, Spec §FR-020] diff --git a/specs/013-managed-session-lifecycle/checklists/deployment.md b/specs/013-managed-session-lifecycle/checklists/deployment.md new file mode 100644 index 0000000..b83a403 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/deployment.md @@ -0,0 +1,33 @@ +# Deployment & Rollback Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that deployment, upgrade, rollback, and first-run requirements are complete, clear, consistent, and measurable for this feature. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Migration & Schema + +- [ ] CHK001 Are deployment requirements specified for the schema migration that adds `predecessor_id`, pending-marker, and any new tables/fields? [Gap, Cross-ref: data-model.md] +- [ ] CHK002 Are rollback requirements specified for the schema migration (down-migration safety)? [Gap] +- [ ] CHK003 Are backwards-compatibility requirements specified with existing FEAT-011 contracts during a phased rollout? [Gap] + +## First-Run & Install + +- [ ] CHK004 Are deployment requirements specified for the durable storage initialization (empty state, first-run behavior, schema seeding)? [Gap, Spec §FR-020] +- [ ] CHK005 Are deployment requirements specified for the local-socket path / permissions during install? [Gap, Spec §FR-017] +- [ ] CHK006 Are deployment requirements specified for configuration file installation (templates, launch profiles, defaults)? [Gap, Cross-ref: configuration.md] + +## Daemon Upgrade / Restart + +- [ ] CHK007 Are deployment requirements specified for the daemon restart sequence (graceful shutdown, in-flight create-layout handling)? [Gap, Spec §FR-020] +- [ ] CHK008 Are deployment requirements specified for surviving daemon upgrades while in-flight layouts exist? [Gap, Recovery Flow] +- [ ] CHK009 Are rollback requirements specified if a daemon upgrade introduces breaking changes to the managed-layout contract? [Gap] +- [ ] CHK010 Are post-deployment audit requirements specified to verify reattach completeness (FR-020)? [Gap] + +## Validation + +- [ ] CHK011 Are deployment-time validation requirements specified (smoke test, configuration sanity check, durable-store integrity check)? [Gap] +- [ ] CHK012 Are requirements specified for cleaning up stale tmux panes / pending-managed markers left over from a prior failed deployment? [Gap] + +## Observability of Deploys + +- [ ] CHK013 Are observability requirements specified for the deploy/restart path itself (events emitted on reattach, FR-020)? [Gap, Cross-ref: observability.md] diff --git a/specs/013-managed-session-lifecycle/checklists/error-handling.md b/specs/013-managed-session-lifecycle/checklists/error-handling.md new file mode 100644 index 0000000..0463403 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/error-handling.md @@ -0,0 +1,47 @@ +# Error Handling & Resilience Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that error-handling and resilience requirements (failure categorization, recovery, rollback) are complete, clear, consistent, and measurable across the layout-creation, registration, log-attach, remove, and recreate pipelines. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Failure Categorization + +- [ ] CHK001 Are error categories enumerated (transient/recoverable vs permanent/non-recoverable)? [Completeness, Spec §FR-013] +- [ ] CHK002 Is the mapping from each error category to the resulting lifecycle state (`degraded` vs `failed`) specified for every error type? [Coverage, Spec §FR-013] +- [ ] CHK003 Are error requirements specified for surfacing the failed stage to the operator with enough granularity for action (FR-013)? [Clarity, Spec §FR-013] +- [ ] CHK004 Are requirements specified for distinguishing `degraded` from `failed` to the operator via a single observable signal? [Clarity, Spec §FR-007] + +## Pipeline Coverage + +- [ ] CHK005 Are error handling requirements specified for every step of the layout creation pipeline (pane create, command launch, registration, log attach)? [Completeness, Spec §FR-013] +- [ ] CHK006 Are timeout requirements specified for each launch-command, log-attach, registration step? [Gap] +- [ ] CHK007 Are retry requirements specified for transient failures (network blip during scan, tmux command failure)? [Gap] +- [ ] CHK008 Are error requirements specified for the case where `tmux kill-pane` fails during remove (FR-010)? [Gap, Spec §FR-010] +- [ ] CHK009 Are error requirements specified for the case where the daemon detects state divergence after restart (FR-020 recovery)? [Gap, Recovery Flow] + +## Edge Case Coverage + +- [ ] CHK010 Are error requirements specified for the "bench container disappears mid-creation" edge case? [Coverage, Exception Flow, Spec §Edge Cases] +- [ ] CHK011 Are error requirements specified for "agent command prompts before registration completes"? [Coverage, Exception Flow, Spec §Edge Cases] +- [ ] CHK012 Are error requirements specified for "log path is not host-readable" mapped to the `degraded` outcome (FR-006)? [Coverage, Spec §FR-006] +- [ ] CHK013 Are error requirements specified for the case where a recreate attempt itself fails (recursive failure)? [Gap, Coverage, Spec §FR-011] +- [ ] CHK014 Are error requirements specified for the case where the periodic scan races with creation in a way the pending-managed marker cannot resolve (e.g., marker missing or corrupted)? [Gap, Spec §FR-014] +- [ ] CHK015 Are error requirements specified for the case where a recovered managed layout (FR-020) has lost panes (tmux pane killed externally during restart window)? [Gap, Recovery Flow] + +## Recovery & Rollback + +- [ ] CHK016 Are partial-failure rollback requirements specified (when one pane fails, do other panes in the layout remain or get cleaned up)? [Gap, Recovery Flow] +- [ ] CHK017 Is the operator's recovery path explicit for every Edge Case bullet? [Coverage, Spec §Edge Cases] +- [ ] CHK018 Are recovery sequences specified for cascading failures (one degraded pane causes a route to break, which causes another pane to fail)? [Gap, Recovery Flow] + +## Error Format & Diagnostics + +- [ ] CHK019 Are error message format requirements specified (machine-readable code + human-readable message + recovery hint)? [Gap, Spec §FR-016] +- [ ] CHK020 Is the SESSION_NAME_CONFLICT error response shape specified beyond the diagnostic string (fields, suggestion)? [Gap, Spec §FR-016] +- [ ] CHK021 Is the audit/event content for failure events specified to be sufficient for post-mortem (which pane, which stage, which command output excerpt)? [Gap, Spec §FR-015] + +## Non-Functional Resilience + +- [ ] CHK022 Are non-functional resilience requirements specified (max time spent in `creating` before automatic transition to `failed`)? [Gap] +- [ ] CHK023 Are requirements specified for surfacing the rejection when the daemon/container is unhealthy (FR-016) with the same diagnostic format as other failures? [Consistency, Spec §FR-016] +- [ ] CHK024 Are circuit-breaker / back-off requirements specified for repeated immediate-exit failures of the same launch command? [Gap] diff --git a/specs/013-managed-session-lifecycle/checklists/idempotency.md b/specs/013-managed-session-lifecycle/checklists/idempotency.md new file mode 100644 index 0000000..8b59d9a --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/idempotency.md @@ -0,0 +1,37 @@ +# Idempotency Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that idempotency requirements (retry safety, dedup keys, pending markers, replay semantics) are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Idempotency Boundary + +- [ ] CHK001 Is the idempotency boundary specified for create-layout (request idempotency-key, layout pending-state, both)? [Clarity, Spec §FR-014] +- [ ] CHK002 Are deduplication semantics specified for "the same pending layout" — what determines sameness (idempotency key, layout id, hash of inputs)? [Clarity, Spec §FR-014] +- [ ] CHK003 Are idempotency semantics specified for remove-managed-pane (multiple removes of the same pane)? [Gap, Spec §FR-010] +- [ ] CHK004 Are idempotency semantics specified for recreate-managed-pane (multiple recreates from the same predecessor)? [Gap, Spec §FR-011] +- [ ] CHK005 Are idempotency semantics specified for layout removal (cascade of pane removals)? [Gap] + +## Pending Marker Lifecycle + +- [ ] CHK006 Is the pending-managed marker's lifetime / TTL specified (how long does it remain active before considered stale)? [Gap, Spec §FR-014] +- [ ] CHK007 Are the conditions specified under which a partial layout is "resumed" vs "restarted"? [Clarity, Spec §FR-014] +- [ ] CHK008 Are requirements specified for cleanup of stale pending-managed markers across daemon restart (FR-020)? [Gap] +- [ ] CHK009 Is the pending-managed-marker representation specified to be observable by the periodic scan without scan changes (or with explicit scan changes)? [Coverage, Cross-ref: integration.md] + +## Replay & Retry + +- [ ] CHK010 Are requirements specified for what happens if the operator retries with different inputs (same idempotency key, different launch command)? [Gap] +- [ ] CHK011 Are concurrent-retry semantics specified (two retries of the same idempotency key in flight at once)? [Gap, Spec §FR-019] +- [ ] CHK012 Is the maximum number of retries before a layout is considered permanently failed specified? [Gap] +- [ ] CHK013 Are idempotency semantics specified for the lifecycle event stream (FR-015) — can duplicate events occur on retry, or are events themselves idempotent? [Gap] + +## Response Semantics + +- [ ] CHK014 Are requirements specified for distinguishing "no-op because already done" from "operation succeeded" responses? [Clarity] +- [ ] CHK015 Is the response shape specified for a retry that finds a previously-failed layout (does it return the prior failure, or attempt resumption)? [Gap, Spec §FR-013] + +## Crash Recovery + +- [ ] CHK016 Are the requirements specified for the case where the daemon crashes after creating panes but before registering them — does the next retry deduplicate via the pending-managed marker? [Coverage, Spec §FR-020] +- [ ] CHK017 Are requirements specified for crash recovery during recreate (predecessor archived, new record half-created)? [Gap, Spec §FR-011] diff --git a/specs/013-managed-session-lifecycle/checklists/integration.md b/specs/013-managed-session-lifecycle/checklists/integration.md new file mode 100644 index 0000000..1b1bb04 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/integration.md @@ -0,0 +1,44 @@ +# Integration Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that integration and external-dependency requirements (FEAT-011/012, sibling features, tmux, thin client) are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Dependency Enumeration + +- [ ] CHK001 Are the specific FEAT-011 surfaces this feature depends on enumerated (panes, agents, events, routes, queues, health, mutations)? [Completeness, Spec §Assumptions] +- [ ] CHK002 Are the specific FEAT-012 surfaces this feature depends on enumerated (which control-panel views, which mutations)? [Completeness, Spec §Assumptions] +- [ ] CHK003 Are the dependencies on FEAT-003 (bench-container discovery) and FEAT-004 (tmux pane discovery) enumerated? [Gap] +- [ ] CHK004 Are the dependencies on FEAT-006 (agent registration) enumerated (managed-created agents go through the same registration path)? [Gap, Spec §FR-004] +- [ ] CHK005 Are the dependencies on FEAT-007 (log attachment) enumerated (FR-006 reuses this path)? [Gap, Spec §FR-006] +- [ ] CHK006 Are the dependencies on FEAT-009 (safe-prompt-queue) and FEAT-010 (event routes / arbitration) enumerated (FR-008 reuses these)? [Gap, Spec §FR-008] +- [ ] CHK007 Are the tmux contract surfaces specified (which tmux commands are required: new-window, split-window, kill-pane, send-keys, list-panes)? [Gap] + +## Contract & Versioning + +- [ ] CHK008 Are version compatibility requirements specified for FEAT-011 contracts (semver, schema version)? [Gap] +- [ ] CHK009 Are deprecation/migration requirements specified for any FEAT-011 contract surface that this feature extends? [Gap] +- [ ] CHK010 Are integration requirements specified for the durable storage location (file path, format, owner) used by FR-020? [Gap, Spec §FR-020] +- [ ] CHK011 Are integration boundary requirements specified for the "no remote network listener" constraint (FR-017) — what is the canonical local socket path? [Clarity, Spec §FR-017] + +## Failure Surfaces + +- [ ] CHK012 Are the failure modes of each dependency's surface enumerated (what does this spec assume the upstream feature handles)? [Coverage, Gap] +- [ ] CHK013 Are integration requirements specified for handling tmux server crashes during layout creation? [Gap, Edge Case] +- [ ] CHK014 Are integration requirements specified for the case where FEAT-006 registration returns success but FEAT-007 log attachment fails (cross-feature partial failure)? [Gap, Coverage] + +## Coexistence + +- [ ] CHK015 Are integration requirements specified for the "managed and adopted coexist" assertion (FR-009) — what guarantees does FEAT-013 require from FEAT-006 to keep adopted-pane identity stable? [Coverage, Spec §FR-009] +- [ ] CHK016 Are integration requirements specified for the pending-managed marker interaction with FEAT-004 scan? [Coverage, Spec §FR-014] +- [ ] CHK017 Are the integration boundaries with the thin client specified (which managed-layout operations are exposed to in-container clients)? [Gap, Spec §FR-017] + +## Consistency + +- [ ] CHK018 Are integration requirements consistent across the host daemon and thin client paths (FR-017)? [Consistency] +- [ ] CHK019 Are integration requirements specified for the audit/event store and any external sink (none in MVP, but is this stated explicitly)? [Gap, Spec §FR-017] + +## Testability + +- [ ] CHK020 Are integration test requirements specified for the FEAT-011/012/006/007 interactions in this feature's scope? [Gap, Cross-ref: testing-strategy.md] +- [ ] CHK021 Are integration test fixtures specified for the bench-container dependency (real container, mock, hybrid)? [Gap] diff --git a/specs/013-managed-session-lifecycle/checklists/observability.md b/specs/013-managed-session-lifecycle/checklists/observability.md new file mode 100644 index 0000000..5e15ddd --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/observability.md @@ -0,0 +1,47 @@ +# Observability Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that observability requirements (events, metrics, logs, traces) are complete, clear, consistent, and measurable for this feature. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Event Catalog + +- [ ] CHK001 Are lifecycle event types fully enumerated (FR-015 lists 8 categories — is each a distinct event type or family of types)? [Completeness, Spec §FR-015] +- [ ] CHK002 Are event payload schemas specified for each event type? [Gap] +- [ ] CHK003 Are required event fields enumerated (event_id, timestamp, layout_id, pane_id, type, payload, actor)? [Gap, Spec §FR-015] +- [ ] CHK004 Are requirements specified for emitting an event on every state transition (versus only on entry to terminal states)? [Clarity, Spec §FR-015] +- [ ] CHK005 Is the relationship between Lifecycle Event records and the FR-008 shared event surfaces specified (are these the same events or two channels)? [Clarity, Spec §FR-008] + +## Metrics & SLIs + +- [ ] CHK006 Are metrics requirements specified (gauges, counters, histograms) for layout-creation duration and pane-state transitions? [Gap] +- [ ] CHK007 Are SLIs specified that correspond to SC-001 (layout-create p95 under 2 minutes) and SC-003 (log-attach-failure surface latency)? [Gap, Measurability, Spec §SC-001, SC-003] +- [ ] CHK008 Are observability requirements specified for the daemon-internal serialization queue (FR-019) so operators can see waits (queue depth, wait time)? [Gap, Spec §FR-019] +- [ ] CHK009 Are observability requirements specified for the pending-managed marker (count of in-flight markers, age distribution)? [Gap, Spec §FR-014] + +## Tracing & Correlation + +- [ ] CHK010 Are trace/correlation-id requirements specified across the create-layout pipeline (operator request → layout → panes → events)? [Gap] +- [ ] CHK011 Are requirements specified for the predecessor_id chain visibility in observability (query "show me the chain for pane X")? [Gap, Spec §FR-011] + +## Coverage + +- [ ] CHK012 Are requirements specified for the operator's ability to filter events by managed/adopted origin? [Gap, Spec §FR-005] +- [ ] CHK013 Are requirements specified for distinguishing events from automated transitions vs operator-initiated transitions? [Gap] +- [ ] CHK014 Are observability requirements specified for daemon-restart recovery (which events are emitted on reattach, FR-020)? [Gap, Spec §FR-020] +- [ ] CHK015 Are observability requirements specified for the failed-stage diagnostic (FR-013) so log queries can find it? [Coverage, Spec §FR-013] +- [ ] CHK016 Are observability requirements specified for the layout-level aggregate state (vs only pane-level events)? [Gap] + +## Volume & Cost + +- [ ] CHK017 Are requirements specified for the volume of events emitted per layout creation (does it scale O(panes), O(stages × panes))? [Gap] +- [ ] CHK018 Are retention/sizing requirements specified for the durable event store given indefinite retention (FR-021)? [Gap, Cross-ref: data-model.md, performance.md] + +## Confidentiality + +- [ ] CHK019 Are requirements specified for redacting any sensitive fields in events (launch command env vars, secrets)? [Gap, Cross-ref: security.md] + +## Consistency + +- [ ] CHK020 Are observability requirements consistent between this feature and FEAT-008 (event ingestion)? [Consistency, Dependency] +- [ ] CHK021 Are observability requirements aligned with the existing operator surfaces used for adopted panes (FR-008)? [Consistency, Spec §FR-008] diff --git a/specs/013-managed-session-lifecycle/checklists/performance.md b/specs/013-managed-session-lifecycle/checklists/performance.md new file mode 100644 index 0000000..e06908a --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/performance.md @@ -0,0 +1,34 @@ +# Performance Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that performance, scalability, and timing requirements are complete, clear, consistent, and measurable. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Latency & Timing + +- [ ] CHK001 Is SC-001's "under 2 minutes" decomposed by stage (pane create, command launch, registration, log attach)? [Completeness, Spec §SC-001] +- [ ] CHK002 Is SC-003's "within 10 seconds of layout creation completion" defined precisely (10s wall-clock from completion event, or 10s from log-attach attempt)? [Clarity, Spec §SC-003] +- [ ] CHK003 Are performance requirements specified for the FR-019 serialization wait time upper bound (max time a second request may wait)? [Gap, Spec §FR-019] +- [ ] CHK004 Are performance requirements specified for daemon-restart recovery time (FR-020/SC-008)? [Gap, Spec §FR-020, SC-008] +- [ ] CHK005 Are timing requirements specified for the pending-managed marker lifetime (max in-flight duration before it is considered stale)? [Gap, Spec §FR-014] +- [ ] CHK006 Are performance requirements specified for the operator-facing diagnostic surface latency (FR-013)? [Gap] +- [ ] CHK007 Are first-feedback-time requirements specified inside the SC-001 budget (operator sees something within X seconds)? [Gap, Spec §SC-001] + +## Throughput & Scalability + +- [ ] CHK008 Are scalability requirements specified for max concurrent managed layouts per daemon? [Gap] +- [ ] CHK009 Are scalability requirements specified for max managed panes per host / per bench container? [Gap] +- [ ] CHK010 Are throughput requirements specified for the lifecycle event stream (events/sec sustainable)? [Gap, Spec §FR-015] +- [ ] CHK011 Is the performance impact of the indefinite event retention's growth on query performance bounded by an SLA? [Gap, Spec §FR-021] +- [ ] CHK012 Is the performance impact of repeated recreations on the predecessor chain quantified (chain length × query cost)? [Gap, Spec §FR-011] + +## Degradation & Load + +- [ ] CHK013 Are degradation requirements specified for high-load scenarios (operator creating many layouts back-to-back)? [Gap, Edge Case] +- [ ] CHK014 Are performance requirements specified for the scan + creation flow interaction (does the scan polling interval impact create-layout p95)? [Gap, Spec §FR-014] +- [ ] CHK015 Are performance requirements specified consistently between FR-008's shared surfaces and existing FEAT-011 contracts (no new SLAs that contradict prior contracts)? [Consistency] + +## Measurability + +- [ ] CHK016 Are performance requirements measurable in CI or local-dev without a multi-host setup? [Measurability] +- [ ] CHK017 Are the metrics required to measure SC-001/SC-003/SC-008 enumerated (which timers, where they are emitted)? [Measurability, Cross-ref: observability.md] diff --git a/specs/013-managed-session-lifecycle/checklists/plan-review.md b/specs/013-managed-session-lifecycle/checklists/plan-review.md new file mode 100644 index 0000000..67bfd1a --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/plan-review.md @@ -0,0 +1,93 @@ +# Post-Plan Review Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Re-verify the spec + plan + research + data-model + contracts + quickstart **after** `/speckit.plan` has been run. Tests requirements-and-design-doc *quality*: did the plan close the gaps surfaced by the deep-and-wide round, are spec/plan/research/contracts mutually consistent, and did any new ambiguities slip in? +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + [plan.md](../plan.md) +**Depth**: Release gate. **Audience**: feature author + PR reviewer before `/speckit.tasks`. + +This file is a single targeted audit, not another deep-and-wide refresh. It does not delete or restate the prior 15 checklists; it tests what the plan added on top of them. + +## Spec ↔ Plan Traceability + +- [ ] CHK001 Is every functional requirement FR-001..FR-021 referenced by at least one element of plan.md (Summary / Technical Context / Project Structure)? [Traceability, Spec §FR vs Plan §Summary] +- [ ] CHK002 Is every success criterion SC-001..SC-008 paired with a Technical Context Performance Goal or a contract-level guarantee? [Traceability, Spec §SC vs Plan §Technical Context] +- [ ] CHK003 Is every clarification (Session 2026-05-24 Q1–Q15) reflected in research.md, data-model.md, **or** contracts/? [Traceability, Spec §Clarifications] +- [ ] CHK004 Is every Edge Case bullet in spec.md addressed by a contract method, a state-machine transition, or a research decision? [Coverage, Spec §Edge Cases] +- [ ] CHK005 Does plan.md's Technical Context contain zero remaining `NEEDS CLARIFICATION` markers? [Completeness, Plan §Technical Context] + +## Plan Internal Completeness + +- [ ] CHK006 Does the Constitution Check table provide concrete evidence (specific FRs / files / decisions) for each of the five principles — not just "PASS"? [Completeness, Plan §Constitution Check] +- [ ] CHK007 Does the Project Structure section list every new module file with a one-line purpose AND identify each existing-module touch point? [Completeness, Plan §Project Structure] +- [ ] CHK008 Is the Summary's "additive layer" enumeration mutually consistent with the Project Structure module list (no orphan layers, no orphan modules)? [Consistency, Plan §Summary vs §Project Structure] +- [ ] CHK009 Is the Complexity Tracking section either fully justified or explicitly empty (not silently omitted)? [Completeness, Plan §Complexity Tracking] +- [ ] CHK010 Are FEAT dependencies enumerated with the **exact** reused surfaces (FEAT-002 dispatcher, FEAT-004 docker-exec channel, FEAT-006 register-self path, FEAT-007 log attach, FEAT-008 audit JSONL, FEAT-009 peer detection, FEAT-010 routes, FEAT-011 envelope/error registry)? [Completeness, Plan §Technical Context] + +## Research Quality + +- [ ] CHK011 Does each research item R1–R13 follow Decision / Rationale / Alternatives with at least one *real* alternative considered (not a strawman)? [Completeness, Research §R*] +- [ ] CHK012 Is the pending-marker representation (R1) safe against the in-pane process editing its own tmux pane title before registration completes? [Edge Case, Gap, Research §R1] +- [ ] CHK013 Is the 5-minute pending-marker TTL (R5) surfaced as a *measurable* system property (not only an internal sweep cadence)? [Measurability, Research §R5] +- [ ] CHK014 Is the recreate-chain depth bound of 16 (R4) justified relative to a realistic operator iteration workflow, not just a round number? [Clarity, Research §R4] +- [ ] CHK015 Is the per-container `asyncio.Lock` (R2) sufficient for the "remove + recreate" sequence, or is an additional per-pane lock needed for the predecessor → successor transition? [Coverage, Gap, Research §R2] +- [ ] CHK016 Are the launch-command argv decisions (R6) compatible with operator-supplied `working_dir` and `env` without re-opening a shell-interpolation hazard? [Consistency, Research §R6 vs Constitution §III] +- [ ] CHK017 Does research §R12's bench-container thin-client constraint refine — not contradict — spec §Assumptions' "MVP authorization is socket-access based"? [Consistency, Research §R12 vs Spec §Assumptions] + +## Data-Model Fidelity + +- [ ] CHK018 Does the SQLite DDL include CHECK constraints matching the closed-set `state` and `failed_stage` enums in both `managed_layout` and `managed_pane`? [Completeness, Data-Model §DDL] +- [ ] CHK019 Does the partial unique index on `(container_id, label)` correctly allow a recreated pane to reuse its predecessor's label after the predecessor enters `removed` or `failed`? [Edge Case, Data-Model §DDL] +- [ ] CHK020 Are required-vs-optional field markers explicit (NOT NULL / nullable) for every attribute in both entities? [Completeness, Data-Model §Entity field reference] +- [ ] CHK021 Are the layout-state derivation rules unambiguous for the zero-non-terminal-pane boundary (every pane `removed`)? [Clarity, Data-Model §ManagedLayout lifecycle] +- [ ] CHK022 Is the `chain_depth <= 16` CHECK constraint reconcilable with the service-side `>= 15` rejection rule (off-by-one boundary)? [Consistency, Data-Model §DDL vs Research §R4] +- [ ] CHK023 Is the `agent_id` FK direction (`managed_pane → agent`) consistent with FEAT-006 owning the agent table (no reverse-FK from agent to managed_pane)? [Consistency, Data-Model §DDL vs Plan §Technical Context] +- [ ] CHK024 Are the indexes (`ix_managed_layout_container_state`, `ix_managed_pane_layout_state`, etc.) aligned with the read access patterns described in contracts/managed-methods.md? [Completeness, Data-Model §DDL vs Contracts §M2..M5] + +## Contract Fidelity + +- [ ] CHK025 Does every method in managed-methods.md declare an explicit error-code list referencing only codes defined in error-codes.md (no undeclared codes)? [Consistency, Contracts §managed-methods vs §error-codes] +- [ ] CHK026 Is the `managed.layout.create` semantics ("response returns after row insertion, before tmux spawn completes") clearly described, including how the operator subsequently observes `ready`? [Clarity, Contracts §M1] +- [ ] CHK027 Is the lifecycle event catalog in managed-methods.md §Events 1:1 with the events listed in research §R11 (same set, same payload shape)? [Consistency, Contracts §Events vs Research §R11] +- [ ] CHK028 Is the `managed_pane_illegal_transition` error's `requested_action` field's value set enumerated (closed set of operator actions)? [Completeness, Gap, Contracts §error-codes] +- [ ] CHK029 Does the state-machine document distinguish operator-initiated transitions from daemon-initiated transitions (sweep, recovery) in the trigger column? [Clarity, Contracts §state-machine] +- [ ] CHK030 Is the `not_implemented` stub for `promote_from_adopted` reachable via both legacy `managed.*` and `app.managed_*` namespaces with identical response shapes? [Consistency, Contracts §M8] +- [ ] CHK031 Are the `idempotency_key` semantics (in-flight match vs completed match vs absent) consistent between `managed.layout.create` and `managed.pane.recreate`? [Consistency, Contracts §M1 vs §M7] + +## Quickstart Adequacy + +- [ ] CHK032 Does the quickstart cover at least one acceptance scenario from each of US1, US2, US3? [Coverage, Quickstart §US1/US2/US3 vs Spec §User Scenarios] +- [ ] CHK033 Does the quickstart exercise the daemon-restart recovery path with explicit pre- and post-restart observable state? [Coverage, Quickstart §US3 daemon restart] +- [ ] CHK034 Does the quickstart include negative-path edge cases (`managed_session_name_conflict`, recreate-chain-too-deep, adopted-pane protection)? [Coverage, Quickstart §Edge cases] +- [ ] CHK035 Are the quickstart's preconditions (YAML files, socket path, container availability) consistent with the constitution's `~/.config/opensoft/agenttower/` path conventions? [Consistency, Quickstart §Preconditions vs Constitution §Technical Constraints] + +## Newly Introduced Gaps (from plan choices) + +- [x] CHK036 Is the 5-minute pending-marker TTL (R5) reflected as either an FR addition or a documented assumption in spec.md, not only in research? [Gap, Research §R5 vs Spec §Assumptions] — **Resolved 2026-05-24** by spec FR-022 (post-plan review). Implementation footprint (sweep loop) deferred to `/speckit.tasks`. +- [x] CHK037 Are the operator-facing implications of the depth-16 recreate-chain bound (R4) surfaced in spec.md (e.g., as an FR or success criterion), not only in contracts/error-codes? [Gap, Research §R4 vs Spec §FR] — **Resolved 2026-05-24** by spec FR-023. +- [x] CHK038 Are the YAML configuration paths (R8/R9) referenced from spec §Assumptions, not only in research/plan? [Completeness, Research §R8/R9 vs Spec §Assumptions] — **Resolved 2026-05-24** by spec §Assumptions YAML-paths bullet + FR-024. +- [x] CHK039 Is the absence of a "cancel in-flight create-layout" operation explicitly listed as out-of-scope in spec §FR-018, not only mentioned implicitly in M6/R2? [Completeness, Gap, Spec §FR-018] — **Resolved 2026-05-24** by spec FR-018 amendment. +- [x] CHK040 Is the `failed_stage` taxonomy (R7) reflected in spec.md as part of FR-013 ("identify the failed stage"), or does the spec stay at the abstract "failed stage" wording? [Consistency, Research §R7 vs Spec §FR-013] — **Resolved 2026-05-24** by spec FR-013 inline enum (also rippled into SC-006 in alignment-cleanup session). +- [x] CHK041 Is the daemon-restart `recovery_reattach` failed_stage outcome reachable from any operator surface (event, list, detail), or only as an internal log entry? [Completeness, Gap, Research §R13 §Recovery vs Contracts §Events] — **Resolved 2026-05-24** by spec FR-020 amendment + SC-009. Implementation footprint (detail-surface fields, post-restart visibility ≤ 5s) deferred to `/speckit.tasks`. + +> **Amendment note 2026-05-24 (alignment cleanup):** CHK036–CHK041 closed by post-plan spec edits. Per spec §Clarifications "Session 2026-05-24 (alignment cleanup)" Q3, the implementation work implied by FR-022 (sweep loop), FR-020 (recovery outcomes in detail surface), and SC-009 (5-second post-restart visibility) is to be captured as tasks by `/speckit.tasks`; these requirements are not blocked, but their CHK closure here is a requirements-quality close, not an implementation-complete close. + +## Cross-Document Terminology Consistency + +- [ ] CHK042 Is "operator" used canonically across plan.md, research.md, data-model.md, contracts/*.md, and quickstart.md (per Q15)? [Consistency, Spec §Clarifications Q15] +- [ ] CHK043 Are the state enum spellings (`creating`, `ready`, `degraded`, `failed`, `removed`) identical across spec, plan, data-model, state-machine, and contracts (no `Creating` / `READY` drift)? [Consistency] +- [ ] CHK044 Are the new closed-set error code spellings identical across data-model.md, contracts/managed-methods.md, and contracts/error-codes.md (e.g., `managed_session_name_conflict` not `session_name_conflict`)? [Consistency] +- [ ] CHK045 Is the `failed_stage` enum spelled identically across data-model.md, state-machine.md, and research §R7 (e.g., `pane_create` vs `pane-create` vs `pane_create_failed`)? [Consistency] + +## Test-Plan Alignment + +- [ ] CHK046 Does the `tests/contract/` list in plan.md cover every method in managed-methods.md (M1–M8)? [Coverage, Plan §Project Structure vs Contracts §managed-methods] +- [ ] CHK047 Does the `tests/integration/` list in plan.md cover every User Story (US1/US2/US3) and the Edge Cases section? [Coverage] +- [ ] CHK048 Does the test plan include a failure-injection harness for partial-failure and restart-recovery flows (callable from the contract-test layer)? [Coverage, Plan §Testing] +- [ ] CHK049 Are the test fixtures (`managed_template_fixtures`, `managed_clock`, `managed_tmux_recorder`) sufficient to exercise the FR-019 serializer FIFO without race conditions in CI? [Measurability, Plan §Project Structure] + +## Constitution Re-Check Coverage + +- [ ] CHK050 Does the Principle III evidence specifically reference the argv-first launch decision (R6) and the `shlex.quote` fallback path? [Completeness, Plan §Constitution Check] +- [ ] CHK051 Does the Principle IV evidence list both CLI (`managed.*`) and app (`app.managed_*`) parity, plus SQLite + JSONL durability? [Completeness, Plan §Constitution Check] +- [ ] CHK052 Does the Principle II evidence rule out host-only-tmux, Antigravity, mailbox adapters, and Python-thread backends? [Completeness, Plan §Constitution Check] +- [ ] CHK053 Is the post-design Constitution re-check called out explicitly (not merely implied by "unchanged")? [Clarity, Plan §Constitution Check] diff --git a/specs/013-managed-session-lifecycle/checklists/requirements.md b/specs/013-managed-session-lifecycle/checklists/requirements.md new file mode 100644 index 0000000..6046c14 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/requirements.md @@ -0,0 +1,95 @@ +# Specification Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate specification completeness and quality before proceeding to planning +**Created**: 2026-05-23 +**Feature**: [spec.md](../spec.md) + +## Content Quality + +- [x] No implementation details (languages, frameworks, APIs) +- [x] Focused on user value and business needs +- [x] Written for non-technical stakeholders +- [x] All mandatory sections completed + +## Requirement Completeness + +- [x] No [NEEDS CLARIFICATION] markers remain +- [x] Requirements are testable and unambiguous +- [x] Success criteria are measurable +- [x] Success criteria are technology-agnostic (no implementation details) +- [x] All acceptance scenarios are defined +- [x] Edge cases are identified +- [x] Scope is clearly bounded +- [x] Dependencies and assumptions identified + +## Feature Readiness + +- [x] All functional requirements have clear acceptance criteria +- [x] User scenarios cover primary flows +- [x] Feature meets measurable outcomes defined in Success Criteria +- [x] No implementation details leak into specification + +## Notes + +- Initial validation passed for `/speckit.clarify` and `/speckit.plan`. + +--- + +## Cross-Cutting Requirements Quality (Session 2026-05-24, Deep & Wide) + +**Purpose**: Cross-cutting requirements-quality unit tests across completeness, clarity, consistency, acceptance criteria, dependencies/assumptions, and ambiguities/conflicts. Each item tests the spec's wording, not the implementation. + +### Completeness + +- [ ] CHK001 Are all functional requirements (FR-001 through FR-021) traceable to at least one user story or success criterion? [Completeness, Traceability] +- [ ] CHK002 Are all success criteria (SC-001 through SC-008) traceable to at least one functional requirement? [Traceability] +- [ ] CHK003 Are all Key Entities cross-referenced by at least one functional requirement? [Completeness] +- [ ] CHK004 Are the "standard templates" (FR-001) defined with full template schema (pane count, role per pane, label pattern, expected commands)? [Completeness, Gap, Spec §FR-001] +- [ ] CHK005 Are all attributes of each Key Entity enumerated, including required-vs-optional markers? [Completeness, Spec §Key Entities] +- [ ] CHK006 Is the lifecycle state transition graph fully enumerated (every valid transition from every state, not only the states themselves)? [Completeness, Gap, Spec §FR-007] +- [ ] CHK007 Are dependencies on FEAT-011 enumerated with specific contract surfaces (which endpoints, which event types)? [Completeness, Spec §Assumptions] +- [ ] CHK008 Are dependencies on FEAT-012 enumerated with specific UI affordances required? [Completeness, Spec §Assumptions] +- [ ] CHK009 Are dependencies on FEAT-003/004/006/007/008/009/010 enumerated where this feature reuses their surfaces (FR-004, FR-006, FR-008, FR-015)? [Completeness, Gap] +- [ ] CHK010 Are out-of-scope items in FR-018 enumerated exhaustively for FEAT-013? [Completeness] + +### Clarity + +- [ ] CHK011 Is the term "managed-created" used consistently and not interchangeably with "managed" or "AgentTower-created"? [Clarity, Consistency] +- [ ] CHK012 Is "pending-managed marker" defined with its lifecycle (when set, when cleared, where stored)? [Clarity, Gap, Spec §FR-014] +- [ ] CHK013 Is "fresh identity" (US3 AS-2) quantified — does it mean a new UUID, a new label, or both? [Clarity, Spec §FR-011] +- [ ] CHK014 Is "actionable diagnostic" (FR-016) quantified with required diagnostic fields? [Clarity, Ambiguity, Spec §FR-016] +- [ ] CHK015 Is "host-readable pane logs" (FR-006) defined with explicit conditions for what counts as host-readable? [Clarity, Spec §FR-006] +- [ ] CHK016 Is the boundary between "layout creation" and "pane creation" lifecycle states unambiguous (when does a layout transition from `creating` to `ready`)? [Clarity, Gap] +- [ ] CHK017 Are layout-level lifecycle states distinct from pane-level lifecycle states, or are they intentionally the same set? [Clarity, Gap, Spec §FR-007] +- [ ] CHK018 Is the term "operator" defined (e.g., who has socket access) or assumed to be self-evident? [Clarity, Gap] + +### Consistency + +- [ ] CHK019 Does FR-007's state list (`creating, ready, degraded, failed, removed`) match exactly the Key Entities Managed Pane state list? [Consistency] +- [ ] CHK020 Is every clarification recorded under "Session 2026-05-24" reflected in at least one downstream FR, SC, or Edge Case? [Consistency] +- [ ] CHK021 Are all edge cases listed in the Edge Cases section mapped to specific FRs that govern their resolution? [Consistency, Traceability] +- [ ] CHK022 Are there any conflicts between Clarifications answers and pre-existing FRs that the spec hasn't reconciled? [Conflict] +- [ ] CHK023 Is the spec's User Story numbering (US1/US2/US3) used consistently across Edge Cases and FRs? [Consistency] +- [ ] CHK024 Is the spec free of [NEEDS CLARIFICATION] markers or unresolved decisions? [Completeness] + +### Acceptance Criteria Quality + +- [ ] CHK025 Are SC-001's "under 2 minutes" and SC-003's "10 seconds" thresholds justified (why those values)? [Acceptance Criteria] +- [ ] CHK026 Is each SC objectively measurable without requiring implementation inspection? [Measurability] +- [ ] CHK027 Are the acceptance scenarios in US1/US2/US3 testable without requiring multi-host setup? [Measurability] +- [ ] CHK028 Are SC-006's "specific failed stage and recovery action visible to the operator" criteria measurable (which fields, which surface)? [Measurability, Spec §SC-006] + +### Dependencies & Assumptions + +- [ ] CHK029 Is the assumption "MVP authorization is socket-access based" testable as a negative requirement (no UID check, no per-container ACL)? [Measurability, Spec §Assumptions] +- [ ] CHK030 Is the assumption "each template declares its own pane count" backed by a corresponding FR or referenced template schema? [Dependency, Gap, Spec §Assumptions] +- [ ] CHK031 Is the dependency on durable storage (FR-020) listed in the Assumptions section as well as the FR? [Consistency, Dependency, Spec §FR-020] +- [ ] CHK032 Are the failure modes for tmux operations (kill-pane, create-pane, send-keys) enumerated and matched to lifecycle state transitions? [Coverage, Gap] + +### Ambiguities & Conflicts + +- [ ] CHK033 Is the predecessor_id field's behavior under multiple successive recreations (predecessor of predecessor) specified? [Coverage, Gap, Spec §FR-011] +- [ ] CHK034 Does the spec specify what happens if a recreated pane itself fails immediately — bounded recreate-chain depth, or unbounded? [Coverage, Gap] +- [ ] CHK035 Is the `promoted_from_adopted` reserved transition's eligible source-state set defined (which adopted-pane states are eligible)? [Gap, Spec §FR-007] +- [ ] CHK036 Are the relationships between layout-level state and pane-level state defined (e.g., a layout is `ready` iff all panes are `ready` or `degraded`)? [Gap] + diff --git a/specs/013-managed-session-lifecycle/checklists/security.md b/specs/013-managed-session-lifecycle/checklists/security.md new file mode 100644 index 0000000..6551d9d --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/security.md @@ -0,0 +1,46 @@ +# Security Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that security and protection requirements (auth, authz, injection, integrity, isolation) are complete, clear, consistent, and measurable for this feature. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Threat Model & Authorization + +- [ ] CHK001 Is the threat model documented or referenced for this feature? [Gap] +- [ ] CHK002 Are the authentication requirements for the daemon socket specified, or explicitly absent for MVP per the Assumptions? [Clarity, Spec §Assumptions] +- [ ] CHK003 Are the local-socket access controls specified (file permissions, group ownership, UID match policy)? [Gap, Spec §FR-017] +- [ ] CHK004 Are authorization requirements specified for destructive lifecycle actions (remove, recreate) beyond "any socket caller"? [Gap, Spec §FR-010, FR-011] +- [ ] CHK005 Is the protection mechanism specified that prevents an operator from removing adopted panes via managed-pane operations (FR-012)? [Completeness, Spec §FR-012] +- [ ] CHK006 Are authentication/authorization requirements specified for the `promoted_from_adopted` transition stub (so it cannot be accidentally invoked in MVP)? [Gap, Spec §FR-018] +- [ ] CHK007 Are deny-by-default requirements specified for any future per-user/per-container ACL extension? [Gap, Spec §Assumptions] + +## Input Validation & Injection + +- [ ] CHK008 Are command-injection protections specified for launch commands (FR-002)? [Gap, Spec §FR-002] +- [ ] CHK009 Are constraints specified on what launch commands a profile may contain (whitelist, sandbox, no shell metachars)? [Gap, Spec §FR-002] +- [ ] CHK010 Are requirements specified for sanitizing the human-readable label patterns to prevent injection into tmux pane titles or terminal output? [Gap, Spec §FR-003] +- [ ] CHK011 Are validation requirements specified for the tmux session name to reject names that could confuse other surfaces (control characters, length limits)? [Gap, Spec §FR-016] + +## Confidentiality + +- [ ] CHK012 Are requirements specified for what data the lifecycle events contain (any sensitive material such as full command lines, environment variables, working directories)? [Gap, Spec §FR-015] +- [ ] CHK013 Are SESSION_NAME_CONFLICT and other error responses specified to not leak sensitive information (other tmux sessions, paths)? [Gap, Spec §FR-016] +- [ ] CHK014 Are requirements specified for redacting any sensitive fields in launch command profiles before persistence/observability? [Gap, Cross-ref: configuration.md, observability.md] + +## Integrity + +- [ ] CHK015 Are protections specified against TOCTOU between scan and creation flow (the pending-managed marker is the mitigation — is its integrity guaranteed)? [Gap, Spec §FR-014] +- [ ] CHK016 Is there a requirement that managed-layout state survival across daemon restart (FR-020) preserves integrity (no tampering between restart cycles)? [Gap, Spec §FR-020] +- [ ] CHK017 Are protections specified against an operator removing tmux sessions they did not create through the managed-pane path? [Completeness, Spec §FR-010] +- [ ] CHK018 Are protections specified against forging the predecessor_id linkage (an operator cannot fabricate a chain to mask history)? [Gap, Spec §FR-011] +- [ ] CHK019 Are audit-log integrity requirements specified for the indefinite event retention (FR-021)? [Gap, Spec §FR-021] + +## Containment / Isolation + +- [ ] CHK020 Are the security implications of the bench-container thin-client model specified (untrusted in-container code calling the daemon via the mounted socket)? [Gap, Spec §FR-017] +- [ ] CHK021 Are isolation requirements specified between managed layouts in different bench containers (cross-container leakage protections)? [Gap, Spec §FR-009] + +## Exception / Recovery + +- [ ] CHK022 Are security requirements specified for the daemon-restart recovery path (verifying that recovered tmux panes really match the durable records)? [Gap, Spec §FR-020] +- [ ] CHK023 Are security requirements specified for the case where two callers race for the same destructive action on the same pane (lock+permission order)? [Gap, Spec §FR-019] diff --git a/specs/013-managed-session-lifecycle/checklists/testing-strategy.md b/specs/013-managed-session-lifecycle/checklists/testing-strategy.md new file mode 100644 index 0000000..0135fa8 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/testing-strategy.md @@ -0,0 +1,42 @@ +# Testing Strategy Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that the requirements themselves are testable — i.e., that every FR/SC/edge case can be exercised by a test without requiring implementation-level inspection. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Traceability + +- [ ] CHK001 Is every FR (FR-001..FR-021) testable by at least one acceptance scenario or success criterion? [Traceability] +- [ ] CHK002 Is every clarification (Session 2026-05-24 Q1–Q15) covered by at least one acceptance scenario, FR, or SC such that a test can verify the chosen option was applied? [Traceability] + +## Observability for Tests + +- [ ] CHK003 Are the testability requirements specified for the FR-019 per-container serialization (how does a test observe that the second request waited)? [Measurability, Spec §FR-019] +- [ ] CHK004 Are the testability requirements specified for the pending-managed marker (how does a test observe it being set and cleared)? [Measurability, Spec §FR-014] +- [ ] CHK005 Are the testability requirements specified for the recreate predecessor_id linkage (how does a test verify the chain)? [Measurability, Spec §FR-011] +- [ ] CHK006 Are the testability requirements specified for the daemon-restart recovery (FR-020/SC-008) without orchestrating a full process restart in every test? [Measurability, Spec §SC-008] + +## SC Measurability + +- [ ] CHK007 Are the testability requirements specified for SC-001's <2min target in CI (with mocks or real bench containers)? [Measurability, Spec §SC-001] +- [ ] CHK008 Are the testability requirements specified for SC-003's 10s log-attach-failure visibility? [Measurability, Spec §SC-003] +- [ ] CHK009 Are the testability requirements specified for SC-008's reattach-without-operator-intervention? [Measurability, Spec §SC-008] +- [ ] CHK010 Are the testability requirements specified for the "label uniqueness within bench container" (FR-003)? [Measurability] + +## Negative & Concurrency Tests + +- [ ] CHK011 Are negative-test requirements specified (operator cannot remove adopted pane, FR-012)? [Coverage, Spec §FR-012] +- [ ] CHK012 Are concurrency-test requirements specified (two simultaneous create-layout requests against the same container, FR-019)? [Coverage] +- [ ] CHK013 Are race-condition test requirements specified for the scan/creation interaction (FR-014)? [Coverage] + +## Failure Injection + +- [ ] CHK014 Are failure-injection test requirements specified for each Edge Case bullet (tmux kill mid-create, log-path unreadable, daemon restart mid-create, container disappearance)? [Gap, Coverage] +- [ ] CHK015 Are test fixtures specified for the bench-container dependency (real container, mock, hybrid)? [Gap] + +## Scope & Boundary + +- [ ] CHK016 Are integration-test requirements specified for the FEAT-011/012/006/007 interaction touch points? [Coverage, Cross-ref: integration.md] +- [ ] CHK017 Are non-regression test requirements specified for the "managed and adopted coexist" guarantee (FR-009)? [Coverage, Spec §FR-009] +- [ ] CHK018 Are the test ownership boundaries specified for what FEAT-013 owns vs what FEAT-011/012 own? [Clarity] +- [ ] CHK019 Is indefinite audit retention (FR-021) testable without long-running tests (e.g., simulated time, or a test-only sub-policy)? [Measurability, Spec §FR-021] diff --git a/specs/013-managed-session-lifecycle/checklists/ux.md b/specs/013-managed-session-lifecycle/checklists/ux.md new file mode 100644 index 0000000..5ae8559 --- /dev/null +++ b/specs/013-managed-session-lifecycle/checklists/ux.md @@ -0,0 +1,48 @@ +# UX Requirements Quality Checklist: Managed Session Creation and Lifecycle + +**Purpose**: Validate that operator-facing UX requirements are complete, clear, consistent, and measurable for the surfaces this feature touches in the control panel. +**Created**: 2026-05-24 +**Feature**: [spec.md](../spec.md) + +## Requirement Completeness + +- [ ] CHK001 Are control-panel UI requirements specified for the layout-creation entry point (modal, wizard, inline action)? [Gap] +- [ ] CHK002 Are visual requirements specified for distinguishing managed vs adopted agents in agent lists? [Completeness, Spec §FR-005] +- [ ] CHK003 Are progress-feedback requirements specified for the up-to-2-minute layout creation duration? [Gap, Spec §SC-001] +- [ ] CHK004 Are visual representations defined for each managed-pane lifecycle state (`creating`, `ready`, `degraded`, `failed`, `removed`)? [Completeness, Spec §FR-007] +- [ ] CHK005 Is the visual treatment for "managed/adopted origin" (SC-002) specified (badge, icon, label, color)? [Clarity, Spec §SC-002] +- [ ] CHK006 Are operator-facing diagnostic UI requirements specified for FR-013's "failed pane, failed stage, suggested recovery action"? [Completeness, Spec §FR-013] +- [ ] CHK007 Is the UI for the predecessor → recreated linkage defined (how the operator sees the chain)? [Gap, Spec §FR-011] +- [ ] CHK008 Are confirmation/affirmation UI requirements specified for destructive lifecycle actions (remove, recreate)? [Gap, Spec §FR-010] +- [ ] CHK009 Are visual cues defined for `SESSION_NAME_CONFLICT` and other error conditions surfaced to the operator? [Gap, Spec §FR-016] +- [ ] CHK010 Is the surface for the audit/history view (FR-021 indefinite retention) defined or scoped out? [Gap, Spec §FR-021] +- [ ] CHK011 Is the input shape for "provide or select configured launch commands" (FR-002) defined (free-text, dropdown, hybrid)? [Clarity, Spec §FR-002] + +## Requirement Clarity + +- [ ] CHK012 Are the visual treatments for `degraded` and `failed` distinct enough to be unambiguous at a glance? [Clarity, Spec §FR-007] +- [ ] CHK013 Are visual hierarchy requirements specified for the relative importance of layouts vs panes vs agents in the same view? [Gap] +- [ ] CHK014 Are operator-facing copy/wording requirements specified to keep the canonical term "operator" across all UI strings? [Consistency, Spec §Clarifications] +- [ ] CHK015 Is the UI behavior defined during the "second request waits" path of FR-019 serialization (spinner, queue position, estimated wait)? [Gap, Spec §FR-019] + +## Requirement Consistency + +- [ ] CHK016 Are UI requirements for managed-vs-adopted distinction consistent across agent lists, routes, queues, and events views (FR-008)? [Consistency, Spec §FR-008] +- [ ] CHK017 Are confirmation-prompt UI requirements consistent between remove and recreate flows (FR-010, FR-011)? [Consistency] + +## Scenario Coverage + +- [ ] CHK018 Are loading/empty-state UI requirements specified for the layout list when no managed layouts exist? [Coverage, Gap] +- [ ] CHK019 Are UI requirements specified for the Recovery Flow when an operator returns to a partially-failed layout? [Coverage, Gap, Spec §FR-013] +- [ ] CHK020 Are UI requirements specified for the daemon-restart recovery scenario (operator notification, transparent reattach, or both)? [Coverage, Gap, Spec §SC-008] +- [ ] CHK021 Are UI requirements specified for the Exception Flow when the bench container disappears mid-creation? [Coverage, Gap, Spec §Edge Cases] + +## Edge Case Coverage + +- [ ] CHK022 Are UI requirements specified for surfacing a pending-managed pane to the operator before registration completes? [Gap, Spec §FR-014] +- [ ] CHK023 Are UI requirements specified for the case where an operator attempts a destructive action on an adopted pane (FR-012)? [Gap, Spec §FR-012] + +## Non-Functional UX + +- [ ] CHK024 Are responsive/breakpoint requirements defined for the control panel surfaces this feature affects? [Gap] +- [ ] CHK025 Are perceived-performance requirements specified for stages within the SC-001 2-minute budget (e.g., first feedback within X seconds)? [Gap, Spec §SC-001] diff --git a/specs/013-managed-session-lifecycle/clarify-questions.md b/specs/013-managed-session-lifecycle/clarify-questions.md new file mode 100644 index 0000000..375d2a0 --- /dev/null +++ b/specs/013-managed-session-lifecycle/clarify-questions.md @@ -0,0 +1,110 @@ +# Clarify Questions — FEAT-013 Alignment Cleanup (Round 3) + +**Session:** 2026-05-24 (alignment cleanup) +**Spec:** [spec.md](./spec.md) +**Trigger:** `alignment-check.md` "Worth investigating" items (CHK032, CHK033, CHK034, CHK037, CHK038) +**Reply format:** Answer with the option letter (e.g., `1: A`), `recommended` to take the recommended option, or a short free-form answer (≤5 words). Multi-answer form OK: `1: A, 2: recommended, ...`. + +--- + +## Q1. Plan back-reference to post-plan Clarifications sub-session (CHK032) + +Should `plan.md` cite the spec.md "Session 2026-05-24 (post-plan review)" sub-session as the documented origin of FR-022 / FR-023 / FR-024 / SC-009? + +**Recommended:** Option A — explicit back-references give a future reader a one-hop audit trail from plan to spec without searching FR IDs. + +| Option | Description | +|--------|-------------| +| A | Add a one-line back-reference in plan.md (Summary or Technical Context) pointing to spec §Clarifications "post-plan review" as the FR-022/023/024 + SC-009 origin. | +| B | No — the FR IDs already provide traceability; readers can find the sub-session in spec.md without help. | +| C | Cross-reference only from research.md (where R5/R4/R8/R9 already exist), not plan.md. | + +--- + +## Q2. User-story traceability for FR-022 / FR-023 / FR-024 / SC-009 (CHK033) + +These four new requirements are arguably system-level (TTL sweep, depth bound, override capability, restart visibility). How should they be traced? + +**Recommended:** Option B — each maps cleanly to an existing User Story; that preserves the "every FR/SC traces to a US" property without inventing a new US. + +| Option | Description | +|--------|-------------| +| A | Mark all four as "Cross-cutting / System-level" in their FR/SC text and document that they intentionally have no User Story home. | +| B | Map each to an existing US in the FR/SC text: FR-022 / SC-009 → US3 (lifecycle / recovery); FR-023 → US3 (recreate); FR-024 → US1 (layout creation). | +| C | Add a new "User Story 4 — Operational Recovery and Operator Overrides" covering these four explicitly. | + +--- + +## Q3. plan-review.md CHK036–041 resolution disposition (CHK034) + +Are CHK036–041 closed by the spec edits alone, or do FR-022 (TTL sweep) / FR-020 + SC-009 (detail-surface) imply specific implementation footprints that need separate task capture? + +**Recommended:** Option B — the spec edits do close the requirements gaps, but FR-022 and FR-020/SC-009 imply real code (sweep loop, detail-surface fields). Acknowledging that now keeps the audit trail honest. + +| Option | Description | +|--------|-------------| +| A | Resolved by spec edits alone — tick CHK036–041 in plan-review.md and move on. | +| B | Spec is updated AND FR-022 / FR-020 / SC-009 imply specific implementation work to capture as tasks; tick CHK036–041 and queue the implementation tasks when `/speckit.tasks` runs. | +| C | Defer the resolution decision to `/speckit.analyze`. | + +--- + +## Q4. Error code for FR-022 TTL-driven failures (CHK037) + +FR-022's TTL sweep transitions a pending pane to `failed` with `failed_stage = pane_create` or `registration`. Should this surface a dedicated error code, or stay observable via the pane state + `failed_stage` alone? + +**Recommended:** Option A — `failed_stage` is the canonical operator signal; the sweep is daemon-internal and should not invent a new closed-set code. + +| Option | Description | +|--------|-------------| +| A | No new error code. TTL sweep is internal; the operator sees the resulting `failed` state + `failed_stage` (`pane_create` or `registration`) — exactly the FR-013 closed set, no new vocabulary. | +| B | Add a new `managed_pane_pending_marker_expired` error code, surfaced when an operator queries during the sweep race. | +| C | Extend `managed_pane_recreate_chain_too_deep` details schema to also cover TTL failures (as CHK037's literal phrasing suggests). | + +--- + +## Q5. SC-006 wording vs FR-013 enum (CHK038) + +FR-013 now declares the closed `failed_stage` enum. SC-006 still says "with a specific failed stage and recovery action visible to the operator" abstractly. + +**Recommended:** Option A — point SC-006 at FR-013; single-source the enum and keep the SC short. + +| Option | Description | +|--------|-------------| +| A | Update SC-006 to reference FR-013 by ID: "...with `failed_stage` from the FR-013 closed set and a recovery action visible to the operator." | +| B | Inline the six enum values in SC-006. | +| C | Leave SC-006 abstract; FR-013 carries the canonical enum and SC-006 stays at the success-criterion level. | + +--- + +## How to reply + +- `1: A, 2: recommended, 3: B, ...` +- `all recommended` to accept every recommendation +- `recommended except 3: A` to accept recommendations with overrides +- For any question, supply a short free-form answer (≤5 words) instead of an option letter. + +## Answers + +1: A + +2: B + +3: B + +4: A + +5: A + +Notes: + +- Add the plan back-reference to the post-plan clarifications session so FR-022, FR-023, FR-024, and SC-009 have a one-hop audit trail. +- Trace the new system-level items to existing stories: FR-022 and SC-009 to US3, FR-023 to US3, and FR-024 to US1. +- Treat CHK036-CHK041 as requirement gaps closed by spec edits, while preserving the implementation work for task generation. +- Do not add a new TTL-specific error code; the operator-facing signal is the failed pane state plus FR-013 `failed_stage`. +- Update SC-006 to reference the FR-013 closed `failed_stage` set rather than duplicating the enum. + +After your replies I will: +1. Apply the answers to spec.md (FR/SC wording adjustments), plan.md (back-reference, if Q1=A), error-codes.md (if Q4≠A), and plan-review.md (if Q3 → tick boxes + amendment note). +2. Re-run a quick consistency pass on FR/SC numbering and traceability. +3. Identify any items that need to be captured as forward-pointing tasks for `/speckit.tasks` (if Q3=B). diff --git a/specs/013-managed-session-lifecycle/contracts/error-codes.md b/specs/013-managed-session-lifecycle/contracts/error-codes.md new file mode 100644 index 0000000..12eaaab --- /dev/null +++ b/specs/013-managed-session-lifecycle/contracts/error-codes.md @@ -0,0 +1,115 @@ +# Contract: Closed-Set Error Codes (FEAT-013 additions) + +**Feature**: 013-managed-session-lifecycle +**Authority**: spec.md §FR-013/016/018; research.md. + +This file lists the **new** closed-set error codes added by FEAT-013, extending the FEAT-011 27-entry registry. Each entry follows the FEAT-011 `(code, message-shape, details schema)` convention. + +The full closed set for an `app.managed_*` or legacy `managed.*` response continues to include the prior FEAT-011 codes (`validation_failed`, `host_only`, `not_implemented`, `internal_error`, `malformed_request`, etc.) — those are reused unchanged. + +--- + +## New codes + +### `managed_template_not_found` + +- **When**: `managed.layout.create` is called with a `template_name` that does not resolve via the built-in registry or the operator YAML override directory. +- **Details schema**: + ```json + {"template_name": "string", "known_templates": ["string", "..."]} + ``` +- **Operator action**: Verify the template name or define it in `~/.config/opensoft/agenttower/managed_templates/`. +- **Resolution order** (per FR-024): operator override file with the same `name` wins over the built-in default; if neither resolves, this error fires. + +### `managed_launch_command_not_found` + +- **When**: A `launch_command_overrides` entry or a template's `default_launch_command_ref` references a profile that does not exist in `~/.config/opensoft/agenttower/launch_commands/`. +- **Details schema**: + ```json + {"profile_name": "string", "known_profiles": ["string", "..."]} + ``` +- **Resolution order** (per FR-024): operator-supplied profile with the same `name` overrides any built-in default before this error is raised. + +### `managed_session_name_conflict` (FR-016, Q6) + +- **When**: `managed.layout.create` requests a `tmux_session_name` that already exists in the target container. +- **Details schema**: + ```json + {"container_id": "string", "tmux_session_name": "string"} + ``` +- **Operator action**: Choose a different `tmux_session_name` or kill the existing tmux session first. +- **Note**: This is a hard rejection — no silent suffixing or session reuse (per Q6 decision). + +### `managed_layout_not_found` + +- **When**: A layout-scoped method (`managed.layout.detail`, `managed.pane.list?layout_id=`, etc.) references an unknown `layout_id`. +- **Details schema**: + ```json + {"layout_id": "string"} + ``` + +### `managed_pane_not_found` + +- **When**: A pane-scoped method references an unknown `pane_id` or `predecessor_pane_id`. +- **Details schema**: + ```json + {"pane_id": "string"} + ``` + +### `managed_pane_protected_adopted` (FR-012) + +- **When**: A destructive `managed.pane.*` action targets a pane id that exists in the FEAT-006 agent registry but **not** in `managed_pane` — i.e., it was adopted, not created by AgentTower. +- **Details schema**: + ```json + {"agent_id": "string", "is_adopted": true} + ``` +- **Operator action**: Use the FEAT-006 adopt/unadopt path; or wait for the later promote-from-adopted feature. + +### `managed_pane_illegal_transition` + +- **When**: A request would trigger a transition not in the state-machine graph (e.g., `remove` while `creating`). +- **Details schema**: + ```json + {"pane_id": "string", "current_state": "string", "requested_action": "string"} + ``` + +### `managed_pane_illegal_recreate_source` + +- **When**: `managed.pane.recreate` references a `predecessor_pane_id` whose state is not `removed` or `failed`. +- **Details schema**: + ```json + {"predecessor_pane_id": "string", "current_state": "string"} + ``` + +### `managed_pane_recreate_chain_too_deep` (R4) + +- **When**: Predecessor's `chain_depth >= 15` (a new record would be at depth 16, which is the configured bound). +- **Details schema**: + ```json + {"predecessor_pane_id": "string", "predecessor_chain_depth": 15, "limit": 16} + ``` +- **Operator action**: Start a fresh layout rather than continuing the recreate chain. + +--- + +## Reused codes (no change) + +These FEAT-011 codes are also returned by FEAT-013 paths and retain their existing shapes: + +- `validation_failed` — field-shape violations; details include `field`, `reason`. +- `host_only` — bench-container peer targeted a host-only method or a foreign container. +- `not_implemented` — used by the `promote_from_adopted` stub; details include `reserved_since: "FEAT-013"`. +- `internal_error` — unhandled exception; details are the redacted exception class name. +- `malformed_request` — NDJSON framing or UTF-8 violation before dispatch. +- `container_not_found` — FEAT-003 code; returned when `container_id` is unknown. +- `payload_too_large` — FEAT-011 code; bounds inherit from FEAT-011 FR-003a. + +--- + +## Code count + +FEAT-011 baseline: 27 codes. +FEAT-013 additions: **9** new codes (listed above). +FEAT-013 total in registry: **36** codes. + +This is an additive evolution within `app_contract_version = "1.0"`; clients that don't recognize the new codes still see the generic `code`/`message`/`details` envelope and can surface them to the operator without protocol changes. diff --git a/specs/013-managed-session-lifecycle/contracts/managed-methods.md b/specs/013-managed-session-lifecycle/contracts/managed-methods.md new file mode 100644 index 0000000..d629897 --- /dev/null +++ b/specs/013-managed-session-lifecycle/contracts/managed-methods.md @@ -0,0 +1,294 @@ +# Contract: Managed-Session API Methods + +**Feature**: 013-managed-session-lifecycle +**Authority**: spec.md §FR-001/002/004/005/008/010/011/012/015/016/017/018/019/020/021; research.md. + +This contract defines the wire shapes for the FEAT-013 method set in **two parallel namespaces**: + +- **Legacy CLI namespace** — `managed.*` methods on the existing FEAT-002 socket dispatcher; reachable from host CLI and bench-container thin clients. Thin-client callers may only target their own container (peer-detected; cross-container returns `host_only`). +- **App contract namespace** — `app.managed_*` methods on the FEAT-011 host-only dispatcher; same JSON envelope as the rest of `app.*`. + +Both namespaces dispatch into the same `managed_sessions.service` entry points. The shapes below are identical between namespaces; method **names** differ as noted at the top of each method block. + +All examples use NDJSON over the local Unix socket. Field types follow FEAT-011 conventions: `state_priority`, `role_priority`, pagination defaults, and the standard envelope. + +--- + +## Envelope + +Inherits FEAT-011 verbatim: + +- Success: `{"ok": true, "app_contract_version": "1.0", "result": {...}}` +- Failure: `{"ok": true, "app_contract_version": "1.0", "error": {"code": "", "message": "...", "details": {...}}}` + +(Note: legacy `managed.*` methods use FEAT-002's existing envelope, which is the same shape minus `app_contract_version`.) + +--- + +## Methods + +### M1. `managed.layout.create` / `app.managed_layout_create` + +Create a managed layout in a bench container. + +**Request**: +```json +{ + "method": "managed.layout.create", + "container_id": "bench-abc", + "template_name": "1m+2s", + "tmux_session_name": "session-alpha", + "launch_command_overrides": { + "master:m1": "claude-master", + "slave:s1": "claude-worker", + "slave:s2": "claude-worker" + }, + "idempotency_key": "operator-clicked-create-12345" +} +``` + +- `container_id` (string, required) — FEAT-003 container id. +- `template_name` (string, required) — must resolve via the template registry (built-in or YAML override). +- `tmux_session_name` (string, required) — must not exist in the target container; otherwise `managed_session_name_conflict` (FR-016). +- `launch_command_overrides` (object, optional) — keyed by `":