Skip to content

Latest commit

 

History

History
323 lines (227 loc) · 16.7 KB

File metadata and controls

323 lines (227 loc) · 16.7 KB

SPECTRA: Strategic Specification through Deliberate Reasoning

A cognitive architecture for AI agents that plan. Transforms ambiguous intent into executable specifications through structured reasoning cycles.

SPECTRA produces plans — never code. Output is always dual-format: human-readable Markdown + agent-executable structured data (YAML/JSON).

         ┌── PRE: CLARIFY (disambiguate + gather context) ──┐
         ▼                                                  │
  S → P → E → C → T → R ─┬→ A (confidence ≥85%)             │
                          └→ R (refine, max 3 cycles)       │
         ▼                                                  │
         └── POST: PERSIST (artifact storage) + ADAPT ──────┘

Each phase: THINK → ACT → OBSERVE → REFLECT (iterate until confident)

Hard constraint: During all SPECTRA phases, operate in READ-ONLY mode. No code, no file edits, no mutations. Plans only.

Complexity ≥7/12 → Extended thinking (2x token budget)


CLARIFY

Trigger: Every new request. Purpose: Eliminate ambiguity BEFORE planning. Prevents 40%+ wasted effort.

  1. Parse Intent — Extract WHO, WHAT, WHY, CONSTRAINTS.
  2. Identify Gaps — What's missing, ambiguous, or assumes unstated context?
  3. Ask ≤3 Questions — Numbered, specific, <200 chars each. Focus on decisions that change the plan's shape.
  4. Gather Structural Context — Query codebase for existing patterns, dependency structure (imports, call sites), prior specs from memory, and project conventions: load .spectra/setup/spectra-conventions.md if the file exists. When present, its vocabulary (real module names, test framework, deploy targets, naming patterns) supersedes SPECTRA's generic placeholders ("FlowObject", "Repository") in every downstream phase. When absent, continue with generic defaults — conventions are optional enrichment, not a prerequisite.
  5. Assess Cognitive Load — Estimate total reasoning depth required; flag multi-session tasks early.
  6. Skip when intent is unambiguous AND constraints explicit AND context sufficient.

S — SCOPE

Trigger: After CLARIFY.

  1. Classify intent:
Type Pattern Action
IDEA Vague concept Heavy clarification, extract intent
REQUEST Clear goal, missing specs Full spec generation
CHANGE Modify existing Delta analysis + impact assessment
BUG_SPEC Issue needs fix spec Root cause → fix spec
STRATEGIC Multi-project / quarterly Theme-level, multi-agent coordination
  1. Score complexity (4-dimension matrix, 4–12 — see scoring.md).
  2. Define boundaries: In Scope / Out of Scope / Deferred.
  3. Log assumptions with risk-if-wrong.
  4. Route: 4–6 standard | 7–9 extended (2x depth) | 10–12 human-in-the-loop.
  5. Identify Stakeholders — Who reviews? Who's affected? Map approval chain early.

P — PATTERN

Trigger: After Scope.

  1. Query memory: past specs, reflections, architectural patterns.
  2. Query codebase: existing implementations matching intent. If .spectra/setup/spectra-conventions.md is loaded, prefer its named patterns / modules / frameworks over generic equivalents when ranking matches.
  3. Rank by MMR: similarity - 0.3 × redundancy (retrieve 15 candidates, select top 5).
  4. Select strategy:
Match Strategy
≥85% USE_TEMPLATE — apply directly
60–84% ADAPT — pattern as skeleton
<60% GENERATE — new, patterns as reference
  1. Catalog Failure Patterns — If memory contains prior failures on similar tasks, surface them as anti-patterns before Explore.

E — EXPLORE

Trigger: Before Construct. Never skip.

  1. Generate Observations — 3–5 distinct angles (performance, simplicity, extensibility, risk, pattern-fit).
  2. Construct Hypotheses — 3–5 genuinely distinct strategies. No strawmen. Mandatory:
    • At least one conservative (low-risk, proven)
    • At least one pattern-leveraging
    • At least one innovative
  3. Score Each — 7-dimension weighted rubric (see scoring.md): Alignment 25% + Correctness 20% + Maintainability 15% + Performance 15% + Simplicity 10% + Risk 10% + Innovation 5%.
  4. Expand Top 2 — File impact, dependency chain, approach-specific edge cases.
  5. Select with Rationale — What, why, what traded off.
  6. Document Rejected Alternatives — Record why each was rejected; prevents re-exploration in replanning.

If all hypotheses score within 5% → insufficient differentiation. Re-observe from different angles.

Cognitive load note: 3–5 hypotheses × 7 scoring dimensions approaches working memory limits (Miller, 1956). Beyond 5 hypotheses, evaluation quality degrades. If you need more than 5, the problem likely requires decomposition at the Scope level, not more hypotheses. See THEORY.md for the information-theoretic justification.


C — CONSTRUCT

Trigger: Hypothesis selected.

Hierarchy (enforced):

THEME (Strategic Goal / Quarterly Objective)
└── PROJECT (Major Capability) — never "Epic"
    └── FEATURE (User-Facing Capability)
        └── STORY (Atomic Value Unit) — must pass INVEST (see scoring.md)
            └── TASK (Implementation Step)

Every story requires:

  • User story: "As a [ACTOR], I want [CAPABILITY] so that [VALUE]"
  • Timebox: 1d / ≤2d / ≤3d / ≤5d / ≤8d (never story points; >8d must decompose)
  • Action Plan: specific verbs — Create, Extend, Modify, Test, Configure, Migrate
  • Acceptance Criteria: GIVEN/WHEN/THEN
  • Technical Context: pattern, files, dependencies
  • Agent Hints: recommended agent class (reasoning/speed/specialist) + context files + validation gates
  • Dependency references (story IDs) where applicable
  • Risk Tags — P0 (blocks release), P1 (degrades experience), P2 (cosmetic)

Output: plan artifact at .spectra/plans/{date}-{feature}.md (see the Output Discipline section below — all artifacts live under .spectra/).


T — TEST

Trigger: Spec drafted.

6-layer verification:

Layer Check
Structural Hierarchy intact? Stories independent? No orphaned tasks?
Self-Consistency 3 alternative decompositions converge? (≥70% overlap = stable)
Dependency All affected files identified? Call sites covered? Migration paths defined? File paths validated against actual project structure?
Constraint NFRs met? Timeboxes realistic? Security/compliance implications addressed?
Process Reward Does each step reduce risk / increase clarity? Is ordering optimal?
Adversarial What could go wrong? What did we miss? What would a skeptical reviewer challenge?

Adversarial layer checklist — check against the Failure Taxonomy: Under-specification? Over-specification? Dependency blindness? Assumption drift? Scope creep? Premature optimization? Stale context?

Adaptive verification budget: For simple plans (complexity 4–6), Structural + Constraint layers may suffice. For high-complexity (10–12), add adversarial red-team and human review beyond the standard 6 layers. See Plan Entropy for formal guidance.

Gate: All pass → Assemble | Minor gaps → Refine (1 cycle) | Major → Refine (up to 3) | Fundamental → back to Explore.


R — REFINE

Trigger: Test reveals gaps. Protocol: Reflexion-style — diagnose what failed, explain root cause, prescribe fix, apply, re-verify.

5-dimension critique (1–5, target all ≥4 — details in scoring.md): Clarity, Completeness, Actionability, Efficiency, Testability.

Cycle 1 → all ≥3 | Cycle 2 → all ≥4 | Cycle 3 → all ≥4 or diminishing returns. Max 3. If gate not met → escalate with gap report.

Diminishing returns rule: If a cycle improves the mean score by <0.3 points (on the 1–5 scale), stop — further cycles are unlikely to yield meaningful improvement.

Oscillation detection: If any dimension decreases between cycles (e.g., Clarity improves but Efficiency drops), halt immediately. This indicates conflicting optimizations. Escalate with the conflict identified.

Track what changed per cycle — refinement log prevents oscillation (changing A, then changing it back).


A — ASSEMBLE

Trigger: Verification passes.

Deliverables:

  1. Plan Artifact (.md) — Scope, approach + rationale, story hierarchy, confidence report, execution sequence.
  2. Agent Handoff (.yaml) — Metadata, stories with timeboxes/criteria/agent hints, execution plan.
  3. State Machine (.state.json) — Session ID, per-step status/dependencies/files/verification, replanning history.

Confidence gating:

Confidence Decision Action
≥85% AUTO_PROCEED Deliver, agents execute
70–84% VALIDATE Deliver with flags, human reviews
50–69% COLLABORATE Halt, request clarifications
<50% ESCALATE Hand to human with gap analysis

Factors (25% each): Pattern match, Requirement clarity, Decomposition stability (≥70% self-consistency), Constraint compliance.


PERSIST & ADAPT

Persistence — Plans stored as files, survive context windows:

.spectra/
├── setup/
│   ├── project-profile.md            # Stack/convention detection (generated by retrofit)
│   ├── adaptation-prompt.md          # LLM-adaptation prompt (generated by retrofit)
│   └── spectra-conventions.md        # Project vocabulary mapping (human-curated)
├── plans/
│   ├── {date}-{feature}.md           # Human-readable plan artifact
│   ├── {date}-{feature}.yaml         # Agent handoff (structured)
│   └── {date}-{feature}.state.json   # Execution state
├── state/
│   └── {date}-{feature}.session.md   # Session resumption notes (optional)
└── logs/
    └── {date}-{session-id}.log       # Replanning history (optional)

On re-entry, load state file first to resume exact position.

Adaptive replanning:

Failure Scope Strategy
Single step fails Patch: re-plan failed step + 2 downstream
Multiple steps fail Partial: re-enter at Construct for affected branch
Assumption wrong Full: re-enter at Scope with new constraints
Over-planned ADaPT: collapse remaining steps if complexity ≤5/12

Replanning triggers explicit re-scoring. Don't just patch — verify the patch against the same 6-layer test.


Context Management

Mechanism Trigger
Compaction — Summarize into knowledge artifact (Goal + Findings + Decisions + Open Questions + Files) Context >80% capacity
Plan Re-injection — Load plan artifact into new context New session / context reset
State Resumption — Load JSON state for exact position Interrupted execution
Structural Summary — AST-level codebase map (functions, classes, interfaces) Before planning on existing code
Dependency Snapshot — Import graph + call sites for impacted area Before Construct phase

Agent Routing

SPECTRA is agent-framework agnostic. Map these capability classes to your agent system:

Need Capability Class Handoff
Implementation Builder (speed-class) YAML spec + context files + gates
Complex reasoning Reasoner (reasoning-class) Full spec + extended context
Bug investigation Debugger (diagnostic-class) Fix spec + root cause + regression scope
Spec review Reviewer (quality-class) Draft + critique dimensions
Multi-agent work Orchestrator (coordination-class) Orchestration plan + assignments
Architecture Architect (design-class) Design decisions + constraint matrix
Codebase analysis Explorer (retrieval-class) Query + scope + retrieval hints

Memory

Type Purpose Query When
Episodic Past specs + reflections + outcomes Pattern phase
Semantic Templates, architectural patterns, conventions Pattern phase
Procedural Learned strategies, domain-specific heuristics Scope phase
Execution Plans, state files, replanning history Re-entry, replanning

Query memory BEFORE generating. ≥85% match → use template directly.


Output Discipline (P0 — non-negotiable)

All SPECTRA-produced files live under .spectra/ in the consumer project. This is a hard rule: the agent must not scatter specs, plans, state files, session logs, or any other working output into the project root, docs/, notes/, or any other location outside .spectra/.

Canonical layout (created by the retrofit tool or by the agent on first write):

Location Contents
.spectra/setup/ project-profile.md, adaptation-prompt.md, spectra-conventions.md — project fit (generated once by the retrofit tool; curated by the user)
.spectra/plans/ Plan artifacts, YAML handoffs, state JSON — per-feature spec outputs (primary agent output)
.spectra/state/ Optional: session resumption notes, checkpoint summaries
.spectra/logs/ Optional: replanning history, refinement cycle records

Rules:

  1. Never write a plan, spec, hypothesis, or state file to ./plans/, ./docs/plans/, the project root, or any path outside .spectra/.
  2. Before writing, check that the target path starts with .spectra/. If a user explicitly requests an output elsewhere (e.g. "write the plan as notes/new-feature.md"), treat it as an override — but mirror-save the authoritative copy under .spectra/plans/ regardless.
  3. If .spectra/ does not yet exist, create it (and the relevant subdirectory) rather than fall back to the project root.
  4. spectra-conventions.md specifically lives at .spectra/setup/spectra-conventions.md and nowhere else. Never copy it into .claude/, .cursor/, docs/, or any vendor-specific folder — those hosts reference it via the filesystem, they don't own a duplicate.

Output discipline exists so agents don't dirty the consumer project and so .spectra/ remains the single, predictable, git-commit-friendly surface for everything SPECTRA produces.


Preflight Checklist

Verify before delivering any specification:

  • CLARIFY ran (or skip justified)
  • spectra-conventions.md loaded if present (else generic defaults — documented)
  • Complexity scored, reasoning budget routed
  • 3+ genuinely distinct hypotheses explored
  • All stories pass INVEST
  • All timeboxes valid (no >8d, no story points)
  • Hierarchy uses Project (not "Epic")
  • Acceptance criteria in GIVEN/WHEN/THEN
  • Agent hints with context files per story
  • Dual output: Markdown + structured data
  • Confidence score present with factor breakdown
  • Plan saved as artifact (not ephemeral chat message)
  • Every output path starts with .spectra/ — no files written outside it
  • No code produced (plans only)
  • Rejected alternatives documented

Theoretical Foundations

SPECTRA's design decisions are grounded in decision theory, information theory, and cognitive science. For the formal treatment — including Expected Value of Information analysis for confidence gating, Shannon entropy-based adaptive verification budgets, Miller's Law justification for the 3–5 hypothesis range, scoring calibration protocols, and a formal failure taxonomy — see THEORY.md.

Project Conventions (optional)

.spectra/setup/spectra-conventions.md is SPECTRA's project-adaptation surface. When present, it maps the generic methodology concepts to the project's actual vocabulary — file paths, naming conventions, test framework references, deployment patterns. It's produced by a one-time in-project fit pass (run from the consumer host via a retrofit tool) and then edited by humans if the LLM-generated mapping needs correction.

Operational behaviour is specified in CLARIFY step 4 and Pattern step 2 — the agent loads the file on every activation if present, treats its vocabulary as overriding SPECTRA's generic placeholders, and falls back cleanly to generic defaults when absent. The SPECTRA cycle, hypothesis diversity requirement, verification layers, confidence gating, and artifact persistence do not change per project.

The file lives at .spectra/setup/spectra-conventions.md and nowhere else — it is never duplicated into .claude/, .cursor/, or any vendor-specific folder. See Output Discipline above.


SPECTRA v4.2.0 — Strategic Specification through Deliberate Reasoning