A self-evolving multi-agent orchestration engine built on the Model Context Protocol.
- Executive Summary
- The Great Unification
- Self-Evolving Agent Lifecycle (T3→T2→T1)
- The Spartan Swarm Protocol
- Council Pattern (Map-Reduce)
- Skills System
- Plan Mode & Separation of Concerns
- Issue-First SDLC
- Memory & Reflection
- Autonomous Operations
- Security Architecture
Optimus Code is a multi-agent orchestration engine that transforms any MCP-compatible AI coding tool into a coordinated development team. It works with VS Code (GitHub Copilot), Cursor, Windsurf, Claude Code, Goose, Roo Cline, and any other client that speaks the Model Context Protocol.
Rather than relying on a single AI assistant to handle every task — planning, coding, reviewing, testing — Optimus decomposes work across specialized agent roles: Product Manager, Architect, Developer, QA Engineer, and more. These agents are not preconfigured. They emerge dynamically as the system encounters new task types, evolve their role definitions through use, and accumulate project memory across sessions.
The result is a system where:
- One natural-language prompt triggers a complete software development lifecycle (Issue → Branch → PR → Merge).
- Agents self-organize via a three-tier lifecycle: ephemeral workers precipitate into role templates, then freeze as reusable instances.
- Parallel expert councils debate architectural decisions using a map-reduce pattern before any code is written.
- Project memory ensures past mistakes and decisions persist, so the team improves with every task.
Optimus is 100% editor-agnostic — a pure Node.js MCP daemon with no VS Code extension dependency.
Traditional AI coding assistants are tightly coupled to a specific editor. Their orchestration logic lives inside VS Code extensions, Cursor plugins, or proprietary backends. This creates fragmentation: if you switch editors, you lose your agent infrastructure.
Optimus Code follows a "Great Unification" architecture. The MCP Server (optimus-plugin/dist/mcp-server.js) is a standalone Node.js daemon that communicates via stdio transport. It has zero dependency on any editor's extension API.
┌──────────────────────────────────────────────┐
│ Any MCP Client (VS Code, Cursor, Claude, ..) │
└──────────────────────┬───────────────────────┘
│ stdio (JSON-RPC)
┌──────────────────────▼───────────────────────┐
│ Optimus MCP Server │
│ ┌─────────┬──────────┬───────────────────┐ │
│ │ Managers │ Adapters │ MCP Tool Handlers │ │
│ └─────────┴──────────┴───────────────────┘ │
│ Pure Node.js — No vscode namespace │
└──────────────────────────────────────────────┘
Key constraints enforced in the codebase:
- The
src/adapters/,src/mcp/, andsrc/managers/directories must remain 100% environment-agnostic. Novscodenamespace imports are permitted. - All agent artifacts (reports, tasks, memory, reviews) are stored in the
.optimus/directory — never as loose files in the repository root. - The server is started with
npx -y github:cloga/optimus-code serveand configured once — every MCP client connects to the same daemon.
The repository itself contains two intertwined codebases:
| Layer | Path | Purpose |
|---|---|---|
| Host project | Root (src/, docs/, .optimus/) |
Optimus's own development workspace |
| Plugin package | optimus-plugin/ |
The npm-publishable MCP server that ships to end-users |
Changes to system instructions, skills, or config must be evaluated for propagation to the plugin scaffold. T1 agent instances, state files, and reports never ship in the plugin.
Optimus communicates with external AI coding agents through adapters — pluggable implementations of the AgentAdapter interface in src/adapters/. Each adapter translates Optimus orchestration commands into the wire protocol understood by a specific agent engine.
| Adapter | Class | Protocol | Agents |
|---|---|---|---|
github-copilot |
GitHubCopilotAdapter |
Copilot CLI text parsing | GitHub Copilot |
claude-code |
ClaudeCodeAdapter |
Claude Code CLI text parsing | Claude Code |
acp |
AcpAdapter |
ACP (Agent Client Protocol) — JSON-RPC over stdio | claude-agent-acp, Claude Code, GitHub Copilot (copilot --acp), Kimi CLI, Qwen Code, Gemini CLI, and any ACP-compliant agent |
ACP Adapter (Epic #319)
The AcpAdapter (src/adapters/AcpAdapter.ts) implements the Agent Client Protocol (ACP) — a universal JSON-RPC protocol over stdio that uses the same framing as LSP (Content-Length header). ACP replaces legacy CLI text parsing with structured message exchange.
Session lifecycle:
initialize → session/new → session/prompt → session/update (streaming) → response
initialize: JSON-RPC handshake to negotiate capabilities.session/new(orsession/loadfor resumption): Creates or resumes a session.session/prompt: Sends the user prompt to the agent.session/update: Streaming notifications for incremental output (maps toonUpdatecallbacks).- Final response: The agent's completed output.
ACP coexists with the existing ClaudeCodeAdapter and GitHubCopilotAdapter through the factory pattern in src/adapters/index.ts. The AdapterKind union type ('github-copilot' | 'claude-code' | 'acp') drives adapter selection via available-agents.json configuration.
For GitHub Copilot specifically, do not infer the complete ACP launch contract from the top-level copilot --help summary alone. GitHub's ACP public-preview docs describe additional server modes such as --stdio and --port even when the summary help only surfaces --acp. Optimus therefore treats explicit transport config plus those documented preview capabilities as the source of truth, and defaults headless Copilot ACP launches to copilot --acp --stdio when no explicit ACP args are configured.
These are orthogonal concepts and should stay separate in engine configuration:
- ACP is a transport choice. It belongs to engine
protocol,preferred_protocol, and theacptransport block. - Copilot autopilot is a continuation policy for the CLI agent. It belongs to
automation.continuation. - Approval policy belongs to
automation.mode.
For GitHub Copilot, official documentation now exposes all three separately:
copilot --acpstarts the ACP server in public preview.- The ACP reference also documents
--stdioand--portserver modes even when the top-level CLI help only surfaces--acp. - Autopilot is documented independently as the multi-turn continuation mode for Copilot CLI, typically paired with
--allow-alland optionally--max-autopilot-continues.
Implication for Optimus: do not treat autopilot as evidence of ACP support, and do not treat ACP availability as evidence that Copilot CLI continuation semantics are available on that transport.
Status: The AcpAdapter is fully implemented with NDJSON transport and JSON-RPC message handling. Verified with Qwen Code v0.12.3 and claude-agent-acp v0.21.0.
Optimus uses a three-tier agent hierarchy that evolves automatically. No roles are pre-installed — the system starts empty and grows organically through use.
| Tier | Storage | Description | Created By |
|---|---|---|---|
| T3 (Ephemeral) | In-memory only | Zero-shot dynamic worker with no persistent file. The Master Agent invents a descriptive role name (e.g., security-auditor) and the engine generates a worker on the fly. |
Master Agent names it at delegation time |
| T2 (Template) | .optimus/roles/<name>.md |
Role template with persona instructions, engine/model binding, and behavioral constraints. Created automatically on first T3 use — "precipitation". | Auto-precipitated from T3; Master Agent evolves it |
| T1 (Instance) | .optimus/agents/<name>_<hash>.md |
Frozen snapshot of a T2 role after a completed task, including the session ID for context continuity. | Auto-created when a task completes with a session_id |
First delegation (T3):
Master invents role name → worker-spawner creates ephemeral agent
↓
Task completes → T2 role template auto-created in .optimus/roles/
↓
Session ID captured → T1 instance created in .optimus/agents/
↓
Next delegation (T1 reuse):
Master provides agent_id → system resumes the T1 session
- T2 ≥ T1: Every T1 agent instance must have a corresponding T2 role template. Orphaned T1s are invalid.
- T1 is frozen: Once created, the body content of a T1 file is never modified. Only the
session_idfield updates when the agent is reused. - T2 is alive: The Master Agent can update T2 templates with new descriptions, engine bindings, and model settings to evolve the team over time.
- Precipitation is immediate: Unlike threshold-based approaches (which required 3 invocations + 80% success rate), T3→T2 precipitation happens on the very first delegation. This was a deliberate simplification after the earlier threshold model proved fragile.
Agents that consistently fail are not deleted — they are quarantined. The quarantine_role MCP tool marks a role as unavailable for dispatch. This prevents cascading failures while preserving the agent's history for debugging. Quarantined agents can be unquarantined after fixes.
T1 garbage collection removes stale instance files that haven't been referenced in configurable time windows, preventing unbounded disk growth.
The Spartan Swarm Protocol defines how the Master Agent discovers, selects, and dispatches work to specialized agents.
Before selecting a worker, the Master first chooses the Optimus entry point:
optimus_orchestrate— preferred for broad or multi-step requests; it chooses delegate/council/plan inside Optimusdispatch_plan_async— for already-decomposed work with explicit dependency edgesdelegate_task_async— for a single already-scoped worker task
This is intentional: the master agent should use the same Optimus-native orchestration surface that end users are given, rather than reaching for some separate built-in sub-agent model first.
Once a direct delegation path is appropriate, every task delegation follows a strict 3-step pipeline:
Step 1 — Camp Inspection (roster_check)
The Master Agent calls roster_check to retrieve the current workforce:
- T1 local instances (stateful, session-resumable)
- T2 project role templates (shared, evolvable)
- Available engines and models from
available-agents.json - Registered skills
This step is never skipped — it prevents the Master from hallucinating roles that don't exist.
Step 2 — Manpower Assessment (Role Selection)
The Master matches the task to the roster:
- Prefer T1 if a matching instance exists with relevant session context.
- Fall back to T2 if a role template exists but no instance.
- Invent T3 for niche tasks — just name a role (e.g.,
webgl-shader-guru) and the engine auto-generates a zero-shot worker.
Step 3 — Deployment (delegate_task / delegate_task_async)
The Master dispatches with structured parameters:
| Parameter | Purpose |
|---|---|
role |
Which agent to invoke |
role_description |
What this role does (used for T2 template generation) |
role_engine |
Which engine (e.g., claude-code, copilot-cli) |
role_model |
Which model (e.g., claude-opus-4.6-1m) |
task_description |
Detailed instructions |
context_files |
Files the agent must read before starting |
required_skills |
Skills the agent needs (pre-flight checked) |
parent_issue_number |
For issue lineage tracking |
output_path |
Where to write results |
When the Master doesn't specify an engine or model, the system resolves them in priority order:
- Master-provided
role_engine/role_model(highest priority) - T2 role frontmatter
engine/model available-agents.json(first non-demo engine + first model)- Hardcoded fallback:
claude-code
Invalid engine or model names are rejected at the gateway with an actionable error listing valid options from available-agents.json.
The Master Agent must physically invoke the Optimus MCP tools when orchestrating work (optimus_orchestrate, dispatch_plan_async, delegate_task_async, dispatch_council_async). It is strictly prohibited from simulating a worker's response in plain text or writing ad-hoc scripts to play the role of a subordinate. This is the Strict Delegation Protocol.
When a decision requires multiple expert perspectives — architectural reviews, security audits, design evaluations — Optimus uses the Council Pattern.
- Proposal: The orchestrator writes a proposal document to
.optimus/proposals/PROPOSAL_<topic>.md. - Dispatch:
dispatch_council(ordispatch_council_async) spawns multiple expert agents in parallel, each reviewing the same proposal from their specialized perspective. - Map phase: Each council member writes an independent review to
.optimus/reviews/<council_id>/<role>.md. - Reduce phase: The system generates a
COUNCIL_SYNTHESIS.mdthat aggregates findings, identifies consensus, and surfaces conflicts. - Arbitration: The orchestrator reads the synthesis. If no blockers exist, implementation proceeds. If fatal conflicts exist, a
.optimus/CONFLICTS.mdis created for resolution.
dispatch_council({
proposal_path: ".optimus/proposals/PROPOSAL_auth_refactor.md",
roles: ["security-expert", "performance-expert", "code-architect"]
})
This spawns three agents simultaneously. Each reads the proposal through their domain lens. The security expert focuses on authentication vulnerabilities, the performance expert evaluates query patterns, and the architect assesses structural impact.
Councils are inherently async. dispatch_council_async returns immediately with a task ID. The orchestrator polls status via check_task_status and reads results when all members have completed.
Optimus decouples identity from capability:
- Role = WHO does the work (identity, constraints, permissions) — stored in
.optimus/roles/ - Skill = HOW to do the work (operational SOP, workflow steps, tool usage) — stored in
.optimus/skills/
Roles and Skills have a many-to-many relationship, bound at runtime via the required_skills parameter in delegate_task. A single role (e.g., senior-full-stack-builder) can be equipped with different skill combinations for different tasks.
Naming convention: Roles use identity names (e.g., product-manager). Skills use capability names (e.g., feature-dev, git-workflow, council-review). A skill is never named after a role.
When required_skills is specified in a delegation, the system verifies that every skill file exists at .optimus/skills/<name>/SKILL.md before the agent process is spawned. Missing skills cause an immediate rejection with an actionable error — the Master must create them first.
This pre-flight prevents agents from receiving tasks they aren't equipped to handle.
The system ships with two meta-skills that enable self-evolution:
| Skill | Purpose |
|---|---|
role-creator |
Teaches the Master Agent how to build and evolve the team (T3→T2→T1 lifecycle, engine selection, role definition best practices) |
skill-creator |
Teaches agents how to write new SKILL.md files following the correct format |
Three core skills handle operational workflows:
| Skill | Purpose |
|---|---|
delegate-task |
Async-first task delegation protocol |
council-review |
Parallel expert review (Map-Reduce) |
git-workflow |
Issue-First SDLC with branch, PR, and merge |
When a skill doesn't exist, the Master delegates to any agent with required_skills: ["skill-creator"], describing what the new skill should teach. The agent reads the skill-creator SKILL.md, learns the format, and writes the new skill. The original delegation can then be retried.
Without guardrails, orchestrator agents (PM, Architect) tend to write code themselves instead of delegating. This violates separation of concerns — the same agent that defines requirements shouldn't implement them.
Orchestrator roles run with mode: plan in their role definition. In plan mode:
- The agent cannot write to source code files. File write operations are restricted to the
.optimus/directory via thewrite_blackboard_artifactMCP tool. - The agent must delegate implementation work to developer roles (e.g.,
senior-full-stack-builder). - The agent can create proposals, requirements documents, task breakdowns, and review reports — but not code.
This MCP tool allows plan-mode agents to write files exclusively to .optimus/. It enforces two layers of path validation:
- Lexical check:
startsWith(optimusRoot + path.sep)prevents..traversal and sibling directory escapes. - Symlink check:
fs.realpathSync()on the resolved path prefix prevents symlink-based escapes to directories outside.optimus/.
Content validation uses === undefined || === null (not !content) to allow legitimate empty-string writes.
Plan mode is a behavioral constraint enforced through the role template and skill instructions. The orchestrator's prompt explicitly states it cannot write code and must use delegation tools. This is reinforced by the skill system — orchestrators are equipped with planning skills (council-review, feature-dev) that guide them through the delegation workflow.
All code changes in Optimus follow the "Issue First" protocol. No code is written without a tracked work item.
1. Create Issue → vcs_create_work_item (GitHub Issue or ADO Work Item)
2. Branch → git checkout -b feature/issue-<ID>-<desc>
3. Implement → Agent writes code, runs build, runs tests
4. PR → vcs_create_pr with "Fixes #<ID>" in body
5. Merge → vcs_merge_pr (squash merge for clean history)
6. Cleanup → Auto-delete source branch, sync local master
When an agent creates a GitHub Issue and then delegates sub-tasks, it passes its own Issue number as parent_issue_number to all subsequent delegate_task and dispatch_council calls. The system automatically injects OPTIMUS_PARENT_ISSUE into child agent processes, maintaining a parent-child tree across all Issues in a workflow.
This enables full traceability: from a high-level epic down to individual sub-task PRs.
All Issues and PRs created via MCP tools are automatically tagged with:
[Optimus]prefix in the titleoptimus-botlabel for filtering
Direct git push to master/main is prohibited. All changes must go through PR merge via vcs_merge_pr. This ensures:
- GitHub's
fixes #Nauto-close works (only triggered by PR merge events) - Code review happens before merge
- Issue-First SDLC traceability is maintained
The vcs_* MCP tools provide a unified abstraction over GitHub and Azure DevOps. The same workflow works regardless of which platform hosts the repository. Configuration is stored in .optimus/config/vcs.json.
Optimus maintains a project memory at .optimus/memory/continuous-memory.md. This is a structured append-only log of verified lessons, architectural decisions, bug postmortems, and workflow improvements.
Memory entries are created via the append_memory MCP tool with categorized metadata:
{
category: "bug-postmortem",
tags: ["upgrade", "config-wipe", "vcs.json"],
content: "optimus upgrade force-overwrote vcs.json..."
}
At agent spawn time, project memory is automatically injected into the agent's prompt. This means every agent — regardless of when it was created — starts with the accumulated knowledge of all past sessions.
Agents may include a ## Self-Assessment section in their output reports containing:
- What Worked: Where the role and skills aligned well with the task
- What Was Missing: Gaps that required improvisation
- Proposed Updates: Specific suggestions for role or skill improvements
Self-assessment is advisory, not mandatory. Agents cannot autonomously modify their own role templates or write to project memory — the PM or Master Agent decides what merits promotion. This prevents runaway self-modification while still capturing improvement signals.
The Universal Reflection Protocol defines a progression:
- Instruction-Level (implemented): Post-delegation checklists and pre-delegation self-checks embedded in instruction files (
.claude/CLAUDE.md,.github/copilot-instructions.md). - Memory-Powered (implemented): Agents read project memory at conversation start. Past mistakes are automatically in context.
- Root Master Self-Delegation (future): The Root Master delegates to a
master-orchestratorrole, making itself subject to the same prompt injection and reflection protocols as worker agents.
Optimus includes a Meta-Cron system for scheduled autonomous agent operations. Cron entries are registered via register_meta_cron with standard 5-field cron expressions.
Each cron entry specifies:
- A role to invoke
- Required skills for the task
- A capability tier (
maintain,develop,review) that bounds what the triggered agent can do - A concurrency policy (
ForbidorAllow) - Max actions per trigger (default: 5)
- Dry-run period (default: 3 ticks before live execution)
Example use cases:
- Daily dependency audit scans
- Stale issue cleanup
- Health monitoring and system checks
All delegation in Optimus is async-first. delegate_task_async and dispatch_council_async return immediately with a task ID. The check_task_status tool polls for completion. This prevents the Master Agent from blocking while workers execute.
When an agent encounters an ambiguous situation and cannot continue autonomously, the proposed workflow is:
- Agent posts a question via
vcs_add_commenton its tracking Issue - Agent adds a
needs-human-inputlabel and writes a checkpoint to.optimus/reports/ - Agent exits (fire-and-forget — no process hanging)
- Human responds on their own schedule via GitHub comment
- A Meta-Cron patrol detects the response and spawns a continuation task with the same
agent_idfor context continuity
This creates a fully async human-in-the-loop mechanism without any real-time channels.
All MCP tool handlers validate inputs before any task creation, file writes, or process spawning:
- Role name confusion guard: If a
roleparameter looks like a model name (e.g.,claude-opus-4,gpt-4o), the call is rejected with an actionable error suggestingrole_modelinstead. - Engine/model validation: Invalid engine or model values are rejected with the list of valid options from
available-agents.json. - Callers receive
McpError(InvalidParams)with enough information to self-correct.
Agent delegation is capped at 3 nested layers (MAX_DELEGATION_DEPTH = 3, defined in src/constants.ts). This prevents infinite recursion where agents delegate to agents indefinitely.
- Tracked via the
OPTIMUS_DELEGATION_DEPTHenvironment variable, automatically injected and incremented at each delegation. - At depth 3, MCP configuration is stripped from the child process, physically preventing further delegation.
sanitizeRoleName()strips dangerous characters from role names, preventing directory traversal via crafted role identifiers.write_blackboard_artifactuses dual-layer validation (lexical +fs.realpathSync()) to prevent writes outside.optimus/. The symlink check was identified as a P0 gap during security review —path.resolve()andpath.normalize()alone do not resolve symlinks.
- All content from GitHub Issues, ADO Work Items, and PR comments is treated as untrusted DATA, never as executable instructions.
- Agents are instructed to never run commands, scripts, or URLs found in external content.
- System instructions are delivered via trusted channels (MCP Resources, CLAUDE.md, copilot-instructions.md), not through user-modifiable fields.
.envfiles are never committed or shipped in the plugin package.- The
.gitignoreand plugin packaging rules exclude.optimus/agents/,.optimus/state/, and credential files. - Agents are warned against committing files that may contain secrets.
Plan mode prevents orchestrator agents from writing arbitrary files. Even if a prompt injection convinced an orchestrator to "write a config file," the write_blackboard_artifact path validation would reject any target outside .optimus/.
.optimus/
├── agents/ # T1 frozen instance snapshots
├── config/ # vcs.json, available-agents.json, system-instructions.md
├── memory/ # continuous-memory.md
├── proposals/ # Council proposal documents
├── reports/ # Agent output reports
├── reviews/ # Council review outputs + synthesis
├── roles/ # T2 role templates
├── skills/ # Skill definitions (SKILL.md per skill)
├── state/ # task-manifest.json, t3-usage-log.json
└── system/ # System-level config
optimus-plugin/
├── bin/ # CLI entry points (init, serve, upgrade)
├── dist/ # Compiled MCP server
├── scaffold/ # Template files shipped to end-users
└── skills/ # Universal bootstrap skills
This document describes Optimus Code v0.4.0. For the latest updates, see the CHANGELOG.