Advisor Harness

Local-first scaffold for a Claude-style advisor strategy across CLI sessions.

The executor model drives the task. The harness injects a starting prompt that tells the executor when to consult the advisor. When the executor reaches a hard decision, it emits an ADVISOR_CONSULT block. The harness records the event, calls the advisor model with reconstructed shared context, records the returned ADVISOR_GUIDANCE, and resumes the same executor session with that guidance.

Default local pairing:

Executor: Kimi CLI
Advisor: Codex CLI
Test backend: deterministic fake adapter

See ROADMAP.md for the staged plan from the generic advisor harness to persistence, UI, vertical applications, and feedback-loop-driven prompt/schema improvement.

Install

python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install wheel
python3 -m pip install -e . --no-build-isolation

Commands

maa doctor
maa init
maa run "fake smoke task" --executor fake --advisor fake --max-turns 3 --max-advisor-calls 3
maa eval
maa ui
maa review --run <run_id> --advisor fake
maa release-readiness --sample --executor fake --advisor fake

Live local smoke:

maa run "For this smoke test, consult the advisor once before finalizing, then produce a short final answer." \
  --executor kimi --advisor codex --timeout 240 --max-turns 3 --max-advisor-calls 2
maa review --run <run_id> --advisor codex --timeout 240

Post-run feedback loop:

maa review --run <run_id> --advisor fake

maa review writes the freeform post-run review, a routing policy patch proposal, and structured improvement_proposals.json / improvement_proposals.md artifacts. Improvement proposals cover memory schema, executor prompt, and advisor prompt targets, validate that every proposal is advisory and requires human approval, and are never applied automatically.

Evaluation harness:

maa eval
maa eval --include-live --live-timeout 240

maa eval runs deterministic fake regression scenarios for autonomous consultation, no-consult completion, malformed blocks, advisor stop signals, max-turn exhaustion, and missing EXECUTOR_DONE. It writes evaluation_summary.json, evaluation_summary.md, and scenario_results.jsonl under runs/eval_*/.

Run timeline UI:

maa ui
maa ui --serve --port 8765

maa ui renders a local HTML dashboard to runs/ui/index.html from the SQLite run database. The dashboard filters persisted runs by status, backend, advisor call count, task text, and error mode, then shows the selected run timeline, prompts, raw CLI outputs, consult reasons, guidance, memory proposals, and outcome.

Release readiness vertical:

maa release-readiness --sample --executor fake --advisor fake
maa release-readiness --evidence path/to/release-evidence.md \
  --executor kimi --advisor codex --timeout 240

maa release-readiness runs a focused vertical workflow on top of the same advisor loop, persistence database, and UI timeline. The workflow assesses supplied release evidence, requires an advisor consultation before the final verdict, emits a structured RELEASE_READINESS_REPORT, and writes release_readiness_evaluation.json plus release_readiness_evaluation.md into the run directory.

Advisor Protocol

The executor decides when advice is needed. The advisor does not run tools, mutate files, or write memory. The harness performs the actual advisor call and owns all durable state.

Executor consultation request:

<ADVISOR_CONSULT>
{"question":"...","context":"...","options":["..."],"preferred_option":"...","urgency":"normal"}
</ADVISOR_CONSULT>

Advisor guidance:

<ADVISOR_GUIDANCE>
{"guidance":"...","rationale":"...","stop_signal":false}
</ADVISOR_GUIDANCE>

Executor completion:

<EXECUTOR_DONE>
{"status":"completed","summary":"..."}
</EXECUTOR_DONE>

A run is marked completed only when the executor emits EXECUTOR_DONE. If the executor exits without a consultation request and without EXECUTOR_DONE, the harness records executor_stopped_without_done.

Artifacts

Runtime state is local and gitignored:

runs/<run_id>/
mailbox/*.jsonl
memory/*.jsonl
memory/advisor_runs.db

Important run files:

memory/advisor_runs.db: queryable source of truth for persisted runs, events, turns, consults, guidance, memory proposals, malformed blocks, and outcomes
version_manifest.json: prompt, memory schema, and policy file version/hash metadata captured at run start
session_events.jsonl: durable cross-session event log
advisor_consults.jsonl: executor consultation requests
advisor_guidance.jsonl: advisor guidance returned to executor
executor_turn_<n>.*: executor CLI outputs
advisor_turn_<n>.*: advisor CLI outputs
outcome.json: run status and counters
evaluation_summary.json: evaluation metrics and scenario results for maa eval
release_readiness_evaluation.json: vertical acceptance result and report metrics for maa release-readiness
improvement_proposals.json: validated post-run proposals for memory schema and prompt improvements
runs/ui/index.html: generated local run timeline dashboard

The outcome.json file includes the executor session id, executor turn count, advisor consultation count, guidance count, and completion status.

Tests

python3 -m unittest discover -s tests

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
memory		memory
packages		packages
policy		policy
tests		tests
.gitignore		.gitignore
README.md		README.md
ROADMAP.md		ROADMAP.md
architecture.md		architecture.md
prd.md		prd.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advisor Harness

Install

Commands

Advisor Protocol

Artifacts

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Advisor Harness

Install

Commands

Advisor Protocol

Artifacts

Tests

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages