samrusani
diff --git a/‎.ai/handoff/CURRENT_STATE.md‎
Lines changed: 9 additions & 8 deletions b/‎.ai/handoff/CURRENT_STATE.md‎
Lines changed: 9 additions & 8 deletions
diff --git a/‎ARCHITECTURE.md‎
Lines changed: 6 additions & 5 deletions b/‎ARCHITECTURE.md‎
Lines changed: 6 additions & 5 deletions
diff --git a/‎BUILD_REPORT.md‎
Lines changed: 39 additions & 57 deletions b/‎BUILD_REPORT.md‎
Lines changed: 39 additions & 57 deletions
diff --git a/‎CURRENT_STATE.md‎
Lines changed: 9 additions & 8 deletions b/‎CURRENT_STATE.md‎
Lines changed: 9 additions & 8 deletions
diff --git a/‎PRODUCT_BRIEF.md‎
Lines changed: 2 additions & 1 deletion b/‎PRODUCT_BRIEF.md‎
Lines changed: 2 additions & 1 deletion
@@ -9,26 +9,27 @@
 - Phase 12 Sprint 1 (`P12-S1`) is shipped.
 - Phase 12 Sprint 2 (`P12-S2`) is shipped.
 - Phase 12 Sprint 3 (`P12-S3`) is shipped.
-- Phase 12 Sprint 4 (`P12-S4`) is the active execution sprint.
+- Phase 12 Sprint 4 (`P12-S4`) is shipped.
+- Phase 12 Sprint 5 (`P12-S5`) is the active execution sprint.
 
 ## Current Baseline Truth
 - Alice has typed memory, provenance, trust classes, correction/supersession behavior, open loops, recall, resumption, and explainability.
 - Alice exposes CLI, MCP, hosted/product, provider-runtime, and Hermes bridge surfaces.
 - The codebase already includes semantic retrieval, embeddings, entities/entity edges, trusted-fact promotion, retrieval evaluation fixtures, deterministic resumption briefs, daily briefs, chief-of-staff briefing flows, and the shipped `P12-S1` hybrid retrieval/reranking foundation with retrieval traces.
-- The codebase also includes the shipped `P12-S2` memory mutation candidate and operation foundation.
+- The codebase also includes the shipped `P12-S2` memory mutation candidate and operation foundation, the shipped `P12-S3` contradiction/trust foundation, and the shipped `P12-S4` public eval harness.
 
 ## Not Yet First-Class In Repo
-- task-adaptive brief compiler separated from current briefing surfaces
 
 ## Phase Transition Note
 - Phase 12 is active.
 - `P12-S1` is complete and establishes the retrieval baseline.
 - `P12-S2` is complete and establishes the mutation baseline.
 - `P12-S3` is complete and establishes the contradiction/trust baseline.
-- `P12-S4` is the active sprint and should benchmark shipped retrieval, mutation, and contradiction behavior without reopening those systems.
-- The current `P12-S4` branch implements the public eval harness, fixture catalog, and checked-in baseline artifact, pending Control Tower merge approval.
+- `P12-S4` is complete and establishes the public-eval baseline.
+- `P12-S5` is the active sprint and should build briefing behavior on top of shipped retrieval, mutation, contradiction, and eval baselines without reopening those systems.
+- The current `P12-S5` branch implements task-adaptive brief generation, comparison, and model-pack briefing defaults, pending Control Tower merge approval.
 
 ## Immediate Control Tower Decisions Needed
-- Decide public eval suite taxonomy and baseline artifact format.
-- Decide what eval artifacts are committed versus generated locally.
-- Decide whether `P12-S4` stays CLI-first or keeps the current branch `/v1/evals/*` API surface.
+- Decide briefing modes and payload schema for user recall, resume, worker subtask, and agent handoff.
+- Decide provider/model-pack fields for briefing strategy and max brief tokens.
+- Decide whether `P12-S5` needs CLI-only, API, and MCP surfaces simultaneously or can stage them.
@@ -2,7 +2,7 @@
 
 ## Scope Boundary
 - **Shipped baseline:** Phases 9-11 and Bridge `B1` through `B4`.
-- **Current repo execution posture:** `v0.2.0` is released; `P12-S1`, `P12-S2`, and `P12-S3` are shipped; `P12-S4` is the active sprint.
+- **Current repo execution posture:** `v0.2.0` is released; `P12-S1`, `P12-S2`, `P12-S3`, and `P12-S4` are shipped; `P12-S5` is the active sprint.
 - **Phase 12 delta:** retrieval quality, mutation explicitness, contradiction handling, public evals, and adaptive briefing.
 
 ## Current System Overview
@@ -60,6 +60,7 @@ Alice is a modular continuity platform with shared continuity semantics across l
 ### Product/Runtime
 - `workspaces`, `workspace_members`, `auth_sessions`, `devices`
 - `model_providers`, `provider_capabilities`, `model_packs`, `workspace_model_pack_bindings`
+- `task_briefs`
 - channel, task, trace, approval, and execution tables
 
 ## Current Key Flows
@@ -122,15 +123,15 @@ Delivered additions:
 Important baseline note: `P12-S3` is now the contradiction/trust baseline for the rest of Phase 12 and should not be reopened except where later sprint integration requires it.
 
 ### P12-S4: Public Eval Harness
-Expand the current retrieval evaluation foundation into public multi-suite benchmark runs and checked-in baseline reports.
+Shipped in `P12-S4`:
 
-Planned additions:
+Delivered additions:
 - `eval_suites`
 - `eval_cases`
 - `eval_runs`
 - `eval_results`
 
-Important baseline note: `P12-S4` should measure shipped retrieval, mutation, and contradiction behavior rather than redesign those systems.
+Important baseline note: `P12-S4` is now the evaluation baseline for the rest of Phase 12 and should not be reopened except where later sprint integration requires it.
 Source-of-truth note: the checked-in fixture catalog defines the authoritative suite/case set and ordering; persisted eval suite/case rows are synchronized snapshots for execution and audit, not an independent planning surface.
 
 ### P12-S5: Task-Adaptive Briefing
@@ -140,7 +141,7 @@ Planned additions:
 - `task_briefs`
 - provider/model-pack briefing strategy fields
 
-Important baseline note: Alice already has resumption, daily-brief, and chief-of-staff briefing surfaces. Phase 12 should treat those as starting points, not as greenfield briefing.
+Important baseline note: `P12-S5` should build on shipped retrieval, mutation, contradiction, and eval baselines. Existing resumption, daily-brief, and chief-of-staff briefing surfaces are starting points, not greenfield replacements.
 
 ## Security And Reliability Rules
 - Keep user/workspace isolation intact for continuity, provider, and channel data.
 
@@ -1,82 +1,64 @@
 # BUILD_REPORT
 
 ## Sprint Objective
-
-Implement `P12-S4` public eval harness so Alice can run reproducible local eval suites, persist suite/case/run/result records, emit stable baseline report artifacts, and document what the measured quality surface means.
+Implement `P12-S5` task-adaptive briefing so the system can generate deterministic, explainable, role-specific context packs for `user_recall`, `resume`, `worker_subtask`, and `agent_handoff`, while preserving shipped retrieval, mutation, contradiction, trust, and eval behavior.
 
 ## Completed Work
-
-- Added public eval persistence tables for `eval_suites`, `eval_cases`, `eval_runs`, and `eval_results`.
-- Added `alicebot_api.public_evals` with:
-  - fixture-catalog loading
-  - suite/case syncing into the database
-  - fixture-backed recall, resumption, correction, contradiction, and open-loop evaluators
-  - canonical report generation with stable digests
-  - report writing helper for checked-in baseline artifacts
-- Added current-branch public eval API surfaces:
-  - `GET /v1/evals/suites`
-  - `POST /v1/evals/runs`
-  - `GET /v1/evals/runs`
-  - `GET /v1/evals/runs/{eval_run_id}`
-- Made the checked-in fixture catalog authoritative for suite listing and run selection.
-- Added pruning for persisted suite/case rows so removed catalog entries do not survive as stale runtime state.
-- Added explicit validation for unknown `suite_key` filters instead of silently returning partial or empty runs.
-- Added CLI surfaces:
-  - `alicebot evals suites`
-  - `alicebot evals run`
-  - `alicebot evals runs`
-  - `alicebot evals show`
-- Added public fixture definitions in `eval/fixtures/public_eval_suites.json`.
-- Added checked-in current-branch baseline report artifact in `eval/baselines/public_eval_harness_v1.json`, with final committed artifact format still pending Control Tower confirmation.
-- Added sprint-owned docs in `docs/evals/public_eval_harness.md`, explicitly framed as current branch behavior where API and artifact decisions are still pending.
-- Added focused unit and integration coverage for the runner, migration, API, CLI, and baseline reproduction path.
+- Added a dedicated task briefing compiler with four briefing modes.
+- Added deterministic briefing summaries, selection rules, truncation metadata, token budgeting, and comparison output.
+- Added task brief persistence through a new `task_briefs` table.
+- Added current-branch API surfaces for task-brief compile, inspect, and compare.
+- Added CLI surfaces for task-brief compile, inspect, and compare.
+- Added MCP tools for task-brief compile, inspect, and compare.
+- Added model-pack briefing defaults through `briefing_strategy` and `briefing_max_tokens`, and task-brief compilation now resolves those defaults when a workspace-selected model pack is available.
+- Added focused docs under `docs/briefing/`, explicitly framed as current branch behavior where briefing payload and surface-shape decisions are still pending.
+- Added unit and integration coverage for determinism, size reduction, persistence, CLI smoke, MCP smoke, API behavior, migration shape, and model-pack strategy fields.
 
 ## Incomplete Work
-
-- None inside the sprint packet scope.
+- None within the sprint packet scope.
 
 ## Files Changed
-
-- `BUILD_REPORT.md`
-- `RULES.md`
+- `.ai/handoff/CURRENT_STATE.md`
 - `ARCHITECTURE.md`
+- `BUILD_REPORT.md`
 - `CURRENT_STATE.md`
-- `.ai/handoff/CURRENT_STATE.md`
 - `PRODUCT_BRIEF.md`
-- `ROADMAP.md`
 - `REVIEW_REPORT.md`
-- `apps/api/alembic/versions/20260414_0060_phase12_public_eval_harness.py`
-- `apps/api/src/alicebot_api/cli.py`
+- `ROADMAP.md`
+- `RULES.md`
+- `apps/api/src/alicebot_api/task_briefing.py`
 - `apps/api/src/alicebot_api/contracts.py`
-- `apps/api/src/alicebot_api/main.py`
-- `apps/api/src/alicebot_api/public_evals.py`
 - `apps/api/src/alicebot_api/store.py`
-- `scripts/check_control_doc_truth.py`
-- `docs/evals/public_eval_harness.md`
-- `eval/baselines/public_eval_harness_v1.json`
-- `eval/fixtures/public_eval_suites.json`
-- `tests/integration/test_cli_integration.py`
-- `tests/integration/test_public_evals_api.py`
-- `tests/unit/test_20260414_0060_phase12_public_eval_harness.py`
+- `apps/api/src/alicebot_api/model_packs.py`
+- `apps/api/src/alicebot_api/main.py`
+- `apps/api/src/alicebot_api/cli.py`
+- `apps/api/src/alicebot_api/cli_formatting.py`
+- `apps/api/src/alicebot_api/mcp_tools.py`
+- `apps/api/alembic/versions/20260414_0061_phase12_task_adaptive_briefing.py`
+- `docs/briefing/task-adaptive-briefing.md`
+- `tests/unit/test_task_briefing.py`
+- `tests/unit/test_model_packs.py`
 - `tests/unit/test_cli.py`
-- `tests/unit/test_main.py`
-- `tests/unit/test_public_evals.py`
+- `tests/unit/test_mcp.py`
+- `tests/unit/test_20260414_0061_phase12_task_adaptive_briefing.py`
+- `tests/integration/test_task_briefing_api.py`
+- `tests/integration/test_cli_integration.py`
+- `tests/integration/test_mcp_cli_parity.py`
+- `tests/integration/test_mcp_server.py`
+- `tests/integration/test_phase11_model_packs_api.py`
+- `scripts/check_control_doc_truth.py`
 
 ## Tests Run
-
-- `./.venv/bin/pytest tests/unit/test_public_evals.py tests/unit/test_20260414_0060_phase12_public_eval_harness.py tests/unit/test_cli.py tests/unit/test_main.py tests/integration/test_public_evals_api.py tests/integration/test_cli_integration.py tests/integration/test_retrieval_evaluation_api.py -q`
-  - Result: PASS (`83 passed`)
+- `./.venv/bin/pytest tests/unit/test_task_briefing.py tests/unit/test_model_packs.py tests/unit/test_cli.py tests/unit/test_mcp.py tests/unit/test_20260414_0061_phase12_task_adaptive_briefing.py tests/unit/test_continuity_resumption.py tests/unit/test_continuity_recall.py tests/unit/test_public_evals.py tests/integration/test_task_briefing_api.py tests/integration/test_cli_integration.py tests/integration/test_mcp_cli_parity.py tests/integration/test_phase11_model_packs_api.py tests/integration/test_mcp_server.py tests/integration/test_public_evals_api.py tests/integration/test_continuity_resumption_api.py tests/integration/test_retrieval_evaluation_api.py -q`
+  - Result: PASS (`73 passed`)
 - `./.venv/bin/python scripts/check_control_doc_truth.py`
   - Result: PASS
-- `rg -n "/Users|samirusani|Desktop/Codex" RULES.md ARCHITECTURE.md CURRENT_STATE.md .ai/handoff/CURRENT_STATE.md PRODUCT_BRIEF.md ROADMAP.md docs/evals eval/fixtures eval/baselines`
+- `rg -n "/Users|samirusani|Desktop/Codex" RULES.md ARCHITECTURE.md CURRENT_STATE.md .ai/handoff/CURRENT_STATE.md PRODUCT_BRIEF.md ROADMAP.md docs/briefing`
   - Result: PASS (no matches)
 
 ## Blockers/Issues
-
-- No sprint blocker remains.
-- The recall suite keeps one non-gating coverage snapshot for entity-edge expansion. It records the current shipped output with `score=0.0` while the suite still passes because the catalog marks that case as observational rather than a strict gate.
-- Final product policy is still pending for the Control Tower decisions called out in the sprint packet, including the committed artifact format and whether `/v1/evals/*` remains part of the accepted Phase 12 surface.
+- No remaining blockers.
+- Final product policy is still pending for the Control Tower decisions called out in the sprint packet, including the canonical persisted briefing payload shape, required model-pack briefing fields, and whether generation and comparison APIs should both ship in `P12-S5`.
 
 ## Recommended Next Step
-
-Request Control Tower merge review against the current `P12-S4` branch head.
+Request Control Tower merge review against the current `P12-S5` branch head.
@@ -12,26 +12,27 @@ Canonical handoff state lives at [.ai/handoff/CURRENT_STATE.md](.ai/handoff/CURR
 - Phase 12 Sprint 1 (`P12-S1`) is shipped.
 - Phase 12 Sprint 2 (`P12-S2`) is shipped.
 - Phase 12 Sprint 3 (`P12-S3`) is shipped.
-- Phase 12 Sprint 4 (`P12-S4`) is the active execution sprint.
+- Phase 12 Sprint 4 (`P12-S4`) is shipped.
+- Phase 12 Sprint 5 (`P12-S5`) is the active execution sprint.
 
 ## Current Baseline Truth
 - Alice has typed memory, provenance, trust classes, correction/supersession behavior, open loops, recall, resumption, and explainability.
 - Alice exposes CLI, MCP, hosted/product, provider-runtime, and Hermes bridge surfaces.
 - The codebase already includes semantic retrieval, embeddings, entities/entity edges, trusted-fact promotion, retrieval evaluation fixtures, deterministic resumption briefs, daily briefs, chief-of-staff briefing flows, and the shipped `P12-S1` hybrid retrieval/reranking foundation with retrieval traces.
-- The codebase also includes the shipped `P12-S2` memory mutation candidate and operation foundation.
+- The codebase also includes the shipped `P12-S2` memory mutation candidate and operation foundation, the shipped `P12-S3` contradiction/trust foundation, and the shipped `P12-S4` public eval harness.
 
 ## Not Yet First-Class In Repo
-- task-adaptive brief compiler separated from current briefing surfaces
 
 ## Phase Transition Note
 - Phase 12 is active.
 - `P12-S1` is complete and establishes the retrieval baseline.
 - `P12-S2` is complete and establishes the mutation baseline.
 - `P12-S3` is complete and establishes the contradiction/trust baseline.
-- `P12-S4` is the active sprint and should benchmark shipped retrieval, mutation, and contradiction behavior without reopening those systems.
-- The current `P12-S4` branch implements the public eval harness, fixture catalog, and checked-in baseline artifact, pending Control Tower merge approval.
+- `P12-S4` is complete and establishes the public-eval baseline.
+- `P12-S5` is the active sprint and should build briefing behavior on top of shipped retrieval, mutation, contradiction, and eval baselines without reopening those systems.
+- The current `P12-S5` branch implements task-adaptive brief generation, comparison, and model-pack briefing defaults, pending Control Tower merge approval.
 
 ## Immediate Control Tower Decisions Needed
-- Decide public eval suite taxonomy and baseline artifact format.
-- Decide what eval artifacts are committed versus generated locally.
-- Decide whether `P12-S4` stays CLI-first or keeps the current branch `/v1/evals/*` API surface.
+- Decide briefing modes and payload schema for user recall, resume, worker subtask, and agent handoff.
+- Decide provider/model-pack fields for briefing strategy and max brief tokens.
+- Decide whether `P12-S5` needs CLI-only, API, and MCP surfaces simultaneously or can stage them.
@@ -16,7 +16,8 @@ Alice is a pre-1.0 continuity platform for AI agents and agent-assisted workflow
 - `P12-S1` Hybrid Retrieval + Reranking is shipped.
 - `P12-S2` Automated Memory Operations is shipped.
 - `P12-S3` Contradiction Detection + Trust Calibration is shipped.
-- `P12-S4` Public Eval Harness is the active sprint.
+- `P12-S4` Public Eval Harness is shipped.
+- `P12-S5` Task-Adaptive Briefing is the active sprint.
 
 ## Next Phase
 ### Phase 12: Retrieval Quality + Adaptive Continuity