Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,13 @@ Implemented in B2:
- `alice_capture_candidates`
- `alice_commit_captures`

Planned for B3+:
- `alice_session_flush`
Implemented in B3:
- `alice_review_queue`
- `alice_review_apply`

Planned for B4+:
- `alice_session_flush`

## Runtime Flows
### Flow 1: Pre-turn Prefetch
1. Hermes receives user input.
Expand All @@ -58,7 +60,7 @@ Planned for B3+:
2. Policy gates auto-save vs review queue using type allowlist + confidence threshold.
3. Writes are idempotent across repeated sync attempts.

### Flow 4: Session-End Flush (planned B3+)
### Flow 4: Session-End Flush (planned B4+)
1. On session end, provider calls `alice_session_flush`.
2. Alice performs dedupe merge, contradiction checks, open-loop normalization, summary refresh, and review queue updates.

Expand Down
103 changes: 44 additions & 59 deletions BUILD_REPORT.md
Original file line number Diff line number Diff line change
@@ -1,83 +1,68 @@
# BUILD_REPORT

## sprint objective
Implement Bridge Sprint 2 (`B2`) auto-capture pipeline on top of the shipped Hermes provider and B1 contract foundation: candidate extraction, commit policy, mode support (`manual`, `assist`, `auto`), review-queue persistence for non-auto-saved items, and deterministic idempotent/no-op behavior.
Implement Bridge Sprint 3 (`B3`) review queue + explainability scope:
- ship `alice_review_queue`
- ship `alice_review_apply`
- support review actions (`approve`, `reject`, `edit-and-approve`, `supersede-existing`)
- expose explanation/provenance rationale in review surfaces
- verify deterministic recall/resume effects after approved review actions

## completed work
- Added B2 capture pipeline core in Alice continuity:
- implemented `alice_capture_candidates` extraction from user/assistant turn pairs
- implemented `alice_commit_captures` commit policy over extracted candidates
- implemented candidate classes: `decision`, `commitment`, `waiting_for`, `blocker`, `preference`, `correction`, `note`, `no_op`
- candidate payloads now include confidence, trust class, evidence snippet, and proposed action
- Implemented commit policy operating modes:
- `manual`: routes non-`no_op` candidates to review persistence
- `assist`: auto-saves only explicit high-confidence allowlist candidates
- `auto`: auto-saves allowlist candidates at the auto-mode confidence gate
- Implemented policy allowlist and review routing evidence:
- auto-save allowlist categories: `correction`, `preference`, `decision`, `commitment`, `waiting_for`, `blocker`
- review-routed categories by type policy: `note`
- additionally, low-confidence or policy-disallowed candidates route to review under mode gates
- Added idempotent commit behavior using commit fingerprint + candidate fingerprint lookup to prevent duplicate writes on repeated sync attempts.
- Added no-op protection so no-op turns (`no_op`) produce no memory writes.
- Wired new HTTP surfaces:
- `POST /v0/continuity/captures/candidates`
- `POST /v0/continuity/captures/commit`
- Wired new MCP surfaces:
- `alice_capture_candidates`
- `alice_commit_captures`
- preserved existing `alice_capture` and other shipped tools for fallback/manual workflows
- Wired Hermes provider B2 flow in `sync_turn`:
- `assist`/`auto` modes now run candidate extraction then commit
- `manual` mode suppresses automatic `sync_turn` capture
- fallback to legacy `/v0/continuity/captures` path when B2 endpoints are unavailable
- preserved dedupe queue and session-end flush behavior
- Updated sprint-scoped integration docs and smoke script for B2 mode/pipeline truth.
- Updated control-doc truth checker markers from B1-active to B2-active so required verification reflects active sprint state.
- Added MCP tool surface `alice_review_queue` with deterministic queue/detail behavior.
- Added MCP tool surface `alice_review_apply` with B3 action vocabulary mapped to continuity correction semantics:
- `approve` -> `confirm`
- `edit-and-approve` -> `edit`
- `reject` -> `delete`
- `supersede-existing` -> `supersede`
- Kept `alice_memory_review` and `alice_memory_correct` as compatibility aliases.
- Extended continuity review serialization to include shared explanation records on review objects.
- Added deterministic `proposal_rationale` to continuity explanation output.
- Ensured explanation chain remains shared across review, recall, and resume paths.
- Updated B3-scoped integration docs for MCP and Hermes memory-provider guidance.
- Updated architecture status markers so B3 review surfaces are marked implemented and only B4 follow-up remains planned.
- Updated control-doc truth checker markers to B3 active-sprint truth.
- Updated B3 review evidence report (`REVIEW_REPORT.md`).
- Added/updated sprint-owned tests for:
- MCP tool surface and B3 names
- action alias mapping and deterministic correction semantics
- review queue explainability presence
- recall exclusion after reject and recall/resume updates after supersede

## incomplete work
- None in B2 packet scope.
- None in B3 sprint scope.

## files changed
- `ARCHITECTURE.md`
- `BUILD_REPORT.md`
- `PRODUCT_BRIEF.md`
- `README.md`
- `REVIEW_REPORT.md`
- `ROADMAP.md`
- `apps/api/src/alicebot_api/continuity_capture.py`
- `apps/api/src/alicebot_api/continuity_explainability.py`
- `apps/api/src/alicebot_api/continuity_review.py`
- `apps/api/src/alicebot_api/contracts.py`
- `apps/api/src/alicebot_api/main.py`
- `apps/api/src/alicebot_api/mcp_tools.py`
- `apps/api/src/alicebot_api/store.py`
- `docs/integrations/hermes-memory-provider/plugins/memory/alice/__init__.py`
- `docs/integrations/hermes-memory-provider.md`
- `docs/integrations/mcp.md`
- `scripts/run_hermes_memory_provider_smoke.py`
- `docs/integrations/hermes-memory-provider.md`
- `scripts/check_control_doc_truth.py`
- `tests/unit/test_continuity_capture.py`
- `tests/integration/test_continuity_capture_api.py`
- `tests/unit/test_continuity_review.py`
- `tests/unit/test_mcp.py`
- `tests/unit/test_hermes_memory_provider.py`
- `tests/integration/test_mcp_server.py`
- `REVIEW_REPORT.md`
- `BUILD_REPORT.md`

## tests run
1. `python3 scripts/check_control_doc_truth.py`
- Result: PASS
- Output summary: verified README, ROADMAP, sprint packet, RULES, current state, archive planning marker
2. `./.venv/bin/python -m pytest tests/unit tests/integration -q`
- Result: PASS
- Output: `1188 passed in 191.85s (0:03:11)`
3. `./.venv/bin/python scripts/run_hermes_memory_provider_smoke.py`
- Result: PASS
- Output summary:
- single external provider enforcement validated
- provider registered with expected tool schemas
- `bridge_contract_version` reported as `bridge_b2`
- bridge status `ready=true`, `errors=[]`
- config includes `bridge_mode=assist`
- lifecycle hooks report `prefetch`, `queue_prefetch`, `sync_turn`, `on_session_end`, and `bridge_mode`
- `python3 scripts/check_control_doc_truth.py`
- Result: PASS
- `./.venv/bin/python -m pytest tests/unit tests/integration -q`
- Result: `1189 passed in 196.98s (0:03:16)` (latest re-run)
- `./.venv/bin/python scripts/run_hermes_memory_provider_smoke.py`
- Result: PASS
- Evidence summary: single-external-provider enforcement message emitted; structural payload reports `single_external_enforced=true` and `bridge_status.ready=true`.
- Local filesystem-specific path fields from script output were intentionally omitted for identifier hygiene.

## blockers/issues
- None.
- No functional blockers.
- No outstanding evidence or documentation blockers after alignment updates.

## recommended next step
Run B2 review against acceptance criteria with focus on policy calibration (confidence thresholds) and confirm desired `auto`-mode aggressiveness before promoting B3 review actions.
Proceed to Bridge Sprint 4 (`B4`) packaging/docs/smoke closeout using the now-shipped B3 review queue/apply surfaces as baseline.
2 changes: 1 addition & 1 deletion PRODUCT_BRIEF.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Review-required:
- low-confidence extractions

## Active Sprint Status
Bridge Sprint 2 (`B2`) is now the active execution sprint. It is limited to the auto-capture pipeline on top of the shipped Hermes provider surface and the `B1` contract foundation.
Bridge Sprint 3 (`B3`) is now the active execution sprint. It is limited to review queue and explainability work on top of the shipped Hermes provider surface plus the `B1` and `B2` bridge foundations.

## Known Gaps To Resolve Before Build
- Candidate scoring rubric and confidence calibration method are not specified.
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ Phase 11 is complete and shipped:
- `P11-R1` Provider Runtime Hardening is shipped
- A bridge phase is now active: Hermes Auto-Capture
- `B1` Hermes Provider Contract Foundation is shipped
- `B2` Auto-Capture Pipeline is the active sprint
- `B2` Auto-Capture Pipeline is shipped
- `B3` Review Queue + Explainability is the active sprint
- Historical planning and control docs: [docs/archive/planning/2026-04-08-context-compaction/README.md](docs/archive/planning/2026-04-08-context-compaction/README.md)

## Why Alice exists
Expand Down
57 changes: 25 additions & 32 deletions REVIEW_REPORT.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,50 +4,43 @@
PASS

## criteria met
- `alice_capture_candidates` is implemented and wired through API and MCP.
- `alice_commit_captures` is implemented and wired through API and MCP.
- Commit operating modes are implemented and exercised: `manual`, `assist`, `auto`.
- Candidate classes required by B2 are present: `decision`, `commitment`, `waiting_for`, `blocker`, `preference`, `correction`, `note`, `no_op`.
- Candidate outputs include confidence, trust class, evidence snippet, and proposed action.
- Auto-save allowlist categories are explicit and implemented: `correction`, `preference`, `decision`, `commitment`, `waiting_for`, `blocker`.
- Review-routed category by type policy is explicit and implemented: `note`.
- Low-confidence/policy-gated candidates route to review persistence rather than auto-save.
- No-op turns produce no memory writes.
- Repeated sync attempts are idempotent via `(sync_fingerprint, candidate_id)` duplicate guard.
- Hermes provider uses the B2 pipeline in `assist`/`auto`, preserves `manual` behavior, and keeps legacy fallback.
- B2 docs/code do not claim `alice_review_queue` or `alice_review_apply` as shipped.
- Previously flagged gaps are fixed:
- explicit `auto` mode tests added
- strict boolean validation for candidate `explicit` added
- `ARCHITECTURE.md` bridge status updated to reflect B2 implementation
- Local identifier hygiene check on changed files passed (no local machine paths/usernames leaked in changed code/docs/reports).
- `alice_review_queue` is implemented and exposed on MCP.
- `alice_review_apply` is implemented and exposed on MCP.
- Required B3 review actions are supported through shipped surface semantics:
- `approve` -> `confirm`
- `edit-and-approve` -> `edit`
- `reject` -> `delete`
- `supersede-existing` -> `supersede`
- Review payloads now include explainability/provenance chain data (`source_facts`, `evidence_segments`, `trust`, `supersession_notes`, `proposal_rationale`).
- Approved review actions deterministically affect later recall/resume behavior (validated in integration flow using supersede).
- Rejected review items are not treated as accepted continuity state (validated by recall exclusion after reject).
- No local identifiers (local usernames/absolute machine paths) were found in changed code/docs/reports.

## criteria missed
- None.
- None functionally against B3 acceptance criteria.

## quality issues
- No blocking quality issues found in B2 scope after fixes.
- No blocking quality issues found in B3 scope.

## regression risks
- Moderate-low risk in policy calibration (confidence thresholds), but implementation behavior is deterministic and covered by unit/integration tests.
- Idempotency/no-op regressions are specifically covered.
- Low: MCP surface additions are additive, and targeted + full unit/integration suites pass.
- Moderate-low: review queue objects now include full explanation payloads, increasing response size; monitor MCP client assumptions on payload size/shape.

## docs issues
- None blocking.
- Bridge status documentation is now aligned with B2 implementation state.
- None blocking. Architecture and build evidence alignment issues are fixed.

## should anything be added to RULES.md?
- No required change.

## should anything update ARCHITECTURE.md?
- Completed in this fix pass: B2 surfaces and runtime flow status markers were updated from planned to implemented where applicable.
- No additional architecture changes required for B3.

## recommended next action
1. Approve B2 for merge.
2. Start B3 review-action scope (`alice_review_queue`, `alice_review_apply`) using B2 persisted review items as baseline fixtures.

## evidence summary
- Required verification commands (re-run):
- `python3 scripts/check_control_doc_truth.py` -> PASS
- `./.venv/bin/python -m pytest tests/unit tests/integration -q` -> `1188 passed in 191.85s (0:03:11)`
- `./.venv/bin/python scripts/run_hermes_memory_provider_smoke.py` -> PASS
1. Approve B3 for merge.
2. Start B4 packaging/docs/demo closeout.

## verification evidence checked
- `python3 scripts/check_control_doc_truth.py` -> PASS
- `./.venv/bin/python -m pytest tests/unit/test_continuity_review.py tests/unit/test_mcp.py tests/integration/test_mcp_server.py -q` -> `13 passed`
- `./.venv/bin/python -m pytest tests/unit tests/integration -q` -> `1189 passed in 196.98s (0:03:16)`
- `./.venv/bin/python scripts/run_hermes_memory_provider_smoke.py` -> PASS
2 changes: 1 addition & 1 deletion ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
Phase 11 remains baseline truth and is not future scope.

## Active Planning Status
- Bridge Sprint 2 (`B2`) is the active execution sprint.
- Bridge Sprint 3 (`B3`) is the active execution sprint.
- The remaining bridge-phase milestones are planned but not yet promoted.

## Bridge Phase: Hermes Auto-Capture (Planned)
Expand Down
82 changes: 62 additions & 20 deletions apps/api/src/alicebot_api/continuity_explainability.py
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,36 @@ def _trust_record(
}


def _proposal_rationale(
*,
status: str,
source_fact_count: int,
evidence_segment_count: int,
trust_class: MemoryTrustClass,
confirmation_status: MemoryConfirmationStatus,
provenance_posture: ContinuityRecallProvenancePosture,
correction_count: int,
) -> str:
lifecycle_clause = f"Candidate entered review with lifecycle status '{status}'."
source_clause = (
f"It is backed by {source_fact_count} source fact(s) and {evidence_segment_count} evidence segment(s)."
)
trust_clause = (
"Trust posture resolves to "
f"'{trust_class}' with confirmation status '{confirmation_status}' and provenance posture "
f"'{provenance_posture}'."
)
correction_clause = f"Correction history includes {correction_count} event(s)."
return " ".join(
[
lifecycle_clause,
source_clause,
trust_clause,
correction_clause,
]
)


def _supersession_notes(
*,
status: str,
Expand Down Expand Up @@ -475,34 +505,46 @@ def build_continuity_item_explanation(
evidence_rows=evidence_rows,
title=title,
)
source_facts = _source_facts(
title=title,
body=body,
provenance=provenance,
capture_event=capture_event,
)
trust = _trust_record(
confidence=confidence,
confirmation_status=resolved_confirmation_status,
provenance_posture=resolved_provenance_posture,
evidence_segment_count=len(evidence_segments),
correction_count=len(correction_events),
source_event_count=source_event_count,
)
supersession_notes = _supersession_notes(
status=status,
supersedes_object_id=supersedes_object_id,
superseded_by_object_id=superseded_by_object_id,
correction_events=correction_events,
)
return {
"source_facts": _source_facts(
title=title,
body=body,
provenance=provenance,
capture_event=capture_event,
),
"trust": _trust_record(
confidence=confidence,
confirmation_status=resolved_confirmation_status,
provenance_posture=resolved_provenance_posture,
evidence_segment_count=len(evidence_segments),
correction_count=len(correction_events),
source_event_count=source_event_count,
),
"source_facts": source_facts,
"trust": trust,
"evidence_segments": evidence_segments,
"supersession_notes": _supersession_notes(
status=status,
supersedes_object_id=supersedes_object_id,
superseded_by_object_id=superseded_by_object_id,
correction_events=correction_events,
),
"supersession_notes": supersession_notes,
"timestamps": _timestamps_record(
created_at=created_at,
updated_at=updated_at,
capture_event=capture_event,
last_confirmed_at=last_confirmed_at,
),
"proposal_rationale": _proposal_rationale(
status=status,
source_fact_count=len(source_facts),
evidence_segment_count=len(evidence_segments),
trust_class=trust["trust_class"],
confirmation_status=resolved_confirmation_status,
provenance_posture=resolved_provenance_posture,
correction_count=len(correction_events),
),
}


Expand Down
Loading
Loading