Phase E: cross-session memory + recall (WAL → lazy digest → recall) by azalio · Pull Request #157 · azalio/map-framework

azalio · 2026-06-03T07:05:29Z

Phase E — Cross-Session Memory + Recall

Gives the MAP Framework durable cross-session memory: cheap per-turn capture into a scratch WAL, lazy LLM "digest" finalization on the next session start, and recall injection of relevant past digests — so a later session knows why an approach was chosen.

Architecture: write-ahead-log → lazy checkpoint (NOT flush-on-SessionEnd). Stop is the only reliable durable-capture point; SessionStart carries finalize + recall. Works with zero SessionEnd dependency (HC-2) and survives compaction (which PreCompact does not reliably fire on, esp. 1M-token contexts).

Built via /map-efficient (8 subtasks, sequential RESEARCH → ACTOR → MONITOR, per-subtask commits). Final-verifier: PASS.

What's in it

src/mapify_cli/memory/ (new pure-runtime package):
- digest_schema.py — single-source field constants + redaction (sk-/gh_/base64/AKIA) + secret-path globs + control-char sanitizer (INV-7, Contract-First).
- capture.py — LLM-free per-turn scratch WAL append (subprocess-free branch resolve via .git/HEAD), on_session_end best-effort marker.
- finalize.py — finalize-if-dirty: claude -p (argv form, MAP_INVOKED_BY=memory-finalize recursion guard, hard timeout), atomic .md.tmp → os.replace → .finalized, per-branch flock, truncation-tolerant, empty→no-digest, memory-cost.log.
- recall.py — current-branch keyword+recency ranking, MAP_MEMORY_RECALL_CAP (default 4000 chars) with whole-digest drops → recall-drop.log, sanitized additionalContext.
4 REQUIRE_GUARD hook shims (map-memory-{capture,finalize,recall,endmark}.py) authored as .jinja, registered in settings.json (Stop / SessionStart finalize→recall / SessionEnd / UserPromptSubmit), lint-hooks.py, and both hook doc tables.
map-memory-now skill — on-demand finalize / --finalize-all, requires-cmd: [claude, git], host-gate pruned when claude absent (EC-4).
.gitignore — ignores the scratch WAL; documents the MAP_MEMORY_COMMIT_DIGESTS=0 opt-out.
Hook executable-bit fix — hooks now ship +x (the harness execs them via shebang); renderer force-sets +x for hook .py/.sh, plus CI guards that exec hooks the way the harness does (shebang, not python3 <path>).

Tests / gates

~150 new unit tests across the memory modules + an end-to-end integration smoke (capture×2 → finalize → recall with a fake claude on PATH; asserts the digest surfaces in additionalContext and memory-cost.log is written).
make check (ruff + mypy + pyright + full pytest + check-render) green; full suite 2037 passed.

Notes

Token accounting into token_accounting.json is intentionally deferred (Decision 9); finalize writes memory-cost.log instead.
Single-source render invariant respected throughout (templates_src/**/*.jinja → make render-templates; check-render enforced).

🤖 Generated with Claude Code

…ion + sanitizer) Adds src/mapify_cli/memory/{__init__,digest_schema}.py as the ONE authority (INV-7 / Phase-A Contract-First) for the memory subsystem: - SCRATCH_TURN_FIELDS / SCRATCH_ENDED_FIELDS / DIGEST_FRONTMATTER_FIELDS (decisions/findings intentionally absent from scratch shapes, spec:118) - REDACTION_PATTERNS + redact_text() (sk-/sk-ant-, gh[pousr]_, base64 blob, AKIA) - SECRET_PATH_GLOBS + redact_secret_path() (.env*/*.pem/*.key/credentials*/secrets*) - sanitize_value() matching the proven _sanitize_for_json control-char rule 57 unit tests (VC1-VC4). ruff/mypy/pyright clean. Follow-up (LOW, out of ST-001 scope): github_pat_ fine-grained PAT prefix and AWS STS ASIA key-id formats are not yet covered by redact_text. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

src/mapify_cli/memory/capture.py: LLM-free Stop-hook hot path. - append_turn(stdin, project_dir): one redacted+sanitized JSONL turn record to .map/<branch>/sessions/scratch/<sid>.jsonl + maintains current-session pointer - append_end_marker(stdin, project_dir): {event:ended,ts,session_id} (reused by ST-005) - resolve_session_id: stdin session_id -> current-session pointer -> None (HC-1) - branch resolved by reading .git/HEAD directly (dir + worktree-file + detached HEAD) — NO subprocess on the hot path (INV-1 proven by zero-subprocess test) - turn counter = non-empty scratch line count (+1), resilient to truncated tail (INV-6) - best-effort: never raises on malformed/empty stdin - field names imported from digest_schema (INV-7) 30 unit tests. ruff/mypy/pyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…lock, timeout) src/mapify_cli/memory/finalize.py: finalize_dirty(incoming_sid, project_dir, timeout=60). Transactional unit (INV-4, load-bearing order): write <sid>.md.tmp -> os.replace -> .finalized marker -> cost log -> delete scratch. - candidate = scratch/*.jsonl with sid != incoming_sid AND no .finalized (no SessionEnd dep, HC-2) - per-branch flock (name sanitized to ^[a-zA-Z0-9_-]{1,64}$) + in-lock re-check => idempotent + concurrent-safe => exactly one digest (VC3); LockTimeoutError -> skip - claude -p in argv list form, env MAP_INVOKED_BY=memory-finalize (recursion guard), hard subprocess timeout; timeout/returncode!=0 -> scratch left unfinalized, tmp cleaned - tolerates truncated trailing JSONL line (INV-6); empty scratch -> no digest but finalized+deleted - digest redact_text + sanitize_value (defense-in-depth); cost -> sessions/memory-cost.log (token_accounting.json deferred) 20 unit tests incl. timeout, returncode!=0, idempotent, concurrent-in-lock, lock-timeout, truncation, empty, redaction, incoming-skip, and the post-replace marker-touch-failure retry-convergence path. ruff/mypy/pyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

src/mapify_cli/memory/recall.py: build_recall(prompt, branch, project_dir) -> str. - reads current-branch digests (.map/<branch>/sessions/*.md; cross-branch deferred, OQ-3 v1) - parses YAML frontmatter (yaml.safe_load), fields via DIGEST_FRONTMATTER_FIELDS (INV-7) - ranks by prompt keyword/ticket overlap + recency tiebreak; empty prompt -> recency - caps assembled payload at MAP_MEMORY_RECALL_CAP (default 4000 chars); whole-digest drops only (never mid-digest, SC-1) logged to sessions/recall-drop.log - sanitize_value + redact_text (defense-in-depth); returns "" when nothing to recall Monitor follow-ups addressed: count the inter-block "\n" separator in the cap check so the payload never exceeds the cap (was off by N-1); document the 500-char per-block body bound; add a multi-block strict-cap regression test. 18 unit tests. ruff/mypy/pyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adds capture.on_session_end(stdin, project_dir): thin SessionEnd entrypoint over append_end_marker (ST-002). Appends ONLY {event:'ended',ts,session_id} — no finalize, no LLM. Wraps the call in its own broad guard (swallow+log) so SessionEnd stays fire-and-forget and NEVER raises (AC-4). Reason-agnostic: SessionEnd reason (clear/resume/logout) is read for logging only and never enters the record (EC-6). 3 new tests (record-only, swallows-exception, reason-agnostic). Full suite 2021 passed. ruff/mypy/pyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ation Authors 4 thin hook shims as templates_src/hooks/*.py.jinja (rendered to .claude/ and templates/ — claude-only; codex ships only workflow-gate.py): - map-memory-capture.py (Stop) -> capture.append_turn - map-memory-endmark.py (SessionEnd) -> capture.on_session_end - map-memory-finalize.py (SessionStart) -> finalize_dirty (MAP_MEMORY_FINALIZE_TIMEOUT, default 60) - map-memory-recall.py (SessionStart+UserPromptSubmit) -> build_recall, emits additionalContext Each: recursion guard `if os.environ.get("MAP_INVOKED_BY"): sys.exit(0)` as the FIRST main() statement (stops the finalizer's own claude -p from re-triggering memory hooks), stdin parse, lazy import (src/ first, falls back to installed mapify_cli, ImportError->no-op), single best-effort module call. Registered in scripts/lint-hooks.py REQUIRE_GUARD and both doc tables (hooks/README.md + references/hook-patterns.md, all trees). check-render green; lint-hooks 16 conform; test_hook_patterns 49 passed; full suite 2029 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…gitignore Wires the memory runtime surface: - settings.json (.jinja + hand-maintained .claude copy, shipped-only): Stop->capture; SessionStart->finalize THEN recall (order load-bearing, INV-3); UserPromptSubmit->recall; new SessionEnd->endmark. - New map-memory-now skill (SKILL.md.jinja): on-demand finalize / --finalize-all sweep via finalize_dirty(None). skill-rules entry: skillClass=task, requires-cmd=[claude,git], direct-invocation triggers. Host gate prunes it when claude absent (EC-4) — new TestMapMemoryNowHostGate covers it. - .gitignore: Phase-E block ignoring .map/*/sessions/scratch/ + documented MAP_MEMORY_COMMIT_DIGESTS=0 opt-out; new templates_src/.gitignore.jinja rendered to templates/.gitignore (".gitignore" added to renderer _CLAUDE_SHIPPED_ONLY). - Tests: skill-count 14->15; VC1/VC3 file_copier updated (map-state + map-memory-now both require git -> 2 skips); 4 memory-hook smoke cases. render + check-render green; full suite 2036 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tests/test_memory_integration.py: drives the real hook BINARIES as subprocesses (capture×2 → finalize[new sid, NO SessionEnd] → recall) with a fake `claude` executable injected on PATH (mocks claude -p; unconditional, no skipif). Asserts: 2 turn records → exactly one digest .md containing the mocked body → .finalized marker + scratch deleted → memory-cost.log with input_tokens (VC4) → recall stdout additionalContext contains the digest body. Strips MAP_INVOKED_BY from the subprocess env so the recursion guard cannot silently no-op the hooks (proves the pipeline really ran). No token_accounting.json assertion (Decision 9 descope). make check (lint + test + check-render) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Final-verifier flagged a wording inaccuracy in the map-memory-now SKILL.md Notes: finalize_dirty never stages or commits in any mode (there is no git add/commit in the memory modules). Reword the MAP_MEMORY_COMMIT_DIGESTS=0 note: digests are committed only because they are not git-ignored; the opt-out is uncommenting the .gitignore line. Re-rendered; check-render green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… shebang The 4 map-memory-* hooks shipped without the executable bit, so Claude Code's direct shebang invocation ("$CLAUDE_PROJECT_DIR"/.claude/hooks/<name>.py) failed at runtime with "/bin/sh: ... Permission denied" (Stop/SessionStart/UserPromptSubmit). The python-based smoke/integration tests invoked hooks as `python3 <path>`, which masked the missing bit. Root fix (defense in depth): - chmod +x the 4 hook .jinja sources (matches the existing convention, e.g. context-meter.py.jinja; git tracks the bit and the renderer propagates it). - Harden template_renderer._atomic_write_file: FORCE +x for .py/.sh under a managed hooks/ dir regardless of source bit, so a hook .jinja that forgets the executable bit still ships an executable hook (mirrors create_hook_files' unconditional chmod on the install path). Implements the learned "preserve executable bits after atomic temp-file writer" rule. CI guards (test hooks the way the harness does): - tests/test_hook_patterns.py::test_hook_is_executable — every .py/.sh hook in all four trees (.claude, .codex, templates, templates/codex) must be X_OK. - tests/hooks/test_hook_inventory_smoke.py::test_every_configured_hook_execs_via_shebang — exec each settings.json-wired hook via its bare shebang path (no interpreter prefix) and assert no PermissionError / 126 / 127. - tests/test_template_render.py: a hook .jinja without +x still renders executable; non-hook files are not force-marked. Negative-proofed: removing +x makes both guard tests fail; re-render restores it. make check (lint + test + check-render) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…de absent) TestVC1MissingDepSkip and TestMapMemoryNowHostGate delegated non-target commands to the REAL requires-cmd checker. map-memory-now requires-cmd:[claude, git]; on CI runners `claude` is absent, so map-memory-now skipped on `claude` (not `git`), flipping the skip message and failing the assertion — while passing locally where `claude` is installed. Force the patched checker deterministically (target command absent, all others present) instead of delegating to PATH. Verified with the whole test file run under a claude-stripped PATH. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…subsystem Correctness: - finalize: slug disambiguation reserved suffix room before 32-char truncation so os.replace can no longer clobber another session's digest (#1) - finalize: derive slug from the `title` key and strip ```json fences in _parse_claude_output so decisions/findings survive fenced output (#3) - finalize: per-field redaction before YAML escaping; identifier fields (session_id/branch/date/slug) excluded so a long session_id is no longer rewritten to «redacted», keeping owner-line dedup working (#4) - recall: cap is rank-monotonic — break on first overflow so a lower-ranked smaller digest never jumps a dropped higher-ranked one (#5) - digest_schema: redact fine-grained github_pat_ tokens (#6); stop over-redacting pure-hex git SHAs while still catching mixed-case secrets (#7) - capture: advance the transcript <sid>.offset only AFTER the record write so a crash never skips a transcript range (#8) - capture: derive a fallback session id from the transcript stem instead of a shared unknown.jsonl bucket (#9) Behavior/perf: - settings: restore the UserPromptSubmit map-memory-recall registration so prompt-relevance ranking actually runs (#2) - capture: memoise _resolve_branch and tail-read the turn count (O(1)) on the hot path; finalize hook subprocess default timeout 60->50 so it stays below the 60s harness timeout and its cleanup runs (#10) Adds tests/test_memory_review_fixes.py (11 regression tests, one per finding). Full gate green: mypy, pyright src/ 0/0/0, lint-hooks, check-render, 2082 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

azalio · 2026-06-03T15:01:00Z

Code-review fixes pushed (commit `901d991`)

Resolved all 10 findings from a high-effort review of the memory subsystem:

Correctness

finalize: slug disambiguation now reserves suffix room before the 32-char truncation → os.replace can no longer clobber another session's digest.
finalize: slug derived from the title key; _parse_claude_output strips ```json fences so decisions/findings survive.
finalize: per-field redaction before YAML escaping; identifier fields (session_id/branch/date/slug) excluded → long session_id no longer becomes «redacted», owner-line dedup preserved.
recall: cap is rank-monotonic — break on first overflow so a lower-ranked smaller digest never jumps a dropped higher-ranked one.
6/7. digest_schema: redact fine-grained github_pat_ tokens; stop over-redacting pure-hex git SHAs (mixed-case secrets still caught).
capture: advance the transcript <sid>.offset only AFTER the record write (crash-safety).
capture: fallback session id from the transcript stem instead of a shared unknown.jsonl.

Behavior/perf
2. settings: restored the UserPromptSubmit recall registration so prompt-relevance ranking actually runs.
10. capture: memoised _resolve_branch + O(1) tail-read turn count; finalize hook subprocess timeout 60→50 (below the 60s harness timeout so cleanup runs).

Adds tests/test_memory_review_fixes.py (11 regression tests). Full gate green: mypy, pyright src/ 0/0/0, lint-hooks, check-render, 2082 tests passed.

🤖 Generated with Claude Code

azalio and others added 12 commits June 2, 2026 21:40

azalio merged commit b0c3133 into main Jun 3, 2026
6 checks passed

azalio deleted the arroyo-switchback branch June 3, 2026 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase E: cross-session memory + recall (WAL → lazy digest → recall)#157

Phase E: cross-session memory + recall (WAL → lazy digest → recall)#157
azalio merged 12 commits into
mainfrom
arroyo-switchback

azalio commented Jun 3, 2026

Uh oh!

azalio commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

azalio commented Jun 3, 2026