Phase E: cross-session memory + recall (WAL → lazy digest → recall)#157
Merged
Conversation
…ion + sanitizer)
Adds src/mapify_cli/memory/{__init__,digest_schema}.py as the ONE authority
(INV-7 / Phase-A Contract-First) for the memory subsystem:
- SCRATCH_TURN_FIELDS / SCRATCH_ENDED_FIELDS / DIGEST_FRONTMATTER_FIELDS
(decisions/findings intentionally absent from scratch shapes, spec:118)
- REDACTION_PATTERNS + redact_text() (sk-/sk-ant-, gh[pousr]_, base64 blob, AKIA)
- SECRET_PATH_GLOBS + redact_secret_path() (.env*/*.pem/*.key/credentials*/secrets*)
- sanitize_value() matching the proven _sanitize_for_json control-char rule
57 unit tests (VC1-VC4). ruff/mypy/pyright clean.
Follow-up (LOW, out of ST-001 scope): github_pat_ fine-grained PAT prefix and
AWS STS ASIA key-id formats are not yet covered by redact_text.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/mapify_cli/memory/capture.py: LLM-free Stop-hook hot path.
- append_turn(stdin, project_dir): one redacted+sanitized JSONL turn record
to .map/<branch>/sessions/scratch/<sid>.jsonl + maintains current-session pointer
- append_end_marker(stdin, project_dir): {event:ended,ts,session_id} (reused by ST-005)
- resolve_session_id: stdin session_id -> current-session pointer -> None (HC-1)
- branch resolved by reading .git/HEAD directly (dir + worktree-file + detached
HEAD) — NO subprocess on the hot path (INV-1 proven by zero-subprocess test)
- turn counter = non-empty scratch line count (+1), resilient to truncated tail (INV-6)
- best-effort: never raises on malformed/empty stdin
- field names imported from digest_schema (INV-7)
30 unit tests. ruff/mypy/pyright clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lock, timeout)
src/mapify_cli/memory/finalize.py: finalize_dirty(incoming_sid, project_dir, timeout=60).
Transactional unit (INV-4, load-bearing order):
write <sid>.md.tmp -> os.replace -> .finalized marker -> cost log -> delete scratch.
- candidate = scratch/*.jsonl with sid != incoming_sid AND no .finalized (no SessionEnd dep, HC-2)
- per-branch flock (name sanitized to ^[a-zA-Z0-9_-]{1,64}$) + in-lock re-check =>
idempotent + concurrent-safe => exactly one digest (VC3); LockTimeoutError -> skip
- claude -p in argv list form, env MAP_INVOKED_BY=memory-finalize (recursion guard),
hard subprocess timeout; timeout/returncode!=0 -> scratch left unfinalized, tmp cleaned
- tolerates truncated trailing JSONL line (INV-6); empty scratch -> no digest but finalized+deleted
- digest redact_text + sanitize_value (defense-in-depth); cost -> sessions/memory-cost.log
(token_accounting.json deferred)
20 unit tests incl. timeout, returncode!=0, idempotent, concurrent-in-lock, lock-timeout,
truncation, empty, redaction, incoming-skip, and the post-replace marker-touch-failure
retry-convergence path. ruff/mypy/pyright clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/mapify_cli/memory/recall.py: build_recall(prompt, branch, project_dir) -> str. - reads current-branch digests (.map/<branch>/sessions/*.md; cross-branch deferred, OQ-3 v1) - parses YAML frontmatter (yaml.safe_load), fields via DIGEST_FRONTMATTER_FIELDS (INV-7) - ranks by prompt keyword/ticket overlap + recency tiebreak; empty prompt -> recency - caps assembled payload at MAP_MEMORY_RECALL_CAP (default 4000 chars); whole-digest drops only (never mid-digest, SC-1) logged to sessions/recall-drop.log - sanitize_value + redact_text (defense-in-depth); returns "" when nothing to recall Monitor follow-ups addressed: count the inter-block "\n" separator in the cap check so the payload never exceeds the cap (was off by N-1); document the 500-char per-block body bound; add a multi-block strict-cap regression test. 18 unit tests. ruff/mypy/pyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds capture.on_session_end(stdin, project_dir): thin SessionEnd entrypoint over
append_end_marker (ST-002). Appends ONLY {event:'ended',ts,session_id} — no finalize,
no LLM. Wraps the call in its own broad guard (swallow+log) so SessionEnd stays
fire-and-forget and NEVER raises (AC-4). Reason-agnostic: SessionEnd reason
(clear/resume/logout) is read for logging only and never enters the record (EC-6).
3 new tests (record-only, swallows-exception, reason-agnostic). Full suite 2021 passed.
ruff/mypy/pyright clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation
Authors 4 thin hook shims as templates_src/hooks/*.py.jinja (rendered to .claude/
and templates/ — claude-only; codex ships only workflow-gate.py):
- map-memory-capture.py (Stop) -> capture.append_turn
- map-memory-endmark.py (SessionEnd) -> capture.on_session_end
- map-memory-finalize.py (SessionStart) -> finalize_dirty (MAP_MEMORY_FINALIZE_TIMEOUT, default 60)
- map-memory-recall.py (SessionStart+UserPromptSubmit) -> build_recall, emits additionalContext
Each: recursion guard `if os.environ.get("MAP_INVOKED_BY"): sys.exit(0)` as the FIRST
main() statement (stops the finalizer's own claude -p from re-triggering memory hooks),
stdin parse, lazy import (src/ first, falls back to installed mapify_cli, ImportError->no-op),
single best-effort module call. Registered in scripts/lint-hooks.py REQUIRE_GUARD and both
doc tables (hooks/README.md + references/hook-patterns.md, all trees).
check-render green; lint-hooks 16 conform; test_hook_patterns 49 passed; full suite 2029 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gitignore
Wires the memory runtime surface:
- settings.json (.jinja + hand-maintained .claude copy, shipped-only): Stop->capture;
SessionStart->finalize THEN recall (order load-bearing, INV-3); UserPromptSubmit->recall;
new SessionEnd->endmark.
- New map-memory-now skill (SKILL.md.jinja): on-demand finalize / --finalize-all sweep
via finalize_dirty(None). skill-rules entry: skillClass=task, requires-cmd=[claude,git],
direct-invocation triggers. Host gate prunes it when claude absent (EC-4) — new
TestMapMemoryNowHostGate covers it.
- .gitignore: Phase-E block ignoring .map/*/sessions/scratch/ + documented
MAP_MEMORY_COMMIT_DIGESTS=0 opt-out; new templates_src/.gitignore.jinja rendered to
templates/.gitignore (".gitignore" added to renderer _CLAUDE_SHIPPED_ONLY).
- Tests: skill-count 14->15; VC1/VC3 file_copier updated (map-state + map-memory-now both
require git -> 2 skips); 4 memory-hook smoke cases.
render + check-render green; full suite 2036 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tests/test_memory_integration.py: drives the real hook BINARIES as subprocesses (capture×2 → finalize[new sid, NO SessionEnd] → recall) with a fake `claude` executable injected on PATH (mocks claude -p; unconditional, no skipif). Asserts: 2 turn records → exactly one digest .md containing the mocked body → .finalized marker + scratch deleted → memory-cost.log with input_tokens (VC4) → recall stdout additionalContext contains the digest body. Strips MAP_INVOKED_BY from the subprocess env so the recursion guard cannot silently no-op the hooks (proves the pipeline really ran). No token_accounting.json assertion (Decision 9 descope). make check (lint + test + check-render) green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-verifier flagged a wording inaccuracy in the map-memory-now SKILL.md Notes: finalize_dirty never stages or commits in any mode (there is no git add/commit in the memory modules). Reword the MAP_MEMORY_COMMIT_DIGESTS=0 note: digests are committed only because they are not git-ignored; the opt-out is uncommenting the .gitignore line. Re-rendered; check-render green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… shebang
The 4 map-memory-* hooks shipped without the executable bit, so Claude Code's
direct shebang invocation ("$CLAUDE_PROJECT_DIR"/.claude/hooks/<name>.py) failed
at runtime with "/bin/sh: ... Permission denied" (Stop/SessionStart/UserPromptSubmit).
The python-based smoke/integration tests invoked hooks as `python3 <path>`, which
masked the missing bit.
Root fix (defense in depth):
- chmod +x the 4 hook .jinja sources (matches the existing convention, e.g.
context-meter.py.jinja; git tracks the bit and the renderer propagates it).
- Harden template_renderer._atomic_write_file: FORCE +x for .py/.sh under a
managed hooks/ dir regardless of source bit, so a hook .jinja that forgets the
executable bit still ships an executable hook (mirrors create_hook_files'
unconditional chmod on the install path). Implements the learned "preserve
executable bits after atomic temp-file writer" rule.
CI guards (test hooks the way the harness does):
- tests/test_hook_patterns.py::test_hook_is_executable — every .py/.sh hook in
all four trees (.claude, .codex, templates, templates/codex) must be X_OK.
- tests/hooks/test_hook_inventory_smoke.py::test_every_configured_hook_execs_via_shebang
— exec each settings.json-wired hook via its bare shebang path (no interpreter
prefix) and assert no PermissionError / 126 / 127.
- tests/test_template_render.py: a hook .jinja without +x still renders executable;
non-hook files are not force-marked.
Negative-proofed: removing +x makes both guard tests fail; re-render restores it.
make check (lint + test + check-render) green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…de absent) TestVC1MissingDepSkip and TestMapMemoryNowHostGate delegated non-target commands to the REAL requires-cmd checker. map-memory-now requires-cmd:[claude, git]; on CI runners `claude` is absent, so map-memory-now skipped on `claude` (not `git`), flipping the skip message and failing the assertion — while passing locally where `claude` is installed. Force the patched checker deterministically (target command absent, all others present) instead of delegating to PATH. Verified with the whole test file run under a claude-stripped PATH. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…subsystem Correctness: - finalize: slug disambiguation reserved suffix room before 32-char truncation so os.replace can no longer clobber another session's digest (#1) - finalize: derive slug from the `title` key and strip ```json fences in _parse_claude_output so decisions/findings survive fenced output (#3) - finalize: per-field redaction before YAML escaping; identifier fields (session_id/branch/date/slug) excluded so a long session_id is no longer rewritten to «redacted», keeping owner-line dedup working (#4) - recall: cap is rank-monotonic — break on first overflow so a lower-ranked smaller digest never jumps a dropped higher-ranked one (#5) - digest_schema: redact fine-grained github_pat_ tokens (#6); stop over-redacting pure-hex git SHAs while still catching mixed-case secrets (#7) - capture: advance the transcript <sid>.offset only AFTER the record write so a crash never skips a transcript range (#8) - capture: derive a fallback session id from the transcript stem instead of a shared unknown.jsonl bucket (#9) Behavior/perf: - settings: restore the UserPromptSubmit map-memory-recall registration so prompt-relevance ranking actually runs (#2) - capture: memoise _resolve_branch and tail-read the turn count (O(1)) on the hot path; finalize hook subprocess default timeout 60->50 so it stays below the 60s harness timeout and its cleanup runs (#10) Adds tests/test_memory_review_fixes.py (11 regression tests, one per finding). Full gate green: mypy, pyright src/ 0/0/0, lint-hooks, check-render, 2082 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
Code-review fixes pushed (commit 901d991)Resolved all 10 findings from a high-effort review of the memory subsystem: Correctness
Behavior/perf Adds 🤖 Generated with Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase E — Cross-Session Memory + Recall
Gives the MAP Framework durable cross-session memory: cheap per-turn capture into a scratch WAL, lazy LLM "digest" finalization on the next session start, and recall injection of relevant past digests — so a later session knows why an approach was chosen.
Architecture: write-ahead-log → lazy checkpoint (NOT flush-on-SessionEnd).
Stopis the only reliable durable-capture point;SessionStartcarries finalize + recall. Works with zeroSessionEnddependency (HC-2) and survives compaction (whichPreCompactdoes not reliably fire on, esp. 1M-token contexts).Built via
/map-efficient(8 subtasks, sequential RESEARCH → ACTOR → MONITOR, per-subtask commits). Final-verifier: PASS.What's in it
src/mapify_cli/memory/(new pure-runtime package):digest_schema.py— single-source field constants + redaction (sk-/gh_/base64/AKIA) + secret-path globs + control-char sanitizer (INV-7, Contract-First).capture.py— LLM-free per-turn scratch WAL append (subprocess-free branch resolve via.git/HEAD),on_session_endbest-effort marker.finalize.py— finalize-if-dirty:claude -p(argv form,MAP_INVOKED_BY=memory-finalizerecursion guard, hard timeout), atomic.md.tmp → os.replace → .finalized, per-branch flock, truncation-tolerant, empty→no-digest,memory-cost.log.recall.py— current-branch keyword+recency ranking,MAP_MEMORY_RECALL_CAP(default 4000 chars) with whole-digest drops →recall-drop.log, sanitizedadditionalContext.map-memory-{capture,finalize,recall,endmark}.py) authored as.jinja, registered insettings.json(Stop / SessionStart finalize→recall / SessionEnd / UserPromptSubmit),lint-hooks.py, and both hook doc tables.map-memory-nowskill — on-demand finalize /--finalize-all,requires-cmd: [claude, git], host-gate pruned whenclaudeabsent (EC-4)..gitignore— ignores the scratch WAL; documents theMAP_MEMORY_COMMIT_DIGESTS=0opt-out.+x(the harness execs them via shebang); renderer force-sets+xfor hook.py/.sh, plus CI guards that exec hooks the way the harness does (shebang, notpython3 <path>).Tests / gates
claudeon PATH; asserts the digest surfaces inadditionalContextandmemory-cost.logis written).make check(ruff + mypy + pyright + full pytest +check-render) green; full suite 2037 passed.Notes
token_accounting.jsonis intentionally deferred (Decision 9); finalize writesmemory-cost.loginstead.templates_src/**/*.jinja→make render-templates;check-renderenforced).🤖 Generated with Claude Code