Skip to content

feat: context_intelligence base_path relocation (multiplexed-safe)#48

Open
colombod wants to merge 4 commits into
mainfrom
feat/context-intelligence-base-path-relocation
Open

feat: context_intelligence base_path relocation (multiplexed-safe)#48
colombod wants to merge 4 commits into
mainfrom
feat/context-intelligence-base-path-relocation

Conversation

@colombod

@colombod colombod commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Make a relocated context_intelligence base_path tell the truth end-to-end — what the hook writes is what every reader finds — and kill the confident false-positive where a reader latched onto Amplifier core's sessions//metadata.json (reported as a context_intelligence capture).

Mechanism (multiplexed-safe by construction):

  • AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH is a read-only host/process INPUT. The bundle NEVER writes os.environ (zero assignments) — so concurrent sessions in one process cannot clobber each other.
  • A one-line in-bundle binding base_path: "${AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH:}" in behaviors/context-intelligence-logging.yaml carries relocation into the writer's config (config_resolver stays pure; fold-discipline gate green).
  • ONE shared canonicalizer (strip -> empty/relative -> default -> expanduser -> absolute-or-default, never cwd) applied identically by writer and readers.
  • ONE shared capture helper keyed on /sessions//context-intelligence/events.jsonl (fixed-shape; subsessions counted). The events.jsonl marker is the false-positive guard. discover.py and the workflow recipe agree on the same set.
  • discover.py returns DiskScanResult{root, root_exists, disk_only_ids, candidate_ids}; absent root is authoritative in the return (no silent zero).
  • Mandatory read-only consistency check in on_session_ready warns loudly if the env value and the resolved base_path disagree.

Behaviorally proven in a Digital Twin (no skill pytest): relocation via the in-bundle binding, false-positive dead, reader convergence, multiplexed-safe invariant (zero env writes), and the consistency check firing — all PASS.
Fold-gate 3/3; discover tests pass.

Known residual: relocation granularity is per-process, not per-session (a host needing different roots per session in one process must use separate processes).

Follow-ups (deferred, intentionally not in this PR): the base-path drift-guard lint, the broad doc-narrative sweep, and the uploader workspace-derivation edge.

Behavioral proof (Digital Twin)

All rows PASSED in the behavioral validation matrix.

Supersedes #37

This PR delivers the base_path-relocation intent of #37 with all needed pieces (multiplexed-safe, behaviorally proven in a Digital Twin). Closing #37 in favor of it.

🤖 Generated with Amplifier

Co-Authored-By: Amplifier 240397093+microsoft-amplifier@users.noreply.github.com


⚠ Breaking change (public API)

context_intelligence.discover_sessions() return type changed:

  • Before: tuple[list[dict], list[str]](graph_sessions, disk_only_ids)
  • After: tuple[list[dict], DiskScanResult](graph_sessions, scan)

DiskScanResult is intentionally not list-like (it forces callers to branch on
scan.root_exists instead of mistaking an empty list for success). External callers must
migrate:

# before
sessions, disk_only = discover_sessions(client, ws, sessions_dir)
# after
sessions, scan = discover_sessions(client, ws, sessions_dir)
if not scan.root_exists:
    ...  # root absent — distinct from "found zero"
disk_only = scan.disk_only_ids

⚠ Truthiness/equality, not just unpacking. DiskScanResult defines no __bool__, so it is
always truthy and never == []. Legacy code that tested the old list directly changes
meaning silently — audit those sites, not only the unpacking line:

# before (old list)        ->  after (operate on .disk_only_ids)
if disk_only:              ->  if scan.disk_only_ids:
if disk_only == []:        ->  if not scan.disk_only_ids:
for sid in disk_only:      ->  for sid in scan.disk_only_ids:

Review fixes (addressing review of this PR)

  • Bidirectional consistency check — the §C.3 startup check now also fires when the env
    var is unset but the writer resolved a non-default base_path (i.e. relocation via
    config.base_path, which the env-only readers cannot see). Previously gated if _env_raw:,
    so that silent split went unwarned.
  • Unexpanded-placeholder guard — the hook resolver now treats a literal ${...}
    base_path (host app did not expand the binding) as default silently instead of warning
    every session as a bogus relative path; a genuinely-intended relocation still warns LOUD via
    the consistency check. The behavior YAML documents the app-${VAR:}-expansion requirement.
  • Recipe workspace filterworkflow-pattern-analysis.yaml now matches the workspace
    field (falling back from the fast slug path), so explicit-workspace setups are no longer
    silently missed.
  • Marker alignment — navigation skill + navigator agent clarify events.jsonl is the
    canonical capture marker (matching the Python readers); metadata.json is for field reads.
  • Relocation docs — skills/agent now state relocation is reader-visible only via
    AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH; config.base_path is writer-side.
  • Dead code — removed unused capture_paths_under_base().

Council remediations (six-lens review)

A six-lens council review (intent / simplicity / cost / reality / user / breaker) was run on
the review-fix changeset. It returned no blockers; the items below close its recorded asks.

  • Consistency check is test-pinned and proven firing. The divergence comparison was
    extracted into a pure, unit-testable helper reader_writer_roots_disagree, and the writer ≡
    reader canonicalizer (two copies, kept separate by the fold gate) is pinned by
    tests/test_base_path_parity.py. A new end-to-end suite drives the real on_session_ready
    against the real amplifier_core runtime and asserts the LOUD divergence warning actually
    fires (env unset + config.base_path relocated), stays silent when env matches the writer,
    and silent-defaults on an unexpanded ${...}.
  • Operator-visible confirmation (positive feedback at the moment of action). When relocation
    is in effect, on_session_ready now logs (INFO — the bundle's default level) the active
    capture root: capturing to <root> (readers resolve the same root from AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH). Relocation is per-process, not per-session. It stays
    silent in the default (non-relocated) case. This closes the "success and silent
    misconfiguration look identical" gap; the per-process limitation is surfaced at the
    behavior-YAML touchpoint.
  • Reader-marker split-brain killed in code. The bash readers (navigation skill, navigator
    agent, workflow-pattern-analysis skill) now enumerate captures by the canonical
    events.jsonl marker
    — matching the Python readers — and read fields from the sibling
    metadata.json. Previously they enumerated by metadata.json, which disagreed with the code
    on partial-write edges.
  • Recipe robustness. The workspace-field fallback bounds its readline (1 MB) and guards
    isinstance(dict) so a malformed/over-long first line can neither blow up memory nor silently
    drop a session.
  • Cross-reference comments link the two by-design-duplicated canonicalizer copies and the
    parity test, so a future edit to one cannot silently drift from the other.

Parked by the council's own recommendation: a full container DTU (the real-runtime harness met
the reality gate) and the CI-lane importorskip signal for the end-to-end suite (the core logic
is already pinned in the always-run root suite).

Verification: python_check clean; 747 root tests pass; 5 end-to-end consistency +
confirmation tests pass under the real amplifier_core runtime.

Make a relocated context_intelligence base_path tell the truth end-to-end —
what the hook writes is what every reader finds — and kill the confident
false-positive where a reader latched onto Amplifier core's
sessions/<id>/metadata.json (reported as a context_intelligence capture).

Mechanism (multiplexed-safe by construction):
- AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH is a read-only host/process INPUT.
  The bundle NEVER writes os.environ (zero assignments) — so concurrent
  sessions in one process cannot clobber each other.
- A one-line in-bundle binding base_path: "${AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH:}"
  in behaviors/context-intelligence-logging.yaml carries relocation into the
  writer's config (config_resolver stays pure; fold-discipline gate green).
- ONE shared canonicalizer (strip -> empty/relative -> default -> expanduser ->
  absolute-or-default, never cwd) applied identically by writer and readers.
- ONE shared capture helper keyed on */sessions/*/context-intelligence/events.jsonl
  (fixed-shape; subsessions counted). The events.jsonl marker is the
  false-positive guard. discover.py and the workflow recipe agree on the same set.
- discover.py returns DiskScanResult{root, root_exists, disk_only_ids,
  candidate_ids}; absent root is authoritative in the return (no silent zero).
- Mandatory read-only consistency check in on_session_ready warns loudly if the
  env value and the resolved base_path disagree.

Behaviorally proven in a Digital Twin (no skill pytest): relocation via the
in-bundle binding, false-positive dead, reader convergence, multiplexed-safe
invariant (zero env writes), and the consistency check firing — all PASS.
Fold-gate 3/3; discover tests pass.

Known residual: relocation granularity is per-process, not per-session (a host
needing different roots per session in one process must use separate processes).

Follow-ups (deferred, intentionally not in this PR): the base-path drift-guard
lint, the broad doc-narrative sweep, and the uploader workspace-derivation edge.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@colombod

Copy link
Copy Markdown
Collaborator Author

Writer-side relocation — live E.3 proof (ROB gate)

The reader half is unit-proven (33 discover tests pass). The half that was not pinned by any test is the keystone this PR promises: the writer actually relocating through the in-bundle binding, which depends on the app-cli expanding ${AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH:} before the hook mounts — a cross-repo seam a bundle-internal test can't honestly touch. So I proved it live in a Digital Twin against this PR's head.

================================================================================
PR #48 — WRITER-SIDE base_path RELOCATION — LIVE E.3 EVIDENCE
================================================================================
Code under test : amplifier-bundle-context-intelligence @ PR#48 head
                  sha 13533482d52401c6e247efff053adeff311845fa
                  branch feat/context-intelligence-base-path-relocation
DTU instance    : ci-base-path-pr48-e3   (fresh; env-binding ONLY, no settings override)
Proof session   : 4dc1f9bb-91f0-4a39-ab45-47bc2c5d7369  (a real `amplifier run`)

PRE-CHECK — binding physically present in the composed bundle inside the DTU:
  behaviors/context-intelligence-logging.yaml:38
      base_path: "${AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH:}"

[1] ENV SET, NO settings.yaml override  -> PASS
  E3_SHELL_ENV=[/data/relocated-context-intelligence]
  INSIDE_ENV=[/data/relocated-context-intelligence]   (agent's own bash tool)
  Relocation is attributable to the binding alone — no settings override to hide behind.

[2] REAL `amplifier run` session in the demo-project  -> PASS
  Session 4dc1f9bb-...  | provider Anthropic claude-sonnet-4-6
  Final: e3-session-done.   AGENT_EXIT=0

[3] KEYSTONE — writer landed under RELOCATED root; default got nothing  -> PASS
  $ find /data/relocated-context-intelligence -path '*context-intelligence/events.jsonl'
  /data/relocated-context-intelligence/-root-work-demo-project/sessions/
      4dc1f9bb-91f0-4a39-ab45-47bc2c5d7369/context-intelligence/events.jsonl
  proof-id CI capture under RELOCATED : 1   (expect 1) OK
  proof-id CI capture under DEFAULT   : 0   (expect 0) OK
  (Note: Amplifier CORE's own session log still lands at the default root — that's core,
   not under test. The BUNDLE writer's context-intelligence/ segment is what relocated.)

[4] READER finds it at the RELOCATED root — two independent paths  -> PASS
  (A) discover.py (env-aware, DiskScanResult contract), env SET:
      RESOLVED_BASE_PATH=/data/relocated-context-intelligence
      ROOT_EXISTS=True  CANDIDATE_COUNT=1  DISK_ONLY_COUNT=1
      CANDIDATE=4dc1f9bb-...   <- matches the writer
      discover == recipe == helper  (all three agree)
      Negative control, env UNSET: RESOLVED=/root/.amplifier/projects, proof-id ABSENT
  (B) session-navigator skill glob (the exact agent fallback expression):
      CONTEXT_INTELLIGENCE_ROOT="${AMPLIFIER_CONTEXT_INTELLIGENCE_BASE_PATH:-$HOME/.amplifier/projects}"
       = /data/relocated-context-intelligence
      glob hit: .../4dc1f9bb-.../context-intelligence/events.jsonl   count=1 OK

[5] Consistency check (on_session_ready, section C.3) — env & writer agree => silent  -> PASS
      disagree_warning_matches = 0  OK

VERDICT: PASS (5/5). The cross-repo seam — app-cli expands the binding -> config_resolver
-> WRITER — is proven REAL against PR #48 head by a live `amplifier run`, env var as the
sole relocation driver. Reader follows the same env to the same root via two code paths.
================================================================================

Reproduce: amplifier-digital-twin exec ci-base-path-pr48-e3, then bash /root/run_e3.sh and the find commands above.

Deferred follow-up (intentionally not in this PR): the durable drift-guard / seam regression that makes writer != reader root impossible rather than merely loud. A bundle-internal unit test can't honestly exercise this cross-repo seam, so the live DTU run is the gate; the drift-guard is the compounding next step.

colombod and others added 3 commits June 30, 2026 19:57
discover_sessions now returns (rows, DiskScanResult); scripts/context-intelligence.py
consumes scan.root_exists / scan.disk_only_ids. The cmd_reconstruct test mocks still
returned the old (rows, list) shape, raising AttributeError: 'list' object has no
attribute 'root_exists' — the real cause of the red root-test CI on PR #48.
Updated all 14 mock sites to return a root_exists=True DiskScanResult, preserving
disk-only ids in disk_only_ids/candidate_ids. tests/ now 729 passed, 0 failed.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…ouncil

Addresses a code review and a six-lens council review of the base_path
relocation work:

- Consistency check now fires BIDIRECTIONALLY (also when env unset but the
  writer relocated via config.base_path, which env-only readers can't see);
  extracted as a pure, unit-tested helper `reader_writer_roots_disagree`.
- Writer treats an unexpanded `${...}` base_path as silent default (no per-
  session noise); behavior YAML documents the host `${VAR:}`-expansion contract.
- Operator-visible INFO confirmation of the active capture root when relocation
  is in effect (closes the "silent misconfiguration looks like success" gap);
  per-process limitation surfaced at the behavior-YAML touchpoint.
- Unified bash readers (navigation skill, navigator agent, workflow skill) onto
  the canonical `events.jsonl` capture marker; fields read from sibling
  metadata.json — kills the events.jsonl/metadata.json reader split-brain.
- Recipe workspace filter matches the `workspace` field (not just dir slug),
  with a bounded readline (1MB) and dict guard.
- Removed dead `capture_paths_under_base`; cross-reference comments link the two
  by-design-duplicated canonicalizer copies.
- New tests: tests/test_base_path_parity.py (writer≡reader parity + divergence
  cases) and an end-to-end consistency/confirmation suite run against the real
  amplifier_core runtime.

Verification: python_check clean; 747 root tests pass; 5 end-to-end consistency
tests pass under real amplifier_core.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…urce_hash)

bundle.dot was stale after the base_path-relocation review + council changes.
Regenerated with the structural bundle_repo_dot() generator (matching the
committed convention, not LLM-enhanced) and re-rendered bundle.png; diff is the
updated source_hash plus token-count deltas from the added doc/comment content.
Clears the BUNDLE_DOT_STALE warning from the full validate-bundle-repo run.

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant