Skip to content

feat(exposure): route student_public through snapshot-safe export; bundle schema v5 (PR 2.2)#64

Merged
shaypal5 merged 4 commits into
mainfrom
feat/route-snapshot-safe-bundle-v5
May 5, 2026
Merged

feat(exposure): route student_public through snapshot-safe export; bundle schema v5 (PR 2.2)#64
shaypal5 merged 4 commits into
mainfrom
feat/route-snapshot-safe-bundle-v5

Conversation

@shaypal5

@shaypal5 shaypal5 commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Wire the structural fix produced by PR 2.1 into the writer + validator so alpha public bundles stop leaking converted_within_90_days via joins. Bundle schema bumps 4 → 5; manifest gains the self-describing relational_snapshot_safe: bool flag.
  • student_public writes route through to_dataframes_snapshot_safe(dfs, snapshot_day=…), dropping BANNED_LEAD_COLUMNS / BANNED_OPP_COLUMNS, omitting customers / subscriptions, and filtering event tables per-lead to lead_created_at + snapshot_day. research_instructor keeps the full-horizon export.
  • validate_bundle() now runs run_all_probes on public bundles (bonus probe stays off — PR 3.3 calibrates) and surfaces findings through the existing list-of-strings contract; CLI output shows Relational leakage [channel] detail: message per finding.

Acceptance evidence

  • All 1037 tests pass; ruff + mypy clean.
  • python scripts/probe_relational_leakage.py release/{intro,intermediate,advanced} --max-accuracy 0.65 exits 0 on every public tier (was exiting 2 on alpha bundles); all path prediction rates A-E = 0.000.
  • python scripts/verify_hash_determinism.py passes — 67/67 files identical across pinned-timestamp runs (down from 73/73 because public bundles drop customers.parquet + subscriptions.parquet × 3 tiers).
  • release/intermediate_instructor/ retains the full-horizon shape: 9 tables including customers / subscriptions, relational_snapshot_safe = false, close_outcome / closed_at / converted_within_90_days retained.
  • Public manifests: bundle_schema_version = "5", relational_snapshot_safe = true, snapshot_day = 30, 7 tables (no customers / subscriptions), redacted_columns = [\"current_stage\", \"is_sql\"].

What changed

Writer (leadforge/api/bundle.py) — calls to_dataframes_snapshot_safe(dfs, snapshot_day=...) for student_public after to_dataframes(...). Feature-level redaction (current_stage / is_sql) and BANNED_* constants operate on disjoint columns — no double-emit, no overlap. Manifest row counts come from the post-redaction dict. Raises ValueError if snapshot_day=None for a snapshot-safe mode.

leadforge/exposure/filters.pyBundleFilter gains relational_snapshot_safe (student_public True; research_instructor False). The writer reads it from get_filter(mode).

leadforge/render/manifests.pyBUNDLE_SCHEMA_VERSION \"4\" → \"5\"; build_manifest accepts and emits the new top-level relational_snapshot_safe field.

leadforge/validation/bundle_checks.py — new _check_relational_leakage calls run_all_probes(bundle_dir, snapshot_day=…) on student_public bundles only; research_instructor retains hidden truth by design and is skipped. _check_fk_integrity silently skips FK constraints involving BANNED_TABLES for snapshot-safe bundles (those tables are intentionally absent).

leadforge/validation/invariants.pycheck_exposure_monotonicity now allows: instructor-only BANNED_TABLES, instructor-only BANNED_LEAD_COLUMNS / BANNED_OPP_COLUMNS, and student row-subsets on SNAPSHOT_FILTERED_TABLES (touches / sessions / sales_activities / opportunities). Unmatched student rows are rejected via an indicator merge.

Tests — renamed v4 contract test → test_bundle_schema_v5_contract.py and refreshed pinned column / table sets (15 tests). New tests/test_pr_2_2_integration.py (17 tests): round-trip both modes validate clean, manifest contract, tampered-public negative case (instructor copy moved into public slot must trip banned-column / banned-table / join-reconstruction channels), and pinned-timestamp hash determinism for both modes.

tests/render/test_render.py_make_config() pins snapshot_day=30 to match the writer contract.

Out of scope (later PRs)

  • Calibrated bonus-model band (PR 3.3).
  • Unified leadforge.validation.leakage_probes module (PR 3.1).
  • Release-grade validation report + figures (PR 3.2 / 3.3).

Test plan

  • pytest (1037 passed)
  • ruff check / ruff format --check / mypy clean
  • python scripts/build_public_release.py → 4 bundles, all validate
  • python scripts/verify_hash_determinism.py — PASS (67/67)
  • python scripts/probe_relational_leakage.py release/{intro,intermediate,advanced} --max-accuracy 0.65 — exit 0 on all three

🤖 Generated with Claude Code

shaypal5 added 2 commits May 5, 2026 22:25
…ot-safe export; bundle schema v5 (PR 2.2)

Wire the structural fix produced by PR 2.1 into the writer and validator
so alpha public bundles stop leaking ``converted_within_90_days`` via
joins.  Bumps ``BUNDLE_SCHEMA_VERSION`` to ``"5"`` and adds the new
self-describing ``relational_snapshot_safe`` flag on the manifest.

Writer (``leadforge/api/bundle.py``)
- Calls ``to_dataframes_snapshot_safe(dfs, snapshot_day=...)`` for
  ``student_public`` after ``to_dataframes(...)``; ``research_instructor``
  keeps the full-horizon export.  ``BANNED_LEAD_COLUMNS`` /
  ``BANNED_OPP_COLUMNS`` and feature-level redaction operate on disjoint
  columns, so they neither double-emit nor overlap.  Manifest row counts
  are computed on the post-redaction dict.  Raises ``ValueError`` if
  ``snapshot_day=None`` for a snapshot-safe mode.

Filter / manifest
- ``BundleFilter`` gains ``relational_snapshot_safe`` (student_public True;
  research_instructor False); the writer reads this from
  ``get_filter(mode)``.
- ``manifest.relational_snapshot_safe`` recorded so a tool reading a v5
  bundle can tell from the manifest alone whether ``tables/`` is the
  snapshot-safe (public) or full-horizon (instructor) shape.

Validator
- ``validate_bundle`` now runs ``run_all_probes`` on student_public
  bundles only (instructor retains hidden truth by design); bonus
  probe stays off (PR 3.3 will calibrate per-tier bands).  Findings are
  rendered as one error string per finding through the existing list-of-
  strings contract.
- ``_check_fk_integrity`` silently skips FK constraints involving the
  expected-absent ``BANNED_TABLES`` for snapshot-safe bundles.
- ``check_exposure_monotonicity`` now allows: customers/subscriptions
  in instructor only; ``BANNED_LEAD_COLUMNS`` / ``BANNED_OPP_COLUMNS``
  on instructor's leads/opportunities; event-table row-subsets in the
  snapshot-filtered tables (``touches`` / ``sessions`` /
  ``sales_activities`` / ``opportunities``).

Tests
- Renamed v4 contract test → ``test_bundle_schema_v5_contract.py`` and
  refreshed the pinned column / table sets for the new shape.
- New ``tests/test_pr_2_2_integration.py``: round-trip both modes
  validate clean, manifest contract (v5 + flag + post-redaction row
  counts), tampered-public negative case (instructor copy moved into
  the public slot must trip banned-column / banned-table /
  join-reconstruction channels), and pinned-timestamp hash determinism
  for both modes.
- ``tests/render/test_render.py``: ``_make_config`` pins
  ``snapshot_day=30`` to match the writer contract.

All 1037 tests pass; ruff + mypy clean.
Records v5 schema bump, snapshot-safe routing through the writer +
validator, regenerated public bundles (3 tiers), instructor bundle
keeps full-horizon, regression gate
(``probe_relational_leakage --max-accuracy 0.65``) flips to exit 0 on
all public tiers, and 67/67 hash-determinism preserved.
Copilot AI review requested due to automatic review settings May 5, 2026 19:44
@shaypal5 shaypal5 added type: feature New capability layer: exposure exposure/ truth-mode filtering labels May 5, 2026
@shaypal5 shaypal5 added layer: render render/ bundle and artifact output layer: validation validation/ invariants and checks labels May 5, 2026
@github-actions

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Routes student_public bundle exports through the new snapshot-safe relational projection, updates bundle manifests to schema v5, and expands validation/tests so public bundles are checked for relational leakage while instructor bundles keep the full-horizon shape.

Changes:

  • Wire student_public writes through snapshot-safe relational export and record the new relational_snapshot_safe manifest flag.
  • Extend bundle validation and exposure monotonicity checks for banned tables/columns and snapshot-filtered relational tables.
  • Replace the v4 schema contract test with a v5 contract suite and add end-to-end integration coverage for both exposure modes.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_pr_2_2_integration.py Adds end-to-end tests for public/instructor bundle generation, validation, tampering detection, and hash determinism.
tests/render/test_render.py Updates render test helper config to pin snapshot_day=30.
tests/render/test_bundle_schema_v5_contract.py Adds the new v5 schema contract assertions for manifests, tables, and columns.
tests/render/test_bundle_schema_v4_contract.py Removes the obsolete v4 contract test.
leadforge/validation/invariants.py Teaches exposure monotonicity checks about banned tables/columns and snapshot-filtered row subsets.
leadforge/validation/bundle_checks.py Adds relational leakage validation and snapshot-safe FK-skip handling.
leadforge/render/manifests.py Bumps manifest schema to v5 and emits relational_snapshot_safe.
leadforge/exposure/filters.py Adds per-mode relational_snapshot_safe policy to bundle filters.
leadforge/api/bundle.py Routes public bundle table writes through to_dataframes_snapshot_safe and threads the manifest flag.
.agent-plan.md Marks PR 2.2 plan items complete and records shipped behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread leadforge/validation/bundle_checks.py Outdated
Comment thread leadforge/validation/invariants.py
…tonicity, test cleanup

Self-review pass on PR 2.2 surfaced several rough edges; this commit
addresses them.

Manifest self-description (Fix B)
- ``manifest.structural_redactions`` enumerates the table-level drops
  applied by the snapshot-safe export (``columns: {leads, opportunities}``
  + ``omitted_tables``).  Together with ``relational_snapshot_safe`` and
  ``redacted_columns`` (snapshot-feature redactions), the manifest now
  fully describes what the writer dropped without consumers needing to
  read package internals.  v5 contract test extended.

Validator coverage when snapshot_day missing (Fix C)
- ``_check_relational_leakage`` no longer skips every probe when
  ``manifest.snapshot_day`` is missing or malformed.  The structural
  probes (banned columns / banned tables / deterministic-reconstruction)
  do not depend on snapshot_day and now run in that case; only the
  snapshot-window probe is skipped, with an explicit ``snapshot_window``-
  channel finding so the gap is visible.

Robust monotonicity row-subset (Fix D)
- ``check_exposure_monotonicity`` now verifies the snapshot-filtered
  row-subset relationship via primary keys (``touch_id`` / ``session_id``
  / ``activity_id`` / ``opportunity_id``) instead of an all-column
  left-merge.  PK-set inclusion is cheaper, correct under NaN, and
  reads better.

Test cleanup (Fix E + F + G)
- ``tests/test_pr_2_2_integration.py`` →
  ``tests/integration/test_snapshot_safe_bundle.py`` (no PR-named
  files; future-readable location).
- Negative-case test now asserts the *details* of the findings (every
  banned column / table is named in the report; path B fires) rather
  than just channel-set membership.
- Dropped the parametrized hash-determinism test from the integration
  file; instead extended ``test_invariants.py::TestDeterminism`` with
  a ``research_instructor`` case so both modes are covered in the
  canonical home for determinism tests.

CI gate test
- ``test_cli_max_accuracy_gate_fails_on_alpha`` →
  ``test_cli_max_accuracy_gate_passes_on_v5_bundle``: PR 2.2 fixed the
  leak the test was originally pointed at, so the regenerated bundle
  must now exit 0 instead of 2.

Fix A (post-init check) considered and reverted
- A ``GenerationConfig.__post_init__`` check for
  ``student_public + snapshot_day=None`` would surface the
  misconfiguration earlier, but its blast radius is too wide:
  ``ExposureMode`` defaults to ``student_public``, so any test that
  builds a bare ``GenerationConfig()`` for purposes unrelated to
  bundle writing (population / snapshot / threading tests) would
  break.  The writer-level ``ValueError`` already raises with a
  clear message before any disk write — kept that, dropped the
  post-init duplicate.

Verification
- 1033 tests pass; ruff + mypy clean.
- Bundles regenerated; hash determinism re-verified (67/67 files).
- Regression gate ``probe_relational_leakage.py --max-accuracy 0.65``
  exits 0 on intro / intermediate / advanced.
@github-actions

This comment has been minimized.

…otonicity tests

Addresses both Copilot review comments on PR #64.

Strict-bool guard (bundle_checks.py)
- ``_check_fk_integrity`` no longer coerces ``manifest.relational_snapshot_safe``
  via ``bool(...)``: a JSON string ``"false"`` was truthy and would
  silently suppress FK errors on a corrupted manifest.  Require an
  actual JSON boolean and surface a typed error otherwise; default to
  False (i.e. don't trust a malformed flag to expand the expected-
  absent set).
- New test in tests/validation/test_bundle_checks.py asserting a
  ``"false"`` string trips the strict check.

Focused monotonicity tests (test_invariants.py)
- New TestExposureMonotonicitySnapshotSafe class with 9 unit tests
  using synthetic parquet bundles in tmp_path:
  * omitted BANNED_TABLES (customers / subscriptions) is allowed
  * non-banned extra instructor table is rejected
  * dropped BANNED_LEAD_COLUMNS / BANNED_OPP_COLUMNS are allowed
  * non-banned extra leads column is rejected
  * PK-based row-subset on touches is allowed (student PK ⊂ instructor PK)
  * student PK absent from instructor is rejected
  * student row count > instructor is rejected
  * dropped PK column is caught by the column-diff check (with
    a clear test name documenting which path is reached)

The new tests exercise the snapshot-safe branches directly without
running the full generator, so a regression in the table-comparison
logic now fails immediately with a localised error.

All 1043 tests pass; ruff + mypy clean.
Copilot AI review requested due to automatic review settings May 5, 2026 20:05
@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #64 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25399429293 attempt 1
Comment timestamp: 2026-05-05T20:05:59.432048+00:00
PR head commit: e692b19d2de2182fa802c97da30df6ce84d480c5

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +287 to +308
if is_snapshot_filtered and len(s_df) != len(i_df):
# Row-subset by primary key. Each student row was derived
# from instructor by row-filtering, so the PK relationship
# is the strongest invariant we can assert without
# depending on column-by-column equality (which is fragile
# under NaN).
pk = snapshot_filtered_pks.get(table_name)
if pk is None or pk not in s_df.columns or pk not in i_df.columns:
errors.append(
f"Table {table}: snapshot-filtered table missing expected "
f"primary key {pk!r}; cannot verify row-subset"
)
continue
student_pks = set(s_df[pk].tolist())
instructor_pks = set(i_df[pk].tolist())
orphans = student_pks - instructor_pks
if orphans:
sample = sorted(orphans)[:3]
errors.append(
f"Table {table}: {len(orphans)} student {pk}(s) absent from instructor, "
f"e.g. {sample} (snapshot-safe export must be a row-subset)"
)
Comment on lines +145 to +146
"relational_snapshot_safe": bool(relational_snapshot_safe),
"structural_redactions": _build_structural_redactions(bool(relational_snapshot_safe)),
@shaypal5 shaypal5 merged commit bf9a105 into main May 5, 2026
12 checks passed
@shaypal5 shaypal5 deleted the feat/route-snapshot-safe-bundle-v5 branch May 5, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: exposure exposure/ truth-mode filtering layer: render render/ bundle and artifact output layer: validation validation/ invariants and checks type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants