feat(exposure): route student_public through snapshot-safe export; bundle schema v5 (PR 2.2)#64
Conversation
…ot-safe export; bundle schema v5 (PR 2.2) Wire the structural fix produced by PR 2.1 into the writer and validator so alpha public bundles stop leaking ``converted_within_90_days`` via joins. Bumps ``BUNDLE_SCHEMA_VERSION`` to ``"5"`` and adds the new self-describing ``relational_snapshot_safe`` flag on the manifest. Writer (``leadforge/api/bundle.py``) - Calls ``to_dataframes_snapshot_safe(dfs, snapshot_day=...)`` for ``student_public`` after ``to_dataframes(...)``; ``research_instructor`` keeps the full-horizon export. ``BANNED_LEAD_COLUMNS`` / ``BANNED_OPP_COLUMNS`` and feature-level redaction operate on disjoint columns, so they neither double-emit nor overlap. Manifest row counts are computed on the post-redaction dict. Raises ``ValueError`` if ``snapshot_day=None`` for a snapshot-safe mode. Filter / manifest - ``BundleFilter`` gains ``relational_snapshot_safe`` (student_public True; research_instructor False); the writer reads this from ``get_filter(mode)``. - ``manifest.relational_snapshot_safe`` recorded so a tool reading a v5 bundle can tell from the manifest alone whether ``tables/`` is the snapshot-safe (public) or full-horizon (instructor) shape. Validator - ``validate_bundle`` now runs ``run_all_probes`` on student_public bundles only (instructor retains hidden truth by design); bonus probe stays off (PR 3.3 will calibrate per-tier bands). Findings are rendered as one error string per finding through the existing list-of- strings contract. - ``_check_fk_integrity`` silently skips FK constraints involving the expected-absent ``BANNED_TABLES`` for snapshot-safe bundles. - ``check_exposure_monotonicity`` now allows: customers/subscriptions in instructor only; ``BANNED_LEAD_COLUMNS`` / ``BANNED_OPP_COLUMNS`` on instructor's leads/opportunities; event-table row-subsets in the snapshot-filtered tables (``touches`` / ``sessions`` / ``sales_activities`` / ``opportunities``). Tests - Renamed v4 contract test → ``test_bundle_schema_v5_contract.py`` and refreshed the pinned column / table sets for the new shape. - New ``tests/test_pr_2_2_integration.py``: round-trip both modes validate clean, manifest contract (v5 + flag + post-redaction row counts), tampered-public negative case (instructor copy moved into the public slot must trip banned-column / banned-table / join-reconstruction channels), and pinned-timestamp hash determinism for both modes. - ``tests/render/test_render.py``: ``_make_config`` pins ``snapshot_day=30`` to match the writer contract. All 1037 tests pass; ruff + mypy clean.
Records v5 schema bump, snapshot-safe routing through the writer + validator, regenerated public bundles (3 tiers), instructor bundle keeps full-horizon, regression gate (``probe_relational_leakage --max-accuracy 0.65``) flips to exit 0 on all public tiers, and 67/67 hash-determinism preserved.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Routes student_public bundle exports through the new snapshot-safe relational projection, updates bundle manifests to schema v5, and expands validation/tests so public bundles are checked for relational leakage while instructor bundles keep the full-horizon shape.
Changes:
- Wire
student_publicwrites through snapshot-safe relational export and record the newrelational_snapshot_safemanifest flag. - Extend bundle validation and exposure monotonicity checks for banned tables/columns and snapshot-filtered relational tables.
- Replace the v4 schema contract test with a v5 contract suite and add end-to-end integration coverage for both exposure modes.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_pr_2_2_integration.py |
Adds end-to-end tests for public/instructor bundle generation, validation, tampering detection, and hash determinism. |
tests/render/test_render.py |
Updates render test helper config to pin snapshot_day=30. |
tests/render/test_bundle_schema_v5_contract.py |
Adds the new v5 schema contract assertions for manifests, tables, and columns. |
tests/render/test_bundle_schema_v4_contract.py |
Removes the obsolete v4 contract test. |
leadforge/validation/invariants.py |
Teaches exposure monotonicity checks about banned tables/columns and snapshot-filtered row subsets. |
leadforge/validation/bundle_checks.py |
Adds relational leakage validation and snapshot-safe FK-skip handling. |
leadforge/render/manifests.py |
Bumps manifest schema to v5 and emits relational_snapshot_safe. |
leadforge/exposure/filters.py |
Adds per-mode relational_snapshot_safe policy to bundle filters. |
leadforge/api/bundle.py |
Routes public bundle table writes through to_dataframes_snapshot_safe and threads the manifest flag. |
.agent-plan.md |
Marks PR 2.2 plan items complete and records shipped behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…tonicity, test cleanup
Self-review pass on PR 2.2 surfaced several rough edges; this commit
addresses them.
Manifest self-description (Fix B)
- ``manifest.structural_redactions`` enumerates the table-level drops
applied by the snapshot-safe export (``columns: {leads, opportunities}``
+ ``omitted_tables``). Together with ``relational_snapshot_safe`` and
``redacted_columns`` (snapshot-feature redactions), the manifest now
fully describes what the writer dropped without consumers needing to
read package internals. v5 contract test extended.
Validator coverage when snapshot_day missing (Fix C)
- ``_check_relational_leakage`` no longer skips every probe when
``manifest.snapshot_day`` is missing or malformed. The structural
probes (banned columns / banned tables / deterministic-reconstruction)
do not depend on snapshot_day and now run in that case; only the
snapshot-window probe is skipped, with an explicit ``snapshot_window``-
channel finding so the gap is visible.
Robust monotonicity row-subset (Fix D)
- ``check_exposure_monotonicity`` now verifies the snapshot-filtered
row-subset relationship via primary keys (``touch_id`` / ``session_id``
/ ``activity_id`` / ``opportunity_id``) instead of an all-column
left-merge. PK-set inclusion is cheaper, correct under NaN, and
reads better.
Test cleanup (Fix E + F + G)
- ``tests/test_pr_2_2_integration.py`` →
``tests/integration/test_snapshot_safe_bundle.py`` (no PR-named
files; future-readable location).
- Negative-case test now asserts the *details* of the findings (every
banned column / table is named in the report; path B fires) rather
than just channel-set membership.
- Dropped the parametrized hash-determinism test from the integration
file; instead extended ``test_invariants.py::TestDeterminism`` with
a ``research_instructor`` case so both modes are covered in the
canonical home for determinism tests.
CI gate test
- ``test_cli_max_accuracy_gate_fails_on_alpha`` →
``test_cli_max_accuracy_gate_passes_on_v5_bundle``: PR 2.2 fixed the
leak the test was originally pointed at, so the regenerated bundle
must now exit 0 instead of 2.
Fix A (post-init check) considered and reverted
- A ``GenerationConfig.__post_init__`` check for
``student_public + snapshot_day=None`` would surface the
misconfiguration earlier, but its blast radius is too wide:
``ExposureMode`` defaults to ``student_public``, so any test that
builds a bare ``GenerationConfig()`` for purposes unrelated to
bundle writing (population / snapshot / threading tests) would
break. The writer-level ``ValueError`` already raises with a
clear message before any disk write — kept that, dropped the
post-init duplicate.
Verification
- 1033 tests pass; ruff + mypy clean.
- Bundles regenerated; hash determinism re-verified (67/67 files).
- Regression gate ``probe_relational_leakage.py --max-accuracy 0.65``
exits 0 on intro / intermediate / advanced.
This comment has been minimized.
This comment has been minimized.
…otonicity tests Addresses both Copilot review comments on PR #64. Strict-bool guard (bundle_checks.py) - ``_check_fk_integrity`` no longer coerces ``manifest.relational_snapshot_safe`` via ``bool(...)``: a JSON string ``"false"`` was truthy and would silently suppress FK errors on a corrupted manifest. Require an actual JSON boolean and surface a typed error otherwise; default to False (i.e. don't trust a malformed flag to expand the expected- absent set). - New test in tests/validation/test_bundle_checks.py asserting a ``"false"`` string trips the strict check. Focused monotonicity tests (test_invariants.py) - New TestExposureMonotonicitySnapshotSafe class with 9 unit tests using synthetic parquet bundles in tmp_path: * omitted BANNED_TABLES (customers / subscriptions) is allowed * non-banned extra instructor table is rejected * dropped BANNED_LEAD_COLUMNS / BANNED_OPP_COLUMNS are allowed * non-banned extra leads column is rejected * PK-based row-subset on touches is allowed (student PK ⊂ instructor PK) * student PK absent from instructor is rejected * student row count > instructor is rejected * dropped PK column is caught by the column-diff check (with a clear test name documenting which path is reached) The new tests exercise the snapshot-safe branches directly without running the full generator, so a regression in the table-comparison logic now fails immediately with a localised error. All 1043 tests pass; ruff + mypy clean.
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #64 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.Run metadata: |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if is_snapshot_filtered and len(s_df) != len(i_df): | ||
| # Row-subset by primary key. Each student row was derived | ||
| # from instructor by row-filtering, so the PK relationship | ||
| # is the strongest invariant we can assert without | ||
| # depending on column-by-column equality (which is fragile | ||
| # under NaN). | ||
| pk = snapshot_filtered_pks.get(table_name) | ||
| if pk is None or pk not in s_df.columns or pk not in i_df.columns: | ||
| errors.append( | ||
| f"Table {table}: snapshot-filtered table missing expected " | ||
| f"primary key {pk!r}; cannot verify row-subset" | ||
| ) | ||
| continue | ||
| student_pks = set(s_df[pk].tolist()) | ||
| instructor_pks = set(i_df[pk].tolist()) | ||
| orphans = student_pks - instructor_pks | ||
| if orphans: | ||
| sample = sorted(orphans)[:3] | ||
| errors.append( | ||
| f"Table {table}: {len(orphans)} student {pk}(s) absent from instructor, " | ||
| f"e.g. {sample} (snapshot-safe export must be a row-subset)" | ||
| ) |
| "relational_snapshot_safe": bool(relational_snapshot_safe), | ||
| "structural_redactions": _build_structural_redactions(bool(relational_snapshot_safe)), |
Summary
converted_within_90_daysvia joins. Bundle schema bumps4 → 5; manifest gains the self-describingrelational_snapshot_safe: boolflag.student_publicwrites route throughto_dataframes_snapshot_safe(dfs, snapshot_day=…), droppingBANNED_LEAD_COLUMNS/BANNED_OPP_COLUMNS, omittingcustomers/subscriptions, and filtering event tables per-lead tolead_created_at + snapshot_day.research_instructorkeeps the full-horizon export.validate_bundle()now runsrun_all_probeson public bundles (bonus probe stays off — PR 3.3 calibrates) and surfaces findings through the existing list-of-strings contract; CLI output showsRelational leakage [channel] detail: messageper finding.Acceptance evidence
python scripts/probe_relational_leakage.py release/{intro,intermediate,advanced} --max-accuracy 0.65exits 0 on every public tier (was exiting 2 on alpha bundles); all path prediction rates A-E = 0.000.python scripts/verify_hash_determinism.pypasses — 67/67 files identical across pinned-timestamp runs (down from 73/73 because public bundles dropcustomers.parquet+subscriptions.parquet× 3 tiers).release/intermediate_instructor/retains the full-horizon shape: 9 tables includingcustomers/subscriptions,relational_snapshot_safe = false,close_outcome/closed_at/converted_within_90_daysretained.bundle_schema_version = "5",relational_snapshot_safe = true,snapshot_day = 30, 7 tables (nocustomers/subscriptions),redacted_columns = [\"current_stage\", \"is_sql\"].What changed
Writer (
leadforge/api/bundle.py) — callsto_dataframes_snapshot_safe(dfs, snapshot_day=...)forstudent_publicafterto_dataframes(...). Feature-level redaction (current_stage/is_sql) andBANNED_*constants operate on disjoint columns — no double-emit, no overlap. Manifest row counts come from the post-redaction dict. RaisesValueErrorifsnapshot_day=Nonefor a snapshot-safe mode.leadforge/exposure/filters.py—BundleFiltergainsrelational_snapshot_safe(student_public True; research_instructor False). The writer reads it fromget_filter(mode).leadforge/render/manifests.py—BUNDLE_SCHEMA_VERSION\"4\" → \"5\";build_manifestaccepts and emits the new top-levelrelational_snapshot_safefield.leadforge/validation/bundle_checks.py— new_check_relational_leakagecallsrun_all_probes(bundle_dir, snapshot_day=…)onstudent_publicbundles only;research_instructorretains hidden truth by design and is skipped._check_fk_integritysilently skips FK constraints involvingBANNED_TABLESfor snapshot-safe bundles (those tables are intentionally absent).leadforge/validation/invariants.py—check_exposure_monotonicitynow allows: instructor-onlyBANNED_TABLES, instructor-onlyBANNED_LEAD_COLUMNS/BANNED_OPP_COLUMNS, and student row-subsets onSNAPSHOT_FILTERED_TABLES(touches / sessions / sales_activities / opportunities). Unmatched student rows are rejected via an indicator merge.Tests — renamed v4 contract test →
test_bundle_schema_v5_contract.pyand refreshed pinned column / table sets (15 tests). Newtests/test_pr_2_2_integration.py(17 tests): round-trip both modes validate clean, manifest contract, tampered-public negative case (instructor copy moved into public slot must trip banned-column / banned-table / join-reconstruction channels), and pinned-timestamp hash determinism for both modes.tests/render/test_render.py—_make_config()pinssnapshot_day=30to match the writer contract.Out of scope (later PRs)
leadforge.validation.leakage_probesmodule (PR 3.1).Test plan
pytest(1037 passed)ruff check/ruff format --check/mypycleanpython scripts/build_public_release.py→ 4 bundles, all validatepython scripts/verify_hash_determinism.py— PASS (67/67)python scripts/probe_relational_leakage.py release/{intro,intermediate,advanced} --max-accuracy 0.65— exit 0 on all three🤖 Generated with Claude Code