chore: add v7 CI validation job and update agent plan

shaypal5 · claude · shaypal5 · commit 0d2fa7ba522c · 2026-05-03T16:47:12.000+03:00
Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/.agent-plan.md b/.agent-plan.md
@@ -162,6 +162,38 @@ Documentation + CI:
 - [x] `CHANGELOG.md` — `Unreleased` renamed to `v1.0.0 — (2026-05-02)`; milestone headings folded into collapsible development history
 - [x] `.agent-plan.md` — updated to reflect v1.0.0 release
 
+### v7: Purely causal leakage trap with canonical validation (PR #50)
+
+Engine changes:
+- [x] `leadforge/mechanisms/counts.py` — `LatentDecayIntensity` follow-up ramp: `followup_boost_after_day`, `followup_boost_factor`, `followup_ramp_days`, `followup_latent_weights` parameters; `_effective_boost(t)` and `_latent_multiplier(t, latents)` methods
+- [x] `leadforge/mechanisms/policies.py` — `_FOLLOWUP_LATENT_WEIGHTS` per motif family (budget_readiness, process_maturity, contact_authority); wired into `assign_mechanisms()` with `followup_boost_after_day=20, followup_boost_factor=10.0, followup_ramp_days=10`
+
+Build pipeline:
+- [x] `leadforge/pipelines/build_v7.py` — all pipeline functions (identical to v6 minus `boost_leakage_trap`); purely causal trap via `compute_post_snapshot_touches`
+- [x] `scripts/build_v7_snapshot.py` — CLI: generates both student + instructor CSVs
+- [x] `scripts/validate_v7_dataset.py` — validates both exports: basic checks, determinism, baseline AUC, tree improvement, value-aware ranking, trap delta (10 seeds), cohort split; honest thresholds for purely causal trap
+- [x] `scripts/quick_baseline_eval_v7.py` — LR + RF + GBM baselines, value-aware ranking, feature importance, trap detection
+
+Datasets:
+- [x] `lead_scoring_intro/lead_scoring_intro_v7.csv` — 1000 rows × 20 cols (student-safe, no leakage)
+- [x] `lead_scoring_intro/lead_scoring_intro_v7_instructor.csv` — 1000 rows × 21 cols (+ `__leakage__touches_post_snapshot_21_90`)
+
+Validation results:
+- [x] Baseline AUC: 0.625 (within [0.58, 0.90]; snapshot day 20)
+- [x] GBM improvement: +0.059 over LR (5-seed average)
+- [x] Trap delta: mean 0.0123, min 0.0048 (purely causal — no label injection, honest thresholds mean≥0.008, min≥0.002)
+- [x] Value-aware uplift: +38.3% at K=25
+- [x] Cohort split AUC drop: 0.066 (random 0.639 → cohort 0.573)
+- [x] All mandatory checks pass
+
+Documentation + CI:
+- [x] `lead_scoring_intro/RELEASE_v7.md` — column dictionary, missingness patterns, metrics, teaching guidance (4 lectures), trap evaluation
+- [x] `lead_scoring_intro/BACKGROUND_v7.md` — ProcureFlow business context for students (snapshot day 20, regions US/UK)
+- [x] `.github/workflows/ci.yml` — `validate-dataset-v7` job added
+- [x] `tests/scripts/test_build_v7_snapshot.py` — 32+ tests for pipeline functions
+- [x] `tests/mechanisms/test_mechanisms.py` — 9 new tests for follow-up ramp mechanism
+- [x] All 839 tests pass; lint + format clean
+
 ### Fix: direct conversion bypass for pre-SQL leads (PR #45, closes #44)
 
 - [x] `leadforge/simulation/engine.py` — added `_DIRECT_CONVERSION_STAGES` and `_DIRECT_CONVERSION_DISCOUNT` (0.01) constants; pre-SQL leads (`mql`, `sal`) now have a small daily probability of converting directly, bypassing the full funnel
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -110,3 +110,31 @@ jobs:
       - name: Skip v6 (no dataset)
         if: steps.check-v6.outputs.found != 'true'
         run: echo "No v6 datasets found — skipping v6 validation"
+
+  validate-dataset-v7:
+    name: Validate v7 lead scoring dataset
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - run: pip install -e ".[dev,scripts]"
+      - name: Check for v7 datasets
+        id: check-v7
+        run: |
+          STUDENT="lead_scoring_intro/lead_scoring_intro_v7.csv"
+          INSTRUCTOR="lead_scoring_intro/lead_scoring_intro_v7_instructor.csv"
+          if [ -f "$STUDENT" ] && [ -f "$INSTRUCTOR" ]; then
+            echo "found=true" >> "$GITHUB_OUTPUT"
+            echo "student=$STUDENT" >> "$GITHUB_OUTPUT"
+            echo "instructor=$INSTRUCTOR" >> "$GITHUB_OUTPUT"
+          else
+            echo "found=false" >> "$GITHUB_OUTPUT"
+          fi
+      - name: Run v7 validator
+        if: steps.check-v7.outputs.found == 'true'
+        run: python scripts/validate_v7_dataset.py "${{ steps.check-v7.outputs.student }}" "${{ steps.check-v7.outputs.instructor }}"
+      - name: Skip v7 (no dataset)
+        if: steps.check-v7.outputs.found != 'true'
+        run: echo "No v7 datasets found — skipping v7 validation"