Skip to content

Commit 0d2fa7b

Browse files
shaypal5claude
andcommitted
chore: add v7 CI validation job and update agent plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent e0cea4e commit 0d2fa7b

2 files changed

Lines changed: 60 additions & 0 deletions

File tree

.agent-plan.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,38 @@ Documentation + CI:
162162
- [x] `CHANGELOG.md``Unreleased` renamed to `v1.0.0 — (2026-05-02)`; milestone headings folded into collapsible development history
163163
- [x] `.agent-plan.md` — updated to reflect v1.0.0 release
164164

165+
### v7: Purely causal leakage trap with canonical validation (PR #50)
166+
167+
Engine changes:
168+
- [x] `leadforge/mechanisms/counts.py``LatentDecayIntensity` follow-up ramp: `followup_boost_after_day`, `followup_boost_factor`, `followup_ramp_days`, `followup_latent_weights` parameters; `_effective_boost(t)` and `_latent_multiplier(t, latents)` methods
169+
- [x] `leadforge/mechanisms/policies.py``_FOLLOWUP_LATENT_WEIGHTS` per motif family (budget_readiness, process_maturity, contact_authority); wired into `assign_mechanisms()` with `followup_boost_after_day=20, followup_boost_factor=10.0, followup_ramp_days=10`
170+
171+
Build pipeline:
172+
- [x] `leadforge/pipelines/build_v7.py` — all pipeline functions (identical to v6 minus `boost_leakage_trap`); purely causal trap via `compute_post_snapshot_touches`
173+
- [x] `scripts/build_v7_snapshot.py` — CLI: generates both student + instructor CSVs
174+
- [x] `scripts/validate_v7_dataset.py` — validates both exports: basic checks, determinism, baseline AUC, tree improvement, value-aware ranking, trap delta (10 seeds), cohort split; honest thresholds for purely causal trap
175+
- [x] `scripts/quick_baseline_eval_v7.py` — LR + RF + GBM baselines, value-aware ranking, feature importance, trap detection
176+
177+
Datasets:
178+
- [x] `lead_scoring_intro/lead_scoring_intro_v7.csv` — 1000 rows × 20 cols (student-safe, no leakage)
179+
- [x] `lead_scoring_intro/lead_scoring_intro_v7_instructor.csv` — 1000 rows × 21 cols (+ `__leakage__touches_post_snapshot_21_90`)
180+
181+
Validation results:
182+
- [x] Baseline AUC: 0.625 (within [0.58, 0.90]; snapshot day 20)
183+
- [x] GBM improvement: +0.059 over LR (5-seed average)
184+
- [x] Trap delta: mean 0.0123, min 0.0048 (purely causal — no label injection, honest thresholds mean≥0.008, min≥0.002)
185+
- [x] Value-aware uplift: +38.3% at K=25
186+
- [x] Cohort split AUC drop: 0.066 (random 0.639 → cohort 0.573)
187+
- [x] All mandatory checks pass
188+
189+
Documentation + CI:
190+
- [x] `lead_scoring_intro/RELEASE_v7.md` — column dictionary, missingness patterns, metrics, teaching guidance (4 lectures), trap evaluation
191+
- [x] `lead_scoring_intro/BACKGROUND_v7.md` — ProcureFlow business context for students (snapshot day 20, regions US/UK)
192+
- [x] `.github/workflows/ci.yml``validate-dataset-v7` job added
193+
- [x] `tests/scripts/test_build_v7_snapshot.py` — 32+ tests for pipeline functions
194+
- [x] `tests/mechanisms/test_mechanisms.py` — 9 new tests for follow-up ramp mechanism
195+
- [x] All 839 tests pass; lint + format clean
196+
165197
### Fix: direct conversion bypass for pre-SQL leads (PR #45, closes #44)
166198

167199
- [x] `leadforge/simulation/engine.py` — added `_DIRECT_CONVERSION_STAGES` and `_DIRECT_CONVERSION_DISCOUNT` (0.01) constants; pre-SQL leads (`mql`, `sal`) now have a small daily probability of converting directly, bypassing the full funnel

.github/workflows/ci.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,3 +110,31 @@ jobs:
110110
- name: Skip v6 (no dataset)
111111
if: steps.check-v6.outputs.found != 'true'
112112
run: echo "No v6 datasets found — skipping v6 validation"
113+
114+
validate-dataset-v7:
115+
name: Validate v7 lead scoring dataset
116+
runs-on: ubuntu-latest
117+
steps:
118+
- uses: actions/checkout@v4
119+
- uses: actions/setup-python@v5
120+
with:
121+
python-version: "3.12"
122+
- run: pip install -e ".[dev,scripts]"
123+
- name: Check for v7 datasets
124+
id: check-v7
125+
run: |
126+
STUDENT="lead_scoring_intro/lead_scoring_intro_v7.csv"
127+
INSTRUCTOR="lead_scoring_intro/lead_scoring_intro_v7_instructor.csv"
128+
if [ -f "$STUDENT" ] && [ -f "$INSTRUCTOR" ]; then
129+
echo "found=true" >> "$GITHUB_OUTPUT"
130+
echo "student=$STUDENT" >> "$GITHUB_OUTPUT"
131+
echo "instructor=$INSTRUCTOR" >> "$GITHUB_OUTPUT"
132+
else
133+
echo "found=false" >> "$GITHUB_OUTPUT"
134+
fi
135+
- name: Run v7 validator
136+
if: steps.check-v7.outputs.found == 'true'
137+
run: python scripts/validate_v7_dataset.py "${{ steps.check-v7.outputs.student }}" "${{ steps.check-v7.outputs.instructor }}"
138+
- name: Skip v7 (no dataset)
139+
if: steps.check-v7.outputs.found != 'true'
140+
run: echo "No v7 datasets found — skipping v7 validation"

0 commit comments

Comments
 (0)