feat: v6 lead scoring dataset — causal trap, student/instructor split by shaypal5 · Pull Request #33 · leadforge-dev/leadforge

shaypal5 · 2026-04-30T20:01:04Z

Summary

v6 lead scoring intro dataset designed for 3–4 lectures on applied ML lead scoring
Causally-grounded leakage trap: post-snapshot touches (days 15–90) computed from simulator event timeline — no label-noise injection
Student/instructor split: student-safe CSV (20 cols, no leakage) + instructor CSV (21 cols, one __leakage__ column)
Engine enhancement: LatentDecayIntensity mechanism makes touch intensity depend on latent intent/fit traits (same traits that drive conversion)
Tree improvement: GBM reliably outperforms LR by +0.022 AUC due to nonlinear interactions
Value-aware ranking: EV ranking captures 17–41% more ACV than probability ranking
Cohort feature: acquisition_wave (A/B/C) for distribution-shift lecture
Momentum feature: touches_last_7_days added to snapshot builder
Soft ACV winsorization: avoids hard-clipping ties at $120k cap
Structured missingness: MAR (web_sessions by lead_source, seniority by partner), MCAR (expected_acv 2%), structural (no-touch NaNs)
737 tests pass (705 existing + 26 v6 pipeline + 6 LatentDecayIntensity)
CI: validate-dataset-v6 job added

Deliverables

#	Deliverable	Status
1	`lead_scoring_intro/lead_scoring_intro_v6.csv`	1000 × 20, no leakage
2	`lead_scoring_intro/lead_scoring_intro_v6_instructor.csv`	1000 × 21, one trap
3	`lead_scoring_intro/RELEASE_v6.md`	Column dictionary, metrics, teaching guide
4	`lead_scoring_intro/BACKGROUND_v6.md`	ProcureFlow business context
5	`scripts/build_v6_snapshot.py`	Build both CSVs
6	`scripts/validate_v6_dataset.py`	Full validation suite
7	`scripts/quick_baseline_eval_v6.py`	LR + RF + GBM baselines
8	CI v6 validation job	`.github/workflows/ci.yml` updated
9	Engine: `LatentDecayIntensity`	`leadforge/mechanisms/counts.py`

Validation results

Check	Result
Baseline LR AUC	0.627
GBM AUC (5-seed avg)	0.680 (+0.022 vs LR)
Trap mean delta (10 seeds)	0.046 (threshold: ≥0.03)
Trap min delta	0.034 (threshold: ≥0.015)
EV uplift @25	+17.6%
EV uplift @50	+41.3%
All mandatory checks	PASS

Test plan

pytest tests/ — 737 tests pass
ruff check . && ruff format --check . — clean
mypy leadforge/ — no issues
python scripts/validate_v6_dataset.py lead_scoring_intro/lead_scoring_intro_v6.csv lead_scoring_intro/lead_scoring_intro_v6_instructor.csv — all checks pass
CI validate-dataset-v6 job passes on PR

🤖 Generated with Claude Code

…, tree improvement Introduce v6 of the lead scoring intro dataset with causally-grounded leakage detection, student/instructor export split, and validated tree model improvement for 3–4 lecture curriculum. Engine changes: - LatentDecayIntensity mechanism: Poisson intensity with recency decay + latent-trait modulation, creating causal link latent→touches→conversion - assign_mechanisms(latent_touch_intensity=True) flag + passthrough via simulate_world() and Generator.generate() - touches_last_7_days added to snapshot builder + feature spec Dataset (lead_scoring_intro/): - Student: 1000 rows × 20 cols (no leakage columns) - Instructor: 1000 rows × 21 cols (+ __leakage__touches_post_snapshot_11_90) - Day-14 snapshot, 30% conversion rate, soft ACV winsorization - acquisition_wave cohort feature (A/B/C) for shift lecture - Structured missingness: MAR (web_sessions, seniority), MCAR (expected_acv), structural (days_since_last_touch, days_since_first_touch) Validation results: - Baseline LR AUC: 0.627, GBM: 0.680 (+0.022 improvement) - Trap delta: mean 0.046, min 0.034 (10 seeds, all above thresholds) - Value-aware uplift: +17.6% at K=25, +41.3% at K=50 - All mandatory checks pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds the v6 “lead_scoring_intro” dataset pipeline and artifacts, including a latent-aware touch-intensity mechanism to support a causally grounded leakage trap and a student/instructor export split, plus validation scripts and CI coverage.

Changes:

Introduces LatentDecayIntensity and a latent_touch_intensity toggle threaded through generator → simulation → mechanism assignment.
Extends snapshot rendering and schema to include touches_last_7_days, and adds the v6 build pipeline + CLI scripts to generate/validate student + instructor CSVs.
Ships v6 dataset/docs and adds a dedicated CI job (validate-dataset-v6) to run the v6 validator.

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`leadforge/mechanisms/counts.py`	Adds `LatentDecayIntensity` mechanism implementation.
`leadforge/mechanisms/policies.py`	Adds per-motif latent weights and toggles touch intensity mechanism selection.
`leadforge/simulation/engine.py`	Threads `latent_touch_intensity` into `assign_mechanisms` (keyword-only).
`leadforge/api/generator.py`	Exposes `latent_touch_intensity` via `Generator.generate()` kwargs passthrough.
`leadforge/render/snapshots.py`	Computes and merges new `touches_last_7_days` snapshot feature.
`leadforge/schema/features.py`	Registers `touches_last_7_days` in the feature spec.
`leadforge/pipelines/build_v6.py`	New v6 pipeline steps (feature derivation, soft ACV cap, cohort assignment, trap computation, subsampling, missingness).
`scripts/build_v6_snapshot.py`	CLI orchestration to generate student + instructor v6 CSVs.
`scripts/validate_v6_dataset.py`	Canonical v6 validation suite (structure, leakage, AUC, trap delta, etc.).
`scripts/quick_baseline_eval_v6.py`	Convenience baseline evaluation script (LR/RF/GBM + optional trap detection).
`tests/mechanisms/test_mechanisms.py`	Adds unit tests for `LatentDecayIntensity` and assignment toggle behavior.
`tests/scripts/test_build_v6_snapshot.py`	Adds tests for v6 pipeline functions (features, softcap, cohorts, subsample, missingness).
`.github/workflows/ci.yml`	Adds `validate-dataset-v6` CI job and clarifies v5 step IDs/naming.
`lead_scoring_intro/lead_scoring_intro_v6.csv`	Adds the shipped student-safe v6 dataset artifact.
`lead_scoring_intro/RELEASE_v6.md`	Adds v6 release notes, column dictionary, metrics, teaching guide.
`lead_scoring_intro/BACKGROUND_v6.md`	Adds v6 narrative/business context for students.
`.agent-plan.md`	Updates internal plan/status notes to reflect v6 shipment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…st threshold - Rename trap column from _11_90 to _15_90 (SNAPSHOT_DAY=14 → window is 15–90) - Fix column count comment: 19 features + 1 target = 20 (not 21) - Add assign_acquisition_wave to __all__ - Tighten missingness test threshold from 15% to 10% to match validator - Update all references (RELEASE_v6.md, instructor CSV header, .agent-plan.md) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-04-30T21:22:46Z

pr-agent-context report:

This run includes unresolved review comments on PR #33 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/pipelines/build_v6.py:60
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568346
Root author: copilot-pull-request-reviewer

Comment:
    INSTRUCTOR_TRAP_COL name (`__leakage__touches_post_snapshot_11_90`) doesn’t match the actual window enforced by compute_post_snapshot_touches (days > snapshot_day, i.e., 15–90 when SNAPSHOT_DAY=14) and the module/docs text. Consider renaming the column to reflect the real range (e.g., `..._15_90` / `..._post_snapshot_15_90`) or adjusting the filter/doc so the name and semantics agree.

## COPILOT-2
Location: leadforge/pipelines/build_v6.py
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568391
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    Comment says “20 features + 1 target = 21 columns”, but FINAL_COLUMNS_STUDENT contains 20 columns total (19 features + 1 target). Please update the comment (or adjust the column list) so counts are consistent.

## COPILOT-3
Location: leadforge/pipelines/build_v6.py:35
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568432
Root author: copilot-pull-request-reviewer

Comment:
    `assign_acquisition_wave` is a public pipeline step (and is imported/used by tests and scripts) but it’s missing from this module’s `__all__` export list. Consider adding it to `__all__` for consistency with build_v5.py’s “export all pipeline steps” pattern.

## COPILOT-4
Location: lead_scoring_intro/RELEASE_v6.md
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568471
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The leakage trap column name (`__leakage__touches_post_snapshot_11_90`) and its described window (“days 15–90”) are inconsistent. It’s likely clearer to align the column name with the actual post-snapshot range (e.g., 15–90 for snapshot day 14) or update the described window to match the name.

## COPILOT-5
Location: tests/scripts/test_build_v6_snapshot.py:250
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568505
Root author: copilot-pull-request-reviewer

Comment:
    This test allows up to 15% missingness per column, but the v6 validator script enforces MAX_COL_MISSING_RATE=10% per column. Consider aligning the test bound with the validator threshold (or documenting why they intentionally differ) to avoid false confidence from passing tests while CI validation fails.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25189952635 attempt 1
Comment timestamp: 2026-04-30T21:21:59.205380+00:00
PR head commit: 943caff022559ad6c424609fe7b30e3a27683057

Copilot AI review requested due to automatic review settings April 30, 2026 20:01

shaypal5 added type: feature New capability layer: mechanisms mechanisms/ generators and transitions layer: validation validation/ invariants and checks layer: recipes recipes/ recipe assets and registry labels Apr 30, 2026

Copilot started reviewing on behalf of shaypal5 April 30, 2026 20:01 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Comment thread leadforge/pipelines/build_v6.py

Comment thread leadforge/pipelines/build_v6.py Outdated

Comment thread leadforge/pipelines/build_v6.py

Comment thread lead_scoring_intro/RELEASE_v6.md Outdated

Comment thread tests/scripts/test_build_v6_snapshot.py

This comment has been minimized.

Sign in to view

shaypal5 merged commit 7ac7a61 into main Apr 30, 2026
7 checks passed

shaypal5 deleted the feat/v6-lead-scoring-dataset branch April 30, 2026 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: v6 lead scoring dataset — causal trap, student/instructor split#33

feat: v6 lead scoring dataset — causal trap, student/instructor split#33
shaypal5 merged 2 commits into
mainfrom
feat/v6-lead-scoring-dataset

shaypal5 commented Apr 30, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaypal5 commented Apr 30, 2026

Summary

Deliverables

Validation results

Test plan

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

github-actions Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants