Skip to content

feat: v6 lead scoring dataset — causal trap, student/instructor split#33

Merged
shaypal5 merged 2 commits into
mainfrom
feat/v6-lead-scoring-dataset
Apr 30, 2026
Merged

feat: v6 lead scoring dataset — causal trap, student/instructor split#33
shaypal5 merged 2 commits into
mainfrom
feat/v6-lead-scoring-dataset

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

  • v6 lead scoring intro dataset designed for 3–4 lectures on applied ML lead scoring
  • Causally-grounded leakage trap: post-snapshot touches (days 15–90) computed from simulator event timeline — no label-noise injection
  • Student/instructor split: student-safe CSV (20 cols, no leakage) + instructor CSV (21 cols, one __leakage__ column)
  • Engine enhancement: LatentDecayIntensity mechanism makes touch intensity depend on latent intent/fit traits (same traits that drive conversion)
  • Tree improvement: GBM reliably outperforms LR by +0.022 AUC due to nonlinear interactions
  • Value-aware ranking: EV ranking captures 17–41% more ACV than probability ranking
  • Cohort feature: acquisition_wave (A/B/C) for distribution-shift lecture
  • Momentum feature: touches_last_7_days added to snapshot builder
  • Soft ACV winsorization: avoids hard-clipping ties at $120k cap
  • Structured missingness: MAR (web_sessions by lead_source, seniority by partner), MCAR (expected_acv 2%), structural (no-touch NaNs)
  • 737 tests pass (705 existing + 26 v6 pipeline + 6 LatentDecayIntensity)
  • CI: validate-dataset-v6 job added

Deliverables

# Deliverable Status
1 lead_scoring_intro/lead_scoring_intro_v6.csv 1000 × 20, no leakage
2 lead_scoring_intro/lead_scoring_intro_v6_instructor.csv 1000 × 21, one trap
3 lead_scoring_intro/RELEASE_v6.md Column dictionary, metrics, teaching guide
4 lead_scoring_intro/BACKGROUND_v6.md ProcureFlow business context
5 scripts/build_v6_snapshot.py Build both CSVs
6 scripts/validate_v6_dataset.py Full validation suite
7 scripts/quick_baseline_eval_v6.py LR + RF + GBM baselines
8 CI v6 validation job .github/workflows/ci.yml updated
9 Engine: LatentDecayIntensity leadforge/mechanisms/counts.py

Validation results

Check Result
Baseline LR AUC 0.627
GBM AUC (5-seed avg) 0.680 (+0.022 vs LR)
Trap mean delta (10 seeds) 0.046 (threshold: ≥0.03)
Trap min delta 0.034 (threshold: ≥0.015)
EV uplift @25 +17.6%
EV uplift @50 +41.3%
All mandatory checks PASS

Test plan

  • pytest tests/ — 737 tests pass
  • ruff check . && ruff format --check . — clean
  • mypy leadforge/ — no issues
  • python scripts/validate_v6_dataset.py lead_scoring_intro/lead_scoring_intro_v6.csv lead_scoring_intro/lead_scoring_intro_v6_instructor.csv — all checks pass
  • CI validate-dataset-v6 job passes on PR

🤖 Generated with Claude Code

…, tree improvement

Introduce v6 of the lead scoring intro dataset with causally-grounded
leakage detection, student/instructor export split, and validated
tree model improvement for 3–4 lecture curriculum.

Engine changes:
- LatentDecayIntensity mechanism: Poisson intensity with recency decay
  + latent-trait modulation, creating causal link latent→touches→conversion
- assign_mechanisms(latent_touch_intensity=True) flag + passthrough via
  simulate_world() and Generator.generate()
- touches_last_7_days added to snapshot builder + feature spec

Dataset (lead_scoring_intro/):
- Student: 1000 rows × 20 cols (no leakage columns)
- Instructor: 1000 rows × 21 cols (+ __leakage__touches_post_snapshot_11_90)
- Day-14 snapshot, 30% conversion rate, soft ACV winsorization
- acquisition_wave cohort feature (A/B/C) for shift lecture
- Structured missingness: MAR (web_sessions, seniority), MCAR (expected_acv),
  structural (days_since_last_touch, days_since_first_touch)

Validation results:
- Baseline LR AUC: 0.627, GBM: 0.680 (+0.022 improvement)
- Trap delta: mean 0.046, min 0.034 (10 seeds, all above thresholds)
- Value-aware uplift: +17.6% at K=25, +41.3% at K=50
- All mandatory checks pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 30, 2026 20:01
@shaypal5 shaypal5 added type: feature New capability layer: mechanisms mechanisms/ generators and transitions layer: validation validation/ invariants and checks layer: recipes recipes/ recipe assets and registry labels Apr 30, 2026
@github-actions

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the v6 “lead_scoring_intro” dataset pipeline and artifacts, including a latent-aware touch-intensity mechanism to support a causally grounded leakage trap and a student/instructor export split, plus validation scripts and CI coverage.

Changes:

  • Introduces LatentDecayIntensity and a latent_touch_intensity toggle threaded through generator → simulation → mechanism assignment.
  • Extends snapshot rendering and schema to include touches_last_7_days, and adds the v6 build pipeline + CLI scripts to generate/validate student + instructor CSVs.
  • Ships v6 dataset/docs and adds a dedicated CI job (validate-dataset-v6) to run the v6 validator.

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
leadforge/mechanisms/counts.py Adds LatentDecayIntensity mechanism implementation.
leadforge/mechanisms/policies.py Adds per-motif latent weights and toggles touch intensity mechanism selection.
leadforge/simulation/engine.py Threads latent_touch_intensity into assign_mechanisms (keyword-only).
leadforge/api/generator.py Exposes latent_touch_intensity via Generator.generate() kwargs passthrough.
leadforge/render/snapshots.py Computes and merges new touches_last_7_days snapshot feature.
leadforge/schema/features.py Registers touches_last_7_days in the feature spec.
leadforge/pipelines/build_v6.py New v6 pipeline steps (feature derivation, soft ACV cap, cohort assignment, trap computation, subsampling, missingness).
scripts/build_v6_snapshot.py CLI orchestration to generate student + instructor v6 CSVs.
scripts/validate_v6_dataset.py Canonical v6 validation suite (structure, leakage, AUC, trap delta, etc.).
scripts/quick_baseline_eval_v6.py Convenience baseline evaluation script (LR/RF/GBM + optional trap detection).
tests/mechanisms/test_mechanisms.py Adds unit tests for LatentDecayIntensity and assignment toggle behavior.
tests/scripts/test_build_v6_snapshot.py Adds tests for v6 pipeline functions (features, softcap, cohorts, subsample, missingness).
.github/workflows/ci.yml Adds validate-dataset-v6 CI job and clarifies v5 step IDs/naming.
lead_scoring_intro/lead_scoring_intro_v6.csv Adds the shipped student-safe v6 dataset artifact.
lead_scoring_intro/RELEASE_v6.md Adds v6 release notes, column dictionary, metrics, teaching guide.
lead_scoring_intro/BACKGROUND_v6.md Adds v6 narrative/business context for students.
.agent-plan.md Updates internal plan/status notes to reflect v6 shipment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread leadforge/pipelines/build_v6.py
Comment thread leadforge/pipelines/build_v6.py Outdated
Comment thread leadforge/pipelines/build_v6.py
Comment thread lead_scoring_intro/RELEASE_v6.md Outdated
Comment thread tests/scripts/test_build_v6_snapshot.py
@github-actions

This comment has been minimized.

…st threshold

- Rename trap column from _11_90 to _15_90 (SNAPSHOT_DAY=14 → window is 15–90)
- Fix column count comment: 19 features + 1 target = 20 (not 21)
- Add assign_acquisition_wave to __all__
- Tighten missingness test threshold from 15% to 10% to match validator
- Update all references (RELEASE_v6.md, instructor CSV header, .agent-plan.md)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

This run includes unresolved review comments on PR #33 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/pipelines/build_v6.py:60
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568346
Root author: copilot-pull-request-reviewer

Comment:
    INSTRUCTOR_TRAP_COL name (`__leakage__touches_post_snapshot_11_90`) doesn’t match the actual window enforced by compute_post_snapshot_touches (days > snapshot_day, i.e., 15–90 when SNAPSHOT_DAY=14) and the module/docs text. Consider renaming the column to reflect the real range (e.g., `..._15_90` / `..._post_snapshot_15_90`) or adjusting the filter/doc so the name and semantics agree.

## COPILOT-2
Location: leadforge/pipelines/build_v6.py
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568391
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    Comment says “20 features + 1 target = 21 columns”, but FINAL_COLUMNS_STUDENT contains 20 columns total (19 features + 1 target). Please update the comment (or adjust the column list) so counts are consistent.

## COPILOT-3
Location: leadforge/pipelines/build_v6.py:35
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568432
Root author: copilot-pull-request-reviewer

Comment:
    `assign_acquisition_wave` is a public pipeline step (and is imported/used by tests and scripts) but it’s missing from this module’s `__all__` export list. Consider adding it to `__all__` for consistency with build_v5.py’s “export all pipeline steps” pattern.

## COPILOT-4
Location: lead_scoring_intro/RELEASE_v6.md
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568471
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The leakage trap column name (`__leakage__touches_post_snapshot_11_90`) and its described window (“days 15–90”) are inconsistent. It’s likely clearer to align the column name with the actual post-snapshot range (e.g., 15–90 for snapshot day 14) or update the described window to match the name.

## COPILOT-5
Location: tests/scripts/test_build_v6_snapshot.py:250
URL: https://github.com/leadforge-dev/leadforge/pull/33#discussion_r3170568505
Root author: copilot-pull-request-reviewer

Comment:
    This test allows up to 15% missingness per column, but the v6 validator script enforces MAX_COL_MISSING_RATE=10% per column. Consider aligning the test bound with the validator threshold (or documenting why they intentionally differ) to avoid false confidence from passing tests while CI validation fails.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25189952635 attempt 1
Comment timestamp: 2026-04-30T21:21:59.205380+00:00
PR head commit: 943caff022559ad6c424609fe7b30e3a27683057

@shaypal5 shaypal5 merged commit 7ac7a61 into main Apr 30, 2026
7 checks passed
@shaypal5 shaypal5 deleted the feat/v6-lead-scoring-dataset branch April 30, 2026 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: mechanisms mechanisms/ generators and transitions layer: recipes recipes/ recipe assets and registry layer: validation validation/ invariants and checks type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants