feat(lifecycle): early-pLTV (tenure-anchored) snapshot [LTV-Pm] by shaypal5 · Pull Request #120 · leadforge-dev/leadforge

shaypal5 · 2026-06-12T15:01:37Z

Summary

LTV-Pm (completes roadmap LTV-M5): the second observation regime — a tenure-anchored early-pLTV snapshot (design.md §3.1, decision D8). Every customer is observed at a fixed short tenure (customer_start + early_tenure_weeks, default 4w) — the genuine cold-start case for acquisition-time value prediction (the Voyantis framing in the design doc).

from leadforge.schemes.lifecycle.snapshots import build_early_pltv_snapshot
early = build_early_pltv_snapshot(population, sim, early_tenure_weeks=4)

Both regimes now share one core

The calendar (build_customer_snapshot) and early (build_early_pltv_snapshot) builders feed a shared _assemble_snapshot() driven by a customer_id -> cutoff map; the three aggregation helpers (_event_aggregates, _health_aggregates, _forward_revenue) take that map instead of a single cutoff. So feature derivations, the mrr_change_full_period trap, target attribution, and difficulty distortions live in exactly one place.

Calendar output unchanged — all LTV-Pl tests pass as-is, and the lead-scoring distorted-snapshot hash is still byte-identical (196bc45f…).

Semantics specific to the early regime

Eligibility = survival to the anchor. Drops onboarding churners (churned at/before start + anchor); keeps late starters and customers who churn after the anchor. The cohort genuinely differs from the calendar regime's — the early-only set is exactly the customers who churned in (anchor, observation_date].
Per-customer forward windows. Each customer's targets are fully simulated relative to its own start (engine D6 runs through max(obs, start+et)+fwd), so the anchor may legitimately fall after observation_date — the builder does not require cutoff <= obs (unlike the calendar regime).
Coverage guards: early_tenure_weeks must be >= 1 and <= sim.early_tenure_weeks (else per-customer forward windows would be censored), on top of the shared forward-window / mismatch / observation-date checks.

Known property (validation follow-up)

tenure_weeks is constant (= early_tenure_weeks) across the early table — that's the defining property of the regime, not a feature. The published-bundle no-zero-variance check must exempt it for this task family; noted for the validation harness (LTV-Pp).

Tests (19 new; full suite 1790 passed / 51 skipped)

Tenure-constant-at-anchor; eligibility = survival to anchor; onboarding-churner exclusion; cohort difference vs calendar (post-anchor/pre-obs churners); per-customer censoring leakage probe (delete each customer's post-anchor events → features identical); targets recomputed off the per-customer cutoff vs the invoice table; cold-start sparsity (NPS all-null at 4w; health aggregates over pre-anchor signals only); anchor + horizon + mismatch + missing-obs validation; distortions leave targets and trap intact.

Scope / roadmap

The actual early-pLTV task directory + train/valid/test split export (render/tasks.py, design.md §536) folds into LTV-Pn with the bundle/task writer — matching how LTV-Pl deferred the calendar task-split writer. This PR delivers the snapshot builder + recomputed targets.
docs/ltv/roadmap.md: LTV-Pm ✓ → LTV-M5 complete (both regimes). Next is LTV-Pn (register LifecycleScheme, recipe, manifest + schema v6, both task-split writers, carried layering cleanups).

🤖 Generated with Claude Code

Add the second observation regime (design.md §3.1 / D8): a tenure-anchored snapshot that observes every customer at a fixed short tenure (customer_start + early_tenure_weeks) — the genuine cold-start case for acquisition-time value prediction (Voyantis framing). - build_early_pltv_snapshot(population, sim, *, early_tenure_weeks=4, …) in schemes/lifecycle/snapshots.py. - Unify both regimes on one per-customer-cutoff core. The calendar and early builders now feed a shared _assemble_snapshot() driven by a customer_id -> cutoff map; the three aggregation helpers take that map instead of a single date. Feature derivations, the mrr_change_full_period trap, target attribution, and difficulty distortions are defined exactly once. The calendar regime's output is unchanged — all LTV-Pl tests pass as-is, and the lead-scoring distorted-snapshot hash is still byte-identical (196bc45f…). Semantics: - Eligibility = survival to the anchor: drops onboarding churners (churned at or before start+anchor), keeps late starters and customers who churn after the anchor. The cohort therefore differs from the calendar regime's. - Forward windows are fully simulated relative to each customer's OWN start (engine D6 runs through max(obs, start+et)+fwd), so the anchor may legitimately fall after observation_date — the builder does not require cutoff <= obs (unlike the calendar regime). - Coverage guards: early_tenure_weeks must be >= 1 and <= the sim's recorded early_tenure_weeks (else per-customer forward windows would be censored), on top of the shared forward-window / population-mismatch / observation-date checks. Known property: tenure_weeks is constant (= early_tenure_weeks) across the early table — the defining property of the regime, not a feature. The published-bundle no-zero-variance check must exempt it for this task family (noted for the validation harness, LTV-Pp). Tests (19): tenure constant at anchor; eligibility = survival to anchor; onboarding churners excluded; cohort difference vs calendar (post-anchor, pre-obs churners); per-customer censoring leakage probe (delete each customer's post-anchor events, features unchanged); targets recomputed off the per-customer cutoff vs the invoice table; cold-start sparsity (NPS all-null at 4w; health aggregates over pre-anchor signals only); anchor + horizon + mismatch + missing-obs validation; distortions leave targets and trap intact. Scope note: the actual early-pLTV *task directory* + split export (render/tasks.py) folds into LTV-Pn with the bundle/task writer, matching how LTV-Pl deferred the calendar task-split writer. This PR delivers the snapshot builder + recomputed targets. Full suite 1790 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot

Pull request overview

Adds a tenure-anchored “early-pLTV” snapshot regime to the lifecycle scheme and refactors snapshot construction so both the calendar-anchored and early regimes share a single per-customer-cutoff assembly core.

Changes:

Introduces build_early_pltv_snapshot(...) with per-customer cutoffs at customer_start + early_tenure_weeks, plus validation around anchor coverage.
Refactors snapshot assembly + aggregation helpers to operate on a customer_id -> cutoff map (shared by both regimes).
Adds a dedicated early-regime test suite covering cohort eligibility, leakage/censoring guards, and target recomputation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`tests/schemes/lifecycle/test_early_pltv.py`	New tests for early (tenure-anchored) snapshot behavior, eligibility, leakage probes, and target correctness.
`leadforge/schemes/lifecycle/snapshots.py`	Implements early snapshot builder and unifies both regimes on a shared per-customer-cutoff snapshot assembly path.
`docs/ltv/roadmap.md`	Marks `LTV-Pm` as completed and documents the delivered scope/deferrals.
`.agent-plan.md`	Updates the agent plan status/progress notes for the new milestone completion.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+# One eligible customer plus the cutoff its row is anchored at.
+_Eligible = tuple["CustomerLifecycleRow", "SubscriptionLifecycleRow", date]


Findings from hostile self-review of the early-pLTV snapshot PR. 1. CALENDAR BYTE-IDENTITY: PROVEN, not just claimed. The "calendar output unchanged by the unification refactor" claim rested on derivation tests, which a subtle reordering could pass. Verified the refactored builder produces byte-identical calendar snapshots to main across all 5 motifs x 2 seeds, with and without difficulty distortions. No code change; the 30 LTV-Pl derivation tests remain the permanent guard. 2. INCOMPLETE DISCLOSURE OF DEGENERATE COLUMNS (the real finding). The PR documented only tenure_weeks as constant in the early regime, but at a short anchor MULTIPLE feature columns are dead by construction — confirmed structural (every seed), not seed accidents: - renewal_count: constant 0 for any anchor < 52w (first anniversary wk 52) - last_nps_score: all-null for any anchor < 13w (first survey wk 13) - weeks_since_last_payment_failure: near-degenerate (<=1 distinct value) Shipping a builder while under-documenting that ~3 columns are dead in its primary (4-week) configuration would mislead consumers and the validation harness. Expanded the build_early_pltv_snapshot docstring and the roadmap note to enumerate all of them with the cadence reason, flag the shared-catalog design tension, and hand LTV-Pp the full exemption list / LTV-Pn the drop-or-keep decision. New parametrized test pins the structural set across seeds so reviving any column forces a conscious update. 3. Added an early-regime trap-divergence test: the mrr_change_full_period trap is *more* leaky here than in the calendar regime (at 4 weeks mrr_change_at_snapshot is ~0 for >80% of rows while the trap captures the whole future expansion path) — pinned so the pedagogically central column can't silently stop diverging. Full suite 1794 passed / 51 skipped; ruff + mypy clean; lead-scoring distorted-snapshot hash still byte-identical (196bc45f…). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-12T15:19:08Z

pr-agent-context report:

This run includes an unresolved review comment on PR #120 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/schemes/lifecycle/snapshots.py:101
URL: https://github.com/leadforge-dev/leadforge/pull/120#discussion_r3404292569
Root author: copilot-pull-request-reviewer

Comment:
    The `_Eligible` type alias is documented as including the row’s cutoff, but the third tuple element is actually the customer start date (used later to compute tenure). The current comment is misleading for future maintenance.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 27424938063 attempt 1
Comment timestamp: 2026-06-12T15:18:19.797575+00:00
PR head commit: 2f7f08aad28df1e57f9538ee4655365c4caa607f

Copilot AI review requested due to automatic review settings June 12, 2026 15:01

shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 12, 2026

shaypal5 added type: feature New capability layer: render render/ bundle and artifact output status: needs review Ready for review dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 12, 2026

Copilot started reviewing on behalf of shaypal5 June 12, 2026 15:02 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed Jun 12, 2026

View reviewed changes

Comment thread leadforge/schemes/lifecycle/snapshots.py

Comment on lines +100 to +101

# One eligible customer plus the cutoff its row is anchored at.

_Eligible = tuple["CustomerLifecycleRow", "SubscriptionLifecycleRow", date]

shaypal5 merged commit df4fad8 into main Jun 12, 2026
10 checks passed

shaypal5 deleted the feat/lifecycle-early-pltv branch June 12, 2026 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(lifecycle): early-pLTV (tenure-anchored) snapshot [LTV-Pm]#120

feat(lifecycle): early-pLTV (tenure-anchored) snapshot [LTV-Pm]#120
shaypal5 merged 2 commits into
mainfrom
feat/lifecycle-early-pltv

shaypal5 commented Jun 12, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# One eligible customer plus the cutoff its row is anchored at.
		_Eligible = tuple["CustomerLifecycleRow", "SubscriptionLifecycleRow", date]

Conversation

shaypal5 commented Jun 12, 2026

Summary

Both regimes now share one core

Semantics specific to the early regime

Known property (validation follow-up)

Tests (19 new; full suite 1790 passed / 51 skipped)

Scope / roadmap

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants