Skip to content

docs(ltv): reframe target to pLTV regression (ZILN) [LTV-Pa]#103

Merged
shaypal5 merged 1 commit into
mainfrom
docs/ltv-reframe-pltv-regression
Jun 10, 2026
Merged

docs(ltv): reframe target to pLTV regression (ZILN) [LTV-Pa]#103
shaypal5 merged 1 commit into
mainfrom
docs/ltv-reframe-pltv-regression

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

Corrects the LTV planning docs (merged in #102) from a churn-classification
framing to the correct predictive-lifetime-value (pLTV) regression framing.
Docs only — no package code.

The first pass pattern-matched LTV onto the lead-scoring binary task and asked
"binary vs multiclass." That's the wrong axis. LTV is a regression problem:
predict continuous future monetary value. This follows Google
lifetime_value / the ZILN paper
(arXiv:1912.07753) and
Voyantis pLTV framing.

Corrected / added decisions (design.md §2.2)

# Decision
D1 (corrected) Primary task = continuous pLTV regression; ZILN-shaped target (zero mass + lognormal tail); LTV-bucket multiclass dropped
D6 Multiple forward windows: 90 / 365 / 730 days — zero-mass + tail grow with window → built-in difficulty gradient
D7 Value basis = gross revenue (sum of paid invoices in window)
D8 First-class early-pLTV variant — tenure-anchored cold-start cutoff alongside the calendar-anchored standard regime (the Voyantis acquisition use case)
D9 Churn (churned_within_180d) kept as a secondary task = the ZILN zero-inflation indicator

Why this is a better fit

  • The simulation already produces a ZILN-shaped target for free: churned
    customers → ~0 forward revenue (zero mass), expansion customers → heavy tail.
  • The staggered-start design (D4) maps onto the cold-start angle: short-
    tenure customers are the hard early-prediction case.
  • Metrics shift from AUC to Spearman / normalized Gini / decile calibration /
    value capture
    — much of which the release_quality harness already
    computes for lead scoring.
  • The mrr_change_full_period leakage trap is more natural against a value
    target.

Roadmap changes

LTV-Pc now ships regression task specs; LTV-Ph is the calendar-anchored
snapshot; new LTV-Pi adds the early-pLTV (tenure-anchored) task family;
LTV-Pj teaches the task-split writer a continuous-target path; LTV-Pl
calibrates regression metric bands; LTV-Pn notebooks teach ZILN + cold-start.
PR sequence is now LTV-Pa..Po.

Scope

Docs only. Schema foundation (LTV-Pb) is unaffected by the reframe — the
entity rows are the same; only tasks/features/snapshot/validation/notebooks
change. Framework code begins at LTV-M1.

🤖 Generated with Claude Code

…n [LTV-Pa]

The first planning pass framed the primary task as churn classification
(binary/multiclass), which pattern-matched onto the lead-scoring binary task
instead of the actual predictive-LTV literature. Corrected to continuous pLTV
regression, following Google lifetime_value / ZILN (arXiv:1912.07753) and
Voyantis pLTV framing.

Decisions added (design.md §2.2):
- D1 (corrected): primary task = continuous pLTV regression; ZILN-shaped
  target (zero mass + lognormal tail); LTV-bucket multiclass dropped.
- D6: multiple forward windows 90/365/730d (zero-mass + tail grow with window
  → built-in difficulty gradient).
- D7: value basis = gross revenue (sum of paid invoices in window).
- D8: first-class early-pLTV variant (tenure-anchored cold-start cutoff)
  alongside the calendar-anchored standard regime.
- D9: churn kept as a secondary task = the ZILN zero-inflation indicator.

Knock-on doc changes:
- New §3 pLTV target (ZILN) + §3.1 two observation regimes.
- §8 targets: three ltv_revenue_{90,365,730}d regression columns + secondary
  churned_within_180d.
- §9 evaluation: Spearman / normalized Gini / decile calibration / value
  capture (not AUC); MSE shown as anti-pattern.
- roadmap.md: LTV-Pc now regression task specs; LTV-Ph calendar-anchored
  snapshot; new LTV-Pi early-pLTV task family; LTV-Pj regression task-split
  writer; LTV-Pl regression metric bands; LTV-Pn ZILN/cold-start notebooks.
  PR sequence now LTV-Pa..Po.
- .agent-plan.md LTV section synced.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 10, 2026 07:26
@shaypal5 shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 10, 2026
@shaypal5 shaypal5 added type: docs Documentation or narrative changes dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 10, 2026
@shaypal5 shaypal5 merged commit ec93a2b into main Jun 10, 2026
9 of 10 checks passed
@shaypal5 shaypal5 deleted the docs/ltv-reframe-pltv-regression branch June 10, 2026 07:26
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #103 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: pull request opened
Workflow run: 27260326471 attempt 1
Comment timestamp: 2026-06-10T07:26:23.804947+00:00
PR head commit: 146280c65cce8b9f2a4a8575744543a8e7f21a47

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the LTV planning documentation to reframe the workstream from churn classification to predictive lifetime value (pLTV) regression with a ZILN-shaped target, aligning the docs with the intended modeling/pedagogy direction (multiple horizons, gross-revenue basis, early/cold-start variant, churn as auxiliary).

Changes:

  • Reframes the primary objective to continuous pLTV regression (ZILN), with 90/365/730-day forward windows and an early/tenure-anchored variant.
  • Updates the roadmap milestones/PR breakdown to reflect the revised task framing and downstream implementation plan.
  • Updates .agent-plan.md to reflect the revised locked decisions and roadmap identifiers.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
docs/ltv/roadmap.md Updates milestone/PR plan and narrative to reflect pLTV regression; adds reframe notes and new/renamed work items.
docs/ltv/design.md Rewrites goal/decisions/targets/metrics sections to define pLTV regression (ZILN), multiple horizons, and early-pLTV regime.
.agent-plan.md Syncs the top-level agent tracker to the new pLTV regression framing and updated roadmap range.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/ltv/roadmap.md
| Milestone | Capability | PRs | GitHub PRs |
|-----------|------------|-----|------------|
| `LTV-M0` | Planning + design lock | `LTV-Pa` | _this PR_ |
| `LTV-M0` | Planning + design lock | `LTV-Pa` | #102 (+ pLTV reframe) |
Comment thread docs/ltv/roadmap.md
Comment on lines +148 to +149
- [ ] **`LTV-Pj`** — `feat(api,core,render): recipe_type dispatch + regression
task splits`. Add `n_customers` + lifecycle config (windows, early-tenure,
Comment thread docs/ltv/roadmap.md
Comment on lines +172 to +173
- [ ] **`LTV-Pl`** — `feat(validation): lifecycle leakage probes + pLTV metric
bands`. Lifecycle leakage probes (cutoff window check; banned terminal
Comment thread docs/ltv/roadmap.md
Comment on lines +72 to +74
(`LTV_REVENUE_{90,365,730}D`) + the secondary `CHURN_WITHIN_180D` to
`tasks.py`; extend the task-spec model to carry `task_type`
(`regression` | `classification`).
Comment thread docs/ltv/roadmap.md
Comment on lines +68 to +72
Add `CUSTOMER_SNAPSHOT_FEATURES` to `features.py` — including the three
continuous targets (`ltv_revenue_{90,365,730}d`), the secondary
`churned_within_180d`, and the `mrr_change_full_period` trap
(`leakage_risk=True`). Add **regression** task specs
(`LTV_REVENUE_{90,365,730}D`) + the secondary `CHURN_WITHIN_180D` to
Comment thread docs/ltv/design.md
split than making censoring a label-derivation hazard.
| D1 | Primary task type | **Continuous pLTV regression.** Target = future gross revenue over a forward window. ZILN-shaped (zero mass + lognormal tail). The LTV-bucket multiclass idea is dropped. |
| D6 | Target horizon(s) | **Multiple windows: 90 / 365 / 730 days.** Three regression targets per customer. Zero-inflation and tail-heaviness grow with the window, giving a built-in difficulty gradient. |
| D7 | Value basis | **Gross revenue** = sum of paid invoice amounts (`payment_status ∈ {paid, recovered}`) inside the window. Matches the MRR mechanics directly. |
Comment thread docs/ltv/design.md
Comment on lines +116 to +118
ltv_revenue_{W}d = Σ amount_usd for invoices with
payment_status ∈ {paid, recovered}
AND cutoff < invoice_date <= cutoff + W days
Comment thread docs/ltv/design.md
Comment on lines +75 to +79
| D1 | Primary task type | **Continuous pLTV regression.** Target = future gross revenue over a forward window. ZILN-shaped (zero mass + lognormal tail). The LTV-bucket multiclass idea is dropped. |
| D6 | Target horizon(s) | **Multiple windows: 90 / 365 / 730 days.** Three regression targets per customer. Zero-inflation and tail-heaviness grow with the window, giving a built-in difficulty gradient. |
| D7 | Value basis | **Gross revenue** = sum of paid invoice amounts (`payment_status ∈ {paid, recovered}`) inside the window. Matches the MRR mechanics directly. |
| D8 | Early/cold-start emphasis | **First-class early-pLTV task variant.** A tenure-anchored observation regime (observe each customer at a fixed short tenure, predict long-horizon value) ships alongside the calendar-anchored standard regime. |
| D9 | Churn task | **Kept as a secondary/auxiliary task** (`churn_within_180_days`), exposing the ZILN zero-inflation indicator. Not the headline. |
Comment thread docs/ltv/design.md
Comment on lines +346 to +349
| `ltv_revenue_90d` | Float64 | primary regression (warm-up horizon) |
| `ltv_revenue_365d` | Float64 | primary regression (standard horizon) |
| `ltv_revenue_730d` | Float64 | primary regression (hard horizon) |
| `churned_within_180d` | boolean | secondary / ZILN zero-inflation indicator |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream type: docs Documentation or narrative changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants