docs(ltv): reframe target to pLTV regression (ZILN) [LTV-Pa]#103
Merged
Conversation
…n [LTV-Pa]
The first planning pass framed the primary task as churn classification
(binary/multiclass), which pattern-matched onto the lead-scoring binary task
instead of the actual predictive-LTV literature. Corrected to continuous pLTV
regression, following Google lifetime_value / ZILN (arXiv:1912.07753) and
Voyantis pLTV framing.
Decisions added (design.md §2.2):
- D1 (corrected): primary task = continuous pLTV regression; ZILN-shaped
target (zero mass + lognormal tail); LTV-bucket multiclass dropped.
- D6: multiple forward windows 90/365/730d (zero-mass + tail grow with window
→ built-in difficulty gradient).
- D7: value basis = gross revenue (sum of paid invoices in window).
- D8: first-class early-pLTV variant (tenure-anchored cold-start cutoff)
alongside the calendar-anchored standard regime.
- D9: churn kept as a secondary task = the ZILN zero-inflation indicator.
Knock-on doc changes:
- New §3 pLTV target (ZILN) + §3.1 two observation regimes.
- §8 targets: three ltv_revenue_{90,365,730}d regression columns + secondary
churned_within_180d.
- §9 evaluation: Spearman / normalized Gini / decile calibration / value
capture (not AUC); MSE shown as anti-pattern.
- roadmap.md: LTV-Pc now regression task specs; LTV-Ph calendar-anchored
snapshot; new LTV-Pi early-pLTV task family; LTV-Pj regression task-split
writer; LTV-Pl regression metric bands; LTV-Pn ZILN/cold-start notebooks.
PR sequence now LTV-Pa..Po.
- .agent-plan.md LTV section synced.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #103 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.Run metadata: |
There was a problem hiding this comment.
Pull request overview
This PR updates the LTV planning documentation to reframe the workstream from churn classification to predictive lifetime value (pLTV) regression with a ZILN-shaped target, aligning the docs with the intended modeling/pedagogy direction (multiple horizons, gross-revenue basis, early/cold-start variant, churn as auxiliary).
Changes:
- Reframes the primary objective to continuous pLTV regression (ZILN), with 90/365/730-day forward windows and an early/tenure-anchored variant.
- Updates the roadmap milestones/PR breakdown to reflect the revised task framing and downstream implementation plan.
- Updates
.agent-plan.mdto reflect the revised locked decisions and roadmap identifiers.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| docs/ltv/roadmap.md | Updates milestone/PR plan and narrative to reflect pLTV regression; adds reframe notes and new/renamed work items. |
| docs/ltv/design.md | Rewrites goal/decisions/targets/metrics sections to define pLTV regression (ZILN), multiple horizons, and early-pLTV regime. |
| .agent-plan.md | Syncs the top-level agent tracker to the new pLTV regression framing and updated roadmap range. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | Milestone | Capability | PRs | GitHub PRs | | ||
| |-----------|------------|-----|------------| | ||
| | `LTV-M0` | Planning + design lock | `LTV-Pa` | _this PR_ | | ||
| | `LTV-M0` | Planning + design lock | `LTV-Pa` | #102 (+ pLTV reframe) | |
Comment on lines
+148
to
+149
| - [ ] **`LTV-Pj`** — `feat(api,core,render): recipe_type dispatch + regression | ||
| task splits`. Add `n_customers` + lifecycle config (windows, early-tenure, |
Comment on lines
+172
to
+173
| - [ ] **`LTV-Pl`** — `feat(validation): lifecycle leakage probes + pLTV metric | ||
| bands`. Lifecycle leakage probes (cutoff window check; banned terminal |
Comment on lines
+72
to
+74
| (`LTV_REVENUE_{90,365,730}D`) + the secondary `CHURN_WITHIN_180D` to | ||
| `tasks.py`; extend the task-spec model to carry `task_type` | ||
| (`regression` | `classification`). |
Comment on lines
+68
to
+72
| Add `CUSTOMER_SNAPSHOT_FEATURES` to `features.py` — including the three | ||
| continuous targets (`ltv_revenue_{90,365,730}d`), the secondary | ||
| `churned_within_180d`, and the `mrr_change_full_period` trap | ||
| (`leakage_risk=True`). Add **regression** task specs | ||
| (`LTV_REVENUE_{90,365,730}D`) + the secondary `CHURN_WITHIN_180D` to |
| split than making censoring a label-derivation hazard. | ||
| | D1 | Primary task type | **Continuous pLTV regression.** Target = future gross revenue over a forward window. ZILN-shaped (zero mass + lognormal tail). The LTV-bucket multiclass idea is dropped. | | ||
| | D6 | Target horizon(s) | **Multiple windows: 90 / 365 / 730 days.** Three regression targets per customer. Zero-inflation and tail-heaviness grow with the window, giving a built-in difficulty gradient. | | ||
| | D7 | Value basis | **Gross revenue** = sum of paid invoice amounts (`payment_status ∈ {paid, recovered}`) inside the window. Matches the MRR mechanics directly. | |
Comment on lines
+116
to
+118
| ltv_revenue_{W}d = Σ amount_usd for invoices with | ||
| payment_status ∈ {paid, recovered} | ||
| AND cutoff < invoice_date <= cutoff + W days |
Comment on lines
+75
to
+79
| | D1 | Primary task type | **Continuous pLTV regression.** Target = future gross revenue over a forward window. ZILN-shaped (zero mass + lognormal tail). The LTV-bucket multiclass idea is dropped. | | ||
| | D6 | Target horizon(s) | **Multiple windows: 90 / 365 / 730 days.** Three regression targets per customer. Zero-inflation and tail-heaviness grow with the window, giving a built-in difficulty gradient. | | ||
| | D7 | Value basis | **Gross revenue** = sum of paid invoice amounts (`payment_status ∈ {paid, recovered}`) inside the window. Matches the MRR mechanics directly. | | ||
| | D8 | Early/cold-start emphasis | **First-class early-pLTV task variant.** A tenure-anchored observation regime (observe each customer at a fixed short tenure, predict long-horizon value) ships alongside the calendar-anchored standard regime. | | ||
| | D9 | Churn task | **Kept as a secondary/auxiliary task** (`churn_within_180_days`), exposing the ZILN zero-inflation indicator. Not the headline. | |
Comment on lines
+346
to
+349
| | `ltv_revenue_90d` | Float64 | primary regression (warm-up horizon) | | ||
| | `ltv_revenue_365d` | Float64 | primary regression (standard horizon) | | ||
| | `ltv_revenue_730d` | Float64 | primary regression (hard horizon) | | ||
| | `churned_within_180d` | boolean | secondary / ZILN zero-inflation indicator | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Corrects the LTV planning docs (merged in #102) from a churn-classification
framing to the correct predictive-lifetime-value (pLTV) regression framing.
Docs only — no package code.
The first pass pattern-matched LTV onto the lead-scoring binary task and asked
"binary vs multiclass." That's the wrong axis. LTV is a regression problem:
predict continuous future monetary value. This follows Google
lifetime_value/ the ZILN paper(arXiv:1912.07753) and
Voyantis pLTV framing.
Corrected / added decisions (design.md §2.2)
churned_within_180d) kept as a secondary task = the ZILN zero-inflation indicatorWhy this is a better fit
customers → ~0 forward revenue (zero mass), expansion customers → heavy tail.
tenure customers are the hard early-prediction case.
value capture — much of which the
release_qualityharness alreadycomputes for lead scoring.
mrr_change_full_periodleakage trap is more natural against a valuetarget.
Roadmap changes
LTV-Pcnow ships regression task specs;LTV-Phis the calendar-anchoredsnapshot; new
LTV-Piadds the early-pLTV (tenure-anchored) task family;LTV-Pjteaches the task-split writer a continuous-target path;LTV-Plcalibrates regression metric bands;
LTV-Pnnotebooks teach ZILN + cold-start.PR sequence is now
LTV-Pa..Po.Scope
Docs only. Schema foundation (
LTV-Pb) is unaffected by the reframe — theentity rows are the same; only tasks/features/snapshot/validation/notebooks
change. Framework code begins at
LTV-M1.🤖 Generated with Claude Code