Skip to content

feat(lifecycle): churn / expansion / payment hazard functions [LTV-Pj]#117

Merged
shaypal5 merged 3 commits into
mainfrom
feat/lifecycle-hazards
Jun 12, 2026
Merged

feat(lifecycle): churn / expansion / payment hazard functions [LTV-Pj]#117
shaypal5 merged 3 commits into
mainfrom
feat/lifecycle-hazards

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

First half of the lifecycle simulation engine milestone (LTV-Pj, milestone LTV-M4). Adds the pure hazard functions that turn latent state + mechanism params (from LTV-Pi, #116) into per-step event probabilities. The weekly engine (LTV-Pk) will own every Bernoulli draw against these.

What's added — leadforge/schemes/lifecycle/hazards.py

function composition
churn_probability(params, latents, week, term) base weekly rate × latent multiplier × onboarding elevation; on an anniversary week additionally × renewal_hazard_multiplier × renewal latent multiplier
expansion_probability(params, latents, depth=None) base weekly rate × latent multiplier × optional (0.5 + feature_depth_score) health modulation
payment_failure_probability(params, latents) base monthly rate × latent multiplier (budget-stability dominated)
is_renewal_week(week, term_months) public anniversary predicate (round(k · term · 52/12)) — the engine emits renewal events on exactly the boundary the churn spike uses

Design decisions

  • Pure / deterministic — no RNG inside the hazards; the engine owns all draws. Exact-value and shape tests need no seeding.
  • Cox-style latent modulationexp(Σ w·(latent − 0.5)): neutral latents → ×1.0 (base rate unchanged); missing traits treated as neutral rather than raising or silently zeroing; matches the sign convention fixed in mechanisms.py.
  • Onboarding elevation — exponential decay from 2.5× at week 0 (time-constant 4 weeks; <8% residual by week 12) — the decreasing-hazard early-Weibull behaviour from design.md §6.
  • Probability cap 0.95 — extreme tails never make an event certain; mis-calibration degrades visibly instead of saturating silently.

Tests (40)

tests/schemes/lifecycle/test_hazards.py: renewal-week arithmetic (12/24/13-month terms incl. non-integer term-weeks, adjacent weeks, week 0, input validation); bounds at extreme latents across all 5 motifs; neutral-latents ≈ base rate; fit/velocity/budget monotonicity; onboarding elevation + monotone decay; renewal spike (>5× adjacent weeks) + champion-dampens-spike; missing-latents neutrality; cap behaviour; determinism; depth modulation + range validation; cross-motif sanity.

  • Full suite 1686 passed / 51 skipped (+40); ruff + mypy clean.

Next

LTV-Pk — the weekly simulation engine (simulate_lifecycle()): the per-customer weekly loop that draws against these hazards and emits subscription_events, health_signals, and invoices.

🤖 Generated with Claude Code

First half of the lifecycle simulation engine milestone (LTV-M4). Adds the
pure hazard functions that convert latent state + mechanism params into
per-step event probabilities; the weekly engine (LTV-Pk) will own every
Bernoulli draw against them.

leadforge/schemes/lifecycle/hazards.py:
- churn_probability(params, latents, week_of_tenure, contract_term_months):
  base weekly rate × Cox-style latent multiplier × onboarding elevation
  (exponential decay from 2.5× at week 0, time-constant 4 weeks — the
  decreasing-hazard early-Weibull behaviour from design.md §6) and, on a
  contract-anniversary week, × renewal_hazard_multiplier × renewal latent
  multiplier (champion-fights-for-renewal).
- expansion_probability(params, latents, feature_depth_score=None): base
  weekly rate × latent multiplier, optionally × (0.5 + depth) health
  modulation (depth 0.5 neutral); validates depth in [0, 1].
- payment_failure_probability(params, latents): base monthly rate × latent
  multiplier (budget-stability dominated).
- is_renewal_week(week, contract_term_months): public anniversary predicate
  (round(k · term · 52/12)) so the engine emits renewal events on exactly the
  same boundary the churn spike uses. Validates inputs.

Design notes:
- Latent modulation is proportional-hazards style: exp(Σ w·(latent − 0.5));
  neutral latents → multiplier 1.0; missing traits treated as neutral.
  Matches the sign convention fixed in mechanisms.py (negative weight on a
  good trait reduces the hazard).
- All probabilities capped at 0.95 so extreme tails never make events certain.
- Functions are deterministic (no RNG) — exact-value and shape tests need no
  seeding.

tests/schemes/lifecycle/test_hazards.py (40 tests): renewal-week arithmetic
(12/24/13-month terms, adjacents, week 0, input validation), bounds at extreme
latents across all 5 motifs, neutral-latents ≈ base rate, fit/velocity/budget
monotonicity, onboarding elevation + monotone decay, renewal spike (>5×
adjacent weeks) + champion dampening, missing-latents neutrality, cap
behaviour, determinism, depth modulation + range validation, cross-motif
sanity (fragile fails payments more; churner churns more than growth).

Full suite 1686 passed / 51 skipped; ruff + mypy clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 12, 2026 08:32
@shaypal5 shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 12, 2026
@shaypal5 shaypal5 added type: feature New capability layer: mechanisms mechanisms/ generators and transitions status: needs review Ready for review dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 12, 2026
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions

This comment has been minimized.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the first set of deterministic lifecycle “hazard” functions (churn, expansion, payment failure) that map motif mechanism parameters + latent traits into per-step event probabilities, along with a dedicated test suite and roadmap bookkeeping updates. This supports the upcoming weekly lifecycle simulation engine by centralizing the probability math in pure, easily testable helpers.

Changes:

  • Introduces leadforge/schemes/lifecycle/hazards.py with pure probability functions and a shared renewal-week predicate.
  • Adds tests/schemes/lifecycle/test_hazards.py covering renewal arithmetic, monotonicity, neutrality of missing latents, capping, and cross-motif sanity checks.
  • Updates LTV roadmap / agent plan status text to reference the new PR.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
leadforge/schemes/lifecycle/hazards.py New hazard/probability functions and is_renewal_week() helper used by the upcoming engine.
tests/schemes/lifecycle/test_hazards.py New unit tests validating probability bounds, shapes, renewal spike behavior, and determinism.
docs/ltv/roadmap.md Roadmap bookkeeping updates for LTV-M3/M4 items and PR references.
.agent-plan.md Status/progress note updated for the LTV workstream.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +112 to +117
if week_of_tenure < 0:
raise ValueError(f"week_of_tenure must be >= 0, got {week_of_tenure}")
if contract_term_months < 1:
raise ValueError(f"contract_term_months must be >= 1, got {contract_term_months}")
if week_of_tenure == 0:
return False
Comment thread docs/ltv/roadmap.md
| `LTV-M2` | Generation-scheme architecture + physical reorg | `LTV-Pd`, `LTV-Pe`, `LTV-Pf`, `LTV-Pg` | #107 (Pd), #108 (Pe), #109 (Pf.1), #110 (Pf.2), #111 (Pg.1), #112 (Pg.2) |
| `LTV-M3` | Customer population + lifecycle world | `LTV-Ph`, `LTV-Pi` | #113 (Ph) |
| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | |
| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | #117 (Pj) |
Comment thread docs/ltv/roadmap.md
## `LTV-M4` — Lifecycle simulation engine

- [ ] **`LTV-Pj`** — `feat(lifecycle): churn / expansion / payment hazards`.
- [ ] **`LTV-Pj`** — `feat(lifecycle): churn / expansion / payment hazards` (**PR #117**).
…-Pj]

Five findings from hostile self-review of the initial commit:

1. CALIBRATION HONESTY (the real one): the tenure shape added in this PR
   silently invalidates the annual-churn annotations corrected in LTV-Pi.
   Onboarding elevation contributes ~6.8 x base_rate of extra first-year churn
   mass and each renewal spike adds (multiplier - 1) x base_rate, so true
   first-year churn runs ~5-14 points above the base-rate-only figures
   (churner_dominated ~52%, outside the advanced band [0.30, 0.50] the comment
   claimed to target). mechanisms.py now states explicitly that the "% annual"
   figures are BASE-RATE-ONLY and that final calibration happens in the LTV-Pk
   engine tests, where base rates are expected to be tuned DOWN.

2. _MAX_PROBABILITY was private but imported by tests and documented in three
   docstrings as part of the contract — same smell class as the _empty_df
   finding in Pg.1. Promoted to public MAX_PROBABILITY, added to __all__.

3. _NEUTRAL_LATENT was doing double duty as the feature-depth multiplier floor
   (semantically unrelated; recentring latents would silently change the
   health modulation). Split out _DEPTH_MULTIPLIER_FLOOR.

4. Stated the uniform-across-motifs rationale for the hardcoded onboarding
   shape (customer-success process constant — mirrors the lead-scoring
   follow-up-ramp precedent), plus a note that latents are deliberately not
   range-validated in the hot path, and a proof comment that banker's rounding
   in is_renewal_week is unreachable (frac(k*13m/3) ∈ {0, 1/3, 2/3}).

5. Added the missing negative renewal test: is_renewal_week(52, 24) is False —
   a 24-month contract must not spike mid-contract.

Full suite 1687 passed / 51 skipped; ruff + mypy clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

This run includes unresolved review comments on PR #117 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/schemes/lifecycle/hazards.py:134
URL: https://github.com/leadforge-dev/leadforge/pull/117#discussion_r3401862299
Root author: copilot-pull-request-reviewer

Comment:
    `is_renewal_week()` docstring says `week_of_tenure` / `contract_term_months` must be (positive) integers, but the implementation only checks numeric bounds. Passing floats/bools (e.g. `True`) will silently behave like `1`, which can hide caller bugs and make the public API contract inaccurate. Consider validating both inputs as `int` (and explicitly rejecting `bool`, which is a subclass of `int`) before doing the arithmetic.

## COPILOT-2
Location: docs/ltv/roadmap.md:47
URL: https://github.com/leadforge-dev/leadforge/pull/117#discussion_r3401862337
Root author: copilot-pull-request-reviewer

Comment:
    The roadmap's Planning notation says GitHub PR numbers are recorded back here on merge, but this row now lists `#117 (Pj)` while `LTV-Pj` is still unchecked below. Either remove the PR number until merge, or update the roadmap convention text to match the new practice.

## COPILOT-3
Location: docs/ltv/roadmap.md:176
URL: https://github.com/leadforge-dev/leadforge/pull/117#discussion_r3401862358
Root author: copilot-pull-request-reviewer

Comment:
    This milestone bullet lists `(**PR #117**)` but remains unchecked (`[ ]`). If the convention is to record PR numbers only on merge (as stated earlier in the doc), drop the PR reference until it lands; otherwise the checklist state and PR tracking are inconsistent.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 27406548799 attempt 1
Comment timestamp: 2026-06-12T09:16:10.743709+00:00
PR head commit: 8d2001cc21751780c77531d3d7a651cc342b8c29

@shaypal5 shaypal5 merged commit 40ce4d4 into main Jun 12, 2026
10 checks passed
@shaypal5 shaypal5 deleted the feat/lifecycle-hazards branch June 12, 2026 09:23
shaypal5 added a commit that referenced this pull request Jun 12, 2026
* feat(lifecycle): weekly simulation engine [LTV-Pk]

Second half of LTV-M4 — the weekly lifecycle simulator that turns a customer
population into the three event tables plus terminal subscription state.

leadforge/schemes/lifecycle/engine.py:
- simulate_lifecycle(population, seed, *, forward_window_days=730,
  early_tenure_weeks=4) → LifecycleSimulationResult{subscriptions,
  subscription_events, health_signals, invoices}.
- Fully simulated target windows (D6): each customer runs through
  max(observation_date, start + early_tenure_weeks) + forward_window_days, so
  all 90/365/730d pLTV targets are complete for BOTH observation regimes.
- Per-customer RNG substreams (lifecycle_sim::<customer_id>): one customer's
  trajectory is invariant to every other customer — stronger stability than
  the lead-scoring shared streams, and provable in a test (solo-resimulation
  equals the full-population trajectory).
- Fixed weekly step order: health signal → invoice/dunning → churn draw →
  renewal event → expansion draw. Causal churn reasons: payment_failure
  (write-off), non_renewal (spiked draw on anniversary week), voluntary.
- Health signals: weekly active_users (plan-seat base x adoption x onboarding
  usage ramp), feature_depth_score (latent plateau x ramp — feeds the same
  week's expansion hazard, creating latents → health → expansion causality),
  Knuth-Poisson support tickets (fit-driven), quarterly NPS (null off-cycle).
- Invoices on month boundaries (12/52 weeks) at current MRR; failures enter
  dunning and resolve to recovered or written_off → forced churn.
- subscription_id derived from the customer index up front and threaded into
  every event (no back-fill pass).

leadforge/schemes/lifecycle/population.py:
- CustomerPopulationResult records motif_family so the engine fetches the same
  family's mechanism params (passing it separately would invite silent drift).

leadforge/schemes/lifecycle/mechanisms.py — ENGINE CALIBRATION (discharges the
obligation recorded in the #117 review):
- Measured simulated first-year churn on motif-biased populations (n=600,
  3 seed pairs) and retuned churn base rates, payment-failure rates, and
  recovery rates over three rounds. Before: payment_fragile 80.5%,
  churner_dominated 66.8%. After: product_led ~19-21%, relationship ~23-27%,
  expansion_led ~13-18%, payment_fragile ~35-39%, churner_dominated ~41-44% —
  matching the per-motif intent while honouring the Pi directional-test
  constraints (fragile failure > 2x others, fragile recovery strictly lowest).
- Calibration comment rewritten from "expected to be tuned DOWN" to
  ENGINE-CALIBRATED with the measured targets and a pointer to the guard test.

tests/schemes/lifecycle/test_engine.py (25 tests): shape/validation,
byte-equal determinism across all four tables, per-customer independence
(solo-resimulation), weekly health cadence, monthly invoice cadence (±1),
quarterly-only NPS, full-window coverage for active customers, event FK
integrity, churn-state consistency + no-events-after-churn, expansion MRR
chain reconciliation (events chain to subscription.current_mrr and
expansion_count), renewal-events-only-on-anniversaries + renewal_count
reconciliation, dunning resolution / write-off → payment_failure churn,
per-motif year-1 churn bands, expansion-world dominance, majority-active-at-
observation sanity.

Engine docstring records the two tracked exclusions: downgrade events (no
mechanism params yet — downgrade_count would be zero-variance; revisit before
LTV-M5 ships the feature) and difficulty-tier scaling (LTV-M6).

Full suite 1712 passed / 51 skipped; ruff + mypy clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(ltv): record LTV-Pk (#118) in roadmap + agent-plan [LTV-Pk]

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(lifecycle): address self-review findings on the engine [LTV-Pk]

Three findings from hostile self-review of the initial engine commit:

1. ACCOUNT LATENTS WERE COMPLETELY DEAD (the real one). The customer latent
   dict contains every account latent key (latent_budget_stability,
   latent_organizational_stability) plus three more, and the merge let
   customer values shadow account values — so every account-level draw was
   discarded for every customer, and the comment called the collision
   "(deliberate)". This also destroyed the within-account correlation that
   account-level draws exist to provide (~3 customers share an account).
   Fix: explicit _merge_latents helper that blends shared traits 50/50 —
   the account component is a shared random effect, giving correlated churn
   and payment behaviour within an account (mixed-effects structure).
   Calibration re-verified after the change: all five motifs stay inside
   their year-1 churn bands across three seed pairs.
   Regression test: swinging account latent_budget_stability 0.0 vs 1.0
   must change the population's failed-invoice count.

2. Intra-week ordering: invoices were issued BEFORE pending dunning resolved.
   Consequences: (a) a customer's write-off churn week could include a fresh
   same-week invoice (whose paid amount would count toward pLTV revenue);
   (b) for dunning_weeks=4 motifs with a 4-week month gap, a second invoice
   could fail while one was pending and be silently dropped — terminal
   "failed" status forever, no event, no dunning. Fix: dunning resolution now
   runs before invoice issuance, so the pending slot is always free by
   issuance time. The only remaining "failed" terminal states are genuine
   censoring (churn for another reason mid-dunning, or window end) — now
   documented in the module docstring and pinned by a test that checks every
   dangling "failed" invoice against exactly those two conditions.

3. The recorded feature_depth_score was rounded to 4dp but the UNROUNDED
   value fed the expansion hazard — the published observable differed (by ε)
   from the value that drove behaviour, breaking the data↔causality
   equivalence this dataset exists to teach. The hazard now consumes the
   exact rounded value the row records; round-trip test added.

Full suite 1715 passed / 51 skipped; ruff + mypy clean; calibration bands
re-verified post-blend.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream layer: mechanisms mechanisms/ generators and transitions status: needs review Ready for review type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants