feat(lifecycle): customer population builder [LTV-Ph] by shaypal5 · Pull Request #113 · leadforge-dev/leadforge

shaypal5 · 2026-06-11T17:19:52Z

Summary

First implementation milestone of the pLTV workstream (LTV-Ph, milestone
LTV-M3). Adds the lifecycle customer population builder — the starting
point for the post-conversion subscription simulation.

What's added

leadforge/schemes/lifecycle/population.py:

build_customer_population(n_customers, seed, motif_family, *, n_accounts, observation_date, acquisition_window_weeks) → CustomerPopulationResult — single public entry point; fully deterministic via two named RNG substreams.
CustomerPopulationResult / CustomerLatentState — output dataclasses.
5 retention motif families (LIFECYCLE_MOTIF_FAMILIES), each with distinct latent-mean biases:

family	primary driver
`product_led_retention`	`latent_product_fit` ↑
`relationship_led_retention`	`latent_champion_strength` ↑
`expansion_led_growth`	`latent_adoption_velocity` ↑
`payment_fragile`	`latent_budget_stability` ↓
`churner_dominated`	`latent_product_fit` ↓, `latent_champion_strength` ↓

5 customer latent traits: latent_product_fit, latent_adoption_velocity, latent_budget_stability, latent_champion_strength, latent_organizational_stability.
D3 seam (independent generation): opportunity_id=None on every customer; reserved for future chaining from a lead-scoring bundle.
D4 staggered starts: customer_start_at sampled uniformly in [obs_date − acquisition_window_weeks, obs_date), so tenure at the observation date naturally varies from near-zero (cold-start) to the full window.
Plan + MRR: employee-band-conditional plan tier selection; MRR ranges $1k–$25k/month.

Tests (28)

tests/schemes/lifecycle/test_population.py: shape, determinism, FK integrity (customer→account; latent-state coverage), staggered-start boundary and spread, D3 seam assertion, latent [0,1] bounds + 5-trait completeness, all 5 motif families, motif-bias direction test, field-value assertions.

Full suite 1565 passed / 51 skipped (+28); ruff + mypy clean.

First implementation milestone of the pLTV workstream (LTV-M3). Adds the lifecycle customer population builder — the starting point for the post- conversion subscription simulation. leadforge/schemes/lifecycle/population.py: - build_customer_population(n_customers, seed, motif_family, *, n_accounts, observation_date, acquisition_window_weeks) → CustomerPopulationResult. Two named RNG substreams (lifecycle_population_{accounts,customers}) keep each generation aspect independently stable. - CustomerPopulationResult: accounts, customers, latent_state, observation_date. - CustomerLatentState: account_latents + customer_latents dicts. - LIFECYCLE_MOTIF_FAMILIES tuple: 5 retention motif families (product_led_retention, relationship_led_retention, expansion_led_growth, payment_fragile, churner_dominated) each with distinct latent-mean biases. - 5 customer latent traits: latent_product_fit, latent_adoption_velocity, latent_budget_stability, latent_champion_strength, latent_organizational_stability. - D3 seam: opportunity_id=None (independent generation); reserved for future chaining from a lead-scoring bundle's converted leads. - D4 staggered starts: customer_start_at sampled uniformly in [obs_date - acquisition_window_weeks, obs_date), varying tenure at snapshot. - Plan + MRR: employee-band-conditional plan tier (starter/growth/enterprise) with MRR ranges $1k-$3.5k / $3.5k-$9k / $9k-$25k per month. - Contract terms: 12mo (65%) / 24mo (35%). tests/schemes/lifecycle/test_population.py (28 tests): - Shape: counts, type assertions, observation_date format. - Determinism: same seed → identical output; different seeds → different. - FK integrity: every customer.account_id in the account set; latent state covers exactly the customer and account populations. - Staggered starts: all starts < observation_date; all within acquisition window; distribution spans both halves of the window. - D3 seam: opportunity_id is None for all customers. - Latent distributions: all values in [0,1]; exactly 5 traits per customer. - Motif families: all 5 registered; each produces valid output; product_led has higher mean latent_product_fit than churner_dominated. - Entity fields: plan, MRR, contract_term, CSM rep, ID prefixes all valid. Full suite 1565 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Implements the first lifecycle (pLTV) “customer population builder” for the lifecycle scheme, generating accounts + lifecycle customers + latent traits with deterministic RNG substreams, and adds a comprehensive test suite plus roadmap/plan references.

Changes:

Added leadforge/schemes/lifecycle/population.py with build_customer_population(...), output dataclasses, motif families, and generation logic.
Added tests/schemes/lifecycle/test_population.py covering determinism, FK integrity, staggered start dates, latent bounds/shape, motif families, and key field assertions.
Updated LTV roadmap and agent plan docs to reference the milestone/PR.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`leadforge/schemes/lifecycle/population.py`	New lifecycle population generator (accounts/customers/latents) with motif-family latent biases and staggered acquisition window.
`tests/schemes/lifecycle/test_population.py`	New tests validating shape, determinism, constraints, motif behavior, and field population.
`docs/ltv/roadmap.md`	Roadmap updated to link milestone item to PR #113.
`.agent-plan.md`	Agent plan updated to reflect M3 start / PR tracking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Four issues found by hostile self-review of the initial commit: 1. Account latents used lead-scoring trait names (latent_account_fit, latent_budget_readiness, latent_process_maturity) that the lifecycle simulation engine will never query. The engine queries latent_budget_stability and latent_organizational_stability at the account level. All three lead-scoring keys replaced; motif-family bias now wires correctly through the account latents (previously bias.get calls on account latents always returned 0.0 because no lifecycle motif key matched any account latent key — the motif had zero effect on account generation). Regression test added. 2. CSM rep IDs used make_id("rep", ...) bypassing ID_PREFIXES["rep"]. The rule in core/ids.py is always go through the registry. Fixed. 3. motif_family was a positional parameter — silently passing a wrong third argument (e.g. n_accounts as an int) produces a confusing ValueError at runtime rather than a TypeError at the call site. Made keyword-only. Regression test added (inspect.signature check). 4. The default observation_date formula used a bare `+ 4` weeks buffer with no explanation. Extracted to _OBS_DATE_BUFFER_WEEKS constant with a comment explaining its purpose (gives earliest-acquired customers a small subscription history before the snapshot). Comment accuracy fix: "shared with the lead-scoring account generator" → "mirrors the distribution of" (the code is parallel, not shared; cross-scheme import would create an awkward dependency). Full suite 1567 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… [LTV-Ph] Implements three accepted recommendations from the copilot PR review: COPILOT-1 (module docstring misleading about RNG independence): Reworded to clarify that each substream handles both entity creation AND latent draws for its entity type (accounts-substream = account rows + account latents; customers-substream = customer rows + customer latents). The independence is between account generation and customer generation, not between "population" and "latent" draws. COPILOT-2 (docstring says "1 year" but default is 56 weeks): Updated observation_date docstring to show the actual formula (_WORLD_BASE_DATE + (acquisition_window_weeks + _OBS_DATE_BUFFER_WEEKS) weeks) and note the concrete value with built-in defaults (56 weeks ≈ 13 months). COPILOT-3 (missing input validation): Added ValueError guards for n_customers < 1, n_accounts < 1 (when explicit), and acquisition_window_weeks < 1 (0 would make every start == obs_date, violating the < obs_date boundary invariant). Pattern mirrors core/models.py::_require_positive_int. Four new validation tests added. COPILOT-4 (ID_PREFIXES["rep"] vs hardcoded string): Already fixed in the self-review commit (67f22df). Resolved as already treated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-11T18:31:16Z

pr-agent-context report:

This run includes unresolved review comments on PR #113 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/schemes/lifecycle/population.py
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854263
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The module docstring says the two RNG substreams make “population and latent draws … independently stable”, but the code only separates account vs customer generation (entities + latents share the same stream). This wording is misleading; either split entity vs latent substreams or update the docstring to reflect what is actually independent.

## COPILOT-2
Location: leadforge/schemes/lifecycle/population.py
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854322
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    The docstring claims the default observation_date is `_WORLD_BASE_DATE + 1 year`, but the implementation derives it from `acquisition_window_weeks` (and currently adds an extra 4 weeks). Please align the docstring with the implemented default (or vice versa) to avoid surprising callers.

## COPILOT-3
Location: leadforge/schemes/lifecycle/population.py:241
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854353
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    `build_customer_population` doesn’t validate `n_customers`, `n_accounts`, or `acquisition_window_weeks`. Some invalid-but-plausible inputs currently lead to confusing runtime errors (e.g., `n_accounts=0` makes `rng.choice(accounts)` crash), and `acquisition_window_weeks=0` violates the documented `[acq_start, obs_date)` invariant by producing `customer_start_at == obs_date`. Add basic positive-int validation and make the default `observation_date` consistent with the acquisition window without the extra 4-week offset.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 27368792404 attempt 1
Comment timestamp: 2026-06-11T18:30:24.054928+00:00
PR head commit: 86548f1f4bb3330155d1f2f72177837791819a94

Copilot AI review requested due to automatic review settings June 11, 2026 17:19

shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 11, 2026

shaypal5 added type: feature New capability layer: simulation simulation/ discrete-time engine status: needs review Ready for review dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 11, 2026

Copilot started reviewing on behalf of shaypal5 June 11, 2026 17:20 View session

docs(ltv): record LTV-Ph (#113) in roadmap + agent-plan [LTV-Ph]

0ab0e29

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This comment has been minimized.

Sign in to view

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread leadforge/schemes/lifecycle/population.py Outdated

Comment thread leadforge/schemes/lifecycle/population.py Outdated

Comment thread leadforge/schemes/lifecycle/population.py

Comment thread leadforge/schemes/lifecycle/population.py Outdated

This comment has been minimized.

Sign in to view

style: reformat test_module_layout.py (CI ruff format)

262eb47

This comment has been minimized.

Sign in to view

shaypal5 merged commit 1321607 into main Jun 11, 2026
10 of 16 checks passed

shaypal5 deleted the feat/lifecycle-customer-population branch June 11, 2026 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(lifecycle): customer population builder [LTV-Ph]#113

feat(lifecycle): customer population builder [LTV-Ph]#113
shaypal5 merged 5 commits into
mainfrom
feat/lifecycle-customer-population

shaypal5 commented Jun 11, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaypal5 commented Jun 11, 2026

Summary

What's added

Tests (28)

Next

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants