feat(lifecycle): customer population builder [LTV-Ph]#113
Merged
Conversation
First implementation milestone of the pLTV workstream (LTV-M3). Adds the
lifecycle customer population builder — the starting point for the post-
conversion subscription simulation.
leadforge/schemes/lifecycle/population.py:
- build_customer_population(n_customers, seed, motif_family, *, n_accounts,
observation_date, acquisition_window_weeks) → CustomerPopulationResult.
Two named RNG substreams (lifecycle_population_{accounts,customers}) keep
each generation aspect independently stable.
- CustomerPopulationResult: accounts, customers, latent_state, observation_date.
- CustomerLatentState: account_latents + customer_latents dicts.
- LIFECYCLE_MOTIF_FAMILIES tuple: 5 retention motif families
(product_led_retention, relationship_led_retention, expansion_led_growth,
payment_fragile, churner_dominated) each with distinct latent-mean biases.
- 5 customer latent traits: latent_product_fit, latent_adoption_velocity,
latent_budget_stability, latent_champion_strength,
latent_organizational_stability.
- D3 seam: opportunity_id=None (independent generation); reserved for future
chaining from a lead-scoring bundle's converted leads.
- D4 staggered starts: customer_start_at sampled uniformly in
[obs_date - acquisition_window_weeks, obs_date), varying tenure at snapshot.
- Plan + MRR: employee-band-conditional plan tier (starter/growth/enterprise)
with MRR ranges $1k-$3.5k / $3.5k-$9k / $9k-$25k per month.
- Contract terms: 12mo (65%) / 24mo (35%).
tests/schemes/lifecycle/test_population.py (28 tests):
- Shape: counts, type assertions, observation_date format.
- Determinism: same seed → identical output; different seeds → different.
- FK integrity: every customer.account_id in the account set; latent state
covers exactly the customer and account populations.
- Staggered starts: all starts < observation_date; all within acquisition
window; distribution spans both halves of the window.
- D3 seam: opportunity_id is None for all customers.
- Latent distributions: all values in [0,1]; exactly 5 traits per customer.
- Motif families: all 5 registered; each produces valid output; product_led
has higher mean latent_product_fit than churner_dominated.
- Entity fields: plan, MRR, contract_term, CSM rep, ID prefixes all valid.
Full suite 1565 passed / 51 skipped; ruff + mypy clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Implements the first lifecycle (pLTV) “customer population builder” for the lifecycle scheme, generating accounts + lifecycle customers + latent traits with deterministic RNG substreams, and adds a comprehensive test suite plus roadmap/plan references.
Changes:
- Added
leadforge/schemes/lifecycle/population.pywithbuild_customer_population(...), output dataclasses, motif families, and generation logic. - Added
tests/schemes/lifecycle/test_population.pycovering determinism, FK integrity, staggered start dates, latent bounds/shape, motif families, and key field assertions. - Updated LTV roadmap and agent plan docs to reference the milestone/PR.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
leadforge/schemes/lifecycle/population.py |
New lifecycle population generator (accounts/customers/latents) with motif-family latent biases and staggered acquisition window. |
tests/schemes/lifecycle/test_population.py |
New tests validating shape, determinism, constraints, motif behavior, and field population. |
docs/ltv/roadmap.md |
Roadmap updated to link milestone item to PR #113. |
.agent-plan.md |
Agent plan updated to reflect M3 start / PR tracking. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Four issues found by hostile self-review of the initial commit:
1. Account latents used lead-scoring trait names (latent_account_fit,
latent_budget_readiness, latent_process_maturity) that the lifecycle
simulation engine will never query. The engine queries
latent_budget_stability and latent_organizational_stability at the account
level. All three lead-scoring keys replaced; motif-family bias now wires
correctly through the account latents (previously bias.get calls on
account latents always returned 0.0 because no lifecycle motif key matched
any account latent key — the motif had zero effect on account generation).
Regression test added.
2. CSM rep IDs used make_id("rep", ...) bypassing ID_PREFIXES["rep"]. The
rule in core/ids.py is always go through the registry. Fixed.
3. motif_family was a positional parameter — silently passing a wrong third
argument (e.g. n_accounts as an int) produces a confusing ValueError at
runtime rather than a TypeError at the call site. Made keyword-only.
Regression test added (inspect.signature check).
4. The default observation_date formula used a bare `+ 4` weeks buffer with
no explanation. Extracted to _OBS_DATE_BUFFER_WEEKS constant with a
comment explaining its purpose (gives earliest-acquired customers a small
subscription history before the snapshot).
Comment accuracy fix: "shared with the lead-scoring account generator" →
"mirrors the distribution of" (the code is parallel, not shared; cross-scheme
import would create an awkward dependency).
Full suite 1567 passed / 51 skipped; ruff + mypy clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
… [LTV-Ph] Implements three accepted recommendations from the copilot PR review: COPILOT-1 (module docstring misleading about RNG independence): Reworded to clarify that each substream handles both entity creation AND latent draws for its entity type (accounts-substream = account rows + account latents; customers-substream = customer rows + customer latents). The independence is between account generation and customer generation, not between "population" and "latent" draws. COPILOT-2 (docstring says "1 year" but default is 56 weeks): Updated observation_date docstring to show the actual formula (_WORLD_BASE_DATE + (acquisition_window_weeks + _OBS_DATE_BUFFER_WEEKS) weeks) and note the concrete value with built-in defaults (56 weeks ≈ 13 months). COPILOT-3 (missing input validation): Added ValueError guards for n_customers < 1, n_accounts < 1 (when explicit), and acquisition_window_weeks < 1 (0 would make every start == obs_date, violating the < obs_date boundary invariant). Pattern mirrors core/models.py::_require_positive_int. Four new validation tests added. COPILOT-4 (ID_PREFIXES["rep"] vs hardcoded string): Already fixed in the self-review commit (67f22df). Resolved as already treated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #113 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/schemes/lifecycle/population.py
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854263
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The module docstring says the two RNG substreams make “population and latent draws … independently stable”, but the code only separates account vs customer generation (entities + latents share the same stream). This wording is misleading; either split entity vs latent substreams or update the docstring to reflect what is actually independent.
## COPILOT-2
Location: leadforge/schemes/lifecycle/population.py
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854322
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
The docstring claims the default observation_date is `_WORLD_BASE_DATE + 1 year`, but the implementation derives it from `acquisition_window_weeks` (and currently adds an extra 4 weeks). Please align the docstring with the implemented default (or vice versa) to avoid surprising callers.
## COPILOT-3
Location: leadforge/schemes/lifecycle/population.py:241
URL: https://github.com/leadforge-dev/leadforge/pull/113#discussion_r3397854353
Status: outdated
Root author: copilot-pull-request-reviewer
Comment:
`build_customer_population` doesn’t validate `n_customers`, `n_accounts`, or `acquisition_window_weeks`. Some invalid-but-plausible inputs currently lead to confusing runtime errors (e.g., `n_accounts=0` makes `rng.choice(accounts)` crash), and `acquisition_window_weeks=0` violates the documented `[acq_start, obs_date)` invariant by producing `customer_start_at == obs_date`. Add basic positive-int validation and make the default `observation_date` consistent with the acquisition window without the extra 4-week offset.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First implementation milestone of the pLTV workstream (
LTV-Ph, milestoneLTV-M3). Adds the lifecycle customer population builder — the startingpoint for the post-conversion subscription simulation.
What's added
leadforge/schemes/lifecycle/population.py:build_customer_population(n_customers, seed, motif_family, *, n_accounts, observation_date, acquisition_window_weeks) → CustomerPopulationResult— single public entry point; fully deterministic via two named RNG substreams.CustomerPopulationResult/CustomerLatentState— output dataclasses.LIFECYCLE_MOTIF_FAMILIES), each with distinct latent-mean biases:product_led_retentionlatent_product_fit↑relationship_led_retentionlatent_champion_strength↑expansion_led_growthlatent_adoption_velocity↑payment_fragilelatent_budget_stability↓churner_dominatedlatent_product_fit↓,latent_champion_strength↓latent_product_fit,latent_adoption_velocity,latent_budget_stability,latent_champion_strength,latent_organizational_stability.opportunity_id=Noneon every customer; reserved for future chaining from a lead-scoring bundle.customer_start_atsampled uniformly in[obs_date − acquisition_window_weeks, obs_date), so tenure at the observation date naturally varies from near-zero (cold-start) to the full window.Tests (28)
tests/schemes/lifecycle/test_population.py: shape, determinism, FK integrity (customer→account; latent-state coverage), staggered-start boundary and spread, D3 seam assertion, latent [0,1] bounds + 5-trait completeness, all 5 motif families, motif-bias direction test, field-value assertions.Next
LTV-Pi— lifecycle motif families + mechanism policies (assign_lifecycle_mechanisms()): churn hazard, expansion propensity, payment-failure params, keyed to the 5 retention motif families built here.🤖 Generated with Claude Code