From bd2c88afc4a62e6fa5cb731b1bd74ccec1fc37ef Mon Sep 17 00:00:00 2001 From: Shay Palachy Date: Wed, 17 Jun 2026 23:58:43 +0300 Subject: [PATCH 1/2] feat(lifecycle): consume narrative firmographics [LTV-Po.1] MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First half of the split LTV-Po (narrative-wiring; the recipe + e2e is Po.2). Per the locked decision, the recipe narrative DRIVES the lifecycle population's firmographics rather than them staying scheme-internal. - build_customer_population gains a narrative parameter. When provided, the account firmographic vocabularies come from narrative.market.icp_industries / narrative.market.geographies; when None, they fall back to the built-in procurement-ICP defaults (_ICP_INDUSTRIES / _GEOGRAPHIES). - _generate_accounts takes the resolved vocabularies (keyword-only, defaulting to the built-ins) and draws from them. - build_world threads its narrative through to the population builder. - Empty narrative vocabularies are rejected with a clear ValueError. Byte-identity: the no-narrative path passes the SAME built-in tuples, so the RNG draws — and every resulting bundle file — are unchanged. Verified vs main (both exposure modes, full SHA-256 of every file) using a fresh worktree. Band-weight distributions (employee/revenue/process-maturity) stay as module constants — they are not part of the narrative MarketSpec schema. Tests: narrative-driven industries/regions; no-narrative built-in fallback; empty-vocab rejection; narrative-population determinism. Full suite 1881 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 --- .agent-plan.md | 10 ++- docs/ltv/roadmap.md | 35 ++++++---- leadforge/schemes/lifecycle/__init__.py | 24 +++---- leadforge/schemes/lifecycle/population.py | 28 +++++++- .../lifecycle/test_population_narrative.py | 64 +++++++++++++++++++ 5 files changed, 129 insertions(+), 32 deletions(-) create mode 100644 tests/schemes/lifecycle/test_population_narrative.py diff --git a/.agent-plan.md b/.agent-plan.md index cdd4489..52cd1d0 100644 --- a/.agent-plan.md +++ b/.agent-plan.md @@ -92,9 +92,13 @@ stateful/terminal columns dropped; manifest flags; CLAUDE.md clause; lead-scoring byte-identical) opened as **#127** (merged). `LTV-Pn.4d` (shared bundle orchestrator — `render/bundle.py` `write_bundle_envelope`; both schemes delegate bundle I/O; carried cleanup #1 discharged; all four bundles -byte-identical) opened as **#128** — **completes LTV-Pn.4**. Next: `LTV-Po` -(the `b2b_saas_ltv_v1` recipe assets + end-to-end `Generator.from_recipe(...)`; -also recipe-driven difficulty resolution + the narrative-consumption decision). +byte-identical) opened as **#128** — **completes LTV-Pn.4**. `LTV-Po` split into Po.1 (narrative-wiring) + Po.2 +(recipe + e2e). `LTV-Po.1` (build_customer_population consumes narrative +firmographics; build_world threads narrative; no-narrative path byte-identical +vs main) opened as **#130**. Decisions locked: narrative DRIVES firmographics; +public early-pLTV stays calendar-only (Option A); difficulty = distortion tiers +now + simulation-level scaling deferred (issue #129). Next: `LTV-Po.2` +(b2b_saas_ltv_v1 recipe YAMLs + difficulty_params resolution + e2e round-trip). Note: `validate_bundle` is lead-scoring-coupled — scheme-aware validation is `LTV-Pp`. diff --git a/docs/ltv/roadmap.md b/docs/ltv/roadmap.md index 5b9c2e5..2e11489 100644 --- a/docs/ltv/roadmap.md +++ b/docs/ltv/roadmap.md @@ -378,20 +378,29 @@ methods, then public-safety, then the carried orchestrator cleanup: bundles (lead_scoring + lifecycle × instructor + public) verified byte-identical** via the full-bundle SHA-256 harness. - Labels: `type: refactor`, `layer: render`, `layer: api` -- [ ] **`LTV-Po`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets`. The three - recipe YAMLs (`scheme: lifecycle`); register in the recipe registry; - end-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()` smoke test. - **Decide narrative consumption:** the lifecycle population hardcodes its - firmographics and `build_world` ignores `narrative` (Pn.4a) — either wire the - recipe's `narrative.yaml` into the population builder or document the - firmographics as scheme-internal. - - Tests: recipe loads, full round-trip, determinism, all task splits (3 - windows × 2 regimes + secondary churn), public/instructor split. - - Labels: `type: feature`, `layer: recipes` -- **Deferred (flagged in Pn.4a):** simulation-level difficulty scaling for the +`LTV-Po` is split: narrative-wiring (prerequisite) then the recipe + e2e. + +- [x] **`LTV-Po.1`** — `feat(lifecycle): consume narrative firmographics` + (**PR #130**). `build_customer_population(narrative=…)` reads the firmographic + vocabularies (`market.icp_industries` / `market.geographies`) from the recipe + narrative when given; a `None` narrative falls back to the built-in + procurement-ICP defaults, so the no-narrative path is byte-identical (verified + vs `main`, both modes). `build_world` threads `narrative` through. Decision + (locked): the recipe narrative **drives** firmographics (not scheme-internal). + - Labels: `type: feature`, `layer: narrative` +- [ ] **`LTV-Po.2`** — `feat(recipes): b2b_saas_ltv_v1 recipe assets + e2e`. The + three recipe YAMLs (`scheme: lifecycle`; `narrative.yaml` with the lifecycle + vertical's firmographics; `difficulty_profiles.yaml`); register in the recipe + registry; resolve `difficulty_params` from the active profile in `build_world` + (mirroring lead-scoring `_resolve_difficulty`) so snapshot distortions fire + per tier; end-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()` + round-trip. Public mode stays calendar-only (Option A, locked). + - Tests: recipe loads, full round-trip, determinism, all task splits, + public/instructor split, per-tier distortion. + - Labels: `type: feature`, `layer: recipes`, `layer: api` +- **Deferred (issue #129):** simulation-level difficulty scaling for the lifecycle engine — making `advanced` a genuinely harder world (not just - noisier snapshots). Currently the motif-calibrated rates are difficulty- - agnostic; revisit alongside `LTV-Pp` difficulty-band validation. + noisier snapshots). Revisit alongside `LTV-Pp` difficulty-band validation. --- diff --git a/leadforge/schemes/lifecycle/__init__.py b/leadforge/schemes/lifecycle/__init__.py index 1e091b0..f4b9072 100644 --- a/leadforge/schemes/lifecycle/__init__.py +++ b/leadforge/schemes/lifecycle/__init__.py @@ -54,20 +54,15 @@ def build_world( ``forward_windows_days`` (the engine simulates through the longest window so every pLTV target is fully covered). - Not yet applied (tracked, not silent): - - - **Difficulty.** ``config.difficulty`` / ``difficulty_params`` are - NOT consumed here, so every difficulty tier currently yields the same - world. Two distinct pieces remain: resolving ``difficulty_params`` - from the active profile and threading it into the snapshot - distortions (``LTV-Pn.4b``, where snapshots are built), and - simulation-level difficulty scaling that actually makes harder tiers - harder worlds (deferred — see ``mechanisms.py`` and the roadmap). - - **Narrative.** ``narrative`` is accepted for protocol parity but - unused: the lifecycle population builder generates its own - firmographics from internal distributions, so the recipe's - ``narrative.yaml`` will not drive them until ``LTV-Po`` decides - whether the lifecycle scheme should consume the narrative spec. + ``narrative``, when provided, drives the population's firmographic + vocabularies (``market.icp_industries`` / ``market.geographies``); a + ``None`` narrative falls back to the built-in procurement-ICP defaults. + + Difficulty (tracked, not silent): ``config.difficulty`` does not yet + scale the *simulation* — every tier yields the same world — so harder + tiers differ only in snapshot distortions (resolved from the recipe + profile in ``LTV-Po`` and threaded into the snapshot builders). + Simulation-level difficulty scaling is deferred (issue #129). """ from leadforge.core.exceptions import InvalidConfigError from leadforge.core.models import WorldBundle, WorldSpec @@ -98,6 +93,7 @@ def build_world( config.seed, motif_family=motif_family, observation_date=config.observation_date, + narrative=narrative, ) simulation_result = simulate_lifecycle( population, diff --git a/leadforge/schemes/lifecycle/population.py b/leadforge/schemes/lifecycle/population.py index ba0fd38..24d94dc 100644 --- a/leadforge/schemes/lifecycle/population.py +++ b/leadforge/schemes/lifecycle/population.py @@ -33,12 +33,16 @@ import random from dataclasses import dataclass, field from datetime import date, timedelta +from typing import TYPE_CHECKING from leadforge.core.ids import ID_PREFIXES, make_id from leadforge.core.rng import RNGRoot from leadforge.schema.entities import AccountRow from leadforge.schemes.lifecycle.entities import CustomerLifecycleRow +if TYPE_CHECKING: + from leadforge.narrative.spec import NarrativeSpec + # --------------------------------------------------------------------------- # Output types # --------------------------------------------------------------------------- @@ -174,6 +178,7 @@ def build_customer_population( n_accounts: int | None = None, observation_date: str | None = None, acquisition_window_weeks: int = _DEFAULT_ACQUISITION_WINDOW_WEEKS, + narrative: NarrativeSpec | None = None, ) -> CustomerPopulationResult: """Generate accounts and lifecycle customers with their latent states. @@ -246,10 +251,26 @@ def build_customer_population( root = RNGRoot(seed) bias = _MOTIF_LATENT_BIAS.get(motif_family, {}) + # Firmographic vocabularies come from the recipe narrative's market spec + # when provided; otherwise fall back to the built-in procurement-ICP + # defaults (so the no-narrative path is unchanged / byte-identical). + if narrative is not None: + industries = tuple(narrative.market.icp_industries) + geographies = tuple(narrative.market.geographies) + if not industries or not geographies: + raise ValueError( + "narrative.market must define non-empty icp_industries and geographies" + ) + else: + industries = _ICP_INDUSTRIES + geographies = _GEOGRAPHIES + accounts, acct_latents = _generate_accounts( n=n_accounts, bias=bias, rng=root.child("lifecycle_population_accounts"), + industries=industries, + geographies=geographies, ) customers, cust_latents = _generate_customers( @@ -282,6 +303,9 @@ def _generate_accounts( n: int, bias: dict[str, float], rng: random.Random, + *, + industries: tuple[str, ...] = _ICP_INDUSTRIES, + geographies: tuple[str, ...] = _GEOGRAPHIES, ) -> tuple[list[AccountRow], dict[str, dict[str, float]]]: """Generate *n* account entities with lifecycle-relevant latent traits. @@ -297,8 +321,8 @@ def _generate_accounts( for i in range(1, n + 1): acct_id = make_id(ID_PREFIXES["account"], i) - industry = rng.choice(_ICP_INDUSTRIES) - region = rng.choice(_GEOGRAPHIES) + industry = rng.choice(industries) + region = rng.choice(geographies) employee_band = rng.choices(_EMPLOYEE_BANDS, weights=_EMPLOYEE_BAND_WEIGHTS, k=1)[0] revenue_band = rng.choices(_REVENUE_BANDS, weights=_REVENUE_BAND_WEIGHTS, k=1)[0] maturity_band = rng.choices( diff --git a/tests/schemes/lifecycle/test_population_narrative.py b/tests/schemes/lifecycle/test_population_narrative.py new file mode 100644 index 0000000..26f82a0 --- /dev/null +++ b/tests/schemes/lifecycle/test_population_narrative.py @@ -0,0 +1,64 @@ +"""Narrative-driven firmographics for the lifecycle population (LTV-Po.1).""" + +from __future__ import annotations + +import pytest + +from leadforge.narrative.spec import MarketSpec, NarrativeSpec +from leadforge.schemes.lifecycle.population import ( + _GEOGRAPHIES, + _ICP_INDUSTRIES, + build_customer_population, +) + + +def _narrative(industries: tuple[str, ...], geographies: tuple[str, ...]) -> NarrativeSpec: + # The population builder reads only ``narrative.market``; the other sub-specs + # are irrelevant here, so they are left as None/empty (never dereferenced). + market = MarketSpec( + icp_employee_range=(200, 2000), + icp_industries=industries, + geographies=geographies, + avg_deal_size_usd=25000, + avg_sales_cycle_days=60, + ) + return NarrativeSpec( + company=None, # type: ignore[arg-type] + product=None, # type: ignore[arg-type] + market=market, + gtm_motion=None, # type: ignore[arg-type] + personas=(), + funnel_stages=(), + ) + + +def test_narrative_drives_industries_and_regions() -> None: + industries = ("Aerospace", "Maritime Logistics") + geographies = ("Antarctica",) + pop = build_customer_population( + 80, 7, motif_family="product_led_retention", narrative=_narrative(industries, geographies) + ) + seen_ind = {a.industry for a in pop.accounts} + seen_geo = {a.region for a in pop.accounts} + assert seen_ind <= set(industries) + assert seen_geo == set(geographies) + # And they are NOT the built-in defaults. + assert seen_ind.isdisjoint(set(_ICP_INDUSTRIES)) + + +def test_no_narrative_uses_builtin_defaults() -> None: + pop = build_customer_population(80, 7, motif_family="product_led_retention", narrative=None) + assert {a.industry for a in pop.accounts} <= set(_ICP_INDUSTRIES) + assert {a.region for a in pop.accounts} <= set(_GEOGRAPHIES) + + +def test_empty_narrative_vocab_rejected() -> None: + with pytest.raises(ValueError, match="icp_industries and geographies"): + build_customer_population(10, 1, narrative=_narrative((), ("US",))) + + +def test_narrative_population_deterministic() -> None: + nar = _narrative(("A", "B", "C"), ("X", "Y")) + a = build_customer_population(60, 3, narrative=nar) + b = build_customer_population(60, 3, narrative=nar) + assert [r.to_dict() for r in a.accounts] == [r.to_dict() for r in b.accounts] From 9367caf758a5e2e2017b27a5c1e8550d4e3e5e51 Mon Sep 17 00:00:00 2001 From: Shay Palachy Date: Thu, 18 Jun 2026 00:16:41 +0300 Subject: [PATCH 2/2] =?UTF-8?q?docs(roadmap):=20flag=20the=20=E2=89=A52-vo?= =?UTF-8?q?cab=20constraint=20for=20the=20Po.2=20recipe=20[LTV-Po.1]?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Self-review of the narrative-firmographics wiring: Po.1 makes narrative.market.icp_industries / geographies drive the public industry / region columns, so a single-value vocabulary would yield a zero-variance firmographic feature in student_public (invariant #6). Not a Po.1 code defect (it correctly draws from whatever vocab it's given, and a single-value vocab is a legitimate request the validator should catch), but an easy-to-miss constraint on the Po.2 recipe narrative.yaml — flagged in the roadmap with a test to add (both columns ≥2 distinct values in the public bundle). Docs only. Co-Authored-By: Claude Opus 4.8 --- docs/ltv/roadmap.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/ltv/roadmap.md b/docs/ltv/roadmap.md index 2e11489..1d74e7d 100644 --- a/docs/ltv/roadmap.md +++ b/docs/ltv/roadmap.md @@ -395,6 +395,12 @@ methods, then public-safety, then the carried orchestrator cleanup: (mirroring lead-scoring `_resolve_difficulty`) so snapshot distortions fire per tier; end-to-end `Generator.from_recipe("b2b_saas_ltv_v1").generate()` round-trip. Public mode stays calendar-only (Option A, locked). + **Constraint (flagged in Po.1 review):** the recipe `narrative.yaml` MUST + declare ≥2 `icp_industries` and ≥2 `geographies` — Po.1 makes these drive the + public `industry`/`region` columns, so a single-value vocab yields a + zero-variance firmographic feature (student_public invariant #6 violation). + Add a test asserting both columns have ≥2 distinct values in the public + bundle. - Tests: recipe loads, full round-trip, determinism, all task splits, public/instructor split, per-tier distortion. - Labels: `type: feature`, `layer: recipes`, `layer: api`