Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions .agent-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,12 @@ builders unified on one per-customer-cutoff core; 19 tests) opened as
drops the lead-scoring `world_graph` param for `generation_scheme` /
`motif_family` / `extra_fields`; every manifest records `generation_scheme`;
`BUNDLE_SCHEMA_VERSION` 5 → 6; lead-scoring data files byte-identical) opened
as **#121**. Next: `LTV-Pn.2` (scheme-agnostic `WorldBundle` + exposure hook +
shared bundle orchestrator), then `Pn.3` (lifecycle config + regression task
model), `Pn.4` (complete `LifecycleScheme` + e2e bundle), `LTV-Po` (recipe).
as **#121** (merged). `LTV-Pn.2` (scheme-agnostic `WorldBundle` — `artifacts:
Any`; `apply_exposure` dispatches hidden truth to a
`GenerationScheme.write_metadata` hook; cleanups #2 + #3 discharged;
lead-scoring byte-identical both modes) opened as **#122**. Next: `Pn.3`
(lifecycle config + regression task model), `Pn.4` (complete `LifecycleScheme`
+ shared bundle orchestrator + e2e bundle), `LTV-Po` (recipe).

---

Expand Down
51 changes: 29 additions & 22 deletions docs/ltv/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ protocol + registry, with the package physically reorganized into
| `LTV-M3` | Customer population + lifecycle world | `LTV-Ph`, `LTV-Pi` | #113 (Ph) |
| `LTV-M4` | Lifecycle simulation engine | `LTV-Pj`, `LTV-Pk` | #117 (Pj), #118 (Pk) |
| `LTV-M5` | Customer snapshots + pLTV targets (both regimes) | `LTV-Pl`, `LTV-Pm` | #119 (Pl), #120 (Pm) |
| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pn.1…4`, `LTV-Po` | #121 (Pn.1) |
| `LTV-M6` | Register LifecycleScheme + recipe + manifest/version | `LTV-Pn.1…4`, `LTV-Po` | #121 (Pn.1), #122 (Pn.2) |
| `LTV-M7` | Validation + regression-metric calibration | `LTV-Pp` | |
| `LTV-M8` | CLI, notebooks, publish | `LTV-Pq`, `LTV-Pr`, `LTV-Ps` | |

Expand Down Expand Up @@ -282,13 +282,20 @@ pipeline + schema bump). Split into four sub-PRs in dependency order:
tasks/); only `manifest.json` changes (new field + version). Schema
contract test renamed v5 → v6.
- Labels: `type: refactor`, `layer: render`
- [ ] **`LTV-Pn.2`** — `refactor: scheme-agnostic WorldBundle + exposure hook +
shared bundle orchestrator`. Generalise `WorldBundle` to hold scheme-owned
artifacts (finishing cleanup #3: drop the `core.models` / `render` →
`lead_scoring.*` back-refs); make `apply_exposure` / `write_metadata_dir`
scheme-agnostic via a hidden-truth hook (cleanup #2); lift a shared bundle
orchestrator with scheme render hooks out of `write_bundle` (cleanup #1).
Lead-scoring bundle byte-identical (full SHA-256 harness).
- [x] **`LTV-Pn.2`** — `refactor: scheme-agnostic WorldBundle + exposure hook`
(**PR #122**). `WorldBundle` now holds only `spec` + an opaque
`artifacts: Any` (scheme-owned; lead-scoring stores `LeadScoringArtifacts`),
finishing cleanup #3 — the `core.models` lead-scoring type imports are gone.
`apply_exposure` is scheme-agnostic: it writes the generic `world_spec.json`
and dispatches hidden-truth files to the producing scheme's new
`GenerationScheme.write_metadata` hook (cleanup #2); the lead-scoring graph /
latent registry / mechanism summary writers moved out of `exposure/` into the
lead-scoring scheme. Lead-scoring bundle **byte-identical** across both
exposure modes (full SHA-256 harness).
- **Re-scoped:** the shared bundle orchestrator (cleanup #1) moves to
`LTV-Pn.4` — per this file's own note it is best designed *with the second
scheme's `write_bundle` in hand*; building it now against one scheme would
guess the hook shape.
- Labels: `type: refactor`, `layer: api`, `layer: core`, `layer: render`
- [ ] **`LTV-Pn.3`** — `feat: lifecycle config + regression task model`. Add
`n_customers` + lifecycle config (forward windows, early-tenure, observation
Expand All @@ -300,7 +307,10 @@ pipeline + schema bump). Split into four sub-PRs in dependency order:
Implement `LifecycleScheme.build_world` (population → sim) and `write_bundle`
(lifecycle relational tables; both regime snapshots → two task families ×
3 windows + secondary churn; dataset card; manifest `observation_date` +
windows via `extra_fields`). First end-to-end lifecycle bundle (programmatic;
windows via `extra_fields`; lifecycle `write_metadata` hidden-truth hook).
With both schemes' `write_bundle` in hand, **lift the shared bundle
orchestrator with scheme render hooks** out of the two implementations
(carried cleanup #1). First end-to-end lifecycle bundle (programmatic;
recipe wiring is `LTV-Po`). Extend `CLAUDE.md` hard constraints with the
lifecycle snapshot-safety clause + the `schemes/` layout. Carries the
LTV-Pp validation flags: early-regime degenerate-column exemptions; the
Expand Down Expand Up @@ -353,19 +363,16 @@ byte-identical and reviewable. They are tracked here and discharged in

1. **Shared render orchestration** — `LTV-Pe` left each scheme owning its full
`write_bundle`; only `write_relational_tables` is shared. A shared bundle
orchestrator with scheme render hooks lands once there are two schemes.
2. **`build_manifest` / `apply_exposure` are lead-scoring-coupled** —
`build_manifest` takes a `world_graph`; `apply_exposure` writes the
lead-scoring hidden graph + latent registry. Generalize both to be
scheme-agnostic.
3. **core→scheme layering inversion** — `LTV-Pf.1` introduced
`TYPE_CHECKING`-only imports of `leadforge.schemes.lead_scoring.*` in
`core.models` (`WorldBundle.world_graph: WorldGraph | None`) and
`render.*`. Harmless at runtime (no eager import), but `core`/shared
`render` should not reference a scheme. **Partly discharged in `LTV-Pn.1`**
(removed the `render.manifests` → `lead_scoring.structure.graph` back-ref);
the `core.models.WorldBundle` back-refs follow in `LTV-Pn.2` once
`WorldBundle` holds scheme-agnostic artifacts.
orchestrator with scheme render hooks lands in **`LTV-Pn.4`**, once the
lifecycle `write_bundle` exists to reveal the real shared shape.
2. ~~**`build_manifest` / `apply_exposure` are lead-scoring-coupled**~~ —
**Done** (`build_manifest` in `LTV-Pn.1`; `apply_exposure` in `LTV-Pn.2` via
the `write_metadata` scheme hook).
3. ~~**core→scheme layering inversion**~~ — **Done.** `LTV-Pn.1` removed the
`render.manifests` back-ref; `LTV-Pn.2` removed the `core.models.WorldBundle`
lead-scoring type imports (it now holds an opaque `artifacts: Any`). Only a
`DEFAULT_SCHEME = "lead_scoring"` string default and doc-comment
cross-references remain — neither is an import/type inversion.

---

Expand Down
19 changes: 8 additions & 11 deletions leadforge/core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,6 @@

if TYPE_CHECKING:
from leadforge.narrative.spec import NarrativeSpec
from leadforge.schemes.lead_scoring.simulation.engine import SimulationResult
from leadforge.schemes.lead_scoring.simulation.population import PopulationResult
from leadforge.schemes.lead_scoring.structure.graph import WorldGraph


# Default generation scheme when a recipe/world does not declare one. Kept here
Expand Down Expand Up @@ -165,15 +162,16 @@ class WorldBundle:

Attributes:
spec: Fully resolved world specification (config + narrative).
population: Generated accounts, contacts, leads, and latent state.
simulation_result: Simulated event tables and final lead outcomes.
world_graph: Sampled hidden world graph used during simulation.
artifacts: The producing scheme's in-memory result (e.g.
:class:`~leadforge.schemes.lead_scoring.artifacts.LeadScoringArtifacts`).
Opaque to the shared core layer — typed ``Any`` so ``core`` never
references a scheme. Each scheme stores and unwraps its own
container; ``None`` until :meth:`~leadforge.api.generator.Generator.generate`
populates it.
"""

spec: WorldSpec = field(default_factory=WorldSpec)
population: PopulationResult | None = None
simulation_result: SimulationResult | None = None
world_graph: WorldGraph | None = None
artifacts: Any = None

def save(self, path: str, generation_timestamp: str | None = None) -> None:
"""Write the full output bundle to *path*.
Expand All @@ -195,8 +193,7 @@ def save(self, path: str, generation_timestamp: str | None = None) -> None:
Pass a fixed value to produce byte-identical manifests.

Raises:
RuntimeError: if :attr:`simulation_result`, :attr:`population`,
or :attr:`world_graph` have not been populated (i.e. if
RuntimeError: if :attr:`artifacts` has not been populated (i.e. if
:meth:`~leadforge.api.generator.Generator.generate` was not
called).
"""
Expand Down
74 changes: 19 additions & 55 deletions leadforge/exposure/metadata.py
Original file line number Diff line number Diff line change
@@ -1,71 +1,35 @@
"""Write hidden-truth metadata files for ``research_instructor`` mode.

:func:`write_metadata_dir` creates ``bundle_root/metadata/`` and populates
it with five files that expose the full hidden world:

- ``graph.json`` — world graph as JSON (nodes, edges, motif family)
- ``graph.graphml`` — world graph as GraphML for graph tools
- ``world_spec.json`` — generation config + narrative spec
- ``latent_registry.json`` — per-entity latent trait values
- ``mechanism_summary.json`` — mechanism assignment summary
"""Scheme-agnostic hidden-truth metadata for ``research_instructor`` mode.

The bundle's ``metadata/`` directory mixes scheme-agnostic provenance
(``world_spec.json`` — config + narrative) with scheme-specific hidden truth
(the lead-scoring world graph, latent registry, and mechanism summary; the
lifecycle scheme will emit its own). Only the generic part lives here;
:func:`write_world_spec_json` writes it. Each scheme owns the rest via its
:meth:`~leadforge.schemes.base.GenerationScheme.write_metadata` hook, called by
:func:`leadforge.exposure.modes.apply_exposure`.
"""

from __future__ import annotations

import dataclasses
import json
from pathlib import Path
from typing import TYPE_CHECKING

if TYPE_CHECKING:
from leadforge.core.models import WorldBundle


def write_metadata_dir(bundle: WorldBundle, bundle_root: Path) -> None:
"""Populate ``bundle_root/metadata/`` with all hidden-truth files.
from pathlib import Path

Args:
bundle: Fully populated :class:`~leadforge.core.models.WorldBundle`.
bundle_root: Root directory of the written bundle.
"""
from leadforge.core.rng import RNGRoot
from leadforge.schemes.lead_scoring.mechanisms.policies import assign_mechanisms

# Callers must only invoke this after full bundle assembly; world_graph
# and population are guaranteed non-None at that point.
assert bundle.world_graph is not None # noqa: S101
assert bundle.population is not None # noqa: S101
from leadforge.core.models import WorldSpec

meta_dir = bundle_root / "metadata"
meta_dir.mkdir(exist_ok=True)
__all__ = ["write_world_spec_json"]

# graph.json + graph.graphml
(meta_dir / "graph.json").write_text(bundle.world_graph.to_json())
(meta_dir / "graph.graphml").write_text(bundle.world_graph.to_graphml())

# latent_registry.json
ls = bundle.population.latent_state
latent_registry: dict[str, object] = {
"account_latents": ls.account_latents,
"contact_latents": ls.contact_latents,
"lead_latents": ls.lead_latents,
}
(meta_dir / "latent_registry.json").write_text(json.dumps(latent_registry, indent=2))
def write_world_spec_json(spec: WorldSpec, meta_dir: Path) -> None:
"""Write ``meta_dir/world_spec.json`` — the resolved config + narrative.

# world_spec.json — config + narrative (if present)
config_dict = dataclasses.asdict(bundle.spec.config)
narrative_dict = (
dataclasses.asdict(bundle.spec.narrative) if bundle.spec.narrative is not None else None
)
Scheme-agnostic: depends only on the shared :class:`WorldSpec`, so it is
identical across generation schemes.
"""
config_dict = dataclasses.asdict(spec.config)
narrative_dict = dataclasses.asdict(spec.narrative) if spec.narrative is not None else None
world_spec_dict = {"config": config_dict, "narrative": narrative_dict}
(meta_dir / "world_spec.json").write_text(json.dumps(world_spec_dict, indent=2))

# mechanism_summary.json
# Reconstruct the mechanism assignment with the same RNG substream that
# was used during simulation — produces the identical parameter values.
motif_family = bundle.world_graph.motif_family
mech_rng = RNGRoot(bundle.spec.config.seed).child("mechanisms")
assignment = assign_mechanisms(motif_family, mech_rng)
(meta_dir / "mechanism_summary.json").write_text(
json.dumps(assignment.summary().to_dict(), indent=2)
)
42 changes: 28 additions & 14 deletions leadforge/exposure/modes.py
Original file line number Diff line number Diff line change
@@ -1,41 +1,55 @@
"""Exposure-mode dispatch for bundle publication.

:func:`apply_exposure` is the single entry point called by
:func:`~leadforge.api.bundle.write_bundle`. It reads the resolved
:class:`~leadforge.exposure.filters.BundleFilter` for the requested mode
and performs the corresponding writes (or skips them).
:func:`apply_exposure` is the single entry point called by each scheme's
``write_bundle``. It reads the resolved
:class:`~leadforge.exposure.filters.BundleFilter` for the requested mode and,
when hidden truth should be published, writes the scheme-agnostic
``world_spec.json`` and delegates the scheme-specific hidden-truth files to the
producing scheme's :meth:`~leadforge.schemes.base.GenerationScheme.write_metadata`
hook. This keeps the exposure layer free of any single scheme's types.
"""

from __future__ import annotations

import shutil
from pathlib import Path
from typing import TYPE_CHECKING

from leadforge.core.enums import ExposureMode
from leadforge.exposure.filters import get_filter
from leadforge.exposure.metadata import write_metadata_dir
from leadforge.exposure.metadata import write_world_spec_json

if TYPE_CHECKING:
from pathlib import Path

from leadforge.core.enums import ExposureMode
from leadforge.core.models import WorldBundle


def apply_exposure(bundle: WorldBundle, bundle_root: Path, mode: ExposureMode) -> None:
"""Apply exposure filtering for *mode* to the bundle at *bundle_root*.

For ``research_instructor`` mode this writes the ``metadata/``
directory with all hidden-truth files. For ``student_public`` mode any
pre-existing ``metadata/`` directory is removed so that hidden truth
is never accidentally published when reusing an output path.
For modes whose filter sets ``write_metadata`` (e.g. ``research_instructor``)
this creates ``metadata/``, writes the scheme-agnostic ``world_spec.json``,
and calls the producing scheme's ``write_metadata`` hook for its
hidden-truth files. For modes that must not publish hidden truth (e.g.
``student_public``) any pre-existing ``metadata/`` directory is removed so
truth is never accidentally republished when reusing an output path.

Args:
bundle: Fully populated :class:`~leadforge.core.models.WorldBundle`.
bundle_root: Root directory of the written bundle (must already exist).
mode: Exposure mode that controls which artefacts are published.
"""
from leadforge.schemes import get_scheme

filt = get_filter(mode)
meta_dir = bundle_root / "metadata"
if filt.write_metadata:
write_metadata_dir(bundle, bundle_root)
elif meta_dir.exists():
# Always start from a clean metadata/ so its contents exactly match the
# current bundle. Reusing an output path across runs — or across schemes,
# which emit different hidden-truth file sets — must not leave stale files
# behind (the non-writing branch below clears it for the same reason).
if meta_dir.exists():
shutil.rmtree(meta_dir)
if filt.write_metadata:
meta_dir.mkdir(parents=True)
write_world_spec_json(bundle.spec, meta_dir)
get_scheme(bundle.spec.scheme).write_metadata(bundle, meta_dir)
Comment thread
shaypal5 marked this conversation as resolved.
13 changes: 13 additions & 0 deletions leadforge/schemes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@
from leadforge.core.exceptions import LeadforgeError

if TYPE_CHECKING:
from pathlib import Path

from leadforge.core.models import GenerationConfig, WorldBundle
from leadforge.narrative.spec import NarrativeSpec

Expand Down Expand Up @@ -92,6 +94,17 @@ def write_bundle(
"""
...

def write_metadata(self, bundle: WorldBundle, meta_dir: Path) -> None:
"""Write the scheme's hidden-truth files into an existing *meta_dir*.

Called by :func:`leadforge.exposure.modes.apply_exposure` for modes
that publish hidden truth (e.g. ``research_instructor``), after the
shared, scheme-agnostic ``world_spec.json`` is written. A scheme emits
whatever latent truth it has — for lead scoring the world graph, latent
registry, and mechanism summary; other schemes emit their own.
"""
...


# Name → scheme instance. Populated by importing the built-in scheme modules
# (each self-registers on import). ``_ensure_builtins`` triggers this lazily so
Expand Down
Loading
Loading