feat: lifecycle config + regression task model [LTV-Pn.3] by shaypal5 · Pull Request #124 · leadforge-dev/leadforge

shaypal5 · 2026-06-13T07:17:25Z

Summary

Third sub-PR of the split LTV-Pn (M6): the config + task-model plumbing the lifecycle pipeline needs, with no end-to-end wiring yet (that's Pn.4). Also discharges the long-deferred LTV-Pc regression-task-spec leftover.

`GenerationConfig` — lifecycle fields (`layer: api`)

New validated fields, consumed only by the lifecycle scheme (lead-scoring ignores them, just as the lifecycle side will ignore n_leads): n_customers, forward_windows_days (strictly-increasing tuple), early_tenure_weeks, observation_date (ISO string or None). Validation is isolated in _validate_lifecycle_fields. Kept flat on the shared config to match existing precedent (n_leads/snapshot_day); a nested per-scheme config is noted as a possible future refactor.

Lead-scoring data (tables/, tasks/) is byte-identical in both modes; only metadata/world_spec.json changes — by design, since it dumps the full config.

`TaskManifest` — regression support (`layer: schema`)

VALID_TASK_TYPES = {binary_classification, regression} with __post_init__ validation; docstrings broadened so label_column / label_window_days read target-agnostically.

Shared split writer (`layer: render`)

The deterministic shuffle/split/write logic is lifted to scheme-agnostic leadforge/render/tasks.py — target-agnostic (it never inspects the label), so it serves continuous pLTV targets. schemes.lead_scoring.render.tasks is now a thin wrapper defaulting the task — byte-identical tasks/ output. This repopulates the render.tasks path that LTV-Pf.2 vacated, now as shared envelope code (peer to render.manifests / render.relational_io); the module-layout lock test is updated to that intended end state.

Lifecycle task families (`schemes/lifecycle/tasks.py`)

lifecycle_task_manifests(regime) builds, per regime: three pltv_revenue_{90,365,730}d regression tasks + a churned_within_180d classification task. Calendar regime unprefixed; early-pLTV regime early_-prefixed so both families get distinct task directories. Targets/windows mirror the snapshot catalog so specs and columns can't drift. Completes LTV-Pc.

Tests (53 new; full suite 1835 passed / 51 skipped)

Config validation (bounds, sorted windows, ISO date); TaskManifest regression-type accept + invalid-type reject; shared writer on a continuous target (values preserved, deterministic, manifest emits regression); lifecycle task-family shape / cross-regime id uniqueness / prefixing / target-column match.

Next: Pn.4 — complete LifecycleScheme.build_world/write_bundle, lift the shared bundle orchestrator, first e2e lifecycle bundle.

🤖 Generated with Claude Code

Third sub-PR of the split LTV-Pn. Adds the config + task-model plumbing the lifecycle pipeline needs, with no end-to-end wiring yet (that is Pn.4). Also discharges the long-deferred LTV-Pc regression-task-spec leftover. GenerationConfig (layer: api): - New validated lifecycle fields, consumed only by the lifecycle scheme (the lead-scoring path ignores them, like it ignores n_leads on the lifecycle side): n_customers, forward_windows_days (strictly-increasing tuple), early_tenure_weeks, observation_date (ISO string or None). Validation lives in a focused _validate_lifecycle_fields helper. Kept flat on the shared config to match existing precedent (n_leads / snapshot_day); a nested per-scheme config is noted as a possible future refactor. - Lead-scoring DATA (tables/, tasks/) byte-identical both modes; only metadata/world_spec.json changes, by design (it dumps the full config). TaskManifest (layer: schema): - Added VALID_TASK_TYPES = {binary_classification, regression} with __post_init__ validation; broadened docstrings so label_column / label_window_days read target-agnostically (continuous targets included). Shared split writer (layer: render): - Lifted the deterministic shuffle/split/write logic to scheme-agnostic leadforge/render/tasks.py (target-agnostic: never inspects the label, so it serves continuous pLTV targets). leadforge.schemes.lead_scoring.render.tasks is now a thin wrapper that defaults the task — byte-identical tasks/ output. - This repopulates the leadforge.render.tasks path that LTV-Pf.2 vacated, now as shared envelope code (peer to render.manifests / render.relational_io); the module-layout lock test is updated to reflect that intended end state. Lifecycle task families (schemes/lifecycle/tasks.py): - lifecycle_task_manifests(regime) builds, per regime, three pltv_revenue_{90, 365,730}d regression tasks + a churned_within_180d classification task. Calendar regime unprefixed; early-pLTV regime early_-prefixed so both families occupy distinct task directories. Targets/windows mirror the snapshot catalog so specs and columns cannot drift. Completes LTV-Pc. Tests (53 new): config field validation; TaskManifest regression type + rejection; shared writer on a continuous target (values preserved, deterministic); lifecycle task-family shape/uniqueness/prefix/targets. Full suite 1835 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot

Pull request overview

Adds lifecycle-scheme configuration fields and regression-capable task manifests, plus a shared deterministic task-split writer that both lead-scoring and lifecycle can use (with lifecycle wiring deferred to the next sub-PR).

Changes:

Extend GenerationConfig with validated lifecycle fields (n_customers, forward_windows_days, early_tenure_weeks, observation_date).
Add task_type validation to TaskManifest and introduce VALID_TASK_TYPES including regression.
Lift the deterministic shuffle/split/write logic into leadforge/render/tasks.py and make lead-scoring delegate; add lifecycle task-family definitions and targeted tests/docs updates.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/schemes/test_module_layout.py	Updates module-layout lock tests to reflect shared `leadforge.render.tasks` repopulation.
tests/schemes/lifecycle/test_tasks.py	Adds tests for regression task types and lifecycle task-family invariants.
tests/render/test_shared_task_writer.py	Adds tests for scheme-agnostic split writer, including continuous target preservation and determinism.
tests/core/test_config_lifecycle_fields.py	Adds coverage for new lifecycle config defaults and validation.
leadforge/schemes/lifecycle/tasks.py	Defines per-regime lifecycle task manifests (pLTV regression + churn classification).
leadforge/schemes/lead_scoring/render/tasks.py	Replaces lead-scoring split writer with a thin wrapper over the shared writer.
leadforge/schema/tasks.py	Introduces `VALID_TASK_TYPES` and validates `TaskManifest.task_type`; updates docs to be target-agnostic.
leadforge/render/tasks.py	New shared deterministic shuffle/split/write implementation for task exports.
leadforge/core/models.py	Adds lifecycle config fields and isolated validation in `GenerationConfig`.
docs/ltv/roadmap.md	Marks roadmap items as completed and documents the new lifecycle/task plumbing.
.agent-plan.md	Updates the running plan/status text for the LTV workstream.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -81,7 +81,8 @@ Total: ~19 PRs across 9 milestones.
  scope (folds into `LTV-Pn`):** regression task specs + a `task_type`
  (`regression` | `classification`) on the task model — they belong with the


…Pn.3] Self-review finding: the new lifecycle GenerationConfig fields (forward_windows_days, n_customers, early_tenure_weeks, observation_date) are not consumed by any pipeline yet — wiring is Pn.4 — and the window/tenure defaults *duplicate* the scheme's canonical constants (schemes.lifecycle.snapshots.FORWARD_WINDOWS_DAYS / DEFAULT_EARLY_TENURE_WEEKS) plus the engine's early_tenure default. The duplication is forced: core.models must not import a scheme (the Pn.2 layering cleanup), so the config copy can't reference the scheme constant. The risk is silent drift — a default-config bundle would carry windows/tenure that disagree with the columns the snapshot builder actually produces. Fix (no behavior change): - tests/schemes/lifecycle/test_config_consistency.py pins the three copies equal (config forward windows == snapshot constant; config early tenure == snapshot constant; engine early-tenure default == snapshot constant). This test layer may import both core and the scheme, so it can enforce what the layering forbids core from importing. - Documented on GenerationConfig: the fields are not yet consumed, Pn.4 makes the config authoritative, and the duplication is guarded by the test above. Full suite 1838 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-13T09:58:49Z

pr-agent-context report:

This run includes an unresolved review comment on PR #124 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: docs/ltv/roadmap.md:82
URL: https://github.com/leadforge-dev/leadforge/pull/124#discussion_r3407595317
Root author: copilot-pull-request-reviewer

Comment:
    The roadmap still documents the `task_type` enum as `regression | classification`, but the implemented/validated values are `binary_classification` and `regression` (see `VALID_TASK_TYPES`). Using `classification` here is misleading for contributors and readers.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 27463528933 attempt 1
Comment timestamp: 2026-06-13T09:58:01.938335+00:00
PR head commit: 628ced51c20e9f204a7227d9c0f4b063c184c3e1

Copilot AI review requested due to automatic review settings June 13, 2026 07:17

shaypal5 added this to the dataset: leadforge-ltv-v1 milestone Jun 13, 2026

shaypal5 added type: feature New capability layer: schema schema/ entity/event contracts layer: render render/ bundle and artifact output layer: api api/ public Python surface dataset: leadforge-ltv-v1 Issue/PR scoped to the b2b_saas_ltv_v1 LTV dataset workstream labels Jun 13, 2026

Copilot started reviewing on behalf of shaypal5 June 13, 2026 07:17 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed Jun 13, 2026

View reviewed changes

Comment thread docs/ltv/roadmap.md

@@ -81,7 +81,8 @@ Total: ~19 PRs across 9 milestones.

scope (folds into `LTV-Pn`):** regression task specs + a `task_type`

(`regression` | `classification`) on the task model — they belong with the

shaypal5 merged commit 8296f4e into main Jun 14, 2026
10 checks passed

shaypal5 deleted the feat/lifecycle-config-regression-tasks branch June 14, 2026 08:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: lifecycle config + regression task model [LTV-Pn.3]#124

feat: lifecycle config + regression task model [LTV-Pn.3]#124
shaypal5 merged 2 commits into
mainfrom
feat/lifecycle-config-regression-tasks

shaypal5 commented Jun 13, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -81,7 +81,8 @@ Total: ~19 PRs across 9 milestones.
		scope (folds into `LTV-Pn`):** regression task specs + a `task_type`
		(`regression` \| `classification`) on the task model — they belong with the

Conversation

shaypal5 commented Jun 13, 2026

Summary

GenerationConfig — lifecycle fields (layer: api)

TaskManifest — regression support (layer: schema)

Shared split writer (layer: render)

Lifecycle task families (schemes/lifecycle/tasks.py)

Tests (53 new; full suite 1835 passed / 51 skipped)

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`GenerationConfig` — lifecycle fields (`layer: api`)

`TaskManifest` — regression support (`layer: schema`)

Shared split writer (`layer: render`)

Lifecycle task families (`schemes/lifecycle/tasks.py`)