refactor: pipeline deduplication + FollowupRampConfig cleanup#54
Merged
Conversation
- Extract shared pipeline functions (subsample, derive_features, softcap_expected_acv, assign_acquisition_wave, rename_and_select, inject_missingness_v6) into leadforge/pipelines/common.py - Extract shared ML pipeline (build_baseline_pipeline, build_preprocessor, fit_evaluate, get_feature_cols, sanitize_categoricals) into leadforge/pipelines/ml.py - Define CAT_FEATURES, NUM_FEATURES, BINARY_FEATURES once in common.py; validators and eval scripts import from there - Introduce FollowupRampConfig frozen dataclass in mechanisms/counts.py; LatentDecayIntensity accepts followup param (legacy params still work) - Fix subsample() to raise ValueError when insufficient negatives instead of silently returning a short DataFrame All 866 tests pass, lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR refactors the lead-scoring pipeline code by centralizing shared dataset-transformation and ML-evaluation helpers, and it modernizes the latent follow-up configuration API in the mechanisms layer. The changes mostly reduce duplication across the v5/v6/v7 pipeline scripts and validators while tightening subsample() behavior when the requested class balance is impossible.
Changes:
- Added shared pipeline utilities in
leadforge/pipelines/common.pyand shared sklearn helpers inleadforge/pipelines/ml.py, then rewired v5/v6/v7 builders and validation/eval scripts to use them. - Introduced
FollowupRampConfigand updatedLatentDecayIntensity/policy wiring to support the grouped follow-up configuration while keeping legacy kwargs. - Updated the v5 subsample tests to expect
ValueErrorwhen there are insufficient negatives.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/scripts/test_build_v5_snapshot.py |
Updates v5 subsample expectations for the new insufficient-negatives error behavior. |
scripts/validate_v7_dataset.py |
Replaces local feature lists / ML helpers with shared imports for v7 validation. |
scripts/validate_v6_dataset.py |
Replaces local feature lists / ML helpers with shared imports for v6 validation. |
scripts/quick_baseline_eval_v7.py |
Reuses shared preprocessing and feature-list utilities in the v7 baseline-eval script. |
scripts/quick_baseline_eval_v6.py |
Reuses shared preprocessing and feature-selection utilities in the v6 baseline-eval script. |
leadforge/pipelines/ml.py |
New shared sklearn preprocessing, feature selection, and evaluation helpers for validators/eval scripts. |
leadforge/pipelines/common.py |
New shared pipeline constants and reusable transformation steps across pipeline versions. |
leadforge/pipelines/build_v7.py |
Refactors v7 builder to import common/shared pipeline steps instead of duplicating them. |
leadforge/pipelines/build_v6.py |
Refactors v6 builder to import common/shared pipeline steps instead of duplicating them. |
leadforge/pipelines/build_v5.py |
Refactors v5 builder to reuse shared subsample and feature-derivation helpers. |
leadforge/mechanisms/policies.py |
Switches latent follow-up policy wiring to the new FollowupRampConfig. |
leadforge/mechanisms/counts.py |
Adds FollowupRampConfig and updates LatentDecayIntensity to accept grouped follow-up config. |
.agent-plan.md |
Marks the follow-up deduplication/cleanup plan items as completed. |
Comments suppressed due to low confidence (1)
leadforge/mechanisms/counts.py:203
- The new
followup=FollowupRampConfig(...)path is untested.tests/mechanisms/test_mechanisms.pystill exercises only the legacy keyword arguments, so regressions in this branch or in the precedence rules between old/new APIs won't be caught.
followup: FollowupRampConfig | None = None,
# Legacy params — kept for backward compatibility during transition
followup_boost_after_day: int | None = None,
followup_boost_factor: float = 1.0,
followup_ramp_days: int = 10,
followup_latent_weights: dict[str, float] | None = None,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pre = ColumnTransformer( | ||
| [("num", num_tr, num_cols), ("cat", cat_tr, cat_cols)], remainder="drop" | ||
| ) | ||
| pre = build_preprocessor(num_cols, cat_cols) |
Comment on lines
+212
to
+213
| # Resolve followup config: prefer the dataclass, fall back to legacy params | ||
| if followup is not None: |
Comment on lines
+1
to
+5
| """Shared ML pipeline utilities for validation and evaluation scripts. | ||
|
|
||
| Provides the canonical sklearn pipeline (ColumnTransformer + imputation + | ||
| encoding + LogisticRegression) used across dataset validators and baseline | ||
| evaluation scripts. |
Comment on lines
+83
to
+86
| # Add any trap columns to numeric if not excluded | ||
| for c in df.columns: | ||
| if c.startswith(LEAKAGE_PREFIX) and c not in exclude: | ||
| num_cols.append(c) |
1. Extract compute_post_snapshot_touches to common.py (was duplicated between v6 and v7 — byte-for-byte identical) 2. Move FINAL_COLUMNS_STUDENT, FINAL_COLUMNS_INSTRUCTOR, RENAME_MAP, INSTRUCTOR_TRAP_COL to common.py (identical in v6/v7) 3. Make v5.rename_and_select delegate to common._rename_and_select_generic (was the only version not using the shared implementation) 4. Fix fit_evaluate() hardcoding seed=42 for the LR model — now passes the caller's seed through 5. Make insufficient positives raise ValueError (was asymmetric: negatives raised but positives only warned) 6. Move loop-internal imports to module level in validator scripts 7. Add __post_init__ validation to FollowupRampConfig so invalid configs fail at construction, not when passed to LatentDecayIntensity 8. Add tests: FollowupRampConfig dataclass path equivalence test, FollowupRampConfig validation test 868 tests pass, lint clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #54 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: scripts/quick_baseline_eval_v6.py:34
URL: https://github.com/leadforge-dev/leadforge/pull/54#discussion_r3179552763
Root author: copilot-pull-request-reviewer
Comment:
Switching `_build_pipeline()` over to `build_preprocessor()` changes the categorical encoder step name from `enc` to `encoder`, but the feature-importance block below still reads `pre.named_transformers_["cat"].named_steps["enc"]`. That path will now raise `KeyError` before the script can print importances.
## COPILOT-2
Location: leadforge/mechanisms/counts.py:221
URL: https://github.com/leadforge-dev/leadforge/pull/54#discussion_r3179552786
Root author: copilot-pull-request-reviewer
Comment:
When `followup` is provided, any legacy follow-up kwargs are silently ignored. During a migration it's very easy for callers to pass both old and new arguments and think they are being merged, so this should reject mixed usage instead of choosing one branch without warning.
## COPILOT-3
Location: leadforge/pipelines/ml.py:5
URL: https://github.com/leadforge-dev/leadforge/pull/54#discussion_r3179552797
Root author: copilot-pull-request-reviewer
Comment:
This adds another copy of the "canonical" preprocessing/evaluation helpers even though `leadforge/validation/lead_scoring.py:364-445` is already documented as the single source of truth for the baseline pipeline and is the copy that has direct test coverage. Leaving both implementations in place means future preprocessing changes can drift again between validation paths.
## COPILOT-4
Location: leadforge/pipelines/ml.py:86
URL: https://github.com/leadforge-dev/leadforge/pull/54#discussion_r3179552805
Root author: copilot-pull-request-reviewer
Comment:
Despite the docstring, this no longer falls back to dtype-based discovery for non-canonical columns; it only returns names from the hard-coded feature lists plus `__leakage__*` columns. That regresses `validate_v6_dataset.py` from evaluating *all* dataset columns to silently ignoring any unexpected extra feature, which makes the validator less likely to catch leaked or unintended columns that don't happen to be on the canonical list.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
subsample,derive_features,softcap_expected_acv,assign_acquisition_wave,rename_and_select,inject_missingness_v6) intoleadforge/pipelines/common.pybuild_baseline_pipeline,build_preprocessor,fit_evaluate,get_feature_cols,sanitize_categoricals) intoleadforge/pipelines/ml.pyCAT_FEATURES,NUM_FEATURES,BINARY_FEATURESonce incommon.py; validators and eval scripts import from thereFollowupRampConfigfrozen dataclass inmechanisms/counts.py;LatentDecayIntensityacceptsfollowupparam (legacy params still work for backward compat)subsample()to raiseValueErrorwhen insufficient negatives instead of silently returning a short DataFrameCloses the "v7 follow-up: pipeline deduplication + LatentDecayIntensity cleanup" deferred item from
.agent-plan.md.Test plan
test_build_v5_snapshot.pyto expectValueErrorfor insufficient negatives (wasUserWarning)ruff check .cleanruff formatclean🤖 Generated with Claude Code