Skip to content

fix(ci): pin scikit-learn<1.9 to restore G13.1 notebook gate#115

Merged
shaypal5 merged 1 commit into
mainfrom
fix/pin-sklearn-lt-1.9
Jun 11, 2026
Merged

fix(ci): pin scikit-learn<1.9 to restore G13.1 notebook gate#115
shaypal5 merged 1 commit into
mainfrom
fix/pin-sklearn-lt-1.9

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

Restores the failing Execute release notebooks (G13.1) CI gate by pinning
scikit-learn<1.9 in all three relevant extras ([dev], [scripts],
[notebooks]). The gate has been failing on every PR since sklearn 1.9.0
shipped, because the uncapped >=1.3 was resolving to 1.9.0 in CI.

Root cause

scikit-learn 1.9.0 changed HistGradientBoostingClassifier defaults in a
way that improves the flat GBM model substantially while the
engineered-feature GBM does not keep pace. In notebook 02
(02_relational_feature_engineering.ipynb) this causes the headline
GBM(eng)−GBM(flat) lift to go non-positive, triggering:

AssertionError: notebook 02 metric panel (seed 42, intermediate) drifted outside tolerance:
  gbm_flat_auc:      observed=0.6339 target=0.6023 |diff|=0.0316 > tol=0.0200
  headline_lift_auc: observed=-0.0253 target=0.0110 |diff|=0.0363 > tol=0.0150

Note: random_state=SEED is already set in the notebook's GBM constructor —
the change is deterministic per version, not randomness.

Fix

Pin scikit-learn>=1.3,<1.9 in pyproject.toml (dev + scripts + notebooks
extras). All three kept in sync with the same upper bound.

This is a hold, not a resolution

Tracking issue #114 documents the root cause and the steps to remove the
pin:

  1. Identify which 1.9 default changed and whether recalibrating NB02_TARGETS
    is appropriate, or whether the relational engineered features need updating to
    restore positive lift on 1.9.
  2. Once resolved, widen the bound and recheck all three metrics.

Testing

  • Main test suite: 1544 passed / 46 skipped (all pass at sklearn 1.7.x, the
    pinned local version).
  • The G13.1 gate itself can only be verified in CI (requires the pre-built
    release bundles on disk). Confident the pin restores it because the failure
    is deterministically tied to the 1.9.0 version observed in CI logs.

🤖 Generated with Claude Code

scikit-learn 1.9.0 changed HistGradientBoostingClassifier defaults in a
way that improves the flat GBM more than the engineered one in NB02
(02_relational_feature_engineering.ipynb), causing the headline
GBM(eng)−GBM(flat) AUC lift to go from +0.0110 to -0.0253 and breaking
the `assert observed["headline_lift_auc"] > 0.0` guard.

Observed (sklearn 1.9.0, seed 42, intermediate bundle):
  gbm_flat_auc:      0.6339  (target 0.6023, |diff| 0.0316 > tol 0.0200)
  headline_lift_auc: -0.0253 (target 0.0110, |diff| 0.0363 > tol 0.0150)

Pin `scikit-learn>=1.3,<1.9` in [dev], [scripts], and [notebooks] extras
to restore CI while the root cause is investigated. Tracking issue: #114.
The pin should be removed once NB02 targets are recalibrated or the
engineered features are revised to provide positive lift on sklearn 1.9.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 11, 2026 19:12
@shaypal5 shaypal5 added type: bugfix Fixes a bug type: ci CI/CD pipeline changes labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #115 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: pull request opened
Workflow run: 27371198627 attempt 1
Comment timestamp: 2026-06-11T19:12:41.337464+00:00
PR head commit: 8a8f3946b6f830244fb19ef3e501855c0f1b4531

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Pins scikit-learn to <1.9 in the relevant optional dependency extras to restore the Execute release notebooks (G13.1) CI gate that began failing after scikit-learn 1.9.0 changed HistGradientBoostingClassifier defaults, causing deterministic notebook-metric drift.

Changes:

  • Pin scikit-learn>=1.3,<1.9 in [project.optional-dependencies].dev.
  • Keep the same pin in the [scripts] and [notebooks] extras (explicitly noted as kept in sync).
  • Add in-file context and a link to issue #114 documenting the temporary nature of the pin.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shaypal5 shaypal5 merged commit 14d9a9e into main Jun 11, 2026
11 checks passed
@shaypal5 shaypal5 deleted the fix/pin-sklearn-lt-1.9 branch June 11, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: bugfix Fixes a bug type: ci CI/CD pipeline changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants