refactor(benchmarks): decouple CI baseline from analytics + unify measurements() (split prep) by FBumann · Pull Request #34 · fluxopt/linopy

FBumann · 2026-06-08T17:18:33Z

TODO (human): one line on why.

Note

Generated by AI (Claude).

Foundation for splitting the local analytics (plotting / sweep / memray engine / CLI) out of linopy into a standalone package — does the two structural changes that make that split clean, with no behaviour change.

Changes

1. Decouple the CI baseline from the analytics layer.
The CodSpeed baseline (registry + models/patterns + phases + test_*.py + conftest) now imports none of {memory, sweep, plotting, snapshot, bench, cli}. Two things were in the way:

spec_param_id (the <name>-<axis>=<value> convention) moved from snapshot.py into registry.py — it's a core concept used by the drivers.
benchmarks/__init__ no longer eagerly imports bench (from benchmarks import bench still works on demand).

Result: importing the package + the three CodSpeed test modules pulls in zero analytics; the dependency arrow only points analytics → core.

2. One measurements() contract for timing + memory.
The pytest drivers and the memray engine each re-derived "what runs + its node id" per phase — a duplication test_memory_id_alignment.py only band-aided. Now phases own phase_cases(phase) -> Iterable[PhaseCase] (each case a setup→action→teardown context manager); the drivers parametrize over it and the memray engine loops over it.

Every benchmark node id is byte-identical (verified by an empty pytest --co diff) → CodSpeed baselines untouched.
_measurements / _phase_tag gone; the memray engine now measures every available solver (was highs-only); the alignment test is deleted (ids are the same object now).
pipeline becomes a first-class opt-in phase (--phase pipeline, both metrics; out of the default run + CI).

Verify

ruff + mypy clean; harness tests pass; collected-node-id diff empty for every existing benchmark.

…cs layer Prep for splitting the local benchmarking (memory, sweep, plotting, …) into a standalone package — makes the CodSpeed CI baseline import-clean of the analytics layer: - Move `spec_param_id` (the `<name>-<axis>=<value>` param-id convention) from `snapshot.py` into `registry.py`. It's used by the core (`registry.param_ids`, `test_to_solver`) and was the only core → analytics import. - Stop eagerly importing `bench` in `benchmarks/__init__`; `models`/`patterns` stay (they register specs). `from benchmarks import bench` still works on demand (submodule import). Result: importing the package + the three CodSpeed test modules pulls zero of {memory, sweep, plotting, snapshot, bench, cli}; the dependency arrow now only points analytics → core. No behaviour change — collection, harness tests (incl. the memory↔pytest id alignment), and the memory CLI are all unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The pytest drivers and the memray engine each re-derived "what runs + its node id" for every phase — a duplication that test_memory_id_alignment.py existed to band-aid. Collapse it into a single source of truth: phases own a phase_cases(phase) -> Iterable[PhaseCase], where each case is a context manager (setup → measured action → teardown). - phases.py: phase_cases + the per-phase case CMs (build/matrices/to_lp/netcdf/ to_solver/pipeline) + PHASE_NODE. Setup (build, scratch files) runs untimed/ untracked before the yielded action; the build is excluded for export phases and included for build/pipeline — uniformly, by construction. - drivers: each test_<phase> is now a thin parametrize-over-phase_cases + the shared conftest.run_case. Node ids are byte-identical (verified by an empty `pytest --co` id diff), so CodSpeed baselines are untouched. - memory.run_phase consumes the same phase_cases; _measurements and _phase_tag are gone, and the memray engine now measures every available solver (was highs-only) with ids matching pytest's. - test_memory_id_alignment.py deleted — the ids are now the same object, not two strings hoped to match. Also makes `pipeline` a first-class phase (build → matrices → lp in one region, build included): selectable via `--phase pipeline` for both metrics, with a new test_pipeline.py for timing. It stays opt-in (deselected unless `--pipeline`) and out of CI (not in CODSPEED_MODULES). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-06-08T17:22:05Z

Merging this PR will not alter performance

✅ 79 untouched benchmarks
🆕 59 new benchmarks
⏩ 593 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	Memory	`test_to_lp[kvl_cycles-severity=100]`	N/A	38.5 MB	N/A
🆕	Memory	`test_to_solver[highs-knapsack-n=10000]`	N/A	2.5 MB	N/A
🆕	Memory	`test_to_solver[gurobi-knapsack-n=10000]`	N/A	2.8 MB	N/A
🆕	Memory	`test_to_solver[gurobi-sos-n=1000]`	N/A	3.1 MB	N/A
🆕	Memory	`test_to_solver[highs-expression_arithmetic-n=250]`	N/A	34.9 MB	N/A
🆕	Memory	`test_to_solver[gurobi-piecewise-n=1000]`	N/A	5.1 MB	N/A
🆕	Memory	`test_to_solver[gurobi-qp-n=1000]`	N/A	831.1 KB	N/A
🆕	Memory	`test_to_solver[gurobi-kvl_cycles-severity=100]`	N/A	198.8 MB	N/A
🆕	Memory	`test_to_lp[nodal_balance-severity=100]`	N/A	17.9 MB	N/A
🆕	Memory	`test_to_lp[merge_balance-severity=100]`	N/A	17.6 MB	N/A
🆕	Memory	`test_build[milp-n=50]`	N/A	283.7 KB	N/A
🆕	Memory	`test_to_solver[highs-kvl_cycles-severity=100]`	N/A	217 MB	N/A
🆕	Memory	`test_build[masked-n=100]`	N/A	735.4 KB	N/A
🆕	Memory	`test_to_solver[highs-sparse_network-n=250]`	N/A	64.8 MB	N/A
🆕	Memory	`test_to_solver[highs-cumsum-severity=100]`	N/A	70.7 MB	N/A
🆕	Memory	`test_to_lp[sparse_network-n=250]`	N/A	34.5 MB	N/A
🆕	Memory	`test_to_solver[highs-masked-n=100]`	N/A	659 KB	N/A
🆕	Memory	`test_build[sos-n=1000]`	N/A	444.1 KB	N/A
🆕	Memory	`test_to_solver[highs-piecewise-n=1000]`	N/A	4.2 MB	N/A
🆕	Memory	`test_to_lp[milp-n=50]`	N/A	63.3 KB	N/A
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

_{Comparing refactor/benchmarks-decouple-core (d2a5d54) with master (85c103c)}

593 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

FBumann and others added 2 commits June 8, 2026 18:17

FBumann mentioned this pull request Jun 8, 2026

bench: Add internal performance benchmark suite + CodSpeed CI PyPSA/linopy#771

Open

5 tasks

FBumann mentioned this pull request Jun 8, 2026

Split local benchmark analytics into a standalone generic package (benchkit) #35

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(benchmarks): decouple CI baseline from analytics + unify measurements() (split prep)#34

refactor(benchmarks): decouple CI baseline from analytics + unify measurements() (split prep)#34
FBumann wants to merge 2 commits into
masterfrom
refactor/benchmarks-decouple-core

FBumann commented Jun 8, 2026

Uh oh!

codspeed-hq Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FBumann commented Jun 8, 2026

Changes

Verify

Uh oh!

codspeed-hq Bot commented Jun 8, 2026

Merging this PR will not alter performance

Performance Changes

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant