Fix: sql server param limit by andres-sole · Pull Request #127 · Meaningful-Data/dpmcore

andres-sole · 2026-06-11T10:17:35Z

Closes #126

Summary

SQL Server caps a single statement at 2,100 bound parameters, so the layout exporter crashed (pyodbc 07002) whenever a module referenced more than ~2,100 variable versions — every ID was bound into one unbounded IN (...). This PR introduces a shared chunking helper and migrates the high-volume, data-sized IN call sites across the codebase to use it.

What was done

Added dpmcore.orm.query_utils.chunked_in (with IN_CHUNK_SIZE = 900) — splits a column.in_(values) filter into fixed-size batches that stay well under the cap on every backend, and concatenates the results. Values are de-duplicated (order-preserving) so the chunked result matches single-statement IN semantics even when callers pass duplicates that would straddle batches.
Migrated the unbounded IN lookups in the layout exporter, structure/semantic/scope-calculator services, Meili JSON, the DPM-XL model queries, and the structure router to chunked_in.
Reworked _load_member_codes so the domain filter is applied in Python instead of a second unbounded IN, keeping the chunked statement under the cap regardless of how many domains an export spans. Its member-code result is now deterministic (highest (category_id, code) wins) rather than dependent on backend row order.
Replaced the SQLite-only chunker in the Meili JSON service with the shared helper and updated tests accordingly.

Notes

Two remaining IN sites flagged against Layout exporter fails on SQL Server when a module references more than 2,100 variable versions #126 are intentionally left: dpm_xl/utils/filters.py binds only release IDs (bounded to the number of releases, never near the cap) and is a query-builder that can't use chunked_in; dpm_xl/ast/operands.py is a pandas-compiled .distinct() path that needs a separate chunk-and-concat approach. Tracked as a follow-up — this PR is part of Layout exporter fails on SQL Server when a module references more than 2,100 variable versions #126, not a full close.

Checklist

Code quality checks pass (ruff format, ruff check, mypy)
Tests pass (pytest) with 100% branch coverage (coverage report --fail-under=100)
Documentation updated (if applicable)

…imit

…#126)

…nked_in

…CT preserved)

…re router

Copilot

Pull request overview

This PR addresses SQL Server’s 2,100 bound-parameter limit by introducing a shared ORM helper that batches large IN (...) filters, then migrating several high-volume query call sites (layout exporter, structure/semantic/scope services, Meili JSON, and DPM-XL queries) to use it.

Changes:

Added dpmcore.orm.query_utils.chunked_in (with unit tests) to safely batch IN (...) predicates across supported backends.
Replaced multiple unbounded .in_(...) usages across services/utilities with chunked_in to prevent SQL Server crashes on large modules.
Updated Meili JSON tests to patch the shared helper (removing the service-local chunking implementation and its tests).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/unit/orm/test_query_utils.py	Adds unit coverage for the new `chunked_in` batching helper.
tests/unit/meili/test_meili_json_service.py	Removes tests for the deleted local chunking helper; updates mocking to target `chunked_in`.
src/dpmcore/orm/query_utils.py	Introduces `chunked_in` and `IN_CHUNK_SIZE` as shared query utilities.
src/dpmcore/services/layout_exporter/queries.py	Uses `chunked_in` to batch large ID lookups during layout export queries.
src/dpmcore/services/structure.py	Replaces multiple unbounded `IN` filters with `chunked_in` for bulk-loading structure data.
src/dpmcore/services/semantic.py	Uses `chunked_in` for module-scope lookups and re-deduplicates results across chunked DISTINCT queries.
src/dpmcore/services/scope_calculator.py	Applies `chunked_in` to bulk module/table/key lookups to avoid SQL Server parameter limits.
src/dpmcore/services/meili_json.py	Switches bulk loaders from a local chunking helper to the shared `chunked_in`.
src/dpmcore/server/routers/structure.py	Uses `chunked_in` when resolving organisation acronyms to IDs.
src/dpmcore/dpm_xl/model_queries.py	Uses `chunked_in` in DataFrame-producing model query helpers to avoid oversized `IN` predicates.

…ble rows

… to stay under the param cap, with a deterministic member-code winner

…ta/dpmcore into fix/sql-server-param-limit

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

…rationale

andres-sole added 6 commits June 11, 2026 12:15

feat(orm): add chunked_in helper for the SQL Server 2,100-parameter l…

bd9eed9

…imit

fix(layout-exporter): chunk IN clauses to fix SQL Server export crash (…

6ad16eb

…#126)

refactor(meili-json): replace the SQLite-only chunker with shared chu…

0a29bd7

…nked_in

fix(semantic,scope-calculator): chunk module/table IN lookups (DISTIN…

5c2b20f

…CT preserved)

fix(dpm-xl,server): chunk IN clauses in model queries and the structu…

db6768e

…re router

fix(structure): chunk IN clauses in batch loaders

a18d194

andres-sole requested a review from a team June 11, 2026 10:17

Merge branch 'master' into fix/sql-server-param-limit

5b16612

andres-sole requested a review from Copilot June 11, 2026 10:31

Copilot started reviewing on behalf of andres-sole June 11, 2026 10:31 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread src/dpmcore/orm/query_utils.py Outdated

Comment thread src/dpmcore/services/layout_exporter/queries.py Outdated

andres-sole added 3 commits June 11, 2026 13:16

fix(orm): dedup chunked_in values so cross-chunk duplicates don't dou…

e3dd1ef

…ble rows

fix(layout-exporter): move _load_member_codes domain filter to Python…

a83a566

… to stay under the param cap, with a deterministic member-code winner

Merge branch 'fix/sql-server-param-limit' of github.com:Meaningful-Da…

9d7d9d5

…ta/dpmcore into fix/sql-server-param-limit

andres-sole requested a review from Copilot June 11, 2026 11:17

Copilot started reviewing on behalf of andres-sole June 11, 2026 11:18 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Comment thread src/dpmcore/services/layout_exporter/queries.py Outdated

Comment thread src/dpmcore/orm/query_utils.py

docs(layout-exporter): correct the _load_member_codes chunk-ordering …

b359e70

…rationale

andres-sole requested a review from ruizmaa June 11, 2026 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: sql server param limit#127

Fix: sql server param limit#127
andres-sole wants to merge 11 commits into
masterfrom
fix/sql-server-param-limit

andres-sole commented Jun 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andres-sole commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What was done

Notes

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andres-sole commented Jun 11, 2026 •

edited

Loading