DEV-1571: fix T-SQL CTE-in-derived-table, T-SQL dotted ORDER BY, MySQL outer-wrap quoting#188
Conversation
…L outer-wrap quoting Three pre-existing dialect bugs surfaced by DEV-1564's xfail integration suites: Bug 1 (T-SQL): SLayer's outer-projection trim wrapped CTE'd inner SELECTs in a derived table, producing `SELECT ... FROM (WITH ctes ...) AS _outer` which T-SQL rejects. Adds an `emit_outer_wrap` strategy hook on `SqlDialect`; the base impl preserves today's shape, and `TsqlDialect` overrides to lift the top-level CTEs to the statement prefix. Bug 2 (T-SQL): bracketed dotted aliases like `[a.b]` don't resolve as SELECT aliases in T-SQL's ORDER BY scope. Adds bracket-anchored mangling via `rewrite_emitted_sql` / `decode_result_keys`, mirroring BigQuery. Extracts the bijective `encode_alias` / `decode_alias` into a shared `slayer/sql/dialects/_alias_mangle.py` and refactors `BigqueryDialect` to consume it. Bug 3 (MySQL): outer-wrap public projection used hardcoded ANSI double quotes; MySQL parses those as string literals. The base `emit_outer_wrap` now emits each public alias via `exp.Identifier(this=a, quoted=True). sql(dialect=self.sqlglot_name)` so quoting is sqlglot-driven (backticks on MySQL, brackets on T-SQL). `SQLGenerator._build_outer_wrap` becomes a thin delegate that strips trailing pagination and threads the generator's `_parse` to the hook so LOG10/SQLite-JSON rewrites survive T-SQL's re-parse. Test changes: - 38 new TDD tests across the dialect suite (shared bijection, T-SQL hoist + LIMIT/OFFSET/hidden-alias, MySQL backtick projection, base passthrough, BigQuery base quoting, generator delegation contract). - 8 xfail markers removed from `test_integration_sqlserver.py` (5) and `test_integration_mysql.py` (3). - 4 pre-existing T-SQL byte-equivalence/log10 tests updated to reflect the legitimate Bug 2 mangling and the sqlglot canonical form on the Bug 1 hoist path. 4668 unit tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 12 minutes and 58 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughFixes two T-SQL dialect bugs (DEV-1571): a new ChangesDEV-1571: T-SQL outer-wrap CTE hoisting and dotted-alias mangling
Sequence Diagram(s)sequenceDiagram
participant SQLGenerator
participant TsqlDialect
rect rgba(135, 206, 235, 0.5)
note over SQLGenerator,TsqlDialect: Outer-wrap construction (T-SQL path)
end
SQLGenerator->>SQLGenerator: _build_outer_wrap(inner_sql, order, limit, offset)
SQLGenerator->>SQLGenerator: strip trailing ORDER BY / LIMIT text from inner_sql
SQLGenerator->>TsqlDialect: emit_outer_wrap(inner_sql, public, order, limit, offset_arg, parse)
TsqlDialect->>TsqlDialect: parse inner_sql → exp.Select
alt top-level WITH present
TsqlDialect->>TsqlDialect: detach with_ from inner Select
TsqlDialect->>TsqlDialect: strip inner table qualifiers from ORDER BY columns
TsqlDialect->>TsqlDialect: wrap inner Select AS _outer derived table
TsqlDialect->>TsqlDialect: re-attach WITH at outer Select
TsqlDialect-->>SQLGenerator: final T-SQL string
else no WITH or parse error
TsqlDialect->>TsqlDialect: base emit_outer_wrap fallback
TsqlDialect-->>SQLGenerator: base T-SQL string
end
SQLGenerator->>TsqlDialect: rewrite_emitted_sql(sql)
TsqlDialect->>TsqlDialect: regex-replace [a.b] → [a___b] via encode_alias
TsqlDialect-->>SQLGenerator: mangled SQL
sequenceDiagram
participant Engine
participant FakeTsqlClient
participant TsqlDialect
participant SlayerResponse
Engine->>FakeTsqlClient: execute SQL
FakeTsqlClient-->>Engine: rows with mangled keys e.g. {orders___status: ...}
Engine->>TsqlDialect: decode_result_keys(rows)
TsqlDialect->>TsqlDialect: decode_alias each key orders___status → orders.status
TsqlDialect-->>Engine: rows with dotted keys
Engine->>SlayerResponse: build response with decoded data
SlayerResponse-->>Engine: resp.data with dotted alias keys
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…ionally Codex flagged that `TsqlDialect.emit_outer_wrap` fell back to the base impl whenever the inner SELECT had no top-level CTE, and the base impl emits a detached `Limit.sql(dialect="tsql")` fragment as literal `LIMIT N`. T-SQL queries that hit the DEV-1444 outer-projection trim without happening to have a CTE chain would therefore still emit invalid pagination — a latent regression in the same hot path Bug 3 lives on. Refactor: always take the AST-based path. The CTE detach is now optional (skipped when `with_node is None`); everything else — the derived-table wrap, the bracketed public projection, and the dialect-aware `TOP`/`FETCH NEXT` transposition — runs uniformly. Fallback to the base impl is retained for parse failures only. Adds `test_tsql_emit_outer_wrap_no_ctes_with_limit_transposes_pagination` pinning the regression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…assembly
The original Bug 3 description in the ticket only mentioned the outer wrap.
The DEV-1564 integration suites (xfails removed in this PR) exposed that
several inner CTE-assembly sites in `SQLGenerator` also hardcoded
`f'"{...}"'`. MySQL parses those as string literals and fails outright;
T-SQL parses them as identifiers but the literal-dot form bypasses Bug 2's
bracket-anchored mangling so ORDER BY can't match the SELECT alias.
Refactor:
- Add `SQLGenerator._q(name)` — `exp.Identifier(this=name, quoted=True).
sql(dialect=self.dialect)` — the single dialect-aware quoter.
- Replace every hardcoded `f'"{...}"'` site in `generator.py`:
- `_build_combined`: final_parts, join ON parts.
- `_assemble_combined_sql`: ORDER BY parts.
- `_apply_pagination_to_sql` (converted from staticmethod to instance).
- `_generate_with_computed`: layer_parts, expression aliases, transform
aliases, time_col, self-join JOIN ON, join_cols, sjoin AS.
- Post-filter qualified-name wrapper.
- `_build_transform_sql` (converted from staticmethod): window measure,
time_col, PARTITION BY.
- `_build_self_join_column` (converted from staticmethod): table.measure.
- Re-emit `expr.sql` (carries ANSI-quoted aliases from enrichment's
`_resolve_sql`) by parsing in postgres dialect and re-emitting in the
active dialect. Applies to both the layered and fallback expression
paths in `_generate_with_computed`.
Tests:
- `tests/dialects/test_mysql.py::test_mysql_time_shift_inner_cte_uses_backticks_not_ansi_quotes`
pins that change_pct desugaring emits backticked identifiers throughout
the inner CTE chain on MySQL.
- `tests/dialects/test_tsql.py::test_tsql_time_shift_inner_cte_uses_mangled_brackets`
symmetric pin for T-SQL — bracket form throughout AND Bug 2 mangling
fires uniformly.
4671 unit tests pass; ruff clean. Expected to clear the 5 CI failures on
the time-shift / cross-model integration paths.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nd-mysql-sql-generator-bugs-surfaced-by-dev-1564
…+ un-xfail post-merge Two follow-ups after merging main: (1) Codex flagged `_query_as_model` at slayer/engine/query_engine.py:1932 emitted the rename-wrapper SQL with hardcoded ANSI double-quoted aliases (`SELECT "orders.id" AS id FROM (inner) AS _inner`). The inner subquery goes through SQLGenerator with dialect-aware quoting (and Bug 2's bracket mangling on T-SQL), so the wrapper's references no longer matched what the inner subquery exposes — query-backed model lookups on MySQL failed with a parse error and on T-SQL failed with `Invalid column name 'a_tbl___id'`. Now emits each alias via sqlglot's dialect-aware identifier quoting and runs the dialect's `rewrite_emitted_sql` post- pass so T-SQL bracket-mangling fires uniformly across inner and outer references. (2) The merge brought in DEV-1564 round 4's three new MySQL xfails plus one new T-SQL xfail that all pin on Bug 3. The Round 2 inner-CTE quoting fix already passes them, so they surface as XPASS(strict) failures. Remove the xfail markers: - tests/integration/test_integration_mysql.py: test_time_shift_with_date_range, test_consecutive_periods_with_boolean_predicate, test_change_with_date_range. - tests/integration/test_integration_sqlserver.py: test_consecutive_periods_with_boolean_predicate. 4690 unit tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|



Summary
Three pre-existing dialect bugs surfaced by DEV-1564's xfail integration suites — all in
SQLGenerator's outer-projection-trim path:WITHinside a derived-table subquery is rejected. NewSqlDialect.emit_outer_wrapstrategy hook;TsqlDialectoverrides to hoist top-level CTEs to the statement prefix.[a.b]) don't resolve as SELECT aliases in T-SQL'sORDER BYscope. Mangling mirrors BigQuery (.→___) viarewrite_emitted_sql/decode_result_keys. The bijection moves intoslayer/sql/dialects/_alias_mangle.pyandBigqueryDialectrefactors to share it.emit_outer_wrapnow emits each public alias viaexp.Identifier(this=a, quoted=True).sql(dialect=self.sqlglot_name)so quoting is sqlglot-driven (backticks on MySQL, brackets on T-SQL, ANSI elsewhere).SQLGenerator._build_outer_wrapbecomes a thin delegate that strips trailing pagination and threads the generator's_parseto the hook so LOG10/SQLite-JSON rewrites survive the T-SQL re-parse.Test plan
tests/dialects/covering: shared bijection round-trip + characterisation, T-SQL CTE hoist + LIMIT/OFFSET/hidden-alias + multi-CTE preservation + regex ASCII pin + false-positive characterisation, MySQL backtick projection + no-hoist regression guard, base passthrough + qualifier strip, BigQuery base-quoting via inherited hook, generator delegation contract + pagination strip.@pytest.mark.xfailmarkers removed:tests/integration/test_integration_sqlserver.py:test_time_shift_with_date_range,test_change_with_date_range,test_change_pct_with_date_range,test_multiple_date_range_shifts,test_integration_sqlserver_cross_model_derived_columnsql.tests/integration/test_integration_mysql.py:test_change_pct_with_date_range,test_multiple_date_range_shifts,test_cross_model_measure.DATETRUNC(month, …)→DATETRUNC(MONTH, …); ANSI-quoted aliases → bracket-mangled).poetry run pytest -m "not integration"— 4668 passed, 6 skipped, 432 deselected.poetry run ruff check slayer/ tests/— clean.Stacked on top of #184 (DEV-1564); retarget to
mainafter that merges.🤖 Generated with Claude Code
Summary by CodeRabbit
WITHclauses, and correctedORDER BYhandling for dotted aliases.