Skip to content

DEV-1571: fix T-SQL CTE-in-derived-table, T-SQL dotted ORDER BY, MySQL outer-wrap quoting#188

Open
ZmeiGorynych wants to merge 5 commits into
mainfrom
egor/dev-1571-t-sql-and-mysql-sql-generator-bugs-surfaced-by-dev-1564
Open

DEV-1571: fix T-SQL CTE-in-derived-table, T-SQL dotted ORDER BY, MySQL outer-wrap quoting#188
ZmeiGorynych wants to merge 5 commits into
mainfrom
egor/dev-1571-t-sql-and-mysql-sql-generator-bugs-surfaced-by-dev-1564

Conversation

@ZmeiGorynych

@ZmeiGorynych ZmeiGorynych commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

Three pre-existing dialect bugs surfaced by DEV-1564's xfail integration suites — all in SQLGenerator's outer-projection-trim path:

  • Bug 1 (T-SQL): WITH inside a derived-table subquery is rejected. New SqlDialect.emit_outer_wrap strategy hook; TsqlDialect overrides to hoist top-level CTEs to the statement prefix.
  • Bug 2 (T-SQL): bracketed dotted aliases ([a.b]) don't resolve as SELECT aliases in T-SQL's ORDER BY scope. Mangling mirrors BigQuery (.___) via rewrite_emitted_sql / decode_result_keys. The bijection moves into slayer/sql/dialects/_alias_mangle.py and BigqueryDialect refactors to share it.
  • Bug 3 (MySQL): outer-wrap public projection used hardcoded ANSI double quotes — MySQL parses those as string literals. The base emit_outer_wrap now emits each public alias via exp.Identifier(this=a, quoted=True).sql(dialect=self.sqlglot_name) so quoting is sqlglot-driven (backticks on MySQL, brackets on T-SQL, ANSI elsewhere).

SQLGenerator._build_outer_wrap becomes a thin delegate that strips trailing pagination and threads the generator's _parse to the hook so LOG10/SQLite-JSON rewrites survive the T-SQL re-parse.

Test plan

  • 38 new TDD tests across tests/dialects/ covering: shared bijection round-trip + characterisation, T-SQL CTE hoist + LIMIT/OFFSET/hidden-alias + multi-CTE preservation + regex ASCII pin + false-positive characterisation, MySQL backtick projection + no-hoist regression guard, base passthrough + qualifier strip, BigQuery base-quoting via inherited hook, generator delegation contract + pagination strip.
  • 8 @pytest.mark.xfail markers removed:
    • tests/integration/test_integration_sqlserver.py: test_time_shift_with_date_range, test_change_with_date_range, test_change_pct_with_date_range, test_multiple_date_range_shifts, test_integration_sqlserver_cross_model_derived_columnsql.
    • tests/integration/test_integration_mysql.py: test_change_pct_with_date_range, test_multiple_date_range_shifts, test_cross_model_measure.
  • 4 pre-existing T-SQL byte-equivalence/log10-preservation tests updated to reflect the legitimate Bug 2 mangling and sqlglot's canonical form on the Bug 1 hoist path (DATETRUNC(month, …)DATETRUNC(MONTH, …); ANSI-quoted aliases → bracket-mangled).
  • poetry run pytest -m "not integration" — 4668 passed, 6 skipped, 432 deselected.
  • poetry run ruff check slayer/ tests/ — clean.

Stacked on top of #184 (DEV-1564); retarget to main after that merges.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Fixed T-SQL generation for derived tables that include nested WITH clauses, and corrected ORDER BY handling for dotted aliases.
    • Improved alias encoding/decoding so BigQuery and T-SQL keep projected and returned column names consistent.
  • Tests
    • Added broader unit and integration coverage for dialect SQL output and result-key decoding, and removed several previously expected failures now resolved.

…L outer-wrap quoting

Three pre-existing dialect bugs surfaced by DEV-1564's xfail integration suites:

Bug 1 (T-SQL): SLayer's outer-projection trim wrapped CTE'd inner SELECTs in
a derived table, producing `SELECT ... FROM (WITH ctes ...) AS _outer` which
T-SQL rejects. Adds an `emit_outer_wrap` strategy hook on `SqlDialect`; the
base impl preserves today's shape, and `TsqlDialect` overrides to lift the
top-level CTEs to the statement prefix.

Bug 2 (T-SQL): bracketed dotted aliases like `[a.b]` don't resolve as
SELECT aliases in T-SQL's ORDER BY scope. Adds bracket-anchored mangling
via `rewrite_emitted_sql` / `decode_result_keys`, mirroring BigQuery.
Extracts the bijective `encode_alias` / `decode_alias` into a shared
`slayer/sql/dialects/_alias_mangle.py` and refactors `BigqueryDialect`
to consume it.

Bug 3 (MySQL): outer-wrap public projection used hardcoded ANSI double
quotes; MySQL parses those as string literals. The base `emit_outer_wrap`
now emits each public alias via `exp.Identifier(this=a, quoted=True).
sql(dialect=self.sqlglot_name)` so quoting is sqlglot-driven (backticks
on MySQL, brackets on T-SQL).

`SQLGenerator._build_outer_wrap` becomes a thin delegate that strips
trailing pagination and threads the generator's `_parse` to the hook
so LOG10/SQLite-JSON rewrites survive T-SQL's re-parse.

Test changes:
- 38 new TDD tests across the dialect suite (shared bijection, T-SQL hoist
  + LIMIT/OFFSET/hidden-alias, MySQL backtick projection, base
  passthrough, BigQuery base quoting, generator delegation contract).
- 8 xfail markers removed from `test_integration_sqlserver.py` (5) and
  `test_integration_mysql.py` (3).
- 4 pre-existing T-SQL byte-equivalence/log10 tests updated to reflect
  the legitimate Bug 2 mangling and the sqlglot canonical form on the
  Bug 1 hoist path.

4668 unit tests pass; ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@linear

linear Bot commented Jun 18, 2026

Copy link
Copy Markdown

DEV-1571

DEV-1564

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@ZmeiGorynych, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 12 minutes and 58 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c2278508-7d9d-4abe-a1e8-993fd3a88c37

📥 Commits

Reviewing files that changed from the base of the PR and between 71596d9 and 6862441.

📒 Files selected for processing (3)
  • slayer/sql/generator.py
  • tests/dialects/test_mysql.py
  • tests/dialects/test_tsql.py
📝 Walkthrough

Walkthrough

Fixes two T-SQL dialect bugs (DEV-1571): a new _alias_mangle module provides shared bijective encode/decode for dotted aliases; a SqlDialect.emit_outer_wrap hook (with T-SQL and BigQuery overrides) replaces hardcoded outer-wrap logic in SQLGenerator; T-SQL hoists inner CTEs and mangles bracket-quoted dotted aliases; MySQL/SQL Server xfail markers removed.

Changes

DEV-1571: T-SQL outer-wrap CTE hoisting and dotted-alias mangling

Layer / File(s) Summary
Shared bijective alias-mangle module
slayer/sql/dialects/_alias_mangle.py, tests/dialects/test__alias_mangle.py
_ALIAS_SEP, encode_alias, and decode_alias implement a lossless .___ encoding with escape-doubling for pre-existing ___ sequences. Tests pin separator constant, all encode/decode cases, round-trip property, and documented out-of-domain behaviour.
Base emit_outer_wrap hook and generator delegation
slayer/sql/dialects/base.py, slayer/sql/generator.py, tests/dialects/test_base.py, tests/dialects/test_generator_dispatch.py
SqlDialect.emit_outer_wrap is the new Postgres-shaped default that wraps inner_sql as derived table _outer, projects quoted public aliases, and re-attaches detached pagination AST nodes. SQLGenerator._build_outer_wrap strips pagination text and fully delegates to the hook. Tests cover shape, CTE preservation, ORDER BY handling, qualifier stripping, and dispatch.
BigQuery refactored to shared alias helpers
slayer/sql/dialects/bigquery.py, tests/dialects/test_bigquery.py
Removes local _encode_alias/_decode_alias; both call sites in rewrite_emitted_sql and decode_result_keys now use the shared helpers. New test pins backtick quoting on BigqueryDialect.emit_outer_wrap.
T-SQL dialect: CTE hoisting and dotted-alias mangling
slayer/sql/dialects/tsql.py, tests/dialects/test_tsql.py
TsqlDialect.emit_outer_wrap hoists inner WITH clauses to the outermost SELECT to comply with T-SQL derived-table restrictions. _TSQL_DOTTED_ALIAS_RE and rewrite_emitted_sql mangle [a.b][a___b]; decode_result_keys reverses it. Comprehensive tests cover CTE hoisting, alias mangling, round-trip, and end-to-end engine integration.
Snapshot updates and xfail removals
tests/dialects/test_byte_equivalence.py, tests/dialects/test_mysql.py, tests/integration/test_integration_mysql.py, tests/integration/test_integration_sqlserver.py
Byte-equivalence snapshots updated to mangled bracket aliases and uppercased DATETRUNC unit. Cumsum window test parameterised per dialect. MySQL outer-wrap quoting/CTE tests added. All pytest.mark.xfail(strict=True) decorators for DEV-1571 removed from MySQL and SQL Server integration suites.

Sequence Diagram(s)

sequenceDiagram
    participant SQLGenerator
    participant TsqlDialect
    rect rgba(135, 206, 235, 0.5)
        note over SQLGenerator,TsqlDialect: Outer-wrap construction (T-SQL path)
    end
    SQLGenerator->>SQLGenerator: _build_outer_wrap(inner_sql, order, limit, offset)
    SQLGenerator->>SQLGenerator: strip trailing ORDER BY / LIMIT text from inner_sql
    SQLGenerator->>TsqlDialect: emit_outer_wrap(inner_sql, public, order, limit, offset_arg, parse)
    TsqlDialect->>TsqlDialect: parse inner_sql → exp.Select
    alt top-level WITH present
        TsqlDialect->>TsqlDialect: detach with_ from inner Select
        TsqlDialect->>TsqlDialect: strip inner table qualifiers from ORDER BY columns
        TsqlDialect->>TsqlDialect: wrap inner Select AS _outer derived table
        TsqlDialect->>TsqlDialect: re-attach WITH at outer Select
        TsqlDialect-->>SQLGenerator: final T-SQL string
    else no WITH or parse error
        TsqlDialect->>TsqlDialect: base emit_outer_wrap fallback
        TsqlDialect-->>SQLGenerator: base T-SQL string
    end
    SQLGenerator->>TsqlDialect: rewrite_emitted_sql(sql)
    TsqlDialect->>TsqlDialect: regex-replace [a.b] → [a___b] via encode_alias
    TsqlDialect-->>SQLGenerator: mangled SQL
Loading
sequenceDiagram
    participant Engine
    participant FakeTsqlClient
    participant TsqlDialect
    participant SlayerResponse
    Engine->>FakeTsqlClient: execute SQL
    FakeTsqlClient-->>Engine: rows with mangled keys e.g. {orders___status: ...}
    Engine->>TsqlDialect: decode_result_keys(rows)
    TsqlDialect->>TsqlDialect: decode_alias each key orders___status → orders.status
    TsqlDialect-->>Engine: rows with dotted keys
    Engine->>SlayerResponse: build response with decoded data
    SlayerResponse-->>Engine: resp.data with dotted alias keys
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • MotleyAI/slayer#134: Directly related — the earlier generator outer-wrap construction that this PR refactors into the new SqlDialect.emit_outer_wrap dispatch hook.
  • MotleyAI/slayer#170: Directly related — introduced BigQuery dotted-alias ___-separator mangling that this PR extracts into the shared _alias_mangle module consumed by both BigQuery and T-SQL dialects.

Poem

🐰 A dot inside a bracket made T-SQL frown,
The WITH inside subqueries brought the query down.
So I mangle the dots to three underscores neat,
And hoist the CTEs so the SQL's complete.
encode, decode, bijection so true —
Round-trip the alias, the rabbit hops through! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the three main bug fixes (T-SQL CTE-in-derived-table, T-SQL dotted ORDER BY, MySQL outer-wrap quoting) that form the core of this PR's changes.
Docstring Coverage ✅ Passed Docstring coverage is 82.86% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1571-t-sql-and-mysql-sql-generator-bugs-surfaced-by-dev-1564

Comment @coderabbitai help to get the list of available commands and usage tips.

@ZmeiGorynych ZmeiGorynych changed the base branch from egor/dev-1564-proper-integration-test-ci-for-mysql-clickhouse-and-sql to main June 18, 2026 08:42
ZmeiGorynych and others added 4 commits June 18, 2026 11:08
…ionally

Codex flagged that `TsqlDialect.emit_outer_wrap` fell back to the base impl
whenever the inner SELECT had no top-level CTE, and the base impl emits a
detached `Limit.sql(dialect="tsql")` fragment as literal `LIMIT N`. T-SQL
queries that hit the DEV-1444 outer-projection trim without happening to
have a CTE chain would therefore still emit invalid pagination — a latent
regression in the same hot path Bug 3 lives on.

Refactor: always take the AST-based path. The CTE detach is now optional
(skipped when `with_node is None`); everything else — the derived-table
wrap, the bracketed public projection, and the dialect-aware
`TOP`/`FETCH NEXT` transposition — runs uniformly. Fallback to the base
impl is retained for parse failures only.

Adds `test_tsql_emit_outer_wrap_no_ctes_with_limit_transposes_pagination`
pinning the regression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…assembly

The original Bug 3 description in the ticket only mentioned the outer wrap.
The DEV-1564 integration suites (xfails removed in this PR) exposed that
several inner CTE-assembly sites in `SQLGenerator` also hardcoded
`f'"{...}"'`. MySQL parses those as string literals and fails outright;
T-SQL parses them as identifiers but the literal-dot form bypasses Bug 2's
bracket-anchored mangling so ORDER BY can't match the SELECT alias.

Refactor:
- Add `SQLGenerator._q(name)` — `exp.Identifier(this=name, quoted=True).
  sql(dialect=self.dialect)` — the single dialect-aware quoter.
- Replace every hardcoded `f'"{...}"'` site in `generator.py`:
  - `_build_combined`: final_parts, join ON parts.
  - `_assemble_combined_sql`: ORDER BY parts.
  - `_apply_pagination_to_sql` (converted from staticmethod to instance).
  - `_generate_with_computed`: layer_parts, expression aliases, transform
    aliases, time_col, self-join JOIN ON, join_cols, sjoin AS.
  - Post-filter qualified-name wrapper.
  - `_build_transform_sql` (converted from staticmethod): window measure,
    time_col, PARTITION BY.
  - `_build_self_join_column` (converted from staticmethod): table.measure.
- Re-emit `expr.sql` (carries ANSI-quoted aliases from enrichment's
  `_resolve_sql`) by parsing in postgres dialect and re-emitting in the
  active dialect. Applies to both the layered and fallback expression
  paths in `_generate_with_computed`.

Tests:
- `tests/dialects/test_mysql.py::test_mysql_time_shift_inner_cte_uses_backticks_not_ansi_quotes`
  pins that change_pct desugaring emits backticked identifiers throughout
  the inner CTE chain on MySQL.
- `tests/dialects/test_tsql.py::test_tsql_time_shift_inner_cte_uses_mangled_brackets`
  symmetric pin for T-SQL — bracket form throughout AND Bug 2 mangling
  fires uniformly.

4671 unit tests pass; ruff clean. Expected to clear the 5 CI failures on
the time-shift / cross-model integration paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nd-mysql-sql-generator-bugs-surfaced-by-dev-1564
…+ un-xfail post-merge

Two follow-ups after merging main:

(1) Codex flagged `_query_as_model` at slayer/engine/query_engine.py:1932
emitted the rename-wrapper SQL with hardcoded ANSI double-quoted aliases
(`SELECT "orders.id" AS id FROM (inner) AS _inner`). The inner subquery
goes through SQLGenerator with dialect-aware quoting (and Bug 2's bracket
mangling on T-SQL), so the wrapper's references no longer matched what
the inner subquery exposes — query-backed model lookups on MySQL failed
with a parse error and on T-SQL failed with `Invalid column name
'a_tbl___id'`. Now emits each alias via sqlglot's dialect-aware
identifier quoting and runs the dialect's `rewrite_emitted_sql` post-
pass so T-SQL bracket-mangling fires uniformly across inner and outer
references.

(2) The merge brought in DEV-1564 round 4's three new MySQL xfails plus
one new T-SQL xfail that all pin on Bug 3. The Round 2 inner-CTE
quoting fix already passes them, so they surface as XPASS(strict)
failures. Remove the xfail markers:
- tests/integration/test_integration_mysql.py: test_time_shift_with_date_range,
  test_consecutive_periods_with_boolean_predicate, test_change_with_date_range.
- tests/integration/test_integration_sqlserver.py: test_consecutive_periods_with_boolean_predicate.

4690 unit tests pass; ruff clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant