Support parentheses for calculated measures by AlisonNeedsCopilot · Pull Request #52 · dodopizza/sqlalchemy-kusto

AlisonNeedsCopilot · 2026-02-11T01:02:26Z

Merge after this PR: #51

Overview

Previously, calculated measures with parentheses were not parsed correctly resulting in mangled queries that threw syntax errors on run. In the image below, both calculated measures have incorrect bracketing and parentheses placement. The calculated measure with parentheses and an ad hoc measure has incorrect escaping. The calculated measure with parentheses and a predefined measure has the original sqlalchemy function instead of the Kusto function.

This development fixes this by stripping outer parentheses from expressions/sub-expressions and adding them back

Technical details

_ParseState: Refactor tracking for string position being within quotes or brackets into new class to remove some code duplication. Class also tracks parentheses depth. Refactored helper functions to use _ParseState
_count_outer_parens(): Strip and count number of outer parentheses. Return number of outer parentheses and stripped expression
_escape_and_quote_columns(): Strip expressions of outer parentheses before quoting and escaping them. Then, re-add the parentheses if there is an operation within the expression/subexpression within the parentheses. Otherwise, omit them.

UI Changes

This ensures that extend operations (which may reference summarized columns) are executed after the summarize clause in the generated KQL query.

…d hoc measures - Add _find_operator_outside_quotes() helper to find operators not inside quoted strings - Update _escape_and_quote_columns() to recursively escape both sides of operators - Update _is_number_literal() to match integers and decimals with digits on both sides - Add tests for: - Two quoted measures: "Measure 1" + "Measure 2" -> ["Measure 1"] + ["Measure 2"] - Measure with constant: "Measure 1" * 2 -> ["Measure 1"] * 2 - Measure with operator in name: "Measure 1-2" -> ["Measure 1-2"] - _is_number_literal function validation

- Add comprehensive unit tests for new helper functions: - TestIsInsideQuotesOrBrackets (10 tests) - TestFindMatchingParen (8 tests) - TestHasOperatorsOutsideQuotes (8 tests) - TestExtractAndReplaceAggregates (10 tests) - TestContainsAggregateFunction (10 tests) - Add comment explaining existing_aggs registration logic - Apply black formatting

Prevents double-escaping of expressions like (count(a) + count(b)) by detecting matching outer parentheses, processing the inner content recursively, and re-adding the parentheses.

…cate checks

- Add _ParseState class for unified parsing state tracking - Add _find_top_level_operator to find operators at depth 0 (outside quotes, brackets, and parens) - Add _count_outer_parens to handle nested parentheses in expressions - Add _wrap_column_refs_in_parens for arithmetic precedence - Update _escape_and_quote_columns to handle parenthesized expressions - Update _has_operators_outside_quotes to use new top-level operator detection - Refactor _is_inside_quotes_or_brackets and _find_matching_paren to use _ParseState - Remove old _find_operator_outside_quotes (replaced by _find_top_level_operator) - Update tests to reflect correct behavior for count(*) - no longer seen as expression with operators

- Add position constants to TestFindTopLevelOperator class - Add paren count constants to TestCountOuterParens class - Add closing position constants to TestFindMatchingParen class - Add aggregate count constants to TestExtractAndReplaceAggregates class - Add integration tests for parentheses preservation, quoted identifiers, uppercase functions, and extend after summarize

Copilot

Pull request overview

This pull request adds support for correctly parsing and handling calculated measures with parentheses in the Kusto KQL dialect. Previously, expressions like ("Measure 1") + ("Measure 2") or (("Measure")) were not parsed correctly, resulting in malformed KQL queries with incorrect bracketing and parentheses placement.

Changes:

Introduced _ParseState class to track parsing state (quotes, brackets, parentheses depth) and refactored helper functions to use it
Added _count_outer_parens() to strip and count outer parentheses from expressions
Enhanced _escape_and_quote_columns() to handle parenthesized expressions by stripping outer parens, processing operands recursively, and re-adding parens when needed
Modified query generation logic to place summarize before extend in the query structure
Changed extend_columns and summarize_columns from sets to lists to preserve column order
Added comprehensive unit and integration tests for parentheses handling

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
sqlalchemy_kusto/dialect_kql.py	Core implementation: Added `_ParseState` class, new helper functions for parentheses handling, updated `_escape_and_quote_columns()` and `_get_projection_or_summarize()` to support calculated measures with parentheses
tests/unit/test_dialect_kql.py	Added extensive unit tests for parentheses handling, operator detection, aggregate extraction, and helper functions. Updated existing tests to reflect new query structure (summarize before extend, preserved column order)
tests/integration/test_dialect_kql.py	Added integration tests to verify parentheses preservation and calculated measures work end-to-end with actual Kusto queries

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-11T01:06:27Z

+            summarize_columns = []
+            extend_columns = []


The change from sets to lists for summarize_columns and extend_columns (lines 416-417) removes automatic deduplication. This means if the same column definition is added multiple times through the logic, it will appear multiple times in the final query, which could cause KQL syntax errors or unexpected behavior.

Consider adding explicit deduplication logic if needed, or document why duplicate entries are not possible in this context.

Copilot · 2026-02-11T01:06:27Z

+            if ch == "[":
+                self.in_bracket = True
+            elif ch == "]":
+                self.in_bracket = False


The _ParseState class tracks bracket state as a simple boolean (in_bracket), which assumes brackets are never nested. However, KQL bracket notation ["column"] could theoretically contain nested structures or be part of more complex expressions. While this is unlikely in practice for KQL column references, the implementation is inconsistent with how parentheses are tracked (using a depth counter).

Consider whether nested brackets are possible in your use case. If they are, implement a depth counter similar to paren_depth for brackets as well.

Copilot · 2026-02-11T01:06:27Z

            for operator in ["/", "+", "-", "*"]:
-                if operator in name:
-                    # Split the name at the first operator and wrap the left part
-                    parts = name.split(operator, 1)
-                    # Remove quotes if they exist at the edges
-                    col_part = parts[0].strip()
-                    if col_part.startswith('"') and col_part.endswith('"'):
-                        col_part = col_part[1:-1].strip()
-                    col_part = col_part.replace('"', '\\"')
-                    return f'["{col_part}"] {operator} {parts[1].strip()}'  # Wrap the column part
-        # No operators found, just wrap the entire name
+                pos = KustoKqlCompiler._find_top_level_operator(inner, operator)
+                if pos != -1:
+                    left = KustoKqlCompiler._escape_and_quote_columns(
+                        inner[:pos].strip()
+                    )
+                    right = KustoKqlCompiler._escape_and_quote_columns(
+                        inner[pos + 1 :].strip()
+                    )
+                    return (
+                        "(" * outer_paren_count
+                        + left
+                        + " "
+                        + operator
+                        + " "
+                        + right
+                        + ")" * outer_paren_count
+                    )


The operator parsing order ["/", "+", "-", "*"] (line 599) doesn't respect standard mathematical operator precedence. In mathematics, multiplication and division have higher precedence than addition and subtraction, so a + b * c should be parsed as a + (b * c), not (a + b) * c.

The current implementation would split "col_a" + "col_b" * 2 at the / first (not found), then at +, producing ["col_a"] + ["col_b"] * 2, which happens to be correct by accident. However, for an expression like "a" / "b" + "c", it would split at / first, producing ["a"] / ["b"] + "c", which would then not split the + "c" part correctly on the right side.

To fix this, search for lower-precedence operators first. Change the order to ["+", "-", "*", "/"] or better yet ["+", "-"] then ["*", "/"] in separate passes. Alternatively, note that the current behavior might be intentional if all operators are meant to be treated with equal precedence in KQL contexts - in that case, please add a comment explaining this design decision.

Copilot · 2026-02-11T01:06:28Z

+        """Count and strip outer parentheses from text. Returns (count, stripped_text)."""
+        text = text.strip()
+        count = 0
+        while len(text) > 1 and text[0] == "(" and text[-1] == ")":
+            depth = 0
+            for ch in text[:-1]:  # Scan all but last char
+                depth += (ch == "(") - (ch == ")")
+                if depth == 0:
+                    return count, text  # First '(' closed before end


The _count_outer_parens function doesn't account for quoted strings or brackets when counting parentheses. This could lead to incorrect behavior when parentheses appear inside quotes or brackets. For example, with input '("text(with)parens")', the function may incorrectly determine that the outer parenthesis doesn't wrap the whole expression because it doesn't track quote state while scanning.

Consider using the _ParseState class that was introduced for this purpose to properly handle quotes and brackets when counting parentheses.

Suggested change

"""Count and strip outer parentheses from text. Returns (count, stripped_text)."""

text = text.strip()

count = 0

while len(text) > 1 and text[0] == "(" and text[-1] == ")":

depth = 0

for ch in text[:-1]: # Scan all but last char

depth += (ch == "(") - (ch == ")")

if depth == 0:

return count, text # First '(' closed before end

"""Count and strip outer parentheses from text. Returns (count, stripped_text).

This respects quoted strings and brackets by using the parser-aware

`_find_matching_paren` helper, which relies on `_ParseState`.

"""

text = text.strip()

count = 0

# Repeatedly strip a single layer of outer parentheses as long as the

# opening parenthesis at position 0 matches the final character.

while len(text) > 1 and text[0] == "(":

match_idx = KustoKqlCompiler._find_matching_paren(text, 0)

# If we didn't find a matching paren, or it doesn't close at the end,

# then the outer '(' does not wrap the whole expression.

if match_idx != len(text) - 1:

break

Copilot · 2026-02-11T01:06:28Z

+    assert "((" in query_compiled
+    assert "))" in query_compiled


The test test_parentheses_preserved_in_expression at line 205-228 expects double parentheses to be preserved in the output (("Text")), asserting that "((" and "))" appear in the compiled query. However, according to the unit test test_calculated_measure_double_paren at lines 592-605 in tests/unit/test_dialect_kql.py, the comment states that "Double parens around a single measure reference are stripped since they're unnecessary for precedence."

This creates an inconsistency: the integration test expects parens to be preserved, while the unit test comments suggest they should be stripped for single values with no operators. Please verify which behavior is correct and ensure the tests are consistent with the intended functionality.

Suggested change

assert "((" in query_compiled

assert "))" in query_compiled

# Double parens around a single measure reference should be stripped as

# they're unnecessary for precedence; only the identifier itself should remain.

assert "((" not in query_compiled

assert "))" not in query_compiled

Copilot · 2026-02-11T01:06:28Z

                    )
            if extend_columns:
-                extend_statement = f"| extend {', '.join(sorted(extend_columns))}"
+                extend_statement = f"| extend {', '.join(extend_columns)}"


The change from sorted(extend_columns) to just extend_columns removes the deterministic ordering of extend columns. While the new behavior preserves the order in which columns appear in the SELECT clause (which is generally more intuitive), this is a breaking change in query output format that could affect users who depend on alphabetical ordering.

Consider documenting this behavior change explicitly in the PR description or release notes, as it may impact query caching, testing, or tools that parse the generated KQL queries.

Suggested change

extend_statement = f"| extend {', '.join(extend_columns)}"

extend_statement = f"| extend {', '.join(sorted(extend_columns))}"

…ntermediary measures, and parentheses support for calculated measures

…ted measures PR dodopizza#52 already implements extend-after-summarize ordering through its intermediary measures approach, so conflicts are resolved by keeping PR dodopizza#52's more comprehensive implementation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Alison Gim added 11 commits February 5, 2026 20:49

Move extend clause after summarize for calculated measures

5548c47

This ensures that extend operations (which may reference summarized columns) are executed after the summarize clause in the generated KQL query.

Clean up comments in _escape_and_quote_columns

fdd3851

Fix parenthesized expression handling in _escape_and_quote_columns

7b0ac2a

Prevents double-escaping of expressions like (count(a) + count(b)) by detecting matching outer parentheses, processing the inner content recursively, and re-adding the parentheses.

Simplify intermediary measures: remove paren depth tracking and dupli…

b90c808

…cate checks

Remove comment about existing_aggs tracking

323c257

Update tests for intermediary measures without paren depth tracking

b64cf11

Refactor: use > 1 instead of >= 2 to avoid PLR2004 noqa comment

98da2eb

Copilot AI review requested due to automatic review settings February 11, 2026 01:02

Copilot started reviewing on behalf of AlisonNeedsCopilot February 11, 2026 01:02 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

ag-ramachandran pushed a commit to ag-ramachandran/sqlalchemy-kusto that referenced this pull request Mar 25, 2026

Merge PRs dodopizza#50, dodopizza#51, dodopizza#52: Quote escaping, i…

ecf7a1f

…ntermediary measures, and parentheses support for calculated measures

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support parentheses for calculated measures#52

Support parentheses for calculated measures#52
AlisonNeedsCopilot wants to merge 11 commits into
dodopizza:mainfrom
AlisonNeedsCopilot:Users/AlisonNeedsCopilot/support-parentheses-for-calculated-measures

AlisonNeedsCopilot commented Feb 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    assert "((" in query_compiled
-    assert "))" in query_compiled
+    # Double parens around a single measure reference should be stripped as
+    # they're unnecessary for precedence; only the identifier itself should remain.
+    assert "((" not in query_compiled
+    assert "))" not in query_compiled

	extend_statement = f"\| extend {', '.join(extend_columns)}"
	extend_statement = f"\| extend {', '.join(sorted(extend_columns))}"

Conversation

AlisonNeedsCopilot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Technical details

UI Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AlisonNeedsCopilot commented Feb 11, 2026 •

edited

Loading