Skip to content

Support parentheses for calculated measures#52

Open
AlisonNeedsCopilot wants to merge 11 commits into
dodopizza:mainfrom
AlisonNeedsCopilot:Users/AlisonNeedsCopilot/support-parentheses-for-calculated-measures
Open

Support parentheses for calculated measures#52
AlisonNeedsCopilot wants to merge 11 commits into
dodopizza:mainfrom
AlisonNeedsCopilot:Users/AlisonNeedsCopilot/support-parentheses-for-calculated-measures

Conversation

@AlisonNeedsCopilot

@AlisonNeedsCopilot AlisonNeedsCopilot commented Feb 11, 2026

Copy link
Copy Markdown

Merge after this PR: #51

Overview

Previously, calculated measures with parentheses were not parsed correctly resulting in mangled queries that threw syntax errors on run. In the image below, both calculated measures have incorrect bracketing and parentheses placement. The calculated measure with parentheses and an ad hoc measure has incorrect escaping. The calculated measure with parentheses and a predefined measure has the original sqlalchemy function instead of the Kusto function.
image

This development fixes this by stripping outer parentheses from expressions/sub-expressions and adding them back

Technical details

  • _ParseState: Refactor tracking for string position being within quotes or brackets into new class to remove some code duplication. Class also tracks parentheses depth. Refactored helper functions to use _ParseState
  • _count_outer_parens(): Strip and count number of outer parentheses. Return number of outer parentheses and stripped expression
  • _escape_and_quote_columns(): Strip expressions of outer parentheses before quoting and escaping them. Then, re-add the parentheses if there is an operation within the expression/subexpression within the parentheses. Otherwise, omit them.

UI Changes

image

Alison Gim added 11 commits February 5, 2026 20:49
This ensures that extend operations (which may reference summarized columns)
are executed after the summarize clause in the generated KQL query.
…d hoc measures

- Add _find_operator_outside_quotes() helper to find operators not inside quoted strings
- Update _escape_and_quote_columns() to recursively escape both sides of operators
- Update _is_number_literal() to match integers and decimals with digits on both sides
- Add tests for:
  - Two quoted measures: "Measure 1" + "Measure 2" -> ["Measure 1"] + ["Measure 2"]
  - Measure with constant: "Measure 1" * 2 -> ["Measure 1"] * 2
  - Measure with operator in name: "Measure 1-2" -> ["Measure 1-2"]
  - _is_number_literal function validation
- Add comprehensive unit tests for new helper functions:
  - TestIsInsideQuotesOrBrackets (10 tests)
  - TestFindMatchingParen (8 tests)
  - TestHasOperatorsOutsideQuotes (8 tests)
  - TestExtractAndReplaceAggregates (10 tests)
  - TestContainsAggregateFunction (10 tests)
- Add comment explaining existing_aggs registration logic
- Apply black formatting
Prevents double-escaping of expressions like (count(a) + count(b)) by
detecting matching outer parentheses, processing the inner content
recursively, and re-adding the parentheses.
- Add _ParseState class for unified parsing state tracking
- Add _find_top_level_operator to find operators at depth 0 (outside quotes, brackets, and parens)
- Add _count_outer_parens to handle nested parentheses in expressions
- Add _wrap_column_refs_in_parens for arithmetic precedence
- Update _escape_and_quote_columns to handle parenthesized expressions
- Update _has_operators_outside_quotes to use new top-level operator detection
- Refactor _is_inside_quotes_or_brackets and _find_matching_paren to use _ParseState
- Remove old _find_operator_outside_quotes (replaced by _find_top_level_operator)
- Update tests to reflect correct behavior for count(*) - no longer seen as expression with operators
- Add position constants to TestFindTopLevelOperator class
- Add paren count constants to TestCountOuterParens class
- Add closing position constants to TestFindMatchingParen class
- Add aggregate count constants to TestExtractAndReplaceAggregates class
- Add integration tests for parentheses preservation, quoted identifiers,
  uppercase functions, and extend after summarize
Copilot AI review requested due to automatic review settings February 11, 2026 01:02

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for correctly parsing and handling calculated measures with parentheses in the Kusto KQL dialect. Previously, expressions like ("Measure 1") + ("Measure 2") or (("Measure")) were not parsed correctly, resulting in malformed KQL queries with incorrect bracketing and parentheses placement.

Changes:

  • Introduced _ParseState class to track parsing state (quotes, brackets, parentheses depth) and refactored helper functions to use it
  • Added _count_outer_parens() to strip and count outer parentheses from expressions
  • Enhanced _escape_and_quote_columns() to handle parenthesized expressions by stripping outer parens, processing operands recursively, and re-adding parens when needed
  • Modified query generation logic to place summarize before extend in the query structure
  • Changed extend_columns and summarize_columns from sets to lists to preserve column order
  • Added comprehensive unit and integration tests for parentheses handling

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
sqlalchemy_kusto/dialect_kql.py Core implementation: Added _ParseState class, new helper functions for parentheses handling, updated _escape_and_quote_columns() and _get_projection_or_summarize() to support calculated measures with parentheses
tests/unit/test_dialect_kql.py Added extensive unit tests for parentheses handling, operator detection, aggregate extraction, and helper functions. Updated existing tests to reflect new query structure (summarize before extend, preserved column order)
tests/integration/test_dialect_kql.py Added integration tests to verify parentheses preservation and calculated measures work end-to-end with actual Kusto queries

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +416 to +417
summarize_columns = []
extend_columns = []

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from sets to lists for summarize_columns and extend_columns (lines 416-417) removes automatic deduplication. This means if the same column definition is added multiple times through the logic, it will appear multiple times in the final query, which could cause KQL syntax errors or unexpected behavior.

Consider adding explicit deduplication logic if needed, or document why duplicate entries are not possible in this context.

Copilot uses AI. Check for mistakes.
Comment on lines +107 to +110
if ch == "[":
self.in_bracket = True
elif ch == "]":
self.in_bracket = False

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _ParseState class tracks bracket state as a simple boolean (in_bracket), which assumes brackets are never nested. However, KQL bracket notation ["column"] could theoretically contain nested structures or be part of more complex expressions. While this is unlikely in practice for KQL column references, the implementation is inconsistent with how parentheses are tracked (using a depth counter).

Consider whether nested brackets are possible in your use case. If they are, implement a depth counter similar to paren_depth for brackets as well.

Copilot uses AI. Check for mistakes.
Comment on lines 599 to +616
for operator in ["/", "+", "-", "*"]:
if operator in name:
# Split the name at the first operator and wrap the left part
parts = name.split(operator, 1)
# Remove quotes if they exist at the edges
col_part = parts[0].strip()
if col_part.startswith('"') and col_part.endswith('"'):
col_part = col_part[1:-1].strip()
col_part = col_part.replace('"', '\\"')
return f'["{col_part}"] {operator} {parts[1].strip()}' # Wrap the column part
# No operators found, just wrap the entire name
pos = KustoKqlCompiler._find_top_level_operator(inner, operator)
if pos != -1:
left = KustoKqlCompiler._escape_and_quote_columns(
inner[:pos].strip()
)
right = KustoKqlCompiler._escape_and_quote_columns(
inner[pos + 1 :].strip()
)
return (
"(" * outer_paren_count
+ left
+ " "
+ operator
+ " "
+ right
+ ")" * outer_paren_count
)

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator parsing order ["/", "+", "-", "*"] (line 599) doesn't respect standard mathematical operator precedence. In mathematics, multiplication and division have higher precedence than addition and subtraction, so a + b * c should be parsed as a + (b * c), not (a + b) * c.

The current implementation would split "col_a" + "col_b" * 2 at the / first (not found), then at +, producing ["col_a"] + ["col_b"] * 2, which happens to be correct by accident. However, for an expression like "a" / "b" + "c", it would split at / first, producing ["a"] / ["b"] + "c", which would then not split the + "c" part correctly on the right side.

To fix this, search for lower-precedence operators first. Change the order to ["+", "-", "*", "/"] or better yet ["+", "-"] then ["*", "/"] in separate passes. Alternatively, note that the current behavior might be intentional if all operators are meant to be treated with equal precedence in KQL contexts - in that case, please add a comment explaining this design decision.

Copilot uses AI. Check for mistakes.
Comment thread sqlalchemy_kusto/dialect_kql.py
Comment on lines +203 to +211
"""Count and strip outer parentheses from text. Returns (count, stripped_text)."""
text = text.strip()
count = 0
while len(text) > 1 and text[0] == "(" and text[-1] == ")":
depth = 0
for ch in text[:-1]: # Scan all but last char
depth += (ch == "(") - (ch == ")")
if depth == 0:
return count, text # First '(' closed before end

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _count_outer_parens function doesn't account for quoted strings or brackets when counting parentheses. This could lead to incorrect behavior when parentheses appear inside quotes or brackets. For example, with input '("text(with)parens")', the function may incorrectly determine that the outer parenthesis doesn't wrap the whole expression because it doesn't track quote state while scanning.

Consider using the _ParseState class that was introduced for this purpose to properly handle quotes and brackets when counting parentheses.

Suggested change
"""Count and strip outer parentheses from text. Returns (count, stripped_text)."""
text = text.strip()
count = 0
while len(text) > 1 and text[0] == "(" and text[-1] == ")":
depth = 0
for ch in text[:-1]: # Scan all but last char
depth += (ch == "(") - (ch == ")")
if depth == 0:
return count, text # First '(' closed before end
"""Count and strip outer parentheses from text. Returns (count, stripped_text).
This respects quoted strings and brackets by using the parser-aware
`_find_matching_paren` helper, which relies on `_ParseState`.
"""
text = text.strip()
count = 0
# Repeatedly strip a single layer of outer parentheses as long as the
# opening parenthesis at position 0 matches the final character.
while len(text) > 1 and text[0] == "(":
match_idx = KustoKqlCompiler._find_matching_paren(text, 0)
# If we didn't find a matching paren, or it doesn't close at the end,
# then the outer '(' does not wrap the whole expression.
if match_idx != len(text) - 1:
break

Copilot uses AI. Check for mistakes.
Comment on lines +219 to +220
assert "((" in query_compiled
assert "))" in query_compiled

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test test_parentheses_preserved_in_expression at line 205-228 expects double parentheses to be preserved in the output (("Text")), asserting that "((" and "))" appear in the compiled query. However, according to the unit test test_calculated_measure_double_paren at lines 592-605 in tests/unit/test_dialect_kql.py, the comment states that "Double parens around a single measure reference are stripped since they're unnecessary for precedence."

This creates an inconsistency: the integration test expects parens to be preserved, while the unit test comments suggest they should be stripped for single values with no operators. Please verify which behavior is correct and ensure the tests are consistent with the intended functionality.

Suggested change
assert "((" in query_compiled
assert "))" in query_compiled
# Double parens around a single measure reference should be stripped as
# they're unnecessary for precedence; only the identifier itself should remain.
assert "((" not in query_compiled
assert "))" not in query_compiled

Copilot uses AI. Check for mistakes.
)
if extend_columns:
extend_statement = f"| extend {', '.join(sorted(extend_columns))}"
extend_statement = f"| extend {', '.join(extend_columns)}"

Copilot AI Feb 11, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from sorted(extend_columns) to just extend_columns removes the deterministic ordering of extend columns. While the new behavior preserves the order in which columns appear in the SELECT clause (which is generally more intuitive), this is a breaking change in query output format that could affect users who depend on alphabetical ordering.

Consider documenting this behavior change explicitly in the PR description or release notes, as it may impact query caching, testing, or tools that parse the generated KQL queries.

Suggested change
extend_statement = f"| extend {', '.join(extend_columns)}"
extend_statement = f"| extend {', '.join(sorted(extend_columns))}"

Copilot uses AI. Check for mistakes.
ag-ramachandran pushed a commit to ag-ramachandran/sqlalchemy-kusto that referenced this pull request Mar 25, 2026
…ntermediary measures, and parentheses support for calculated measures
ag-ramachandran pushed a commit to ag-ramachandran/sqlalchemy-kusto that referenced this pull request Mar 25, 2026
…ted measures

PR dodopizza#52 already implements extend-after-summarize ordering through its
intermediary measures approach, so conflicts are resolved by keeping
PR dodopizza#52's more comprehensive implementation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants