fix: ensure logical operators do not modify string literal casing#49
fix: ensure logical operators do not modify string literal casing#49Egorrko wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the KQL dialect’s handling of logical operators so that compiling queries no longer (a) mutates SQLAlchemy’s global operator map and (b) lowercases AND/OR inside string literals during post-processing.
Changes:
- Removed the global override of SQLAlchemy
OPERATORS[and_]to avoid affecting the entire SQLAlchemy runtime. - Implemented
KustoKqlCompiler.visit_clauselist()to emit KQL-styleand/orseparators during compilation instead of regex-rewriting the rendered WHERE clause. - Added unit tests covering operator precedence (parentheses) and ensuring literal casing is preserved.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
sqlalchemy_kusto/dialect_kql.py |
Stops mutating SQLAlchemy globals; moves logical operator formatting into the compiler and removes regex-based AND/OR rewriting in _sql_to_kql_where(). |
tests/unit/test_dialect_kql.py |
Adds regression tests for precedence and ensuring AND/OR inside string literals are not modified. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if clauselist.operator in kql_operators: | ||
| sep = kql_operators[clauselist.operator] | ||
| return self._generate_delimited_list(clauselist.clauses, sep, **kw) | ||
|
|
There was a problem hiding this comment.
visit_clauselist() currently ignores clauselist.group / grouping semantics and always returns the raw _generate_delimited_list(...) output for AND/OR. SQLAlchemy uses grouping to emit parentheses for nested boolean expressions (e.g. and_(or_(...), ...)), so this override should preserve that behavior (e.g. wrap the rendered list when grouping is requested) to avoid changing operator precedence in the emitted KQL.
| if clauselist.operator in kql_operators: | |
| sep = kql_operators[clauselist.operator] | |
| return self._generate_delimited_list(clauselist.clauses, sep, **kw) | |
| if clauselist.operator in kql_operators: | |
| sep = kql_operators[clauselist.operator] | |
| text = self._generate_delimited_list(clauselist.clauses, sep, **kw) | |
| # Preserve SQLAlchemy's grouping semantics so nested boolean expressions | |
| # get parentheses when required (e.g. and_(or_(...), ...)). | |
| if getattr(clauselist, "group", False): | |
| return f"({text})" | |
| return text |
|
|
||
| if clauselist.operator in kql_operators: | ||
| sep = kql_operators[clauselist.operator] | ||
| return self._generate_delimited_list(clauselist.clauses, sep, **kw) | ||
|
|
There was a problem hiding this comment.
There’s trailing whitespace on the blank lines inside visit_clauselist (will be flagged by linters like Ruff/pycodestyle). Please remove the extra spaces.
| if clauselist.operator in kql_operators: | |
| sep = kql_operators[clauselist.operator] | |
| return self._generate_delimited_list(clauselist.clauses, sep, **kw) | |
| if clauselist.operator in kql_operators: | |
| sep = kql_operators[clauselist.operator] | |
| return self._generate_delimited_list(clauselist.clauses, sep, **kw) |
| column("C") == 3 | ||
| ) | ||
| query1 = select([column("Field")]).select_from(text("logs")).where(cond_nested_or) | ||
| compiled1 = str(query1.compile(engine, compile_kwargs={"literal_binds": True})).replace("\n", " ") | ||
|
|
||
| assert '(["A"] == 1 or ["B"] == 2) and ["C"] == 3' in compiled1 | ||
|
|
||
| cond_nested_and = sa.or_( | ||
| column("A") == 1, | ||
| sa.and_(column("B") == 2, column("C") == 3) | ||
| ) | ||
| query2 = select([column("Field")]).select_from(text("logs")).where(cond_nested_and) | ||
| compiled2 = str(query2.compile(engine, compile_kwargs={"literal_binds": True})).replace("\n", " ") |
There was a problem hiding this comment.
These new test lines are very long and aren’t formatted consistently with the rest of the file (Black formatting / line-length). Consider wrapping the select_from(...).where(...) and compiledX = ...compile(...).replace(...) statements across multiple lines and adding trailing commas in multi-line calls so formatters don’t churn the diff.
| column("C") == 3 | |
| ) | |
| query1 = select([column("Field")]).select_from(text("logs")).where(cond_nested_or) | |
| compiled1 = str(query1.compile(engine, compile_kwargs={"literal_binds": True})).replace("\n", " ") | |
| assert '(["A"] == 1 or ["B"] == 2) and ["C"] == 3' in compiled1 | |
| cond_nested_and = sa.or_( | |
| column("A") == 1, | |
| sa.and_(column("B") == 2, column("C") == 3) | |
| ) | |
| query2 = select([column("Field")]).select_from(text("logs")).where(cond_nested_and) | |
| compiled2 = str(query2.compile(engine, compile_kwargs={"literal_binds": True})).replace("\n", " ") | |
| column("C") == 3, | |
| ) | |
| query1 = ( | |
| select([column("Field")]) | |
| .select_from(text("logs")) | |
| .where(cond_nested_or) | |
| ) | |
| compiled1 = ( | |
| str(query1.compile(engine, compile_kwargs={"literal_binds": True})) | |
| .replace("\n", " ") | |
| ) | |
| assert '(["A"] == 1 or ["B"] == 2) and ["C"] == 3' in compiled1 | |
| cond_nested_and = sa.or_( | |
| column("A") == 1, | |
| sa.and_(column("B") == 2, column("C") == 3), | |
| ) | |
| query2 = ( | |
| select([column("Field")]) | |
| .select_from(text("logs")) | |
| .where(cond_nested_and) | |
| ) | |
| compiled2 = ( | |
| str(query2.compile(engine, compile_kwargs={"literal_binds": True})) | |
| .replace("\n", " ") | |
| ) |
This PR fixes two things:
OPERATORS[operators.and_] = " and "- this line overrides theandoperator for the entire sqlalchemy library as soon as theKustoKqlCompileris imported.where_clause = re.sub(r"\s+AND\s+", r" and ", where_clause, flags=re.IGNORECASE)- this regular expression replacesANDwithandnot only in operators, but also in values. Look at the tests, for an example