Skip to content

feat(msdax,powerquery): expand DAX functions and improve Power Query tokenizer#5309

Open
back1ply wants to merge 6 commits into
microsoft:mainfrom
back1ply:main
Open

feat(msdax,powerquery): expand DAX functions and improve Power Query tokenizer#5309
back1ply wants to merge 6 commits into
microsoft:mainfrom
back1ply:main

Conversation

@back1ply
Copy link
Copy Markdown

@back1ply back1ply commented Apr 28, 2026

Summary

This PR closes remaining spec coverage gaps for both DAX and Power Query M, verified against the official Microsoft Learn documentation. All tokenizer tests pass (86/86).

DAX (msdax) — full spec coverage achieved

Missing functions added (7 categories)

All functions verified against learn.microsoft.com/dax.

LogicalCOALESCE, BITAND, BITOR, BITXOR, BITLSHIFT, BITRSHIFT, IF.EAGER

InformationISAFTER, ISBOOLEAN, ISCURRENCY, ISDATETIME, ISDECIMAL, ISDOUBLE, ISSTRING, CONTAINSSTRING, CONTAINSSTRINGEXACT, NAMEOF, SELECTEDMEASURE, SELECTEDMEASUREFORMATSTRING, SELECTEDMEASURENAME, ISSELECTEDMEASURE, USERCULTURE, USEROBJECTID

TextCOMBINEVALUES

StatisticalNORM.DIST, NORM.INV, NORM.S.DIST, NORM.S.INV, T.DIST, T.DIST.2T, T.DIST.RT, T.INV, T.INV.2T

Date & timeNETWORKDAYS, QUARTER, UTCNOW, UTCTODAY

Time-intelligenceDATESWTD (week-to-date; siblings DATESMTD/DATESQTD/DATESYTD were already present)

FinancialRRI (the last missing from the full financial set)

INFO DMV family — all ~50+ INFO metadata functions including INFO.CATALOGS, INFO.COLUMNS, INFO.MEASURES, INFO.TABLES, INFO.FUNCTIONS, INFO.VIEW.COLUMNS, INFO.VIEW.MEASURES, INFO.VIEW.RELATIONSHIPS, INFO.VIEW.TABLES, plus DMV-backed catalog/relationship/role/storage variants

OtherEVALUATEANDLOG, EXTERNALMEASURE, TOCSV, TOJSON

Operators & literals

  • Multi-char operators now tokenize as single tokens: => (lambda/UDF arrow), == (strict equal), <>, <=, >=, !=, &&, || — previously split into two single-char tokens
  • DAX datetime literal dt"..." (e.g., dt"2015-12-31") now recognized (dax-syntax-reference)

Dotted function names

Identifier regex extended so dotted names (e.g. PERCENTILE.EXC, STDEV.P, VAR.P, CHISQ.DIST.RT, NORM.S.INV, INFO.VIEW.MEASURES) tokenize as a single keyword.function token instead of identifier+delimiter+identifier.

Query keywords

Added FUNCTION, WITH, VISUAL, SHAPE, AXIS, GROUP, TOTAL, DENSIFY for DEFINE TABLE ... WITH VISUAL SHAPE (…) syntax (virtual-table-statement-dax) and UDF parameter modifiers ANYVAL, ANYREF, SCALAR, VAL, EXPR (function-statement-dax).

Cleanup

  • Removed duplicate DATATABLE keyword (it's a function per datatable-function-dax). Moved TABLEOF to Information category.
  • Removed NOT from functions[] (dead code — keywords[] is checked first in the Monarch cases map, so the functions[] entry was unreachable).

Power Query M (powerquery) — spec-complete

Keywords

Added catch for try … catch (e) => … error handling (m-spec-error-handling, May-2022 addition).

Operators

  • ?? coalesce operator now a single token (previously two ? tokens) — per m-spec-operators
  • Operator regex rewritten: multi-char alternatives (??, =>, ..., .., <=, >=, <>) come before the single-char fallback. Escaped / inside the character class for strict regex correctness.

Literals

Verbatim text literal #!"..." now recognized (m-spec-consolidated-grammar §Lexical).

Tokenizer cleanup

Removed redundant explicit rules for section/shared/each/as/is — all correctly handled by the existing lowercase-keyword rule.

Built-in functions

Added recently-added functions: Table.AddFuzzyClusterColumn, Table.AddRankColumn, Table.CombineColumnsToRecord, Table.ConformToPageReader, Table.FuzzyGroup, Table.FuzzyJoin, Table.FuzzyNestedJoin, Table.StopFolding, Text.Reverse, Html.Table, Pdf.Tables, Identity.From, Identity.IsMemberOf, IdentityProvider.Default, Odbc.InferOptions, Character.ToLower, Character.ToUpper, Record.CombineLists, Record.FromFields, Value.Alternates, Value.Expression, Value.ReplaceFields, Value.Traits, Value.VersionIdentity, Value.Versions, Diagnostics.BeginTrace, Diagnostics.LogFailure, Diagnostics.LogValue, Diagnostics.LogValue2.

Tests

  • msdax: 10+ new cases covering ==/<>/<=/>=, => (single token), dt"...", dotted function tokenization (PERCENTILE.EXC, STDEV.P, INFO.VIEW.MEASURES), newly added functions (COALESCE, BITAND, COMBINEVALUES, SELECTEDMEASURE, QUARTER, UTCTODAY, TOCSV, DATESWTD), plus DEFINE FUNCTION and DEFINE TABLE … WITH VISUAL SHAPE keyword sets. Updated DATATABLE(...) test expectation from keywordkeyword.function.
  • powerquery: 7 new cases for try/catch (parameterized and empty-param), ??, #!"...".
  • All 86 language tests pass.

DAX (msdax):
- Add ~80 missing functions: REMOVEFILTERS, ALLCROSSFILTERED, SELECTEDVALUE,
  WINDOW, INDEX, OFFSET, FIRST, LAST, ORDERBY, PARTITIONBY, KEEPFILTERS,
  LOOKUPVALUE, DISTINCTCOUNTNOBLANK, ISINSCOPE, NONVISUAL, and financial/info functions
- Distinguish functions from keywords using keyword.function token type
- Add COLUMN, TABLE, EXPRESSION, DEFAULT as query keywords
- Fix &&/|| as explicit two-char operator tokens (distinct from & and |)

Power Query (powerquery):
- Explicitly handle section/shared/each in root tokenizer rule
- Add as/is type annotation keyword matching in root
- Add hash literal/constructor (#date, #time, #nan etc.) in root tokenizer
- Restore clean string tokenizer (remove erroneous stringRecord state)

Tests:
- Add test cases for &&/||, REMOVEFILTERS, SELECTEDVALUE, ALLCROSSFILTERED,
  KEEPFILTERS, WINDOW/PARTITIONBY/ORDERBY, DEFINE VAR/RETURN, COLUMN/TABLE/EXPRESSION
- Add Power Query test cases for section, shared, each, try/otherwise,
  as/is type annotations, hash constants and constructors
- All 86 language tokenizer tests pass
@back1ply
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

back1ply and others added 5 commits April 28, 2026 15:14
…d categories

- Remove 15 fabricated/non-existent DAX functions: NATIVEPETERJOIN, LOOKUP,
  LOOKUPWITHTOTALS, ROWPCTRANSLATE, ROWS, COLUMNS, RUNNINGTOTAL, RUNNINGAVERAGE,
  GROUPPROPERLY, DMIRR, DVARP, DVARS, QMDURATION, QPRICE, RLN
- Remove non-DAX Excel-only functions: CEILING.MATH, CEILING.PRECISE,
  FLOOR.MATH, FLOOR.PRECISE
- Remove entire fabricated INFO/Metadata block including 'ISCLOUD\u5c0f\u91ceBUILTIN',
  TABLEAU, non-existent metadata functions, and Excel CUBE functions
- Remove VAR and VARP from functions (VAR is a keyword; VARP is not a DAX function;
  VAR.P already exists)
- Remove duplicates: MIRR/NPER/PMT/PPMT/PV/RATE appeared twice (math + financial);
  XIRR/XNPV appeared twice (aggregation + financial); PATH/PATHCONTAINS/etc.
  and TREATAS/CROSSFILTER/USERELATIONSHIP/ISONORAFTER appeared twice
- Fix keyword categories: move CALCULATE and CALCULATETABLE back to functions
  (they are filter functions, not query keywords); remove INT, SHARED from keywords
  (INT is a function; SHARED is a Power Query concept)
- Preserve COLUMNSTATISTICS (real DAX info function) by relocating it to scalar section
- Update test: INT tokenizes as keyword.function, not keyword, consistent with
  the PR's goal of distinguishing functions from keywords
- powerquery.ts: revert unnecessary escaping of double-quote inside single-quoted string literal
- msdax.ts: restore original bracket character class order (unrelated cosmetic reorder)

Both changes were flagged in code review as diff noise unrelated to the PR's goals.
Cross-referenced all added identifiers against official Microsoft DAX
docs (learn.microsoft.com/dax) and fixed two issues:

Removed:
- DEFAULT: not a reserved DAX keyword. It appears only as a value in
  the `BLANKS DEFAULT` enumeration within ORDERBY() and RANK()/ROWNUMBER()
  `blanks` parameter, alongside FIRST/LAST. It is not a standalone query
  keyword per the DAX grammar spec.

Added to functions (verified against learn.microsoft.com/dax):
- MATCHBY: window-function column matcher, pair to ORDERBY/PARTITIONBY
- RANK, ROWNUMBER: ranking window functions (added April 2023)
- MOVINGAVERAGE, RUNNINGSUM: visual calc aggregation shortcuts
- RANGE: visual calc shortcut for WINDOW
- LOOKUP, LOOKUPWITHTOTALS: visual calc lookup (June 2025)
- NEXT, PREVIOUS: visual calc row navigation (Jan 2024)
- LINEST, LINESTX: least-squares fit (Feb 2023)
- TABLEOF: table reference helper (Feb 2026)
- EXPAND, EXPANDALL, COLLAPSE, COLLAPSEALL: visual calc hierarchy navigation
- ISATLEVEL: visual calc level predicate

All 86 language tokenizer tests pass (npm run test:grammars).
DAX (msdax.ts):
- Add missing functions across categories verified against learn.microsoft.com/dax:
  Logical (COALESCE, BITAND/OR/XOR/LSHIFT/RSHIFT, IF.EAGER), Information (ISAFTER,
  ISBOOLEAN, ISCURRENCY, ISDATETIME, ISDECIMAL, ISDOUBLE, ISSTRING, CONTAINSSTRING,
  CONTAINSSTRINGEXACT, NAMEOF, SELECTEDMEASURE family, ISSELECTEDMEASURE, USERCULTURE,
  USEROBJECTID), Text (COMBINEVALUES), Statistical (NORM.*, T.*), Date & time
  (NETWORKDAYS, QUARTER, UTCNOW, UTCTODAY), Time intelligence (DATESWTD), Other
  (EVALUATEANDLOG, EXTERNALMEASURE, TOCSV, TOJSON), Financial (RRI), and the full
  INFO DMV family (INFO.CATALOGS, INFO.COLUMNS, INFO.MEASURES, INFO.TABLES,
  INFO.VIEW.*, INFO.RELATIONSHIPS, and ~40 more).
- Add multi-char operators ==, <>, <=, >=, !=, &&, || so they tokenize as single
  tokens (dax-operator-reference).
- Add dt"..." datetime literal (dax-syntax-reference#date-and-time-literal).
- Support dotted function names in identifier regex so PERCENTILE.EXC, STDEV.P,
  INFO.VIEW.MEASURES tokenize as one keyword.function token.
- Add query keywords FUNCTION, WITH, VISUAL, SHAPE, AXIS, GROUP, TOTAL, DENSIFY
  (virtual-table-statement-dax, function-statement-dax) and UDF parameter keywords
  ANYVAL, ANYREF, SCALAR, VAL, EXPR.
- Remove duplicate DATATABLE keyword entry (it is a function per datatable-function-dax).

Power Query M (powerquery.ts):
- Add catch keyword (m-spec-error-handling, May-2022 addition).
- Add ?? coalesce operator (m-spec-operators).
- Add #!"..." verbatim text literal (m-spec-consolidated-grammar).
- Rewrite operators regex so multi-char tokens (??, =>, ..., .., <=, >=, <>) match
  as single tokens instead of being split.
- Drop redundant explicit rules for section/shared/each/as/is; all fall through
  the existing lowercase-keyword rule correctly.
- Add missing recent M built-ins: Table.AddFuzzyClusterColumn, Table.AddRankColumn,
  Table.CombineColumnsToRecord, Table.ConformToPageReader, Table.Fuzzy*,
  Table.StopFolding, Text.Reverse, Html.Table, Pdf.Tables, Identity.*,
  Odbc.InferOptions, Character.ToLower/ToUpper, Record.CombineLists/FromFields,
  Value.{Alternates,Expression,ReplaceFields,Traits,VersionIdentity,Versions},
  Diagnostics.{BeginTrace,LogFailure,LogValue,LogValue2}.

Tests:
- msdax: new coverage for multi-char operators, dt"..." literal, dotted function
  names, each newly-added function category, DEFINE FUNCTION, DEFINE TABLE ... WITH
  VISUAL SHAPE. Existing DATATABLE test updated from keyword to keyword.function.
- powerquery: new coverage for try/catch (with parameter and with empty param list),
  ?? coalesce, and #!"..." verbatim literal.

npm run test:grammars -> 86/86 passing.
- msdax: remove NOT from functions[] (dead code — keywords[] matched first)
- msdax: add => to multi-char operators so lambda arrow is a single token
- powerquery: escape / inside operator char class regex

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant