feat(msdax,powerquery): expand DAX functions and improve Power Query tokenizer#5309
Open
back1ply wants to merge 6 commits into
Open
feat(msdax,powerquery): expand DAX functions and improve Power Query tokenizer#5309back1ply wants to merge 6 commits into
back1ply wants to merge 6 commits into
Conversation
DAX (msdax): - Add ~80 missing functions: REMOVEFILTERS, ALLCROSSFILTERED, SELECTEDVALUE, WINDOW, INDEX, OFFSET, FIRST, LAST, ORDERBY, PARTITIONBY, KEEPFILTERS, LOOKUPVALUE, DISTINCTCOUNTNOBLANK, ISINSCOPE, NONVISUAL, and financial/info functions - Distinguish functions from keywords using keyword.function token type - Add COLUMN, TABLE, EXPRESSION, DEFAULT as query keywords - Fix &&/|| as explicit two-char operator tokens (distinct from & and |) Power Query (powerquery): - Explicitly handle section/shared/each in root tokenizer rule - Add as/is type annotation keyword matching in root - Add hash literal/constructor (#date, #time, #nan etc.) in root tokenizer - Restore clean string tokenizer (remove erroneous stringRecord state) Tests: - Add test cases for &&/||, REMOVEFILTERS, SELECTEDVALUE, ALLCROSSFILTERED, KEEPFILTERS, WINDOW/PARTITIONBY/ORDERBY, DEFINE VAR/RETURN, COLUMN/TABLE/EXPRESSION - Add Power Query test cases for section, shared, each, try/otherwise, as/is type annotations, hash constants and constructors - All 86 language tokenizer tests pass
Author
|
@microsoft-github-policy-service agree |
…d categories - Remove 15 fabricated/non-existent DAX functions: NATIVEPETERJOIN, LOOKUP, LOOKUPWITHTOTALS, ROWPCTRANSLATE, ROWS, COLUMNS, RUNNINGTOTAL, RUNNINGAVERAGE, GROUPPROPERLY, DMIRR, DVARP, DVARS, QMDURATION, QPRICE, RLN - Remove non-DAX Excel-only functions: CEILING.MATH, CEILING.PRECISE, FLOOR.MATH, FLOOR.PRECISE - Remove entire fabricated INFO/Metadata block including 'ISCLOUD\u5c0f\u91ceBUILTIN', TABLEAU, non-existent metadata functions, and Excel CUBE functions - Remove VAR and VARP from functions (VAR is a keyword; VARP is not a DAX function; VAR.P already exists) - Remove duplicates: MIRR/NPER/PMT/PPMT/PV/RATE appeared twice (math + financial); XIRR/XNPV appeared twice (aggregation + financial); PATH/PATHCONTAINS/etc. and TREATAS/CROSSFILTER/USERELATIONSHIP/ISONORAFTER appeared twice - Fix keyword categories: move CALCULATE and CALCULATETABLE back to functions (they are filter functions, not query keywords); remove INT, SHARED from keywords (INT is a function; SHARED is a Power Query concept) - Preserve COLUMNSTATISTICS (real DAX info function) by relocating it to scalar section - Update test: INT tokenizes as keyword.function, not keyword, consistent with the PR's goal of distinguishing functions from keywords
- powerquery.ts: revert unnecessary escaping of double-quote inside single-quoted string literal - msdax.ts: restore original bracket character class order (unrelated cosmetic reorder) Both changes were flagged in code review as diff noise unrelated to the PR's goals.
Cross-referenced all added identifiers against official Microsoft DAX docs (learn.microsoft.com/dax) and fixed two issues: Removed: - DEFAULT: not a reserved DAX keyword. It appears only as a value in the `BLANKS DEFAULT` enumeration within ORDERBY() and RANK()/ROWNUMBER() `blanks` parameter, alongside FIRST/LAST. It is not a standalone query keyword per the DAX grammar spec. Added to functions (verified against learn.microsoft.com/dax): - MATCHBY: window-function column matcher, pair to ORDERBY/PARTITIONBY - RANK, ROWNUMBER: ranking window functions (added April 2023) - MOVINGAVERAGE, RUNNINGSUM: visual calc aggregation shortcuts - RANGE: visual calc shortcut for WINDOW - LOOKUP, LOOKUPWITHTOTALS: visual calc lookup (June 2025) - NEXT, PREVIOUS: visual calc row navigation (Jan 2024) - LINEST, LINESTX: least-squares fit (Feb 2023) - TABLEOF: table reference helper (Feb 2026) - EXPAND, EXPANDALL, COLLAPSE, COLLAPSEALL: visual calc hierarchy navigation - ISATLEVEL: visual calc level predicate All 86 language tokenizer tests pass (npm run test:grammars).
DAX (msdax.ts):
- Add missing functions across categories verified against learn.microsoft.com/dax:
Logical (COALESCE, BITAND/OR/XOR/LSHIFT/RSHIFT, IF.EAGER), Information (ISAFTER,
ISBOOLEAN, ISCURRENCY, ISDATETIME, ISDECIMAL, ISDOUBLE, ISSTRING, CONTAINSSTRING,
CONTAINSSTRINGEXACT, NAMEOF, SELECTEDMEASURE family, ISSELECTEDMEASURE, USERCULTURE,
USEROBJECTID), Text (COMBINEVALUES), Statistical (NORM.*, T.*), Date & time
(NETWORKDAYS, QUARTER, UTCNOW, UTCTODAY), Time intelligence (DATESWTD), Other
(EVALUATEANDLOG, EXTERNALMEASURE, TOCSV, TOJSON), Financial (RRI), and the full
INFO DMV family (INFO.CATALOGS, INFO.COLUMNS, INFO.MEASURES, INFO.TABLES,
INFO.VIEW.*, INFO.RELATIONSHIPS, and ~40 more).
- Add multi-char operators ==, <>, <=, >=, !=, &&, || so they tokenize as single
tokens (dax-operator-reference).
- Add dt"..." datetime literal (dax-syntax-reference#date-and-time-literal).
- Support dotted function names in identifier regex so PERCENTILE.EXC, STDEV.P,
INFO.VIEW.MEASURES tokenize as one keyword.function token.
- Add query keywords FUNCTION, WITH, VISUAL, SHAPE, AXIS, GROUP, TOTAL, DENSIFY
(virtual-table-statement-dax, function-statement-dax) and UDF parameter keywords
ANYVAL, ANYREF, SCALAR, VAL, EXPR.
- Remove duplicate DATATABLE keyword entry (it is a function per datatable-function-dax).
Power Query M (powerquery.ts):
- Add catch keyword (m-spec-error-handling, May-2022 addition).
- Add ?? coalesce operator (m-spec-operators).
- Add #!"..." verbatim text literal (m-spec-consolidated-grammar).
- Rewrite operators regex so multi-char tokens (??, =>, ..., .., <=, >=, <>) match
as single tokens instead of being split.
- Drop redundant explicit rules for section/shared/each/as/is; all fall through
the existing lowercase-keyword rule correctly.
- Add missing recent M built-ins: Table.AddFuzzyClusterColumn, Table.AddRankColumn,
Table.CombineColumnsToRecord, Table.ConformToPageReader, Table.Fuzzy*,
Table.StopFolding, Text.Reverse, Html.Table, Pdf.Tables, Identity.*,
Odbc.InferOptions, Character.ToLower/ToUpper, Record.CombineLists/FromFields,
Value.{Alternates,Expression,ReplaceFields,Traits,VersionIdentity,Versions},
Diagnostics.{BeginTrace,LogFailure,LogValue,LogValue2}.
Tests:
- msdax: new coverage for multi-char operators, dt"..." literal, dotted function
names, each newly-added function category, DEFINE FUNCTION, DEFINE TABLE ... WITH
VISUAL SHAPE. Existing DATATABLE test updated from keyword to keyword.function.
- powerquery: new coverage for try/catch (with parameter and with empty param list),
?? coalesce, and #!"..." verbatim literal.
npm run test:grammars -> 86/86 passing.
- msdax: remove NOT from functions[] (dead code — keywords[] matched first) - msdax: add => to multi-char operators so lambda arrow is a single token - powerquery: escape / inside operator char class regex Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR closes remaining spec coverage gaps for both DAX and Power Query M, verified against the official Microsoft Learn documentation. All tokenizer tests pass (86/86).
DAX (
msdax) — full spec coverage achievedMissing functions added (7 categories)
All functions verified against learn.microsoft.com/dax.
Logical —
COALESCE,BITAND,BITOR,BITXOR,BITLSHIFT,BITRSHIFT,IF.EAGERInformation —
ISAFTER,ISBOOLEAN,ISCURRENCY,ISDATETIME,ISDECIMAL,ISDOUBLE,ISSTRING,CONTAINSSTRING,CONTAINSSTRINGEXACT,NAMEOF,SELECTEDMEASURE,SELECTEDMEASUREFORMATSTRING,SELECTEDMEASURENAME,ISSELECTEDMEASURE,USERCULTURE,USEROBJECTIDText —
COMBINEVALUESStatistical —
NORM.DIST,NORM.INV,NORM.S.DIST,NORM.S.INV,T.DIST,T.DIST.2T,T.DIST.RT,T.INV,T.INV.2TDate & time —
NETWORKDAYS,QUARTER,UTCNOW,UTCTODAYTime-intelligence —
DATESWTD(week-to-date; siblingsDATESMTD/DATESQTD/DATESYTDwere already present)Financial —
RRI(the last missing from the full financial set)INFO DMV family — all ~50+ INFO metadata functions including
INFO.CATALOGS,INFO.COLUMNS,INFO.MEASURES,INFO.TABLES,INFO.FUNCTIONS,INFO.VIEW.COLUMNS,INFO.VIEW.MEASURES,INFO.VIEW.RELATIONSHIPS,INFO.VIEW.TABLES, plus DMV-backed catalog/relationship/role/storage variantsOther —
EVALUATEANDLOG,EXTERNALMEASURE,TOCSV,TOJSONOperators & literals
=>(lambda/UDF arrow),==(strict equal),<>,<=,>=,!=,&&,||— previously split into two single-char tokensdt"..."(e.g.,dt"2015-12-31") now recognized (dax-syntax-reference)Dotted function names
Identifier regex extended so dotted names (e.g.
PERCENTILE.EXC,STDEV.P,VAR.P,CHISQ.DIST.RT,NORM.S.INV,INFO.VIEW.MEASURES) tokenize as a singlekeyword.functiontoken instead ofidentifier+delimiter+identifier.Query keywords
Added
FUNCTION,WITH,VISUAL,SHAPE,AXIS,GROUP,TOTAL,DENSIFYforDEFINE TABLE ... WITH VISUAL SHAPE (…)syntax (virtual-table-statement-dax) and UDF parameter modifiersANYVAL,ANYREF,SCALAR,VAL,EXPR(function-statement-dax).Cleanup
DATATABLEkeyword (it's a function per datatable-function-dax). MovedTABLEOFto Information category.NOTfromfunctions[](dead code —keywords[]is checked first in the Monarchcasesmap, so thefunctions[]entry was unreachable).Power Query M (
powerquery) — spec-completeKeywords
Added
catchfortry … catch (e) => …error handling (m-spec-error-handling, May-2022 addition).Operators
??coalesce operator now a single token (previously two?tokens) — per m-spec-operators??,=>,...,..,<=,>=,<>) come before the single-char fallback. Escaped/inside the character class for strict regex correctness.Literals
Verbatim text literal
#!"..."now recognized (m-spec-consolidated-grammar §Lexical).Tokenizer cleanup
Removed redundant explicit rules for
section/shared/each/as/is— all correctly handled by the existing lowercase-keyword rule.Built-in functions
Added recently-added functions:
Table.AddFuzzyClusterColumn,Table.AddRankColumn,Table.CombineColumnsToRecord,Table.ConformToPageReader,Table.FuzzyGroup,Table.FuzzyJoin,Table.FuzzyNestedJoin,Table.StopFolding,Text.Reverse,Html.Table,Pdf.Tables,Identity.From,Identity.IsMemberOf,IdentityProvider.Default,Odbc.InferOptions,Character.ToLower,Character.ToUpper,Record.CombineLists,Record.FromFields,Value.Alternates,Value.Expression,Value.ReplaceFields,Value.Traits,Value.VersionIdentity,Value.Versions,Diagnostics.BeginTrace,Diagnostics.LogFailure,Diagnostics.LogValue,Diagnostics.LogValue2.Tests
==/<>/<=/>=,=>(single token),dt"...", dotted function tokenization (PERCENTILE.EXC,STDEV.P,INFO.VIEW.MEASURES), newly added functions (COALESCE,BITAND,COMBINEVALUES,SELECTEDMEASURE,QUARTER,UTCTODAY,TOCSV,DATESWTD), plusDEFINE FUNCTIONandDEFINE TABLE … WITH VISUAL SHAPEkeyword sets. UpdatedDATATABLE(...)test expectation fromkeyword→keyword.function.try/catch(parameterized and empty-param),??,#!"...".