fix: SplitDouble producing transposed output in DXIL#8512
Open
arnavnagzirkar wants to merge 1 commit into
Open
fix: SplitDouble producing transposed output in DXIL#8512arnavnagzirkar wants to merge 1 commit into
arnavnagzirkar wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Root Cause
asuint(double_matrix, out uint_mat lo, out uint_mat hi)produced transposed output in DXIL.HLSL matrices are stored in registers in row-major order regardless of their declared orientation. Column-major matrices (the default) have their allocas laid out in column-major order so that the column-major subscript/element index formulas work correctly.
TranslateDoubleAsUintinlib/HLSL/HLOperationLower.cpp(line 1876) callsSplitDoubleon each element of the input in sequential flat order — i.e.lo_alloca[i] = lo(x[i])for i=0..N-1 — which matches the row-major in-register order ofx. For column-major output allocas the alloca expects elements in column-major order, so the sequential write lands at the wrong logical positions.The
SafeToSkipoptimisation inEmitHLSLOutParamConversionInit(CGHLSLMS.cpp line 6347) passes the actuallo/hiallocas directly to theasuintHL intrinsic (no temporary alloca, no copy-back with orientation conversion), so the mismatch is never corrected.Concrete example (2×2, default column-major):
xrow-major flat order:[m00, m01, m10, m11]loalloca expects:alloca[0]=lo(m00), alloca[1]=lo(m10), alloca[2]=lo(m01), alloca[3]=lo(m11)TranslateDoubleAsUintwrites:alloca[0]=lo(m00), alloca[1]=lo(m01), alloca[2]=lo(m10), alloca[3]=lo(m11)← row-majorlo[0](first row) is read it accessesalloca[0]andalloca[2], gettinglo(m00)andlo(m10)— the first column instead of the first row.Change Made
File:
lib/HLSL/HLMatrixLowerPass.cppAdded a special case for
IntrinsicOp::IOP_asuintinHLMatrixLowerPass::lowerHLIntrinsicthat handles the 4-argumentasuint(opcode, x, lo, hi)form whenxis a matrix type and the output pointers are column-major.Before building the lowered call, the in-register row-major vector for
xis transposed to column-major order usingHLMatrixType::emitLoweredVectorRowToCol. This ensures thatTranslateDoubleAsUintwriteslo_alloca[i] = lo(x_colmaj[i])wherex_colmajis already in column-major order, so eachSplitDoubleresult lands at the correct position in the column-major alloca.A helper method
isColMajorMatrixPtrArgwas added to detect whether a matrix pointer argument is backed by a column-major alloca. It does so by inspecting the HL operations that consume the shared underlying lowered pointer throughvecToMatStubcalls, looking forColMatSubscript,ColMatElement,ColMatLoad, orColMatStoreopcodes. The fix is a no-op for row-major matrices (the fallback generic lowering path is taken).Issue
Fixes #8477
Issue URL: #8477
Changes
Testing
Agent ran relevant tests during development
Linting checks passed
Changes are minimal and focused on the issue
AI Assistance Disclosure
This pull request was prepared with the assistance of AI coding tools (GitHub Copilot). The change has been read, understood, and is owned by the human contributor submitting it, who will respond to review feedback.