[inductor][rocm] make AMD MM matrix_instr_nonkdim configurable by reger-men · Pull Request #3234 · ROCm/pytorch

reger-men · 2026-05-19T14:11:10Z

Adds torch._inductor.config.rocm.mfma_nonkdim, reading the env var TORCHINDUCTOR_MFMA_NONKDIM. The AMD MM Triton template autotune sweep is now driven from this knob rather than being hard-coded to [0, 16]. Default behaviour is unchanged on ROCm; ignored on other backends.

Recognised values:

value	autotune sweep	default `matrix_instr_nonkdim`
unset	`[0, 16]`	16 (upstream)
`0` / `16` / `32`	`[value]`	value
`auto`	`[0, 16, 32]`	16

mfma_32x32x*_bf16 is only emitted when 32 is in the sweep, so auto is the safe opt-in for shapes where the mfma_32 path might win and 32 forces it on. Per-workload tuning knob, do not set system-wide.

Test plan

test_amd_mfma_nonkdim_config.py covers unset / forced 0 / forced 16 / forced 32 / auto / garbage via torch._inductor.config.patch
subprocess probe asserts the import-time env parser handles 0, 16, 32, auto, AUTO, an empty string, and a non-integer
existing AMD MM autotune tests continue to pass with env unset

rocm-repo-management-api · 2026-05-19T14:22:03Z

Jenkins build for 78e99ac5b00742298bb83fe8d49b1a3d5991856c commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

rocm-repo-management-api · 2026-05-20T18:06:58Z

Jenkins build for e2926f2d3ada9da02d55fc19a97bd0028fe6f1f5 commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Adds torch._inductor.config.rocm.mfma_nonkdim, reading the env var TORCHINDUCTOR_MFMA_NONKDIM. The AMD MM Triton template autotune sweep is now driven from this knob rather than being hard-coded to [0, 16]. Default behaviour is unchanged on ROCm; ignored on other backends. Recognised values: unset upstream default ([0, 16] sweep, ROCmGemmConfig default 16) "0" / "16" / "32" force a single value; sweep collapses to [value] "auto" extend the autotune sweep to [0, 16, 32]; default stays 16 mfma_32x32x*_bf16 is only emitted when 32 is in the sweep, so "auto" is the safe opt-in for shapes where the mfma_32 path might win. Test under test/inductor/test_amd_mfma_nonkdim_config.py covers all modes (unset / forced int / "auto" / garbage) by patching the config attribute in-process via torch._inductor.config.patch, plus a subprocess probe that spawns a fresh Python with the env var set to exercise the import-time env parser.

rocm-repo-management-api · 2026-05-21T09:37:10Z

Jenkins build for f26feec7ab0bf69d6f3b70be5a08fd354f690cc8 commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

reger-men force-pushed the pr3-mfma-nonkdim branch from 78e99ac to e2926f2 Compare May 20, 2026 18:05

reger-men force-pushed the pr3-mfma-nonkdim branch from e2926f2 to f26feec Compare May 21, 2026 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor][rocm] make AMD MM matrix_instr_nonkdim configurable#3234

[inductor][rocm] make AMD MM matrix_instr_nonkdim configurable#3234
reger-men wants to merge 1 commit into
ROCm:developfrom
reger-men:pr3-mfma-nonkdim

reger-men commented May 19, 2026

Uh oh!

rocm-repo-management-api Bot commented May 19, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented May 20, 2026 •

edited

Loading

Uh oh!

rocm-repo-management-api Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

reger-men commented May 19, 2026

Uh oh!

rocm-repo-management-api Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rocm-repo-management-api Bot commented May 19, 2026 •

edited

Loading

rocm-repo-management-api Bot commented May 20, 2026 •

edited

Loading

rocm-repo-management-api Bot commented May 21, 2026 •

edited

Loading