[release/2.11] Align Radeon targets for supported and preferred hipBLASLt backend with release/2.10 by mstankov-amd · Pull Request #3135 · ROCm/pytorch

mstankov-amd · 2026-04-07T12:33:27Z

Motivation

Align list of supported and preferred hipBLASLt targets (for Radeon GPU) with release/2.10 branch. Also, update the ROCm versions where target should be added to supported or preferred hipBLASLt list.

…th release/2.10

rocm-repo-management-api · 2026-04-07T13:21:04Z

Jenkins build for c1f312dfd0059c24e3e63b64216b2dadab17b47e commit finished as FAILURE
Links: Pipeline Overview / Build artifacts / Test Results

Detected error during Pytorch building:

      |     ^
/var/lib/jenkins/pytorch/third_party/mslk/external//composable_kernel/include/ck/tensor_operation/gpu/device/impl/device_grouped_gemm_multiple_d_xdl_cshuffle_tile_loop.hpp:65:5: warning: failed to meet occupancy target given by 'amdgpu-waves-per-eu' in '_ZN2ck16tensor_operation6device34kernel_grouped_gemm_multiple_d_xdlINS_34GridwiseGemmMultiD_xdl_cshuffle_v3INS_13tensor_layout4gemm8RowMajorENS5_11ColumnMajorENS_5TupleIJS6_S7_EEES6_DB8_SA_ffNS8_IJffEEEtNS0_12element_wise11PassThroughESD_NSC_16MultiplyMultiplyELNS1_18GemmSpecializationE0ELi256ELi224ELi256ELi128ELi16ELi16ELi16ELi16ELi7ELi8ENS_8SequenceIJLi8ELi32ELi1EEEENSG_IJLi1ELi0ELi2EEEESI_Li2ELi16ELi16ELb0ELi0ESH_SI_SI_Li2ELi16ELi16ELb0ELi0ELi1ELi2ENSG_IJLi1ELi32ELi1ELi8EEEENSG_IJLi8ELi8ELi1EEEELNS_26BlockGemmPipelineSchedulerE0ELNS_24BlockGemmPipelineVersionE2ESA_SA_SA_SA_Lb0EEENS1_25GroupedGemmKernelArgumentILi2EEELSF_0ESA_SA_SB_tS6_S7_S9_S6_Li128ENS_25OffsettedBlockToCTileMap2INS_39BlockToCTileMap_Grouped_M00_N0_M01AdaptILi8ELi224ELi256EEEEESS_SD_SD_SE_LSL_0ELSM_2EEEvPU3AS4KviT13_T14_T15_': desired occupancy was 2, final occupancy is 1 [-Wpass-failed]
2 warnings generated when compiling for gfx942.
[7268/8165] Linking CXX static library lib/libmslk.a
[7269/8165] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ck_gemm_float.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ck_gemm_float.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ck_gemm_float.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip && /opt/conda/envs/py_3.12/lib/python3.12/site-packages/cmake/data/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/. && /opt/conda/envs/py_3.12/lib/python3.12/site-packages/cmake/data/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_ck_gemm_float.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ck_gemm_float.hip.o.cmake
sccache: encountered fatal error
sccache: error: Failed to parse included file path
sccache: caused by: Failed to parse included file path
failed to execute:/opt/rocm/llvm/bin/clang++  --offload-arch=gfx90a --offload-arch=gfx908 --offload-arch=gfx942 -O3  -c -x hip /var/lib/jenkins/pytorch/aten/src/ATen/native/hip/ck_gemm_float.hip -o "/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_ck_gemm_float.hip.o" --offload-compress -std=c++17 --rocm-device-lib-path=/opt/rocm/amdgcn/bitcode -fclang-abi-compat=17 -DUSE_NCCL -DUSE_ROCM -D__HIP_PLATFORM_AMD__ -DUSE_FLASH_ATTENTION -DFLASHATTENTION_DISABLE_ALIBI -DFLASHATTENTION_DISABLE_SOFTCAP -DFLASH_NAMESPACE=pytorch_flash -DUNFUSE_FMA -DUSE_MEM_EFF_ATTENTION -DUSE_C10D_NCCL -DTORCH_CUDA_BUILD_MAIN_LIB -DROCM_VERSION=70201 -DTORCH_HIP_VERSION=702 -DUSE_LAYERNORM_FAST_RECIPROCAL -DONNX_ML=1 -DONNXIFI_ENABLE_EXT=1 -DONNX_NAMESPACE=onnx_torch -DIDEEP_USE_MKL -DHAVE_MMAP=1 -D_FILE_OFFSET_BITS=64 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_POSIX_FALLOCATE=1 -DUSE_EXTERNAL_MZCRC -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -D__HIP_PLATFORM_AMD__=1 -DUSE_PROF_API=1 -DAT_PER_OPERATOR_HEADERS -DUSE_DISTRIBUTED -DUSE_C10D_GLOO -DUSE_RPC -DUSE_TENSORPIPE -D__HIP_PLATFORM_AMD__ -DHIPBLASLT_USE_ROCROLLER -DFMT_HEADER_ONLY=1 -fPIC -D__HIP_PLATFORM_AMD__=1 -DCUDA_HAS_FP16=1 -DUSE_ROCM -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -DTORCH_HIP_VERSION=702 -Wno-shift-count-negative -Wno-shift-count-overflow -DCAFFE2_USE_MIOPEN -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_HIP -DHIPBLAS_V2 -DHIP_ENABLE_WARP_SYNC_BUILTINS -DHIPBLASLT_OUTER_VEC -DUSE_ROCM_CK_GEMM -fno-gpu-rdc -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch -I/opt/rocm-7.2.1/include -I/var/lib/jenkins/pytorch/build/third_party/gloo -I/var/lib/jenkins/pytorch/cmake/../third_party/gloo -I/var/lib/jenkins/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -I/var/lib/jenkins/pytorch/cmake/../third_party/googletest/googlemock/include -I/var/lib/jenkins/pytorch/cmake/../third_party/googletest/googletest/include -I/var/lib/jenkins/pytorch/third_party/protobuf/src -I/opt/conda/envs/py_3.12/include -I/var/lib/jenkins/pytorch/third_party/XNNPACK/include -I/var/lib/jenkins/pytorch/third_party/ittapi/include -I/var/lib/jenkins/pytorch/cmake/../third_party/eigen -I/opt/rocm/include -I/opt/rocm-7.2.1/include -I/var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -I/var/lib/jenkins/pytorch/third_party/ideep/include -I/var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -I/opt/conda/envs/py_3.12/include -I/var/lib/jenkins/pytorch/nlohmann -I/var/lib/jenkins/pytorch/INTERFACE -I/var/lib/jenkins/pytorch/third_party/nlohmann/include -I/var/lib/jenkins/pytorch/moodycamel -I/var/lib/jenkins/pytorch/INTERFACE -I/var/lib/jenkins/pytorch/third_party/concurrentqueue -I/var/lib/jenkins/pytorch/aten/src/THH -I/var/lib/jenkins/pytorch/third_party/mslk/include/ -I/var/lib/jenkins/pytorch/aten/src/ATen/hip -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/composable_kernel/library/include -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/composable_kernel/example/ck_tile/01_fmha -I/var/lib/jenkins/pytorch/build/caffe2/aten/src/ATen/composable_kernel -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/aiter/csrc/include -I/var/lib/jenkins/pytorch/third_party/fmt/include -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build/caffe2/aten/src -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/aten/src/ATen/.. -I/var/lib/jenkins/pytorch/torch/include -I/opt/rocm-7.2.1/include -I/opt/rocm/include -I/var/lib/jenkins/pytorch/c10/hip/../.. -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch/c10/../ -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch/torch/csrc/api -I/var/lib/jenkins/pytorch/torch/csrc/api/include -I/var/lib/jenkins/pytorch/third_party/protobuf/src -I/opt/conda/envs/py_3.12/include -I/opt/rocm-7.2.1/include -I/opt/rocm/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include/hiprand -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include/rocrand -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm/include -I/opt/rocm/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm-7.2.1/include -I/opt/rocm/include -I/var/lib/jenkins/pytorch/build/third_party/gloo/hip -I/opt/rocm/magma/include -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch -I/opt/rocm-7.2.1/include -I/var/lib/jenkins/pytorch/build/third_party/gloo -I/var/lib/jenkins/pytorch/cmake/../third_party/gloo -I/var/lib/jenkins/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -I/var/lib/jenkins/pytorch/cmake/../third_party/googletest/googlemock/include -I/var/lib/jenkins/pytorch/cmake/../third_party/googletest/googletest/include -I/var/lib/jenkins/pytorch/third_party/protobuf/src -I/opt/conda/envs/py_3.12/include -I/var/lib/jenkins/pytorch/third_party/XNNPACK/include -I/var/lib/jenkins/pytorch/third_party/ittapi/include -I/var/lib/jenkins/pytorch/cmake/../third_party/eigen -I/opt/rocm/include -I/var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -I/var/lib/jenkins/pytorch/third_party/ideep/include -I/var/lib/jenkins/pytorch/nlohmann -I/var/lib/jenkins/pytorch/INTERFACE -I/var/lib/jenkins/pytorch/third_party/nlohmann/include -I/var/lib/jenkins/pytorch/moodycamel -I/var/lib/jenkins/pytorch/third_party/concurrentqueue

mgehre-amd · 2026-05-20T11:40:31Z

@mstankov-amd @jeffdaily, will this get merged?

mgehre-amd · 2026-05-20T11:59:39Z

And are you working on adding this to release/2.12 and the upstream pytorch main branch?

fjankovi · 2026-05-21T13:42:31Z

And are you working on adding this to release/2.12 and the upstream pytorch main branch?

Yes, we should now add gfx110x and gfx1151 to the prefer hipblaslt list on pytorch main

mgehre-amd · 2026-05-27T09:57:28Z

And are you working on adding this to release/2.12 and the upstream pytorch main branch?

Yes, we should now add gfx110x and gfx1151 to the prefer hipblaslt list on pytorch main

Great! @fjankovi, is there a ticket or PR to track that work?

mstankov-amd · 2026-05-27T15:01:09Z

Created an upstream PR for adding gfx1100, gfx1101 and gfx1151 to hiupBLASLt preferred backend list.

Align Radeon targets for supported and preferred hipBLASLt backend wi…

c1f312d

…th release/2.10

mstankov-amd requested review from jeffdaily and jithunnair-amd April 7, 2026 12:33

mstankov-amd changed the title ~~Align Radeon targets for supported and preferred hipBLASLt backend with release/2.10~~ [release/2.11] Align Radeon targets for supported and preferred hipBLASLt backend with release/2.10 Apr 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release/2.11] Align Radeon targets for supported and preferred hipBLASLt backend with release/2.10#3135

[release/2.11] Align Radeon targets for supported and preferred hipBLASLt backend with release/2.10#3135
mstankov-amd wants to merge 1 commit into
release/2.11from
align_radeon_targets_preffered_blas_backend

mstankov-amd commented Apr 7, 2026

Uh oh!

rocm-repo-management-api Bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

mgehre-amd commented May 20, 2026

Uh oh!

mgehre-amd commented May 20, 2026

Uh oh!

fjankovi commented May 21, 2026

Uh oh!

mgehre-amd commented May 27, 2026

Uh oh!

mstankov-amd commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mstankov-amd commented Apr 7, 2026

Motivation

Uh oh!

rocm-repo-management-api Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgehre-amd commented May 20, 2026

Uh oh!

mgehre-amd commented May 20, 2026

Uh oh!

fjankovi commented May 21, 2026

Uh oh!

mgehre-amd commented May 27, 2026

Uh oh!

mstankov-amd commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rocm-repo-management-api Bot commented Apr 7, 2026 •

edited

Loading