Skip to content
Open
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
595423d
Add benchmark capabilities for ops.
neoblizz Feb 3, 2026
8c965a1
Merge branch 'main' into neoblizz/iris-xops-perf
neoblizz Feb 7, 2026
ef227b0
Merge conflicts.
neoblizz Feb 7, 2026
f132ceb
Up the tritonBLAS commit.
neoblizz Feb 7, 2026
1628a61
...
neoblizz Feb 10, 2026
c26e872
Apply Ruff auto-fixes
github-actions[bot] Feb 10, 2026
3d4c7d7
Fix load vectorization and transpose config
ryanswann-amd Feb 11, 2026
5b02211
Apply Ruff auto-fixes
github-actions[bot] Feb 11, 2026
4c3b3f4
Add HBM buffered version
ryanswann-amd Feb 11, 2026
a301392
Merge branch 'ryaswann/iris_xops_perf' of github.com:ROCm/iris into r…
ryanswann-amd Feb 11, 2026
1f3b9ef
Apply Ruff auto-fixes
github-actions[bot] Feb 11, 2026
45288ff
Use workgroup specialized variant
ryanswann-amd Feb 13, 2026
b2aadcd
Apply Ruff auto-fixes
github-actions[bot] Feb 13, 2026
7b2321e
Update hbm buffered all gather matmul
ryanswann-amd Feb 16, 2026
a4d845f
Merge branch 'ryaswann/iris_xops_perf' of github.com:ROCm/iris into r…
ryanswann-amd Feb 16, 2026
9692222
Apply Ruff auto-fixes
github-actions[bot] Feb 16, 2026
44ebc97
Add tracing
ryanswann-amd Feb 16, 2026
0c2842e
Merge branch 'ryaswann/iris_xops_perf' of github.com:ROCm/iris into r…
ryanswann-amd Feb 17, 2026
11d017a
Apply Ruff auto-fixes
github-actions[bot] Feb 17, 2026
ace40d0
Add stages to all_gather_matmul_hbm_buffer
ryanswann-amd Feb 17, 2026
950c3a0
Merge branch 'ryaswann/iris_xops_perf' of github.com:ROCm/iris into r…
ryanswann-amd Feb 17, 2026
f7612bd
Apply Ruff auto-fixes
github-actions[bot] Feb 17, 2026
51bccb5
Updates to benchmark and kernel
ryanswann-amd Feb 17, 2026
9b71523
Merge branch 'ryaswann/iris_xops_perf' of github.com:ROCm/iris into r…
ryanswann-amd Feb 17, 2026
cbe2aff
Apply Ruff auto-fixes
github-actions[bot] Feb 17, 2026
11d9001
Add predictive params, fix pointer overflows, fix race conditions
Mar 3, 2026
3c4cb4d
Apply Ruff auto-fixes
github-actions[bot] Mar 3, 2026
f2f755a
Merge branch 'neoblizz/iris-xops-perf' into ryaswann/iris_xops_perf
ryanswann-amd Mar 3, 2026
77eff5b
Reverse 2D block translate
Mar 3, 2026
dcafd2a
Properly use iris tracing APIs
Mar 3, 2026
6fdad6d
Apply Ruff auto-fixes
github-actions[bot] Mar 3, 2026
08755b7
Remove test.sh
Mar 3, 2026
88f7767
All gather matmul with improved performance. (#415)
ryanswann-amd Mar 5, 2026
f558293
Fix CI: restore vectorization hints, align tritonBLAS versions, remov…
ryanswann-amd Mar 6, 2026
e5dd77f
Merge main into neoblizz/iris-xops-perf
ryanswann-amd Mar 6, 2026
477b472
Fix CI: increase default N to match FusedConfig block_size_n=256
ryanswann-amd Mar 6, 2026
76cc30d
Revert "Fix CI: increase default N to match FusedConfig block_size_n=…
ryanswann-amd Mar 6, 2026
9743b13
Remove unnecessary block size assertions — Triton handles masking
ryanswann-amd Mar 6, 2026
a86dc04
Initial plan
Copilot Mar 11, 2026
445b25c
Add vectorization hints and tests for HBM buffer all-gather matmul
Copilot Mar 12, 2026
2f0099f
Add vectorization hints and tests for HBM buffer all-gather matmul (#…
ryanswann-amd Mar 12, 2026
39c213d
Merge branch 'main' into neoblizz/iris-xops-perf
ryanswann-amd Mar 16, 2026
bad3422
Initial plan for PR cleanup
Copilot Apr 8, 2026
2a9f31a
Cleanup PR: address reviewer feedback
Copilot Apr 8, 2026
98d25bf
Clarify bias handling in matmul_reduce_scatter: raise NotImplementedE…
Copilot Apr 8, 2026
196bef7
Merge branch 'main' into neoblizz/iris-xops-perf
Copilot Apr 8, 2026
f4b4e75
Sync with main, remove unneeded scripts, minimize PR footprint
Copilot Apr 8, 2026
9d29d8c
Port HBM buffer benchmark to iris.bench, remove helper scripts
Copilot Apr 8, 2026
2c8b226
Replace shmem with ctx in hbm_buffer kernel and tests
Copilot Apr 9, 2026
1f7f6f1
Updated copilot instructions: you have GPUs, use them
mawad-amd Apr 9, 2026
9999273
Add benchmark comparison plots for HBM buffer vs baseline
Copilot Apr 9, 2026
e6b7114
Merge benchmarks and tests, remove dead code
Copilot Apr 9, 2026
5fac461
Update benchmark comparison plots with MxNxK x-axis labels
Copilot Apr 9, 2026
184331c
Extend trace events with categorized ID ranges and fix tracing abuse
mawad-amd Apr 9, 2026
1b6df88
Apply Ruff auto-fixes
github-actions[bot] Apr 9, 2026
6b70059
Bump trace schema version to 1.2 for new event categories
mawad-amd Apr 9, 2026
8607e38
Add RCCL baseline and rename algorithms to one_shot/prefetch
mawad-amd Apr 9, 2026
63c978b
Fix RCCL benchmark: use regular CUDA memory, not iris symmetric heap
mawad-amd Apr 9, 2026
6a8ad6b
Fix RCCL benchmark: use dist.get_world_size() instead of ctx
mawad-amd Apr 9, 2026
292ee11
Update HBM buffer kernel defaults and benchmark for parameter sweep
Copilot Apr 9, 2026
6979787
Update benchmark plots with new vs previous defaults comparison
Copilot Apr 9, 2026
02ea2b6
Fix preamble FusedConfig() defaults and add shape-adaptive auto-config
ryanswann-amd Apr 11, 2026
64a631f
Fix collective ordering deadlock in fd_passing at ws<8
ryanswann-amd Apr 11, 2026
7d3f476
Apply Ruff auto-fixes
github-actions[bot] Apr 11, 2026
ef0a173
Port auto-config system from ryanswann-amd/iris feature/auto-config-x…
Copilot Apr 15, 2026
2528e8e
Add docs/benchmark-results/ to .gitignore
Copilot Apr 15, 2026
caed8a5
Remove accidentally committed .github/agents and benchmark images
Copilot Apr 15, 2026
9c99965
Fix: add tl.debug_barrier() before atomic.xchg, fix tests k_per_flag,…
Copilot Apr 22, 2026
2dedbce
Add state.skip() when iris disabled by auto-config, fix benchmark ran…
Copilot Apr 22, 2026
e42c7a3
Use per-tensor Generator for seeding in benchmark, use ctx.randn for …
Copilot Apr 22, 2026
7f163a0
Add bar chart: iris vs RCCL vs expected for tuned shapes at ws=8 (MI3…
Copilot Apr 22, 2026
95dce96
Fix rccl benchmark: use dist.all_gather+cat(dim=1) for correct K-conc…
Copilot Apr 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/scripts/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,10 @@ EXIT_CODE=0
# shellcheck disable=SC2086
"$SCRIPT_DIR/container_exec.sh" $GPU_ARG "
set -e

echo \"Installing iris using method: $INSTALL_METHOD\"
$INSTALL_CMD

# Run tests in the specified directory
for test_file in tests/$TEST_DIR/test_*.py; do
if [ -f \"\$test_file\" ]; then
Expand All @@ -88,4 +88,4 @@ EXIT_CODE=0
" || { EXIT_CODE=$?; }

# GPU cleanup is now handled by workflow-level release_gpus.sh step
exit $EXIT_CODE
exit $EXIT_CODE
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ omni*.pdf
slurm*.out

*.egg-info
*.backup
*.with_chunked

examples/gemm/results/*
asm/
Expand Down
2 changes: 1 addition & 1 deletion apptainer/iris.def
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ From: rocm/pytorch:rocm7.1_ubuntu24.04_py3.13_pytorch_release_2.9.1
cd /opt
git clone https://github.com/triton-lang/triton.git \$TRITON_PATH
cd \$TRITON_PATH
git checkout bcbcabdd0cff6539c7168299075992b2a23ff38e
git checkout bcbcabdd0cff6539c7168299075992b2a23ff38e
pip3 install -e .
"

Expand Down
Loading
Loading