Skip to content

test: add qwen3 decode A3/A5 PTO cases#491

Draft
HecreReed wants to merge 3 commits intohw-native-sys:mainfrom
HecreReed:codex/qwen3-decode-a3-a5-cases
Draft

test: add qwen3 decode A3/A5 PTO cases#491
HecreReed wants to merge 3 commits intohw-native-sys:mainfrom
HecreReed:codex/qwen3-decode-a3-a5-cases

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • vendor the A3 and A5 qwen3_decode_incore_*.pto fragments regenerated from pypto-lib/examples/models/qwen3/qwen3_32b_decode.py
  • add per-case custom golden coverage for all A3/A5 decode PTO kernels
  • wire runop.sh and generate_testcase.py so these direct .pto samples use the right default flags and golden assets

Validation

  • python3 -m py_compile test/npu_validation/scripts/generate_testcase.py test/samples/Qwen3DecodeA3/qwen3_decode_golden_lib.py test/samples/Qwen3DecodeA5/qwen3_decode_golden_lib.py test/samples/Qwen3DecodeA3/*_golden.py test/samples/Qwen3DecodeA5/*_golden.py
  • bash -n test/samples/runop.sh
  • offline smoke: all 17 A3 and 17 A5 cases passed ptoas -> generate_testcase -> custom golden

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a3 qwen3_decode_incore_0 qwen3_decode_incore_1 qwen3_decode_incore_2 qwen3_decode_incore_3 qwen3_decode_incore_4 qwen3_decode_incore_5 qwen3_decode_incore_6 qwen3_decode_incore_7 qwen3_decode_incore_8 qwen3_decode_incore_9 qwen3_decode_incore_10 qwen3_decode_incore_11 qwen3_decode_incore_12 qwen3_decode_incore_13 qwen3_decode_incore_14 qwen3_decode_incore_15 qwen3_decode_incore_16

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_incore_0 qwen3_decode_incore_1 qwen3_decode_incore_2 qwen3_decode_incore_3 qwen3_decode_incore_4 qwen3_decode_incore_5 qwen3_decode_incore_6 qwen3_decode_incore_7 qwen3_decode_incore_8 qwen3_decode_incore_9 qwen3_decode_incore_10 qwen3_decode_incore_11 qwen3_decode_incore_12 qwen3_decode_incore_13 qwen3_decode_incore_14 qwen3_decode_incore_15 qwen3_decode_incore_16 --pto-level=level3

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Qwen3 decode kernels for A3 and A5 architectures by adding the necessary test case generation logic and PTO kernel fragments. The changes include updates to the test case generation script, the addition of new sample directories, and modifications to the runop.sh script to handle these new targets. My feedback suggests moving the growing configuration dictionary in the generation script to an external file for better maintainability and verifying the glob pattern in the shell script to avoid potential redundant file processing.

Comment on lines +96 to +142
"qwen3_decode_incore_4": {
"v11": 1,
"v12": 0,
"v13": 1,
},
"qwen3_decode_incore_5": {
"v4": 1,
"v5": 1,
"v6": 1,
"v7": 0,
},
"qwen3_decode_incore_6": {
"v5": 1,
"v6": 1,
"v7": 0,
},
"qwen3_decode_incore_7": {
"v4": 1,
"v5": 1,
"v6": 1,
"v7": 0,
},
"qwen3_decode_incore_8": {
"v5": 2,
"v6": 1,
},
"qwen3_decode_incore_9": {
"v4": 1,
"v5": 64,
},
"qwen3_decode_incore_10": {
"v4": 1,
"v5": 64,
},
"qwen3_decode_incore_12": {
"v4": 256,
},
"qwen3_decode_incore_13": {
"v4": 256,
},
"qwen3_decode_incore_15": {
"v4": 128,
},
"qwen3_decode_incore_16": {
"v4": 1,
"v5": 128,
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dictionary CASE_POINTER_COUNT_MINIMUMS is growing large. Consider moving these configuration values to a separate JSON or YAML file to improve maintainability and keep the script clean.

Comment thread test/samples/runop.sh
fi

for asset in "${sample_dir}"/*_golden.py "${sample_dir}"/*_compare.py; do
for asset in "${sample_dir}"/*_golden.py "${sample_dir}"/*_compare.py "${sample_dir}"/*_golden_*.py; do
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The glob pattern "${sample_dir}"/*_golden_*.py might be redundant if *_golden.py already covers the necessary files. Ensure this does not lead to double-processing or unexpected file inclusion.

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 15, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: test: add qwen3 decode A3/A5 PTO cases #491 test: add qwen3 decode A3/A5 PTO cases
  • Author: HecreReed
  • Base/Head: main / codex/qwen3-decode-a3-a5-cases
  • Head SHA: 8a5195652e73
  • Trigger: PR 有新提交
  • Generated At: 2026-04-15T13:05:31Z
  • Previous Head SHA: c2361cc63664
  • Status: failed at codex-review (exit=1)

Summary

Review failed at stage codex-review: exit=1

Findings

未生成结构化 findings,因为 review 过程提前失败。

Log Tail

 .../Qwen3Tilelet/qwen3_decode_layer_incore_12.pto  |  31 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_13.pto  | 116 -----
 .../qwen3_decode_layer_incore_13_golden.py         |  73 ---
 .../Qwen3Tilelet/qwen3_decode_layer_incore_14.pto  |  75 ----
 .../qwen3_decode_layer_incore_14_golden.py         |  61 ---
 .../Qwen3Tilelet/qwen3_decode_layer_incore_15.pto  |  47 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_16.pto  |  49 ---
 .../Qwen3Tilelet/qwen3_decode_layer_incore_17.pto  | 104 -----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_18.pto  |  75 ----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_19.pto  |  36 --
 .../qwen3_decode_layer_incore_1_golden.py          |  77 ----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_2.pto   | 146 ------
 .../qwen3_decode_layer_incore_2_golden.py          |  86 ----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_3.pto   |  45 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_4.pto   |  46 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_5.pto   |  46 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_6.pto   |  88 ----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_7.pto   |  92 ----
 .../Qwen3Tilelet/qwen3_decode_layer_incore_8.pto   |  30 --
 .../Qwen3Tilelet/qwen3_decode_layer_incore_9.pto   |  49 ---
 test/samples/runop.sh                              |  57 ++-
 101 files changed, 3736 insertions(+), 1843 deletions(-)
===== END STAGE clone rc=0 @ 2026-04-15 21:05:05 =====

===== STAGE codex-review @ 2026-04-15 21:05:05 =====
set -euo pipefail
cd '/tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/repo'
'codex' exec -C '/tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/repo' -s read-only -c 'model_provider="codereview"' -c 'model="gpt-5.4"' -c 'model_reasoning_effort="xhigh"' --output-schema '/tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/review_schema.json' -o '/tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/codex_last_message.json' --color never - < '/tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/review_prompt.txt'
OpenAI Codex v0.115.0 (research preview)
--------
workdir: /tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/repo
model: gpt-5.4
provider: codereview
approval: never
sandbox: read-only
reasoning effort: xhigh
reasoning summaries: none
session id: 019d913e-d039-7351-8a2e-7334a1b71975
--------
user
你现在在审查 GitHub PR。

仓库:hw-native-sys/PTOAS
PR:#491 test: add qwen3 decode A3/A5 PTO cases
作者:HecreReed
base branch:origin/main
head branch:HEAD(当前已 checkout 到 PR head)

要求:
1. 只审查这个 PR 相对 origin/main 的改动,必要时可以看上下文文件。
2. 重点找真实的 correctness / regression / contract mismatch / CI / runtime / compatibility 问题。
3. 不要提纯风格建议,不要提低价值猜测。
4. 严格按优先级输出:
   - P1:高概率会导致错误结果、编译/运行失败、严重回归、发布阻断
   - P2:重要缺陷、行为回归、遗漏校验/测试、较大兼容性问题
   - P3:次要但明确可改的问题
5. 如果没有问题,summary 直接写:未检查到 PR #491 存在问题,并返回 findings=[]。
6. 如果有问题,summary 简洁概括,findings 里每条都要给出:
   - severity
   - title
   - body(说明为什么是问题,尽量具体)
   - file(尽量给相对路径)
   - line(能确定就填整数,否则 null)

建议先查看:
- git status --short
- git diff --stat origin/main...HEAD
- git diff --unified=80 origin/main...HEAD

最终输出必须严格匹配 JSON schema。

mcp startup: no servers
Reconnecting... 1/5 (unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 22d43e09-d82b-4483-bf29-844a7afc1834)
Reconnecting... 2/5 (unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 02a43d26-6d9b-41be-a757-b94c4cdcb8c7)
Reconnecting... 3/5 (unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 1fb031c3-2368-4e4e-97f1-a2b28c3d3ed2)
Reconnecting... 4/5 (unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 506c1d7d-e7b0-409d-8959-c8aa261baadf)
Reconnecting... 5/5 (unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 9d101060-b369-415d-a4e0-04c472d0f42b)
ERROR: unexpected status 503 Service Unavailable: Service temporarily unavailable, url: https://codex.0u0o.com/responses, request id: 8fbfbf26-7a7a-414f-a282-3f51ba52883a
Warning: no last agent message; wrote empty content to /tmp/ptoas-pr-review-monitor/runs/20260415_210502_pr491/codex_last_message.json
===== END STAGE codex-review rc=1 @ 2026-04-15 21:05:31 =====

@reedhecre
Copy link
Copy Markdown

A3 板测成功

  • 触发方式:manual
  • 源码提交:487dd9251daa
  • 结果汇总:OK 17 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260415_184505_manual_pr491.log
  • 结果 TSV:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260415_184505_manual_pr491.tsv
  • 手动指令:/run a3 qwen3_decode_incore_0 qwen3_decode_incore_1 qwen3_decode_incore_2 qwen3_decode_incore_3 qwen3_decode_incore_4 qwen3_decode_incore_5 qwen3_decode_incore_6 qwen3_decode_incore_7 qwen3_decode_incore_8 qwen3_decode_incore_9 qwen3_decode_incore_10 qwen3_decode_incore_11 qwen3_decode_incore_12 qwen3_decode_incore_13 qwen3_decode_incore_14 qwen3_decode_incore_15 qwen3_decode_incore_16
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_incore_0,qwen3_decode_incore_1,qwen3_decode_incore_2,qwen3_decode_incore_3,qwen3_decode_incore_4,qwen3_decode_incore_5,qwen3_decode_incore_6,qwen3_decode_incore_7,qwen3_decode_incore_8,qwen3_decode_incore_9,qwen3_decode_incore_10,qwen3_decode_incore_11,qwen3_decode_incore_12,qwen3_decode_incore_13,qwen3_decode_incore_14,qwen3_decode_incore_15,qwen3_decode_incore_16
  • 触发评论:test: add qwen3 decode A3/A5 PTO cases #491 (comment)

@reedhecre
Copy link
Copy Markdown

A5 板测成功

  • 触发方式:manual
  • 源码提交:487dd9251daa
  • 源码策略:origin/main + PR merge commit 487dd9251daa
  • 结果汇总:OK 17 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_184511_manual_pr491.log
  • 结果 TSV:/root/ptoas-board-monitor-a5/logs/20260415_184511_manual_pr491.tsv
  • 手动指令:/run a5 qwen3_decode_incore_0 qwen3_decode_incore_1 qwen3_decode_incore_2 qwen3_decode_incore_3 qwen3_decode_incore_4 qwen3_decode_incore_5 qwen3_decode_incore_6 qwen3_decode_incore_7 qwen3_decode_incore_8 qwen3_decode_incore_9 qwen3_decode_incore_10 qwen3_decode_incore_11 qwen3_decode_incore_12 qwen3_decode_incore_13 qwen3_decode_incore_14 qwen3_decode_incore_15 qwen3_decode_incore_16 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_incore_0,qwen3_decode_incore_1,qwen3_decode_incore_2,qwen3_decode_incore_3,qwen3_decode_incore_4,qwen3_decode_incore_5,qwen3_decode_incore_6,qwen3_decode_incore_7,qwen3_decode_incore_8,qwen3_decode_incore_9,qwen3_decode_incore_10,qwen3_decode_incore_11,qwen3_decode_incore_12,qwen3_decode_incore_13,qwen3_decode_incore_14,qwen3_decode_incore_15,qwen3_decode_incore_16
  • PTOAS 参数:--pto-level=level3
  • 触发评论:test: add qwen3 decode A3/A5 PTO cases #491 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants