Skip to content

fix: fall back illegal A5 auto-sync pairs to barrier#482

Closed
HecreReed wants to merge 2 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-a5-illegal-sync-426
Closed

fix: fall back illegal A5 auto-sync pairs to barrier#482
HecreReed wants to merge 2 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-a5-illegal-sync-426

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • fall back A5 auto-inserted low-level sync to pto.barrier <PIPE_ALL> when either pipe is not legal for A5 low-level set_flag/wait_flag
  • keep existing low-level sync behavior unchanged for legal A5 pairs and for non-A5 targets
  • update A5 basic checks to assert barrier fallback instead of illegal pipe pairs

Root Cause

PR426 exposed an existing A5 sync codegen bug rather than introducing it.

For auto-inserted sync, PTOAS could emit low-level pairs like:

  • PIPE_MTE2 -> PIPE_MTE1
  • PIPE_MTE1 -> PIPE_M
  • PIPE_M -> PIPE_FIX

Ascend950/A5 rejects those pipe ids in low-level set_flag/wait_flag, so generated C++ failed in A5 SIM compile.

Scope

This patch is intentionally narrow:

  • only auto-generated sync emitted by SyncCodegen is changed
  • user-authored low-level sync ops are untouched
  • unsupported A5 low-level pairs now degrade to PIPE_ALL barrier

Validation

  • ptoas --pto-arch a5 --enable-insert-sync test/basic/tmov_acc_mat_pipe_selection.pto | FileCheck ...
  • ptoas --pto-arch a5 --enable-insert-sync test/basic/tinsert_a5_pipe_selection.pto | FileCheck ...
  • ptoas --pto-arch a5 --enable-insert-sync test/basic/tmov_acc_to_vec_mode_a5_emitc.pto | FileCheck --check-prefix=A5 ...
  • ptoas --pto-arch a3 test/basic/tmov_acc_to_vec_mode_a5_emitc.pto | FileCheck --check-prefix=A3 ...
  • ptoas --pto-arch a5 --enable-insert-sync test/basic/textract_a5_scaling_pipe_selection.pto | FileCheck ...
  • smoke-checked PR426 Qwen tilelet cases qwen3_decode_layer_incore_{3,4,5,8}.pto with --pto-level=level3 --enable-insert-sync; illegal A5 pairs no longer appear in generated C++

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fallback synchronization mechanism for the A5 architecture, replacing specific flag-based synchronization with a global pipe barrier (PIPE_ALL) when low-level pipes are illegal. This change affects both single-buffer and multi-buffer synchronization generation. Feedback indicates that the fallback barrier creation in CreateSetWaitOpForMultiBuffer lacks deduplication logic, which could lead to redundant barrier operations.

Comment on lines +391 to +400
if (shouldUseA5BarrierFallback(func_, sync)) {
auto pipeAllAttr = getPipeAttr(rewriter, PipelineType::PIPE_ALL);
if (beforeInsert || op->hasTrait<OpTrait::IsTerminator>()) {
rewriter.setInsertionPoint(op);
} else {
rewriter.setInsertionPointAfter(op);
}
rewriter.create<pto::BarrierOp>(op->getLoc(), pipeAllAttr);
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fallback barrier creation in CreateSetWaitOpForMultiBuffer also lacks deduplication logic. While the insertion point logic cannot be easily moved to the top due to the GetBufferSelected call later in the function, the barrier creation itself should still check for existing identical barriers to avoid redundancy.

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 14, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix: fall back illegal A5 auto-sync pairs to barrier #482 fix: fall back illegal A5 auto-sync pairs to barrier
  • Author: HecreReed
  • Base/Head: main / codex/fix-a5-illegal-sync-426
  • Head SHA: 7b45c747a15f
  • Trigger: PR 有新提交
  • Generated At: 2026-04-14T12:43:21Z
  • Previous Head SHA: 47c51e8e38ee
  • Status: completed

Summary

发现 1 个 P1:PR 改变了 A5 auto-sync 的输出形态(部分 set/wait 改为 PIPE_ALL barrier),但未同步现有 basic FileCheck 断言,极可能导致 CI 失败。

Findings

  1. P1 A5 fallback 行为已变更,但对应 basic 测试断言未更新 lib/PTO/Transforms/InsertSync/SyncCodegen.cpp:358

SyncCodegen 新增的 A5 fallback 逻辑会把非白名单 pipe(不在 PIPE_S/PIPE_V/PIPE_MTE2/PIPE_MTE3)的 set/wait 直接降级为 pto.barrier(PIPE_ALL)。这会改变当前测试期望中的输出:例如 test/basic/textract_a5_scaling_pipe_selection.pto 仍断言 set_flag/wait_flag(PIPE_MTE2, PIPE_MTE1)test/basic/tinsert_a5_pipe_selection.ptotest/basic/tmov_acc_mat_pipe_selection.ptotest/basic/tmov_acc_to_vec_mode_a5_emitc.pto 仍断言 PIPE_M -> PIPE_FIX / PIPE_FIX -> PIPE_MTE1 的 set/wait。当前 PR 未包含这些断言更新,FileCheck 结果将与新代码路径冲突。

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19 --pto-level=level3

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:5fc508713bf8
  • 源码策略:origin/main + PR merge commit 5fc508713bf8
  • 结果汇总:OK 14 / FAIL 6 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_111512_manual_pr482.log
  • 手动指令:/run a5 qwen3_decode_layer_incore_0 qwen3_decode_layer_incore_1 qwen3_decode_layer_incore_2 qwen3_decode_layer_incore_3 qwen3_decode_layer_incore_4 qwen3_decode_layer_incore_5 qwen3_decode_layer_incore_6 qwen3_decode_layer_incore_7 qwen3_decode_layer_incore_8 qwen3_decode_layer_incore_9 qwen3_decode_layer_incore_10 qwen3_decode_layer_incore_11 qwen3_decode_layer_incore_12 qwen3_decode_layer_incore_13 qwen3_decode_layer_incore_14 qwen3_decode_layer_incore_15 qwen3_decode_layer_incore_16 qwen3_decode_layer_incore_17 qwen3_decode_layer_incore_18 qwen3_decode_layer_incore_19 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19
  • PTOAS 参数:--pto-level=level3
  • 触发评论:fix: fall back illegal A5 auto-sync pairs to barrier #482 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • qwen3_decode_layer_incore_8 (run, exit=2)
  • qwen3_decode_layer_incore_6 (run, exit=1)
  • qwen3_decode_layer_incore_5 (run, exit=2)
  • qwen3_decode_layer_incore_4 (run, exit=2)
  • qwen3_decode_layer_incore_3 (run, exit=2)
  • qwen3_decode_layer_incore_17 (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #482

qwen3_decode_layer_incore_8

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:505:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:173:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
            load_cbuf_to_ca(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:282:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
            load_cbuf_to_cb(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:282:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:152:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
    mad(c, a, b, m, k, n, static_cast<uint8_t>(Phase), gemvCtrl, cmatrixSource, cmatrixInitVal);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:150:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:232:5: error: function type 'void (unsigned long) noexcept' of 'set_loop3_para' does not support the given target feature
    set_loop3_para(config);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
    copy_matrix_cc_to_gm(dstGlobalAddr, srcTileAddr, xmReg, xtReg);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
17 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/qwen3_decode_layer_incore_8_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 11:17:35] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_8
qwen3_decode_layer_incore_6

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_6/main.cpp:141)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 105878] 2026-04-15-11:18:01.207.229 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 10, there is an aivec error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800000, current: 0x1000408004ac, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, fault kernel info ext=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, program id=0, hash=1851969691321044957.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 11:18:05] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_6
qwen3_decode_layer_incore_5

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:173:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
            load_cbuf_to_ca(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
            load_cbuf_to_cb(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 1);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:152:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
    mad(c, a, b, m, k, n, static_cast<uint8_t>(Phase), gemvCtrl, cmatrixSource, cmatrixInitVal);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:150:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:232:5: error: function type 'void (unsigned long) noexcept' of 'set_loop3_para' does not support the given target feature
    set_loop3_para(config);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
    copy_matrix_cc_to_gm(dstGlobalAddr, srcTileAddr, xmReg, xtReg);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/qwen3_decode_layer_incore_5_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 11:18:07] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_5
qwen3_decode_layer_incore_4

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:173:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
            load_cbuf_to_ca(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
            load_cbuf_to_cb(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 1);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:152:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
    mad(c, a, b, m, k, n, static_cast<uint8_t>(Phase), gemvCtrl, cmatrixSource, cmatrixInitVal);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:150:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:232:5: error: function type 'void (unsigned long) noexcept' of 'set_loop3_para' does not support the given target feature
    set_loop3_para(config);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
    copy_matrix_cc_to_gm(dstGlobalAddr, srcTileAddr, xmReg, xtReg);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/qwen3_decode_layer_incore_4_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 11:18:09] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_4
qwen3_decode_layer_incore_3

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:337:5: error: function type 'void (unsigned long) noexcept' of 'set_mte2_nz_para' does not support the given target feature
    set_mte2_nz_para(mte2NzPara);                                      // only set once
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned char, unsigned short, unsigned int, unsigned long, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
            copy_gm_to_cbuf_multi_nd2nz(reinterpret_cast<__cbuf__ uint16_t *>(dst),
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TLoad.hpp:250:13: error: function type 'void (__cbuf__ unsigned short *, __gm__ unsigned short *, unsigned char, unsigned long, unsigned short, unsigned int, unsigned long, bool, bool) noexcept' of 'copy_gm_to_cbuf_multi_nd2nz' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:173:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
            load_cbuf_to_ca(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:135:13: error: function type 'void (__ca__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_ca' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned char, unsigned short, unsigned short, unsigned char, bool, unsigned int) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
            load_cbuf_to_cb(dstAddr, srcAddr, mStartPosition, kStartPosition, mStep, kStep, srcStride, dstStride, 1);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TExtract.hpp:301:13: error: function type 'void (__cb__ __bf16 *, __cbuf__ __bf16 *, unsigned short, unsigned short, unsigned char, unsigned char, short, unsigned short, bool) noexcept' of 'load_cbuf_to_cb' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:152:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
    mad(c, a, b, m, k, n, static_cast<uint8_t>(Phase), gemvCtrl, cmatrixSource, cmatrixInitVal);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:38:5: error: function type 'void (__cc__ float *, __ca__ __bf16 *, __cb__ __bf16 *, unsigned short, unsigned short, unsigned short, unsigned char, bool, bool, bool) noexcept' of 'mad' does not support the given target feature
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:32:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/pto-inst.hpp:23:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr.hpp:18:
In file included from /tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/common/pto_instr_impl.hpp:150:
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:232:5: error: function type 'void (unsigned long) noexcept' of 'set_loop3_para' does not support the given target feature
    set_loop3_para(config);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
    copy_matrix_cc_to_gm(dstGlobalAddr, srcTileAddr, xmReg, xtReg);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/payload/pto-isa/include/pto/npu/a5/TStore.hpp:233:5: error: function type 'void (__gm__ float *, __cc__ float *, unsigned long, unsigned long) noexcept' of 'copy_matrix_cc_to_gm' does not support the given target feature
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/qwen3_decode_layer_incore_3_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 11:18:11] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_3
qwen3_decode_layer_incore_17

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507015 (/tmp/ptoas-board-monitor-a5/runs/20260415_111512_manual_pr482/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_17/main.cpp:108)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 110830] 2026-04-15-11:19:38.491.647 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 11, there is an aicore error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800008, current: 0x100040800188, sc error info: 0xffffffffffff, su error info: 0xfefffeec1efe9387,0x7fddfebff8007fff, mte error info: 0x22601000002005d, vec error info: 0, cube error info: 0, l1 error info: 0xffbf0017f6ee, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 12, there is an aivec error exception, core id is 0, error code = 0, dump info: pc start: 0x100040800764, current: 0x1000408008b8, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       The extend info: errcode:(0) errorStr: timeout or trap error. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 12, there is an aivec error exception, core id is 1, error code = 0, dump info: pc start: 0x100040800764, current: 0x100040800b68, sc error info: 0xffffffffffff, su error info: 0x2985b4fc1dfeefdf,0xe64ef56bc000acdb, mte error info: 0xdebf63730007defe, vec error info: 0x4d6c3f7f001cfccf, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AICORE Kernel task happen error, retCode=0x26.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:(no result)[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, fault kernel info ext=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, program id=0, hash=726938774213693469.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507015[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 11:19:44] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_17

@HecreReed HecreReed closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants