Skip to content

fix: allow addptr for pipe gm_addr lowering#486

Open
FangRui0 wants to merge 1 commit intohw-native-sys:mainfrom
FangRui0:fix_verify
Open

fix: allow addptr for pipe gm_addr lowering#486
FangRui0 wants to merge 1 commit intohw-native-sys:mainfrom
FangRui0:fix_verify

Conversation

@FangRui0
Copy link
Copy Markdown
Contributor

@FangRui0 FangRui0 commented Apr 15, 2026

fix issue #481
问题
./build/tools/ptoas/ptoas --pto-arch=a3 --enable-insert-sync addptr.pto 失败,报 addptr 类型/使用链路相关错误。

原因
aic/aiv_initialize_pipe 会先降到 initialize_l2g2l_pipe,但 addptr 在这条 gm_addr 路径上没有被及时消解;同时原有规则对 addptr 使用场景限制过严。

解决方案
在 PTOViewToMemref 中新增对 initialize_l2g2l_pipe(gm_addr) 场景的 addptr 折叠(转为 memref.reinterpret_cast offset 语义),并移除强制失败限制,让未折叠 addptr 交由后续流程处理。这样目标命令可通过。

@FangRui0
Copy link
Copy Markdown
Contributor Author

/run a3

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new transformation stage (Stage 1.75) in PTOViewToMemref.cpp to fold AddPtrOp chains used by InitializeL2G2LPipeOp into memref.reinterpret_cast operations, and removes the previous strict error checking for unhandled AddPtrOp instances. Feedback highlights that the current logic is too restrictive because it only processes AddPtrOp if all its users are InitializeL2G2LPipeOp, which causes failures when pointers are shared across different operations. It is recommended to refactor the logic to iterate over InitializeL2G2LPipeOp instances directly and to move the IRRewriter instantiation outside the loop for better performance.

Comment thread lib/PTO/Transforms/PTOViewToMemref.cpp Outdated
Comment on lines +912 to +953
SmallVector<mlir::pto::AddPtrOp, 8> addPtrsForPipeInit;
func.walk([&](mlir::pto::AddPtrOp op) {
bool eligible = !op->use_empty();
for (Operation *user : op->getUsers()) {
auto init = dyn_cast<mlir::pto::InitializeL2G2LPipeOp>(user);
if (!init || init.getGmAddr() != op.getResult()) {
eligible = false;
break;
}
}
if (eligible)
addPtrsForPipeInit.push_back(op);
});

for (auto op : addPtrsForPipeInit) {
IRRewriter rewriter(ctx);
rewriter.setInsertionPoint(op);
Location loc = op.getLoc();

Value base = op.getPtr();
Value totalOffset = ensureIndex(rewriter, loc, op.getOffset(), op);
while (auto add = base.getDefiningOp<mlir::pto::AddPtrOp>()) {
Value off = ensureIndex(rewriter, loc, add.getOffset(), add);
totalOffset = rewriter.create<arith::AddIOp>(loc, totalOffset, off);
base = add.getPtr();
}

auto baseMrTy = dyn_cast<MemRefType>(base.getType());
if (!baseMrTy || baseMrTy.getRank() != 1)
continue;

int64_t dyn = ShapedType::kDynamic;
auto layout = StridedLayoutAttr::get(ctx, dyn, {dyn});
auto targetTy = MemRefType::get({dyn}, baseMrTy.getElementType(), layout,
baseMrTy.getMemorySpace());
SmallVector<OpFoldResult, 1> sizes{rewriter.getIndexAttr(1)};
SmallVector<OpFoldResult, 1> strides{rewriter.getIndexAttr(1)};
auto rc = rewriter.create<memref::ReinterpretCastOp>(
loc, targetTy, base, OpFoldResult(totalOffset), sizes, strides);
rc->setAttr("pto.addptr_trace", rewriter.getUnitAttr());
rewriter.replaceOp(op, rc.getResult());
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of Stage 1.75 is too restrictive because it only folds an AddPtrOp if all of its users are InitializeL2G2LPipeOp. This logic fails in scenarios where an addptr is shared between an initialize_l2g2l_pipe and another operation (even if that other operation is already handled, like a load_scalar or another addptr in a chain).

For example, in a chain arg -> add1 -> init1 and arg -> add1 -> add2 -> init2, add1 will not be collected because it has a user (add2) that is not an init_pipe. Consequently, init1 will continue to use add1, which will have an invalid memref input after Stage 0, leading to verification failures.

It is recommended to follow the pattern used in Stage 1 (MakeTensorView): walk the InitializeL2G2LPipeOp operations instead, and fold the addptr chain specifically for the gm_addr operand. The existing cleanup loop will then correctly erase any addptr that becomes unused.

Additionally, the IRRewriter should be instantiated once outside the loop for better efficiency.

      SmallVector<mlir::pto::InitializeL2G2LPipeOp, 8> pipeInits;
      func.walk([&](mlir::pto::InitializeL2G2LPipeOp op) { pipeInits.push_back(op); });

      IRRewriter rewriter(ctx);
      for (auto op : pipeInits) {
        Value gmAddr = op.getGmAddr();
        auto addPtr = gmAddr.getDefiningOp<mlir::pto::AddPtrOp>();
        if (!addPtr)
          continue;

        rewriter.setInsertionPoint(op);
        Location loc = op.getLoc();

        Value base = addPtr.getPtr();
        Value totalOffset = ensureIndex(rewriter, loc, addPtr.getOffset(), addPtr);
        while (auto add = base.getDefiningOp<mlir::pto::AddPtrOp>()) {
          Value off = ensureIndex(rewriter, loc, add.getOffset(), add);
          totalOffset = rewriter.create<arith::AddIOp>(loc, totalOffset, off);
          base = add.getPtr();
        }

        auto baseMrTy = dyn_cast<MemRefType>(base.getType());
        if (!baseMrTy || baseMrTy.getRank() != 1)
          continue;

        int64_t dyn = ShapedType::kDynamic;
        auto layout = StridedLayoutAttr::get(ctx, dyn, {dyn});
        auto targetTy = MemRefType::get({dyn}, baseMrTy.getElementType(), layout,
                                        baseMrTy.getMemorySpace());
        SmallVector<OpFoldResult, 1> sizes{rewriter.getIndexAttr(1)};
        SmallVector<OpFoldResult, 1> strides{rewriter.getIndexAttr(1)};
        auto rc = rewriter.create<memref::ReinterpretCastOp>(
            loc, targetTy, base, OpFoldResult(totalOffset), sizes, strides);
        rc->setAttr("pto.addptr_trace", rewriter.getUnitAttr());
        op.setGmAddr(rc.getResult());
      }

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 15, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix: allow addptr for pipe gm_addr lowering #486 fix: allow addptr for pipe gm_addr lowering
  • Author: FangRui0
  • Base/Head: main / fix_verify
  • Head SHA: 73ee8e1fa1cc
  • Trigger: PR 有新提交
  • Generated At: 2026-04-16T02:20:21Z
  • Previous Head SHA: 6c290ac2d022
  • Status: completed

Summary

未检查到 PR #486 存在问题,并返回 findings=[].

Findings

No issues found.

@reedhecre
Copy link
Copy Markdown

A3 板测完成(有跳过)

  • 触发方式:manual
  • 源码提交:71a6946a12ed
  • 结果汇总:OK 184 / FAIL 0 / SKIP 1
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260415_160204_manual_pr486.log
  • 结果 TSV:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260415_160204_manual_pr486.tsv
  • 手动指令:/run a3
  • 触发人:FangRui0
  • 触发评论:fix: allow addptr for pipe gm_addr lowering #486 (comment)

@FangRui0 FangRui0 force-pushed the fix_verify branch 3 times, most recently from 6c290ac to 45f61f3 Compare April 16, 2026 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants