Skip to content

Feat/issue 356 aicpu launch new interface#537

Open
puddingfjz wants to merge 6 commits intohw-native-sys:mainfrom
puddingfjz:feat/issue-356-aicpu-launch-new-interface
Open

Feat/issue 356 aicpu launch new interface#537
puddingfjz wants to merge 6 commits intohw-native-sys:mainfrom
puddingfjz:feat/issue-356-aicpu-launch-new-interface

Conversation

@puddingfjz
Copy link
Copy Markdown

Summary

  • Migrate AICPU kernel launch from legacy rtAicpuKernelLaunchExWithArgs API to new rtsLaunchCpuKernel / rtsBinaryLoadFromFile / rtsFuncGetByName interface
  • Affects both a2a3 and a5 platforms (onboard backend)
  • Uses JSON-based loading approach matching pypto reference

Changes

a2a3 platform

  • Add AicpuLoader abstraction class with conditional compilation
  • Implement new interface path: rtsBinaryLoadFromFile with cpuKernelMode=0
  • Generate temporary JSON descriptor with filename-only .so reference
  • Update device_runner to use AicpuLoader

a5 platform

  • Same implementation as a2a3 for consistency
  • Use rtsBinaryLoadFromFile instead of rtsBinaryLoadFromData

Build

  • Add BUILD_WITH_NEW_CANN compile flag (default: ON)
  • New RTS header include path for CANN 7.0+

Test plan

  • a2a3 onboard hardware test PASSED
  • All three new APIs actually called

Fixes #356

puddingfjz and others added 2 commits April 13, 2026 17:29
Implements Issue hw-native-sys#356: Migrate from legacy rtAicpuKernelLaunchExWithArgs
to the new rtsLaunchCpuKernel / rtsBinaryLoadFromFile / rtsFuncGetByName
interface available in CANN 7.0+.

Changes:
- Add AicpuLoader abstraction class supporting both legacy and new interfaces
- New interface: Use rtsBinaryLoadFromFile with JSON descriptor (pypto approach)
  - JSON contains only filename (libaicpu_kernel.so), runtime finds via library path
  - No temporary .so file creation needed
  - cpuKernelMode=0 (JSON only mode)
- Legacy interface: Unchanged behavior when BUILD_WITH_NEW_CANN=OFF
- BUILD_WITH_NEW_CANN compile flag controls which interface is used

The new interface provides forward compatibility and aligns with pypto's
approach. The legacy interface remains as fallback for older CANN versions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align a5 implementation with a2a3 by using rtsBinaryLoadFromFile with
cpuKernelMode=0 instead of rtsBinaryLoadFromData. This matches the
pypto approach and ensures consistency across platforms.

Changes:
- Use rtsBinaryLoadFromFile + JSON descriptor (cpuKernelMode=0)
- Generate temporary JSON file with filename-only .so reference
- Add json_file_path_ member for cleanup

Fixes hw-native-sys#356

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an AicpuLoader abstraction to support both legacy and new CANN 7.0+ interfaces for launching AICPU kernels across the a2a3 and a5 platforms. The implementation includes build system updates, runtime JSON descriptor generation, and integration into the DeviceRunner. Feedback focuses on improving build portability by avoiding hardcoded architecture paths and enhancing the robustness of manual JSON construction. Additionally, the removal of a default parameter in the a2a3 platform's header is identified as a breaking change that violates cross-platform consistency. Suggestions were also made to reduce coupling in the kernel name mapping.


target_link_directories(host_runtime
PRIVATE
${ASCEND_HOME_PATH}/aarch64-linux/lib64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the architecture-specific path aarch64-linux/lib64 reduces the portability of the build system. The previous implementation used more generic paths. Consider using ${CMAKE_SYSTEM_PROCESSOR}-linux or a similar variable to maintain flexibility across different Ascend platforms or simulation environments.

Comment on lines +46 to +74
static bool GenerateAicpuOpJson(const std::string& json_path, const std::vector<AicpuOpConfig>& op_configs) {
std::ofstream json_file(json_path);
if (!json_file.is_open()) {
LOG_ERROR("Failed to open JSON file for writing: %s", json_path.c_str());
return false;
}

json_file << "{\n";
for (size_t i = 0; i < op_configs.size(); ++i) {
const auto& config = op_configs[i];
json_file << " \"" << config.opType << "\": {\n";
json_file << " \"opInfo\": {\n";
json_file << " \"functionName\": \"" << config.functionName << "\",\n";
json_file << " \"kernelSo\": \"" << config.kernelSo << "\",\n";
json_file << " \"opKernelLib\": \"" << config.opKernelLib << "\",\n";
json_file << " \"computeCost\": \"" << config.computeCost << "\",\n";
json_file << " \"engine\": \"" << config.engine << "\",\n";
json_file << " \"flagAsync\": \"" << config.flagAsync << "\",\n";
json_file << " \"flagPartial\": \"" << config.flagPartial << "\",\n";
json_file << " \"userDefined\": \"" << config.userDefined << "\"\n";
json_file << " }\n";
json_file << " }" << (i < op_configs.size() - 1 ? "," : "") << "\n";
}
json_file << "}\n";
json_file.close();

LOG_INFO("Generated AICPU op info JSON: %s", json_path.c_str());
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Manual JSON construction using string concatenation is fragile and does not handle character escaping. If any of the configuration strings (such as functionName or opType) contain special characters like double quotes, the resulting JSON will be invalid. It is safer to use a dedicated JSON library or at least implement basic string escaping for these fields.

Comment on lines +99 to +102
std::unordered_map<std::string, std::string> name_mapping = {
{"DynTileFwkKernelServerInit", "DynTileFwkBackendKernelServerInit"},
{"DynTileFwkKernelServer", "DynTileFwkBackendKernelServer"}
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The name_mapping between opType and functionName is hardcoded within the AicpuLoader implementation. This creates a tight coupling and makes the code harder to maintain if new kernels are added in the future. Consider passing this mapping as an argument to init_with_binary or defining it in a more central configuration location.

int
run(Runtime &runtime, int block_dim, int device_id, const std::vector<uint8_t> &aicpu_so_binary,
const std::vector<uint8_t> &aicore_kernel_binary, int launch_aicpu_num = 1);
const std::vector<uint8_t> &aicore_kernel_binary, int launch_aicpu_num);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default value for launch_aicpu_num has been removed from the run method declaration. This is a breaking change for any existing callers that rely on the default value. Additionally, this change is inconsistent with the a5 platform's device_runner.h, which still retains the default value. Per repository rules, files shared across different runtimes should maintain identical code for consistency. Consider restoring the default value to maintain backward compatibility.

References
  1. For files shared across different runtimes, maintain identical code for consistency, even if a more robust implementation is possible.


target_link_directories(host_runtime
PRIVATE
${ASCEND_HOME_PATH}/aarch64-linux/lib64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the architecture-specific path aarch64-linux/lib64 reduces the portability of the build system. Consider using ${CMAKE_SYSTEM_PROCESSOR}-linux or a similar variable to maintain flexibility across different platforms.

Comment on lines +46 to +74
static bool GenerateAicpuOpJson(const std::string& json_path, const std::vector<AicpuOpConfig>& op_configs) {
std::ofstream json_file(json_path);
if (!json_file.is_open()) {
LOG_ERROR("Failed to open JSON file for writing: %s", json_path.c_str());
return false;
}

json_file << "{\n";
for (size_t i = 0; i < op_configs.size(); ++i) {
const auto& config = op_configs[i];
json_file << " \"" << config.opType << "\": {\n";
json_file << " \"opInfo\": {\n";
json_file << " \"functionName\": \"" << config.functionName << "\",\n";
json_file << " \"kernelSo\": \"" << config.kernelSo << "\",\n";
json_file << " \"opKernelLib\": \"" << config.opKernelLib << "\",\n";
json_file << " \"computeCost\": \"" << config.computeCost << "\",\n";
json_file << " \"engine\": \"" << config.engine << "\",\n";
json_file << " \"flagAsync\": \"" << config.flagAsync << "\",\n";
json_file << " \"flagPartial\": \"" << config.flagPartial << "\",\n";
json_file << " \"userDefined\": \"" << config.userDefined << "\"\n";
json_file << " }\n";
json_file << " }" << (i < op_configs.size() - 1 ? "," : "") << "\n";
}
json_file << "}\n";
json_file.close();

LOG_INFO("Generated AICPU op info JSON: %s", json_path.c_str());
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Manual JSON construction using string concatenation is fragile and does not handle character escaping. If any of the configuration strings contain double quotes, the resulting JSON will be invalid. It is safer to use a dedicated JSON library or implement string escaping.

puddingfjz and others added 4 commits April 13, 2026 18:14
Remove profiling_logs_* and *.log entries that were added
during development and testing. These are not needed for
the production codebase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Apply clang-format pointer/reference spacing
- Fix copyright header: "WITHOUT WARRANTIES OF ANY KIND" (not "OR")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Revert hardcoded aarch64-linux path in CMakeLists.txt, use portable paths
- Restore default parameter for launch_aicpu_num in device_runner.h
- Add documentation explaining JSON construction and name_mapping design

The JSON construction uses manual string concatenation without a library.
This is safe because kernel names are controlled strings without special
characters, matching pypto's approach for similar AICPU op descriptors.

The name_mapping from opType to functionName is specific to the Ascend
tile framework kernels and is unlikely to change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace rtAicpuKernelLaunchExWithArgs with rtsBinaryLoadFromFile/
  rtsFuncGetByName/rtsLaunchCpuKernel (BUILD_WITH_NEW_CANN=ON path)
- Add two-layer architecture: outer libaicpu_dispatcher.so + inner
  runtime SO. Dispatcher saves inner SO to /tmp/aicpu_kernels/ and
  forwards kernel calls via dlopen/dlsym
- Add LoadAicpuOp class to manage JSON generation and function handles
- Generate JSON in SO directory with matching basename (cpuKernelMode=1)
- Remove static linking (-static-libstdc++ -static-libgcc), add
  -Wl,-Bsymbolic,--build-id, strip debug info to match CANN built-ins
- Update a2a3 and a5 platforms identically

Known issue: scheduler fails with 507018 (symbol lookup failure for
DynTileFwkKernelServerNull in libaicpu_dispatcher.so). Host-side
API calls succeed, scheduler-side dlsym fails despite symbol being
present in SO binary. Investigation ongoing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Migrate AICPU launch to new rtsLaunchCpuKernel interface (BUILD_WITH_NEW_CANN)

1 participant