Expose CoreML compute units as gpu device IDs in onnx-coreml backend#2401
Merged
borg323 merged 2 commits intoLeelaChessZero:masterfrom Mar 29, 2026
Merged
Conversation
Replace the hardcoded ProfileComputePlan=1 provider option with a configurable MLComputeUnits option (default: ALL). Accepts the same values as the CoreML MLComputeUnits enum: ALL, CPU_AND_NE, CPU_ONLY, CPU_AND_GPU. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
On a MacBook Pro M5 Max this PR gives a substantial improvement in nps using multiplexing compared to simple onnx-coreml. Results for nets 792013, 11248, 771473 and BT4-1024x15x32h-swa-6147500.pb.gz : |
Menkib64
reviewed
Mar 20, 2026
| batch_size_ = opts.GetOrDefault<int>("batch", default_batch); | ||
| steps_ = opts.GetOrDefault<int>("steps", default_steps); | ||
| min_batch_size_ = opts.GetOrDefault<int>("min_batch", default_min_batch); | ||
| compute_units_ = opts.GetOrDefault<std::string>("compute_units", "ALL"); |
Contributor
There was a problem hiding this comment.
Backend can expose different compute units as separate GPUs. GPU should get id 0 and neural engine id 1. This would use an existing option in a logical way.
Replace the custom `compute_units` string option with the standard `gpu` integer option used by all other GPU backends. gpu=0 (default) selects CPUAndGPU, gpu=1 selects CPUAndNeuralEngine, and any other value uses ALL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Menkib64
approved these changes
Mar 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gpudevice ID parameter in theonnx-coremlbackend, allowing users to control which Apple hardware accelerators CoreML uses for inference.ProfileComputePlan=1from CoreML provider options (previously caused ~111 s warm session creation; without it warm startup is ~46 s — a ~58% improvement).Usage
The
gpuparameter selects which CoreML compute units to use:gpu=0(default) — CPU + GPU (CPUAndGPU)gpu=1— CPU + Neural Engine (CPUAndNeuralEngine)gpu=2or higher — all available hardware (ALL: CPU, GPU, Neural Engine)This is most useful with the
multiplexingbackend to run separate instances targeting different compute units simultaneously:Motivation
On Apple Silicon, the GPU and Neural Engine are separate hardware units that can run simultaneously. By running two
onnx-coremlinstances via themultiplexingbackend — one targetinggpu=0(CPUAndGPU) and one targetinggpu=1(CPUAndNeuralEngine) — both accelerators can be kept busy in parallel, increasing overall inference throughput compared to a single instance.Using the existing
gpuparameter (instead of a customcompute_unitsstring option) keeps the interface consistent with other backends like CUDA, wheregpu=Nselects a device.Test plan
onnx-coremlbackend withgpu=0andgpu=1and verify inference produces valid resultsProfileComputePlan=1gpu=0andgpu=1shows acceptable numerical differences🤖 Generated with Claude Code