[CoreML] Add ONE_BLOB multimethod weight sharing strategy#18531
[CoreML] Add ONE_BLOB multimethod weight sharing strategy#18531
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18531
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled Job, 8 Unrelated FailuresAs of commit ea5ee0a with merge base e0e10cc ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
any issues @lucylq ? |
|
|
||
| def test_multifunction_one_blob_simple_model(self): | ||
| """Test exporting a simple model using ONE_BLOB weight sharing strategy.""" | ||
| model = self.SimpleModel() |
There was a problem hiding this comment.
For ONE_BLOB, I think it's also worth testing a model with different partitions in forward/prefill and make sure it still works - I guess that is the major difference between it and the POSITIONAL strategy.
| f"'{first_method}' has {num_partitions}. POSITIONAL weight sharing " | ||
| "strategy requires all methods to have the same number of partitions. " | ||
| "Use MULTIMETHOD_WEIGHT_SHARING_STRATEGY.DISABLED if methods should " | ||
| "be processed independently." |
There was a problem hiding this comment.
nit: user can also select MULTIMETHOD_WEIGHT_SHARING_STRATEGY.ONE_BLOB now
| method_spec = method_model.get_spec() | ||
| input_names = [inp.name for inp in method_spec.description.input] | ||
| output_names = [out.name for out in method_spec.description.output] | ||
| methods_metadata[method_name] = MethodMetadata( |
There was a problem hiding this comment.
Is this a problem? If a single method has multiple partitions, this is overwritten.
| ) | ||
|
|
||
| return MULTIMETHOD_WEIGHT_SHARING_STRATEGY.POSITIONAL | ||
| return MULTIMETHOD_WEIGHT_SHARING_STRATEGY.DISABLED |
There was a problem hiding this comment.
Import to internal ? See if this change breaks anything.
There was a problem hiding this comment.
I'll import to internal. I think RL is explicitly setting this, though.
915723b to
ea5ee0a
Compare
|
@metascroy has imported this pull request. If you are a Meta employee, you can view this in D99755766. |
|
|
||
| std::string method_name_str = [methodName UTF8String]; | ||
| const MethodMetadata* method_metadata = metadataValue.get_method_metadata(method_name_str); | ||
| if (functionName == nil || functionName.length == 0) { |
There was a problem hiding this comment.
TODO: make sure these changes are tested against new cache system stack.
Adds a new ONE_BLOB weight sharing strategy for CoreML multifunction models that combines all partitions from all methods into a single multifunction model, stored as one entry in NamedDataStore.
Motivation
The existing POSITIONAL strategy requires all methods to have the same number of partitions and creates one multifunction model per partition index. This works if each method has the same number of partitions, and these partitions are aligned to naturally share weights. But we want to relax this restriction.
Design
POSITIONAL (existing): N blobs, one per partition index. Each blob contains that partition from every method. Requires partition count alignment.
combined_partition_0.mlpackage → functions: {forward, prefill}
combined_partition_1.mlpackage → functions: {forward, prefill}
combined_partition_2.mlpackage → functions: {forward, prefill}
ONE_BLOB (new): 1 blob containing all partition × method combinations. No partition count alignment required. Function names use {method}__{partition_idx} encoding for CoreML dispatch; metadata is keyed by method name for runtime compatibility.
combined_all.mlpackage → functions: {forward__0, forward__1, forward__2,
prefill__0, prefill__1, prefill__2}
No runtime changes required. The existing CMJR JSON reference mechanism (functionName field) was designed to support arbitrary function names.
Test plan
Existing CI + new unit test