I ran into this while trying to write an end-to-end regression for a Metal conv_general output-offset bug.
Reproducer
import mlx.core as mx
n = (2**31) // (64 * 64) + 2
x = mx.ones((n, 8, 8, 1), dtype=mx.float16)
w = mx.ones((1, 1, 1, 1), dtype=mx.float16)
y = mx.conv_general(x, w, input_dilation=(9, 9))
mx.eval(y)
Output shape: (524290, 64, 64, 1), total elements: 2147491840.
Expected
Either the array evaluates successfully, or MLX raises a clean size-limit error with the true requested size.
Actual
mx.eval(y) fails with a wrapped allocation size:
RuntimeError: [metal::malloc] Attempting to allocate 18446744069414600704 bytes which is greater than the maximum allowed buffer size of 86586540032 bytes.
18446744069414600704 is 2^64 - 4294950912 consistent with signed-to-unsigned overflow in size bookkeeping.
Related failures from the same y
mx.reshape(y, (-1,)) # ValueError: [reshape] Cannot reshape array of size 2147491840 into shape (-2147475456).
mx.take(y, mx.array([0], dtype=mx.uint32)) # ValueError: [gather] Slice sizes must be in [0, a.shape(i)]. Got (1) for array with shape (-2147475456).
Lazy slice y[-1] reports the correct shape (64, 64, 1), but mx.eval(y[-1]) triggers the same wrapped metal::malloc error.
The inferred shape -2147475456 is 2147491840 interpreted as signed int32 confirming overflow is in shape/size accounting, not just the allocator.
Boundary
Verified on mlx 0.31.1 and local 0.31.2.dev20260325+84099a14b:
| output elements |
reshape |
eval |
slice materialization |
| 2147479552 (2³¹ - 4096, n=524287) |
✓ |
✓ |
✓ |
| 2147483648 (exactly 2³¹, n=524288) |
✗ negative shape |
✗ wrapped alloc |
✗ wrapped alloc |
| 2147491840 (reproducer, n=524290) |
✗ negative shape |
✗ wrapped alloc |
✗ wrapped alloc |
The overflow triggers at exactly 2^31 elements. The last passing case 2^31 - 4096 falls naturally out of the (n, 64, 64, 1) shape grid.
Root cause
The exact 2^31 boundary is consistent with ShapeElem = int32_t in mlx/array.h. The wrapped allocation and negative shapes appear to be downstream symptoms of shape/size overflow at the host level before Metal is involved. Additional affected paths likely exist.
Environment
- Device: Apple M3 Max
- Memory: 137438953472 bytes
- Max buffer length: 86586540032 bytes
- Reproduced three times in independent fresh Python processes with identical results
- Reproduced on both installed wheel (
mlx 0.31.1) and local main build
I ran into this while trying to write an end-to-end regression for a Metal
conv_generaloutput-offset bug.Reproducer
Output shape:
(524290, 64, 64, 1), total elements:2147491840.Expected
Either the array evaluates successfully, or MLX raises a clean size-limit error with the true requested size.
Actual
mx.eval(y)fails with a wrapped allocation size:18446744069414600704is2^64 - 4294950912consistent with signed-to-unsigned overflow in size bookkeeping.Related failures from the same
yLazy slice
y[-1]reports the correct shape(64, 64, 1), butmx.eval(y[-1])triggers the same wrappedmetal::mallocerror.The inferred shape
-2147475456is2147491840interpreted as signedint32confirming overflow is in shape/size accounting, not just the allocator.Boundary
Verified on
mlx 0.31.1and local0.31.2.dev20260325+84099a14b:The overflow triggers at exactly
2^31elements. The last passing case2^31 - 4096falls naturally out of the(n, 64, 64, 1)shape grid.Root cause
The exact
2^31boundary is consistent withShapeElem = int32_tinmlx/array.h. The wrapped allocation and negative shapes appear to be downstream symptoms of shape/size overflow at the host level before Metal is involved. Additional affected paths likely exist.Environment
mlx 0.31.1) and local main build