Commit 1432557
committed
fix: store Metal completion handler errors instead of throwing
Metal completion handlers run on dispatch queues where C++ exceptions
cannot propagate — throwing causes std::terminate → SIGABRT, crashing
the process with no diagnostic information.
Instead, store the error message atomically in the CommandEncoder and
check it at the next synchronous point (commit, synchronize). This
converts fatal crashes into catchable runtime_error exceptions that
the application can handle gracefully.
Root cause analysis: the crash at 262K+ context reported as mlx#3216
was actually TWO separate issues:
1. Thread safety in stream management (fixed by PR #3281)
2. C++ exceptions thrown from Metal completion handler callbacks
(fixed by this commit)
The GPU watchdog error (kIOGPUCommandBufferCallbackErrorImpactingInteractivity)
is a separate concern — macOS kills command buffers that block the GPU
beyond the watchdog threshold. This commit ensures that error is reported
as a Python RuntimeError instead of SIGABRT.1 parent f35ce26 commit 1432557
3 files changed
Lines changed: 70 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
424 | 424 | | |
425 | 425 | | |
426 | 426 | | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
427 | 447 | | |
| 448 | + | |
428 | 449 | | |
429 | 450 | | |
430 | 451 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
90 | 91 | | |
91 | 92 | | |
92 | 93 | | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
93 | 101 | | |
94 | 102 | | |
95 | 103 | | |
| |||
125 | 133 | | |
126 | 134 | | |
127 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
128 | 142 | | |
129 | 143 | | |
130 | 144 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
56 | 57 | | |
57 | 58 | | |
58 | 59 | | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
59 | 65 | | |
60 | 66 | | |
61 | 67 | | |
62 | 68 | | |
63 | | - | |
| 69 | + | |
64 | 70 | | |
65 | | - | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
66 | 77 | | |
67 | 78 | | |
68 | 79 | | |
69 | 80 | | |
70 | | - | |
71 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
72 | 88 | | |
73 | 89 | | |
74 | 90 | | |
| |||
77 | 93 | | |
78 | 94 | | |
79 | 95 | | |
| 96 | + | |
80 | 97 | | |
81 | | - | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
82 | 106 | | |
83 | 107 | | |
84 | 108 | | |
85 | 109 | | |
86 | 110 | | |
87 | 111 | | |
| 112 | + | |
88 | 113 | | |
89 | 114 | | |
90 | 115 | | |
91 | 116 | | |
92 | 117 | | |
93 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
94 | 122 | | |
95 | 123 | | |
96 | 124 | | |
| |||
0 commit comments