Rosetta: fix some uncaught async exceptions causing 502s#18720
Open
glyh wants to merge 2 commits intocompatiblefrom
Open
Rosetta: fix some uncaught async exceptions causing 502s#18720glyh wants to merge 2 commits intocompatiblefrom
glyh wants to merge 2 commits intocompatiblefrom
Conversation
…uter The previous try...with only caught synchronous exceptions during deferred construction, missing exceptions raised inside async callbacks (e.g. Yojson.Basic.from_string in graphql.ml when the daemon returns a 200 with non-JSON body). Those escaped to on_handler_error and could crash the process. Monitor.try_with catches exceptions across the full async deferred chain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dkijania
approved these changes
Apr 3, 2026
Member
|
It would be nice to have some tests if current behavior is not covered |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Rosetta
/network/statusendpoint (and all other routes) was producing 502 errors under daemon stress. Two bugs combined to cause this:Bug 1 —
Yojson.Basic.from_stringcould throw an uncaught exception (graphql.ml)When the Mina daemon returns a HTTP 200 with a non-JSON body (e.g. during restart or under load),
Yojson.Basic.from_stringthrowsYojson.Json_error. This call lived inside an Async deferred callback, so it was never caught by the synchronoustry...withinrosetta.ml. The exception escaped toon_handler_error, leaving the HTTP connection in a broken state (or killing the process ifMINA_ROSETTA_TERMINATE_ON_SERVER_ERRORwas set) — both visible as 502s from any reverse proxy.Bug 2 —
try...withinrouterdoes not cover async exceptions (rosetta.ml)The top-level exception handler in
routerwas a plain OCamltry...with. In Async, this only catches exceptions raised synchronously during deferred construction. Any exception raised inside alet%bind/let%mapcallback fires later in the event loop, outside thetry...withscope, and propagates unhandled toon_handler_error.