Skip to content

Fix ContinuationStore desynchronization#8741

Open
tlively wants to merge 2 commits into
mainfrom
interpreter-suspend-fix
Open

Fix ContinuationStore desynchronization#8741
tlively wants to merge 2 commits into
mainfrom
interpreter-suspend-fix

Conversation

@tlively
Copy link
Copy Markdown
Member

@tlively tlively commented May 21, 2026

When a primary module execution traps or suspends to the host, its
continuation store is cleared. Previously, this was done by reassigning
the shared_ptr to a new ContinuationStore instance. However, secondary
(linked) modules that were instantiated prior to this still hold the
original shared_ptr to the old ContinuationStore. This led to
desynchronization, where the secondary module would run with stale
continuation state (including leaked continuations and resuming flags),
eventually causing crashes like assertion failures in visitSuspend.

This fix changes clearContinuationStore to clear the ContinuationStore
in-place (clearing the continuations vector and resetting resuming flag)
instead of reassigning the shared_ptr, ensuring all linked modules
continue to share the same cleared state.

Added a lit test to verify the fix and prevent regression.

When a primary module execution traps or suspends to the host, its
continuation store is cleared. Previously, this was done by reassigning
the shared_ptr to a new ContinuationStore instance. However, secondary
(linked) modules that were instantiated prior to this still hold the
original shared_ptr to the old ContinuationStore. This led to
desynchronization, where the secondary module would run with stale
continuation state (including leaked continuations and resuming flags),
eventually causing crashes like assertion failures in visitSuspend.

This fix changes clearContinuationStore to clear the ContinuationStore
in-place (clearing the continuations vector and resetting resuming flag)
instead of reassigning the shared_ptr, ensuring all linked modules
continue to share the same cleared state.

Added a lit test to verify the fix and prevent regression.
@tlively tlively requested a review from kripken May 21, 2026 01:05
@tlively tlively requested a review from a team as a code owner May 21, 2026 01:05
Copy link
Copy Markdown
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with that

Comment thread src/wasm-interpreter.h
#endif
continuationStore = std::make_shared<ContinuationStore>();
continuationStore->continuations.clear();
continuationStore->resuming = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put this in a clear() method on the class, and add a comment here why we call clear to clear it in-place as opposed to wiping it from orbit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(putting in the class makes it less likely we forget to add something to clear() if we add something to the class)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants