Fix GIL deadlocks during raw C/Perl forks by integrating PyOS fork handlers#1
Open
dshivashankar1994 wants to merge 1 commit into
Open
Fix GIL deadlocks during raw C/Perl forks by integrating PyOS fork handlers#1dshivashankar1994 wants to merge 1 commit into
dshivashankar1994 wants to merge 1 commit into
Conversation
…ndlers When `pyperl` executes a raw POSIX `fork()` while background Python threads (such as OpenTelemetry metric exporters) are active, the newly created child process inherits an exact memory clone of the parent. Crucially, if a background thread held the Python GIL or other internal CPython locks (like the import lock) at the exact time of the fork, the child process inherits those locks in a permanently locked state held by non-existent "ghost" threads. When the child process subsequently attempts to execute Python code, or even attempts to terminate (which triggers `Py_Finalize` for garbage collection), it attempts to acquire the GIL, resulting in a permanent deadlock. Furthermore, because the `fork()` originates from C/Perl, Python's internal fork lifecycle is bypassed. High-level fork safety mechanisms registered via `os.register_at_fork()` (which are designed to gracefully pause background threads before a fork) are never executed. This patch implements `pthread_atfork` hooks at the C-extension layer to properly synchronize raw C-level forks with the Python interpreter (leveraging APIs introduced in Python 3.7+). * **`_atfork_prepare`:** Safely acquires the Python GIL via `PyGILState_Ensure()` and explicitly calls `PyOS_BeforeFork()`. This forces the Python interpreter to execute all registered Python-level `before` fork handlers, allowing background threads to safely park themselves and relinquish internal locks *before* the OS duplicates the memory. * **`_atfork_parent` / `_atfork_child`:** Calls the respective `PyOS_AfterFork_Parent()` and `PyOS_AfterFork_Child()` C-API functions. This triggers the Python-level `after_in_parent` and `after_in_child` resume handlers, sanitizes the interpreter state in the new process, and finally releases the GIL back to the system via `PyGILState_Release()`. By acquiring the GIL and delegating the fork preparation to Python's native C-API, we guarantee the child process inherits a strictly safe, single-threaded Python memory state.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When
pyperlexecutes a raw POSIXfork()while background Python threads (such as OpenTelemetry metric exporters) are active, the newly created child process inherits an exact memory clone of the parent. Crucially, if a background thread held the Python GIL or other internal CPython locks (like the import lock) at the exact time of the fork, the child process inherits those locks in a permanently locked state held by non-existent "ghost" threads.When the child process subsequently attempts to execute Python code, or even attempts to terminate (which triggers
Py_Finalizefor garbage collection), it attempts to acquire the GIL, resulting in a permanent deadlock.Furthermore, because the
fork()originates from C/Perl, Python's internal fork lifecycle is bypassed. High-level fork safety mechanisms registered viaos.register_at_fork()(which are designed to gracefully pause background threads before a fork) are never executed.This patch implements
pthread_atforkhooks at the C-extension layer to properly synchronize raw C-level forks with the Python interpreter (leveraging APIs introduced in Python 3.7+)._atfork_prepare: Safely acquires the Python GIL viaPyGILState_Ensure()and explicitly callsPyOS_BeforeFork(). This forces the Python interpreter to execute all registered Python-levelbeforefork handlers, allowing background threads to safely park themselves and relinquish internal locks before the OS duplicates the memory._atfork_parent/_atfork_child: Calls the respectivePyOS_AfterFork_Parent()andPyOS_AfterFork_Child()C-API functions. This triggers the Python-levelafter_in_parentandafter_in_childresume handlers, sanitizes the interpreter state in the new process, and finally releases the GIL back to the system viaPyGILState_Release().By acquiring the GIL and delegating the fork preparation to Python's native C-API, we guarantee the child process inherits a strictly safe, single-threaded Python memory state.