fix(browser): close orphaned browser before relaunch to stop headless_shell leak#6
Open
wrdo-dev wants to merge 1 commit into
Open
Conversation
…_shell leak launchBrowser() assigned `_browser = await chromium.launch()` without closing any existing browser first. When the connection state desyncs from the OS process — the `disconnected` handler nulls `_browser` while the headless_shell process can still be alive, or a post-launch step (newContext/newPage) throws and the catch-block launches a *second* browser — the previous browser process was abandoned. ensureBrowser() then relaunches on the next tool call, so idle headless_shell processes accumulate (~6 per relaunch cycle) until the host OOMs. Fix: - Add discardBrowser(): best-effort close that never blocks a relaunch. - launchBrowser() closes the current browser before launching a new one. - The fallback (system-Chrome) path discards the partially-launched primary browser before its second launch attempt. - The `disconnected` handler now closes the browser (deterministic Playwright handle cleanup) and only clears shared state if it still owns `_browser`, so a stale event from a prior browser can't null the current one. Verified on a host with chromium installed: two forced disconnect→relaunch cycles hold the headless_shell process count flat instead of growing by ~6 each cycle.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
launchBrowser()insrc/browser/manager.tsassigns_browser = await chromium.launch()without closing any existing browser first. The browser model is a single shared_browsersingleton, but several paths abandon a live OS process:disconnectedhandler sets_browser = null, but the underlyingheadless_shellprocess can still be alive (e.g. a transient disconnect, or the JS handle dropped while the process lingers).ensureBrowser()then sees!_browserand callslaunchBrowser()again → a fresh Chromium spawns and the previous process is orphaned.chromium.launch()on the primary path succeeds but a later step (newContext()/newPage()) throws, control jumps to thecatchblock, which launches a second browser (system-Chrome fallback) and never closes the first.Because nothing closes the abandoned process, idle
headless_shellprocesses accumulate — roughly one browser cluster (~6 processes on Linux) per relaunch cycle — until the host runs out of memory and the OOM-killer fires.Observed in production (v1.1.0): glance running as a child MCP accumulated ~7 orphaned browser clusters (40+
headless_shell, ~0.6 GB unique RSS) over ~90 minutes, all idle (<1% CPU), eventually triggering a kernel OOM that took down sibling processes.Fix
discardBrowser()helper — a best-effortclose()that swallows errors so a close failure can never block a relaunch.launchBrowser()now closes the current browser (and clears shared state) before launching a new one.chromium.launch().disconnectedhandler nowclose()s the browser (deterministic Playwright-side handle cleanup) and only clears the shared_browser/_context/_pagesstate if it still owns_browser— so a latedisconnectedevent from a previously-replaced browser can't null out the current one.The duplicated launch→context→page setup is factored into a shared
attach()closure to keep the primary and fallback paths in sync.Verification
Built with
npm run build(esbuild, clean). On a host with chromium installed, drove the manager through repeated forced disconnect → relaunch cycles (SIGKILL the browser main process, then navigate again). With this change theheadless_shellprocess count stays flat at one browser cluster across cycles; onmainit grows by a full cluster each cycle.No API or behavior change — same single-browser model, just without leaking the old process on replacement.