Skip to content

Stdio MCP server busy-spins at 100% CPU forever when its client dies (orphaned on editor reload) #799

@Venkateshvenki404224

Description

@Venkateshvenki404224

Summary

When the MCP client that spawned codegraph serve --mcp dies (e.g. a VS Code window
reload kills the Claude Code session), the codegraph stdio server does not exit on
stdin EOF. Instead its main JS thread enters a tight userspace loop polling the dead
stream, pinning one CPU core at 100% indefinitely. The orphan survives reparented to
PID 1 and keeps burning a full core until manually killed (28+ minutes observed, would
run forever).

The problem is aggravated by the npx launch chain
(npm exec → sh → bin shim → bundled node binary): the client's kill signal hits the
top of the chain but does not propagate to the actual server binary 4 levels down, so
every editor reload can leak one of these spinners.

Environment

codegraph @colbymchenry/codegraph 0.9.7 (npm latest at time of report: 0.9.9)
Bundled runtime node v24.16.0 (@colbymchenry/codegraph-linux-x64, launched with --liftoff-only)
OS Linux 6.8.0-124-generic x86_64 (Ubuntu)
MCP client Claude Code VS Code extension 2.1.173
Launch config (.mcp.json) {"command": "npx", "args": ["-y", "@colbymchenry/codegraph", "serve", "--mcp"]}
Workspace ~4.3 MB .codegraph/codegraph.db, daemon v0.9.7 healthy

Steps to reproduce

  1. Configure codegraph as a stdio MCP server via npx (config above) in any MCP client.
  2. Let the client spawn the server, then kill the client abruptly (VS Code window
    reload / kill the client process). A reload race also reproduces it: if the client
    dies while npm exec is still resolving, the server starts with stdin already at EOF
    and spins from its very first second.
  3. Observe the leftover codegraph serve --mcp process at 100% CPU, reparented to PID 1.

Evidence

1. 100% CPU, and CPU time ≥ wall time — spinning since birth (evidence/03-orphan-chain-cpu.txt):

    PID    PPID                  STARTED     ELAPSED     TIME %CPU STAT CMD
2923221       1 (reparented to systemd)                            npm exec @colbymchenry/codegraph serve --mcp
2924314 2924305 Thu Jun 11 14:32:05 2026       28:07 00:28:09  100 Rl   .../codegraph-linux-x64/node --liftoff-only .../bin/codegraph.js serve --mcp

Elapsed 28:07, CPU time 28:09 → ≥100% CPU for the process's entire life.

2. Pure userspace spin, not I/O or indexing (evidence/05-proc-stat.txt):

utime(user): 1690.8s   stime(kernel): 23.01s     (98.6% userspace)
State: R (running)   Threads: 7
nonvoluntary_ctxt_switches: 92193

The graph DB was last written a day earlier — no indexing was happening.

3. Only the main JS thread spins; workers idle (evidence/04-thread-breakdown.txt):

    LWP %CPU S WCHAN                          COMMAND
2924314 99.4 R -                              MainThread
2924337  0.1 S futex_wait_queue               V8Worker
2924336  0.1 S futex_wait_queue               V8Worker
...

4. The stdio sockets have no peer — the client is gone (evidence/07-socket-peers.txt):

ss -xp shows fds 0/1/2 of the server are unix sockets whose only remaining users are
the codegraph-side processes themselves; the MCP client endpoint no longer exists.

5. Healthy instances for contrast: two other codegraph serve --mcp instances with
living clients (plus the shared daemon) sat at 0.0% CPU on the same machine at the same
time. Only the orphaned one spins. (evidence/01-top-snapshot.txt, 02-process-tree.txt)

6. Daemon side is unaffected (evidence/08-daemon.log): normal auto-syncs and idle
shutdowns; the spin is entirely in the per-client MCP stdio server.

Expected behavior

A stdio MCP server must treat stdin EOF / client disconnect as a shutdown signal and
exit. The read loop should never poll a closed stream in a tight loop.

Suspected cause

The stdin read loop appears to re-schedule itself synchronously (e.g.
setImmediate-style polling) when read() returns null/EOF instead of terminating on
the stream's end/close event — yielding an unbounded userspace loop. The
spin-from-birth case shows the EOF path is hit even before any MCP handshake.

Workarounds (for other users hitting this)

  • pgrep -af codegraph after editor reloads; kill orphans whose npm exec ancestor has PPID 1.
  • Launch the installed binary directly instead of via npx (shallower chain, kill
    signals reach the real process) — reduces leaks but doesn't fix the EOF spin itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions