Fix connection pool lifecycle to stop leaking connections by jens · Pull Request #177 · crystaldba/postgres-mcp

jens · 2026-06-05T14:49:26Z

Problem

The stdio server can outlive its client and hold pooled connections indefinitely, eventually exhausting the database role's connection limit. Once that happens, every newly started server fails to connect with FATAL: too many connections for role "...".

This is the leak described in #98.

Root cause

An MCP client signals shutdown by closing the server's stdin, which makes run_stdio_async() return. But if the client is force-killed (crash, kill -9, or a supervisor tearing down the process tree), stdin EOF never arrives — run_stdio_async() never returns and the process is reparented to init, lingering indefinitely.

Each lingering process keeps its pool open. With min_size=1 the pool pins at least one connection for the whole life of the process, so a handful of orphaned servers is enough to exhaust a modest per-role CONNECTION LIMIT.

Fix

Zero idle footprint. min_size=0 + max_idle=60, so an idle server holds no connections and drifts back to zero between queries; max_size lowered 5 → 3.
Orphan watchdog (stdio only). A lightweight task runs alongside the transport and detects orphaning (the process is reparented to PID 1), closes the pool, and exits. The transport coroutine can't be cancelled cleanly — it blocks in a stdin reader thread — so the watchdog releases the pool and hard-exits rather than trying to unwind it.
Guaranteed release. The pool is closed in a finally on every clean exit path (stdin EOF, signal, error), and the watchdog is cancelled there.

Either change alone reduces the impact; together, an idle or orphaned server holds zero connections, and an orphaned one exits within a couple of seconds.

Tests

tests/unit/test_lifecycle.py covers: the watchdog closes the pool before exiting, waits until actually orphaned, is started exactly once and cancelled on clean stdio exit, and is never started for the sse/streamable-http transports.

Notes

Orphan detection keys on reparenting to PID 1, the common case on Linux and macOS. Under a PID namespace or a custom subreaper the parent may differ; there the watchdog simply doesn't fire, and the min_size=0 change still bounds the leak.

Fixes #98

The stdio server could outlive its client and hold pooled connections indefinitely. A client normally signals shutdown by closing stdin, which makes run_stdio_async() return. But if the client is force-killed, stdin EOF never arrives, run_stdio_async() never returns, and the process is reparented to init — lingering forever. With min_size=1 each such process pins at least one connection, so a few orphans exhaust a low per-role connection limit, after which every new server fails with "too many connections for role". - Pool: min_size=0 + max_idle=60 so an idle server drifts back to zero connections; lower max_size 5 -> 3. - stdio: run a lightweight watchdog alongside the transport that detects orphaning (parent reparented to PID 1), closes the pool, and exits. The transport coroutine can't be cancelled cleanly (it blocks in a stdin reader thread), so the watchdog releases the pool and hard-exits. - Release the pool in a finally on every clean exit path (stdin EOF, signal, or error), and stop the watchdog there. Add unit tests for the watchdog behavior and per-transport wiring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix connection pool lifecycle to stop leaking connections#177

Fix connection pool lifecycle to stop leaking connections#177
jens wants to merge 1 commit into
crystaldba:mainfrom
jens:fix/connection-pool-lifecycle

jens commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jens commented Jun 5, 2026

Problem

Root cause

Fix

Tests

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant