Skip to content

Wait-event coverage gaps: where backends block but pg_stat_activity reports NULL(iteration 3) #49

Description

@NikolayS

Background

The "wait events for server logging" patch (thread) is one instance of a broader pattern: a backend blocks on something, but pg_stat_activity.wait_event is NULL, so it reads as "on CPU" and the stall is invisible. This issue catalogs the similar gaps so they can be tracked/prioritized.

Confirmed against the current tree: src/backend/libpq/auth.c and contrib/postgres_fdw contain zero pgstat_report_wait_start / WAIT_EVENT_* instrumentation.

Genuine gaps — backend blocks but reports NULL

  1. Outbound libpq / FDW network waits. postgres_fdw, dblink, and libpqwalreceiver block on a remote server through libpq with no wait event — there is no ClientRead/ClientWrite equivalent for outbound connections. A backend stuck on a slow foreign server looks idle-on-CPU. Likely the biggest real-world gap.

  2. Authentication. auth.c is entirely uninstrumented: LDAP (ldap_search), PAM conversation, RADIUS socket, Kerberos/GSSAPI, SSPI, plus DNS (getaddrinfo) and reverse lookups during pg_hba matching. These block on external services, often for seconds. (Partly because auth runs before stats are fully set up.)

  3. TLS handshake. SSL_accept and cert/CRL loading during connection setup — the negotiation itself is not a wait event (only steady-state ClientRead/Write are).

  4. archive_command / restore_command. The archiver/startup process shells out via system() and blocks for the entire external command with no wait event. Recovery stalled on a slow restore_command is invisible.

  5. Non-VFD file ops. Directory scans and metadata syscalls outside the smgr/File layer — opendir/readdir over pg_wal, tablespaces; unlink/rename/stat during startup and checkpoint segment recycling; config-file and pg_hba/SSL-file reads on reload. Most are not wrapped.

  6. The syslogger's own disk writes. write_syslogger_file() fwrite() to the on-disk log file, plus rotation (fopen/fclose) and the current_logfiles metainfo write. This is where a slow log device actually blocks hardest. Caveat: the syslogger does not attach to shared memory and has no pg_stat_activity entry, so it currently cannot report wait events at all — fixing this needs a bigger change.

By-design NULL, but arguably the same problem

  1. Pure CPU work (sort, hash, aggregation, expression/qual eval, (de)compression). Genuinely on-CPU, so NULL is "correct" — but indistinguishable from an uninstrumented wait. This ambiguity is the meta-gap behind all of the above.

  2. Memory pressuremalloc/mmap/page faults/swap-in. Backend blocked in the kernel, reports NULL. Hard to capture from userspace.

  3. Spinlocks. LWLocks are instrumented; spinlock spin-delays under contention are not (by design — meant to be short, but pathological cases hide here).

Structural root cause

NULL is overloaded: it means both "on CPU" and "blocked somewhere we didn't instrument." Two recurring directions worth pursuing:

  • Give auxiliary/early processes (syslogger, archiver, some auth contexts) real backend-status entries so they can report.
  • Add a way to positively signal "actually running" so NULL stops being a catch-all.

There is a generic Extension wait event extensions can adopt, but in-core libpq users like postgres_fdw don't, which is why #1 stays dark.


Catalog compiled while reviewing the server-logging wait-events patch; each item is a candidate for its own focused patch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions