fix: harden uninstall paths (stop jobs, narrow guards)#290
Conversation
pg_cron / pg_timetable jobs are catalog rows, not dependent objects: dropping the pgque schema or extension leaves pgque_ticker, pgque_retry_events, pgque_maint and pgque_rotate_step2 behind, failing every 1-30 s forever. sql/pgque-tle-uninstall.sql now calls pgque.stop() before drop extension, and both uninstall scripts perform the drop in the same do block so a real stop() failure aborts the uninstall. The previous catch-all handler in sql/pgque_uninstall.sql is narrowed to undefined_function / invalid_schema_name (the sqlstates raised when pgque is not installed), keeping the scripts idempotent without swallowing real errors such as cron.unschedule permission failures. Addresses findings C4 and C6 of #283. https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv
When pgque is installed as an extension (pg_tle path), pgque.uninstall() ran drop schema pgque cascade, which fails with a confusing dependency error: extension pgque requires it. Detect pg_extension membership first and raise a clear exception pointing to drop extension pgque cascade and the pg_tle uninstall script. Regenerates sql/pgque.sql and sql/pgque-tle.sql; covered by a new assertion in tests/test_tle_install.sql (runs in the pg_tle CI job). Addresses finding C9 of #283. https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv
The generated header claimed the file works in GUI tools, JDBC and libpq-direct callers, but the script uses \set ON_ERROR_STOP and \echo psql meta-commands, so any non-psql client fails on the first one. State that it is a psql script and point non-psql callers at pgtle.install_extension() with the sql/pgque.sql body instead. Regenerates sql/pgque-tle.sql. Addresses finding C7 of #283. https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv
In the pg_tle CI job, pgque is installed as an extension. Running sql/pgque-tle-uninstall.sql from test_uninstall_guard.sql there (correctly, after the C4/C9 fixes) drops the extension -- and the whole pgque schema with it -- mid-suite, so the "plain install must survive" assertion failed. Gate the sub-tests that execute the TLE uninstall script behind a psql \if on pgque not being an extension member, emitting the suite's usual SKIP notice; the extension path of the script is covered by tests/test_tle_install.sql. https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv
|
CI follow-up: "pg_tle install path" job fixed in 313a87c. Cause: in the pg_tle job, pgque is installed as an extension. Fix: gate the two sub-tests that execute Verification (local, plain install, PG 16):
https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv Generated by Claude Code |
Bugs
Four uninstall-path findings from #283:
sql/pgque-tle-uninstall.sqlrandrop extension if exists pgque cascadewithout callingpgque.stop()first. pg_cron / pg_timetable jobs are catalog rows, not dependent objects, sopgque_ticker(every 1 s),pgque_retry_events,pgque_maint, andpgque_rotate_step2survived the drop and failed every 1–30 s forever ("schema pgque does not exist"), spamming logs andcron.job_run_details.sql/pgque_uninstall.sqlwrappedpgque.stop()inexception when others then null, silently swallowing real failures (e.g.cron.unschedulepermission errors, the "untrusted pg_timetable schema owner" exception) and dropping the schema anyway — same orphaned-jobs outcome, but silent.sql/pgque-tle.sqlheader claimed the file "works in psql, GUI tools (DBeaver, etc.), JDBC, libpq-direct callers", but the script uses\set ON_ERROR_STOPand\echopsql meta-commands, so any non-psql client fails on the first one.pgque.uninstall()randrop schema pgque cascade, which fails on extension (pg_tle) installs with a confusing dependency error ("extension pgque requires it").Fixes
sql/pgque-tle-uninstall.sqlnow callspgque.stop()beforedrop extension, with the drop in the samedoblock, so a realstop()failure aborts the uninstall before anything is dropped.undefined_function/invalid_schema_name— empirically the sqlstates raised when pgque is not installed (42883 / 3F000) — keeping the scripts idempotent without swallowing realstop()failures.sql/pgque_uninstall.sqlalso performs the drop inside the samedoblock, so abort-before-drop holds regardless of clientON_ERROR_STOPsettings.build/transform.sh) now states the file is a psql script and points non-psql callers atpgtle.install_extension()with thesql/pgque.sqlbody.ON_ERROR_STOPbehavior is kept.pgque.uninstall()detectspg_extensionmembership first and raises a clear exception pointing todrop extension pgque cascade/sql/pgque-tle-uninstall.sql.sql/pgque.sqlandsql/pgque-tle.sqlregenerated viabash build/transform.shin the commits that touch sources.Tests (red/green TDD)
tests/test_uninstall_guard.sql(added totests/run_all.sql): swapspgque.stop()for instrumented fakes (real definition saved and restored), asserting (1) a raisingstop()abortssql/pgque_uninstall.sqlbefore the schema drop, (2)sql/pgque-tle-uninstall.sqlcallspgque.stop(), (3) a raisingstop()aborts the TLE script too. Test (1) failed (schema dropped) and test (2) failed (stop never called) at origin/main; all pass after the fix. Runs without pg_cron / pg_tle.tests/test_tle_install.sql(pg_tle CI job):pgque.uninstall()on an extension install must raise an error pointing todrop extension pgque cascade, leaving schema and extension intact.Verification (local PG 16, fresh scratch DBs)
pg_tle / pg_cron are not installed locally; the TLE-specific C9 assertion runs in the dedicated pg_tle CI job, and the new guard tests degrade cleanly without either extension.
Addresses findings C4, C6, C7, C9 of #283
https://claude.ai/code/session_01KAaEGkQZmey1D1xCsVGmqv
Generated by Claude Code