Add /healthz (liveness) and /readyz (readiness) HTTP probes#6
Merged
Conversation
Adds two unauthenticated HTTP endpoints intended for orchestrators and
load balancers:
- /healthz (liveness): always returns 200 with {"status": "ok"} as
long as the process can serve a request. Deliberately does NOT touch
the database — a database outage must not cause Kubernetes to
restart the pod, because a restart would not help.
- /readyz (readiness): runs SELECT 1 against the configured database
and verifies that all on-disk migrations have been applied. Returns
200 with {"status": "ready", ...} on success, or 503 with a
structured payload when the database is unreachable or pending
migrations exist, so the load balancer drops the pod from rotation
until the process is fully usable.
Both endpoints live outside the /api/ namespace, so the existing
api_auth_required_for_path() helper bypasses authentication without
any middleware change.
The before_request hook is taught to skip its implicit db.connect()
for the two probe paths: /healthz must remain answerable when the DB
is down, and /readyz manages its own connection explicitly so it can
return a clean 503 on DB errors.
Tests (tests/test_health.py, 10 cases) cover:
- /healthz returns 200 even when init_database is patched to raise;
- /healthz needs no auth;
- /readyz returns 200 on the happy path (DB ok, no pending migrations);
- /readyz returns 503 when SELECT 1 raises (DB outage);
- /readyz returns 503 when a fake pending migration appears on disk;
- /readyz returns 503 when the migration check itself raises;
- both probes are outside /api/* (regression guard).
All 10 new tests pass locally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/healthz(liveness, always 200, no DB) and/readyz(readiness, runsSELECT 1and checks pending migrations, 200 or 503)./api/, so the existing auth middleware bypasses them with no change.before_requestso/healthzkeeps answering when the database is down.This unblocks a Kubernetes Helm chart and any HTTP load balancer (HAProxy / nginx / ELB) that today can only fall back to a TCP socket check.
Test plan
tests/test_health.py— 10 new cases covering happy path, DB outage (mocked), pending migrations (mocked), migration-check exceptions, no-auth requirement, and a regression guard that both paths stay outside/api/*. All pass locally.tests/test_incidents_api.pyandtests/test_maintenance_windows.py— they already exist onmainand the touched files are not modified here.curlagainst a running server, healthy and with the DB pointed at an unreadable path:/healthz/readyz200 {"status":"ok"}200 {"status":"ready","database":"ok",…}200 {"status":"ok"}503 {"status":"not_ready","database":"error","database_error":"unable to open database file"}POST /api/integrations/alertmanager(sanity)401—/api/*auth is untouchedTestsworkflow on this PR (Python 3.10 / 3.11 / 3.12).This is an AI-assisted PR.