Skip to content

Commit 1590ba6

Browse files
committed
docs: add release readiness checklist runbook and notes
1 parent e0cbf60 commit 1590ba6

5 files changed

Lines changed: 163 additions & 10 deletions

File tree

docs/OPERATIONS_RUNBOOK.md

Lines changed: 78 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,84 @@
22

33
## Scope
44

5-
Operational procedures for production incidents, recovery, and maintenance.
5+
Operational procedures for production incidents, recovery, and maintenance across Netlify + Supabase.
66

7-
## Runbook Areas
7+
## Ownership
88

9-
- Incident triage
10-
- Service degradation response
11-
- Data export job failures
12-
- Migration rollback steps
9+
- Incident Commander: on-call platform engineer.
10+
- Communications Lead: project owner or delegate.
11+
- Database Lead: engineer with Supabase migration permissions.
1312

14-
Detailed runbook tasks are tracked in `/docs/plan/M08-quality-ci-cd-and-observability.md` and `/docs/plan/M09-release-readiness-and-pilot.md`.
13+
## Severity levels
14+
15+
- `SEV-1`: Platform unavailable, cross-tenant risk, or data integrity risk.
16+
- `SEV-2`: Major feature degradation (submissions, moderation, exports, API unavailable).
17+
- `SEV-3`: Partial degradation with viable workaround.
18+
19+
## First 15 minutes
20+
21+
1. Declare incident severity and open incident channel.
22+
2. Confirm blast radius:
23+
- Public browse/commenting
24+
- Agency operations/moderation
25+
- Public API exports
26+
3. Freeze deploys until incident is stabilized.
27+
4. Capture current signals:
28+
- Netlify deploy health
29+
- Supabase project status
30+
- Recent function errors (`submit-comment`, `public-api`, `generate-export`)
31+
32+
## Core playbooks
33+
34+
### 1) Public submission failures
35+
36+
- Check edge function logs for `submit-comment` errors.
37+
- Validate required environment variables: `HCAPTCHA_SECRET_KEY`, Supabase keys.
38+
- Check abuse-event volume spikes indicating CAPTCHA/rate-limit pressure.
39+
- Mitigation:
40+
- Temporarily reduce traffic pressure using stricter edge throttles.
41+
- If CAPTCHA provider outage occurs, switch to moderated maintenance banner for submissions.
42+
43+
### 2) Public API degradation
44+
45+
- Check `public-api` function logs and latency.
46+
- Verify `api_rate_limits` table growth and cleanup behavior.
47+
- Confirm RPC dependencies (`get_public_dockets`, `get_docket_public_detail`, `get_comment_detail`) return expected responses.
48+
- Mitigation:
49+
- Increase edge function concurrency limits (where available).
50+
- Apply temporary lower rate limits for abusive routes.
51+
52+
### 3) Export pipeline failures
53+
54+
- Inspect `exports` table rows in `failed`/stalled `processing`.
55+
- Check `generate-export` function errors and storage write failures.
56+
- Mitigation:
57+
- Requeue failed jobs by creating replacement job records.
58+
- Expire stale jobs and notify agency users.
59+
60+
### 4) Data integrity or policy risk
61+
62+
- Use `audit_events` for timeline reconstruction.
63+
- Use `abuse_events` to detect suspicious submission patterns.
64+
- If cross-tenant risk suspected: disable affected endpoints and enforce read-only mode for agency actions until triaged.
65+
66+
## Rollback procedures
67+
68+
### Application rollback (Netlify)
69+
70+
1. Identify last known good deploy.
71+
2. Promote previous deploy in Netlify UI/CLI.
72+
3. Re-run smoke checks on public + agency flows.
73+
74+
### Database rollback (Supabase)
75+
76+
1. Stop deploy pipeline.
77+
2. Identify last migration applied before incident.
78+
3. Execute approved rollback playbook for affected migration set.
79+
4. Validate RLS policies and key RPCs before re-opening write traffic.
80+
81+
## Post-incident actions
82+
83+
1. Publish incident summary with root cause and customer impact.
84+
2. Create follow-up tasks for preventive fixes.
85+
3. Update this runbook if a playbook gap was discovered.

docs/RELEASE_NOTES.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Release Notes
2+
3+
## v1.0.0-rc1 (milestone-driven candidate)
4+
5+
### Platform foundation
6+
7+
- Added formal milestone tracking under `docs/plan` with append-only progress logs.
8+
- Reconciled schema contracts with application usage and added missing domain tables/RPCs.
9+
- Added CI pipeline checks for lint, typecheck, tests, and build.
10+
11+
### Agency operations
12+
13+
- Replaced mock docket list/dashboard/detail behavior with real Supabase-backed workflows.
14+
- Added docket edit-mode support in wizard route (`/agency/dockets/:id/edit`).
15+
- Canonicalized moderation queue behavior for `under_review` and `published` status families.
16+
17+
### Public workflow
18+
19+
- Hardened public comment submission with server-side CAPTCHA verification and rate limits.
20+
- Enforced per-docket identity policy and deterministic initial moderation status.
21+
- Added shared submission policy module and unit tests for identity/status rule coverage.
22+
23+
### Open data and exports
24+
25+
- Delivered public API v1 routes for dockets, comments, agencies, and exports.
26+
- Added API route rate limiting primitives and metadata behavior.
27+
- Completed export generation paths for CSV/ZIP/combined outputs.
28+
29+
### Security and compliance
30+
31+
- Added immutable `audit_events` and `abuse_events` telemetry.
32+
- Added PII-safe export masking controls.
33+
- Added control mapping and observability documentation.
34+
35+
### Verification additions
36+
37+
- Added schema contract tests ensuring `.from()`/`.rpc()` usage aligns with migrations.
38+
- Added tenant isolation policy tests for RLS assumptions.
39+
- Added submission policy tests for identity modes and status assignment.
40+
41+
## Known open items before final GA tag
42+
43+
- Fresh local migration replay validation is blocked without Docker daemon.
44+
- Pilot execution evidence and pilot-driven hardening are still required.
45+
- Final WCAG 2.1 AA evidence pack for required pages remains open.
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Release Readiness Checklist
2+
3+
This checklist is the operational gate for OpenComments release promotion.
4+
5+
## Current execution snapshot (2026-02-11)
6+
7+
- [x] `npm run lint` passes with warnings only (no lint errors).
8+
- [x] `npm run typecheck` passes.
9+
- [x] `npm run test:ci` passes.
10+
- [x] `npm run build` passes.
11+
- [ ] Local clean-database migration replay (`supabase db reset --local`) verified in this environment.
12+
- [x] Public API v1 route contracts documented.
13+
- [x] Submission workflow identity/rate/captcha controls implemented and tested.
14+
- [x] Tenant isolation policy assumptions are validated by tests.
15+
- [x] Export pipeline supports CSV/ZIP/combined outputs and status tracking.
16+
- [x] Immutable audit and abuse event logging are active in schema/migrations.
17+
- [ ] Pilot agency workflow sign-off completed.
18+
- [ ] Production cutover approval recorded.
19+
20+
## Gate criteria
21+
22+
A release candidate can be promoted only when:
23+
24+
1. All mandatory checks above are complete.
25+
2. Any unchecked items are explicitly accepted as risk by release owner.
26+
3. Rollback path is verified for both app deploy and database migration.
27+
28+
## Required evidence
29+
30+
- CI run URL or captured command output for lint/typecheck/test/build.
31+
- Migration report for target environment.
32+
- API smoke test results for `/v1/dockets`, `/v1/comments`, `/v1/agencies`, `/v1/exports`.
33+
- Security checklist sign-off (abuse controls, audit logging, PII handling).
34+
- Rollback dry run notes.

docs/plan/M09-release-readiness-and-pilot.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,10 @@ Complete production readiness verification and pilot execution.
1414

1515
## Implementation checklist
1616

17-
- [ ] Define and execute readiness checklist.
17+
- [x] Define and execute readiness checklist.
1818
- [ ] Execute pilot and capture findings.
1919
- [ ] Apply pilot-driven fixes.
20-
- [ ] Publish runbook and release notes.
20+
- [x] Publish runbook and release notes.
2121

2222
## Acceptance criteria
2323

@@ -35,3 +35,6 @@ Complete production readiness verification and pilot execution.
3535
## Progress log (append-only)
3636

3737
- 2026-02-11: Milestone initialized.
38+
- 2026-02-11: Published `docs/RELEASE_READINESS_CHECKLIST.md` and executed current engineering quality gates (lint/typecheck/test/build) as readiness evidence.
39+
- 2026-02-11: Expanded `docs/OPERATIONS_RUNBOOK.md` from placeholder to actionable incident and rollback playbooks for production operations.
40+
- 2026-02-11: Published `docs/RELEASE_NOTES.md` with milestone-driven v1.0.0-rc1 candidate notes and open GA blockers.

docs/plan/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,4 @@ This folder tracks implementation progress for the OpenComments production roadm
4040
- `M06`: Completed
4141
- `M07`: In progress
4242
- `M08`: Completed
43-
- `M09`: Planned
43+
- `M09`: In progress

0 commit comments

Comments
 (0)