Skip to content

Add protocol incident response runbook#175

Merged
punk6529 merged 4 commits into
mainfrom
codex/protocol-incident-response-runbooks
Jun 12, 2026
Merged

Add protocol incident response runbook#175
punk6529 merged 4 commits into
mainfrom
codex/protocol-incident-response-runbooks

Conversation

@punk6529

@punk6529 punk6529 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add a no-secret protocol incident-response runbook for stuck auctions, failed/stale randomness, bad Merkle roots, curator claims, metadata/dependency mistakes, signer compromise, drop-pause decisions, and release artifact/evidence mistakes
  • add a focused incident-response checker and tests, then wire them into Makefile, CI, and local shell/PowerShell check wrappers
  • link the runbook from security, release readiness, tooling, audit package, randomizer/dependency operations, roadmap, and release artifact docs, then refresh release manifest/checksum outputs

Validation

  • python scripts/test_incident_response.py
  • python scripts/check_incident_response.py
  • python scripts/test_release_readiness.py
  • python scripts/check_release_readiness.py
  • python scripts/test_audit_package.py
  • python scripts/check_audit_package.py
  • python scripts/test_release_manifest.py
  • python scripts/generate_release_manifest.py --check
  • python scripts/test_release_checksums.py
  • python scripts/generate_release_checksums.py --check
  • python scripts/check_changelog.py
  • bash -n scripts/check.sh
  • PowerShell parser check for scripts/check.ps1
  • python -m py_compile for touched scripts/tests
  • rg -n "^#|^##|^###" docs\incident-response.md docs\release-readiness.md docs\tooling.md docs\randomizer-operations.md docs\dependency-operations.md SECURITY.md ops\ROADMAP.md ops\AUTONOMOUS_RUN.md
  • git diff --check
  • make check

CodeRabbit follow-up validation:

  • python scripts/test_release_readiness.py
  • python scripts/check_release_readiness.py
  • git diff --check

Closes #173

Summary by CodeRabbit

  • Documentation

    • Added an incident response runbook (scenarios, triage, evidence rules, reopening, maintenance) and updated operational, release-readiness, audit, tooling, dependency, randomizer, roadmap, and security guidance to reference it
  • Tests & Validation

    • Added automated checks and unit tests to verify the incident response runbook content and linked evidence
  • Chores

    • Integrated incident-response validation into CI, Makefile targets, release manifest generation, and release gating steps

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d7a8a9e-3c5c-4370-972d-8d79a44c009a

📥 Commits

Reviewing files that changed from the base of the PR and between db49a0e and 574804b.

📒 Files selected for processing (3)
  • release-artifacts/latest/SHA256SUMS
  • release-artifacts/latest/release-checksums.json
  • release-artifacts/latest/release-manifest.json
✅ Files skipped from review due to trivial changes (1)
  • release-artifacts/latest/SHA256SUMS
🚧 Files skipped from review as they are similar to previous changes (2)
  • release-artifacts/latest/release-checksums.json
  • release-artifacts/latest/release-manifest.json

📝 Walkthrough

Walkthrough

Adds an operator-facing incident response runbook, a CLI validator and unit tests, and integrates them into local/CI checks, release manifest generation, audit/readiness checkers, Makefile targets, tooling docs, and release-artifact metadata.

Changes

Incident Response Runbook and Validation

Layer / File(s) Summary
Incident response runbook definition
docs/incident-response.md
Operator-facing runbook defining severity levels, roles, universal triage procedure, evidence retention/no-secret rules, and focused runbooks for stuck auctions, failed/stale randomness, bad Merkle roots, bad metadata/dependency configuration, signer compromise, and release artifact mistakes, with reopening criteria and post-incident review requirements.
Runbook validator and test suite
scripts/check_incident_response.py, scripts/test_incident_response.py
CLI validator that enforces required Markdown headings, maturity/incident phrases, command substrings, and valid local link targets; comprehensive test suite verifying positive cases (committed runbook, minimal generated runbook) and negative cases (missing elements, dead links, path escapes).
Release & audit integration
scripts/generate_release_manifest.py, scripts/check_release_readiness.py, scripts/check_audit_package.py, scripts/test_release_manifest.py, scripts/test_release_readiness.py
Extends release readiness and audit package checkers to require incident-response phrase, commands, and documentation link; includes runbook in default governance docs embedded in release manifest; seeds test fixtures with docs/incident-response.md.
Build and CI gate wiring
Makefile, .github/workflows/ci.yml, scripts/check.sh, scripts/check.ps1
Adds incident-response-check phony target and wires it into local make check and release-manifest targets; includes syntax checks in CI Repository Hygiene step; adds dedicated "Incident response" CI workflow step with separate logs; integrates validator calls into local check scripts.
Documentation references and release artifacts
SECURITY.md, docs/audit-package.md, docs/dependency-operations.md, docs/randomizer-operations.md, docs/release-readiness.md, docs/tooling.md, release-artifacts/README.md, release-artifacts/latest/{SHA256SUMS, release-checksums.json, release-manifest.json}, ops/AUTONOMOUS_RUN.md, ops/ROADMAP.md, CHANGELOG.md
Cross-references incident-response runbook from security, audit, tooling, and operations docs; updates roadmap baseline and operational maturity guidance; records PR #175 state in autonomous-run queue; refreshes release manifest/checksum hashes to reflect new/changed governance docs and changelog entry.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

"🐰 I hopped through docs and CI with care,
A runbook stitched, checked, and logged there,
Six playbooks in hand,
Tests passing on land,
Evidence tidy — auditors stare! 🎉"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add protocol incident response runbook' is concise and clearly summarizes the main change: introducing incident response documentation and related validation infrastructure.
Linked Issues check ✅ Passed The PR fully implements all coding-related requirements from issue #173: adds docs/incident-response.md, creates checker/test scripts, wires them into Makefile/CI/shell scripts, links from required docs, and refreshes release artifacts.
Out of Scope Changes check ✅ Passed All changes are in scope: incident response documentation, validation scripts, build system integration, documentation links, and release artifact updates directly support issue #173 objectives with no Solidity code or unrelated changes.
Docstring Coverage ✅ Passed Docstring coverage is 95.83% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/protocol-incident-response-runbooks

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/release-readiness.md (1)

38-43: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Keep the CI/local-gates summary in sync.

The readiness table still omits the incident-response gate even though make check and CI now run it. That makes the dashboard summary stale.

Suggested wording
-| CI and local gates | Passing local/CI baseline exists for build, tests, size, local deployment rehearsals, release artifacts, architecture/threat model, audit package, release manifest, checksums, and changelog | No | No, but release commit CI must be green |
+| CI and local gates | Passing local/CI baseline exists for build, tests, size, local deployment rehearsals, incident response, release artifacts, architecture/threat model, audit package, release manifest, checksums, and changelog | No | No, but release commit CI must be green |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/release-readiness.md` around lines 38 - 43, Update the readiness table
row under the "CI and local gates" summary to include the incident-response gate
so the dashboard matches current checks; specifically, add a cell noting that
"incident-response" is now run by make check/CI and mark its status
appropriately (e.g., Passing or Yes) alongside the existing build, tests, size,
local deployment rehearsals, release artifacts, etc.; ensure the table text that
starts with "CI and local gates" and the phrasing referencing "make check" and
CI are synchronized so the summary is no longer stale.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ops/AUTONOMOUS_RUN.md`:
- Around line 165-174: The runbook summary in AUTONOMOUS_RUN's "Queue Item 90"
must be expanded to match issue `#173` by re-adding the missing "curator-claims"
and "drop-pause" slices; update the summary section text so it explicitly lists
curator-claims and drop-pause alongside the existing items (stuck auctions,
failed/stale randomness, bad Merkle roots, bad metadata/dependency
configuration, signer compromise, and release artifact/evidence mistakes) and
ensure any links or acceptance-criteria bullets reference issue `#173` to keep
scope aligned.

---

Outside diff comments:
In `@docs/release-readiness.md`:
- Around line 38-43: Update the readiness table row under the "CI and local
gates" summary to include the incident-response gate so the dashboard matches
current checks; specifically, add a cell noting that "incident-response" is now
run by make check/CI and mark its status appropriately (e.g., Passing or Yes)
alongside the existing build, tests, size, local deployment rehearsals, release
artifacts, etc.; ensure the table text that starts with "CI and local gates" and
the phrasing referencing "make check" and CI are synchronized so the summary is
no longer stale.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 608b8b67-eccf-4ec7-9999-e3f0cb77a618

📥 Commits

Reviewing files that changed from the base of the PR and between 074ac3e and 0846615.

📒 Files selected for processing (25)
  • .github/workflows/ci.yml
  • CHANGELOG.md
  • Makefile
  • SECURITY.md
  • docs/audit-package.md
  • docs/dependency-operations.md
  • docs/incident-response.md
  • docs/randomizer-operations.md
  • docs/release-readiness.md
  • docs/tooling.md
  • ops/AUTONOMOUS_RUN.md
  • ops/ROADMAP.md
  • release-artifacts/README.md
  • release-artifacts/latest/SHA256SUMS
  • release-artifacts/latest/release-checksums.json
  • release-artifacts/latest/release-manifest.json
  • scripts/check.ps1
  • scripts/check.sh
  • scripts/check_audit_package.py
  • scripts/check_incident_response.py
  • scripts/check_release_readiness.py
  • scripts/generate_release_manifest.py
  • scripts/test_incident_response.py
  • scripts/test_release_manifest.py
  • scripts/test_release_readiness.py

Comment thread ops/AUTONOMOUS_RUN.md
@punk6529 punk6529 merged commit 4be2808 into main Jun 12, 2026
2 checks passed

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add protocol incident response runbooks

1 participant