From a7a578a3c0d81cebea72d5bb6b6258e07155476e Mon Sep 17 00:00:00 2001 From: wheval Date: Sun, 28 Jun 2026 00:21:56 +0100 Subject: [PATCH] docs: add emergency pause runbook Document when to use the contract's pause/unpause mechanism, how to invoke it via CLI and JS SDK, what operations are blocked vs. still available while paused, steps to take while paused (snapshot, reproduce, fix, upgrade), and the criteria that must hold before calling unpause. Closes accesslayerorg/accesslayer-contracts#490 --- docs/emergency-pause-runbook.md | 192 ++++++++++++++++++++++++++++++++ 1 file changed, 192 insertions(+) create mode 100644 docs/emergency-pause-runbook.md diff --git a/docs/emergency-pause-runbook.md b/docs/emergency-pause-runbook.md new file mode 100644 index 0000000..a866d1e --- /dev/null +++ b/docs/emergency-pause-runbook.md @@ -0,0 +1,192 @@ +# Emergency Pause Runbook + +The `CreatorKeysContract` exposes `pause(admin)` and `unpause(admin)` entry points +that allow the protocol admin to halt all state-mutating operations without a contract +upgrade or migration. This document describes when to use them, how to execute them +safely, and what conditions must hold before calling `unpause`. + +--- + +## 1. Conditions That Justify an Emergency Pause + +Pausing is a last resort. Trigger it when any of the following is confirmed or +strongly suspected: + +| Trigger | Example | +|---|---| +| **Active exploit in a write path** | An attacker is draining the bonding-curve pool or minting supply without payment | +| **Critical logic bug discovered** | A fee calculation produces a negative payout; a division-by-zero path is reachable in production | +| **Treasury / fee-recipient drain** | Protocol fee balance is decreasing at an anomalous rate not explained by legitimate trades | +| **Unauthorized admin change** | `set_protocol_admin` was called by an unexpected account | +| **Oracle / price manipulation** | An upstream price feed is being exploited to affect buy/sell quotes | + +Do **not** pause for: + +- Routine upgrades — use the upgrade path instead +- Network congestion or high fees — not a contract-level concern +- Single failed transaction by a legitimate user + +--- + +## 2. Who Can Pause + +Only the address stored in the contract's `protocol_admin` storage key can call +`pause` or `unpause`. Any other caller receives `ContractError::Unauthorized` and +the call is reverted on-chain. + +The admin address is set via `set_protocol_admin(caller, new_admin)` where `caller` +must be the current admin (or the admin bootstrapping the first admin during +initialization). + +--- + +## 3. What Is Blocked While Paused + +All state-mutating operations return `ContractError::ProtocolPaused` immediately: + +- `buy_key` — purchasing creator keys +- `sell_key` / `buyback` — selling or buying back creator keys +- `register_creator` — new creator registration +- `distribute_dividend` / `claim_dividend` — dividend operations +- `transfer_keys` — peer-to-peer key transfers +- Any future write entry points added to the contract + +**Read-only views continue to work** while paused: + +- `get_is_paused`, `is_creator_registered`, `get_total_key_supply` +- `get_key_balance`, `get_creator_details`, `get_protocol_fee_view` +- `get_buy_quote`, `get_sell_quote` (price simulation only — no state changes) +- All other `get_*` / `is_*` view methods + +This means monitoring dashboards, explorers, and read-only integrations remain +operational and can be used to diagnose the issue while the contract is paused. + +--- + +## 4. How to Pause + +1. **Confirm admin key is available** — locate the admin keypair for the deployed + network (testnet or mainnet). Never use a hot key for the admin on mainnet. + +2. **Invoke `pause(admin)` on-chain:** + + Using the Stellar CLI: + ```sh + stellar contract invoke \ + --id \ + --source \ + --network \ + -- pause \ + --admin + ``` + + Using the JavaScript SDK: + ```ts + import { Contract, TransactionBuilder, Networks, Keypair } from '@stellar/stellar-sdk'; + + const contract = new Contract(CONTRACT_ID); + const adminKeypair = Keypair.fromSecret(ADMIN_SECRET); + const account = await server.loadAccount(adminKeypair.publicKey()); + const tx = new TransactionBuilder(account, { fee: BASE_FEE, networkPassphrase: Networks.PUBLIC }) + .addOperation(contract.call('pause', xdr.ScVal.scvAddress(...))) + .setTimeout(30) + .build(); + tx.sign(adminKeypair); + await server.sendTransaction(tx); + ``` + +3. **Verify the pause is in effect:** + ```sh + stellar contract invoke \ + --id \ + --network \ + -- get_is_paused + ``` + Expected output: `true` + +4. **Communicate to users** — post a status update on the protocol's public channels + (Discord, Twitter/X, status page) explaining that the contract is temporarily + paused and that funds are safe. Avoid disclosing exploit details until a fix is + deployed. + +--- + +## 5. Steps While the Contract Is Paused + +1. **Preserve state** — query all relevant on-chain state (`get_key_balance`, + `get_creator_details`, fee balances) immediately after pausing and save a snapshot. + +2. **Reproduce the issue** — reproduce the bug in a forked testnet or local `soroban` + environment. Do not run exploits against the live contract even while paused. + +3. **Assess impact** — determine which accounts were affected, what amounts are at + risk, and whether the bug can be triggered retroactively against existing state. + +4. **Prepare a fix** — implement and test the fix. The fix must pass: + - All existing Soroban unit tests (`cargo test`) + - A regression test that specifically covers the exploit scenario + - Manual review by at least one additional engineer + +5. **Deploy the fix** — use the `upgrade(new_wasm_hash)` entry point (admin-only) + to upgrade the contract WASM in place. Soroban upgrades preserve all storage state. + Verify the new WASM hash matches the expected build output before invoking. + +6. **Test on testnet first** — deploy the fixed WASM to testnet, reproduce the full + attack scenario, confirm it is blocked, and confirm legitimate operations work. + +--- + +## 6. Criteria That Must Hold Before Calling `unpause` + +Do **not** unpause until all of the following are true: + +- [ ] The vulnerability is **fully understood** — root cause identified, not just + symptoms suppressed +- [ ] A **fix is deployed** on mainnet (not just testnet) via `upgrade` +- [ ] The fix has been **independently verified** by a second engineer running the + exploit scenario against the upgraded contract +- [ ] Any **affected users have been identified** and a remediation plan exists (even + if not yet executed) +- [ ] A **post-mortem draft** is ready to publish within 24 hours of unpausing +- [ ] **Key rotation has been considered** — if the admin key or any privileged key + may have been exposed, rotate before unpausing + +If the fix cannot be deployed in time (e.g. requires a breaking state migration), +keep the contract paused and evaluate a controlled migration to a new contract +deployment with state replay. + +--- + +## 7. How to Unpause + +Once all criteria above are satisfied: + +```sh +stellar contract invoke \ + --id \ + --source \ + --network mainnet \ + -- unpause \ + --admin +``` + +Immediately verify: +```sh +stellar contract invoke --id --network mainnet -- get_is_paused +# Expected: false +``` + +Then verify a test `buy_key` transaction succeeds on mainnet with a small amount +before announcing to users that the protocol is operational again. + +--- + +## 8. Warning: Never Unpause Without a Deployed Fix + +Unpausing before the vulnerability is fixed re-exposes users to the same exploit. +Even under stakeholder pressure, **pausing again after an early unpause is worse +than staying paused** — it signals to the ecosystem that the protocol is reactive +rather than deliberate. + +If you are uncertain whether the fix is sufficient, keep the contract paused and +consult additional reviewers before proceeding.