Last Updated: 2026-03-29 Scope: Sentinel security-path incident response and recovery
Track and alert on:
- replay detections (token/proof)
- DPoP validation failures
- nonce challenge rate (use_dpop_nonce)
- SSF rejection and processing failure rates
- finance authorization bounds exceeded events
- cache dependency latency/error budget breach
| Severity | Criteria |
|---|---|
| Sev-1 | security bypass suspected, widespread auth outage, or confirmed replay abuse campaign |
| Sev-2 | localized security-path degradation with user impact |
| Sev-3 | elevated warnings without confirmed user-impacting failures |
- sudden increase in replay-related failures/alerts
- determine scope by route, client_id, subject, source network
- verify whether failures map to a single integration/client rollout
- check cache state health before concluding active abuse
- keep fail-closed behavior enabled
- notify security engineering if replay pattern is distributed/coordinated
- collect trace IDs and correlated logs for forensic timeline
- replay rate returns to baseline
- no permissive bypass behavior observed
- increase in 401 responses with use_dpop_nonce
- client not persisting latest nonce
- intermediary stripping DPoP-Nonce or WWW-Authenticate headers
- nonce store/cache degradation
- verify challenge headers are emitted by API
- inspect edge/proxy header behavior
- validate cache health and latency
- work with affected clients on retry logic correctness (single retry with fresh nonce)
- repeated 401/400 outcomes on SSF event endpoint
- identify failure class: auth token mismatch, signature/issuer, payload timing/shape
- verify IdP discovery/JWKS reachability
- check auth token config parity between sender and receiver
- rotate/update shared auth token if compromised or mismatched
- coordinate issuer key-rotation validation path
- isolate malformed sender batches to avoid event flood masking
- replay/nonce/session checks showing backend unavailability or timeouts
- requests on protected paths may fail closed (503/denials depending path)
- restore cache availability first; do not disable replay or blacklist checks
- verify state writes and reads recover
- run synthetic auth checks (nonce challenge + valid retry + replay rejection)
- increased authorization-bounds-exceeded warnings and 403 responses on transfer route
- compare expected signed bounds with submitted payload shapes
- detect client-side currency/amount normalization regressions
- assess for potential tampering/abuse signals
- preserve opaque external 403 response semantics
- use internal structured logs for detailed delta analysis
- coordinate fix rollout with affected client teams
After mitigation:
- protected endpoints succeed for valid DPoP flows
- replay attempts are rejected
- nonce challenge volume normalizes
- SSF valid events are accepted and applied
- finance transfer policy denials match expected baseline
Always collect:
- incident time window
- impacted endpoints and prefixes
- trace IDs and correlation IDs
- dependency health snapshots
- mitigation actions and rollback conditions
Any incident-driven config or policy change in auth/security paths must trigger updates to:
- LIVING_THREAT_MODEL.md
- COMPLIANCE_AUDIT_MATRIX.md
- OPENAPI_3_1.yaml (if contract behavior changed)