Skip to content

Commit 455fca2

Browse files
committed
feat(governance): add notification threshold override audit
1 parent 9bec93d commit 455fca2

8 files changed

Lines changed: 938 additions & 53 deletions

docs/brainstorms/2026-04-16-mainline-ci-stabilization-and-m7-direction-requirements.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -673,6 +673,40 @@ Deliverables:
673673
- `npm run test:agent-workspace:contracts`
674674
- `npm run verify:agent-workspace:runtime`
675675

676+
### M7.22 (Now): Notification Threshold Overrides and Audit-Trail Governance (Lane Ops Bridge)
677+
678+
Deliverables:
679+
680+
- add file-backed notification-threshold override routes for operator governance updates.
681+
- expose bounded audit-trail review for override/reset actions with previous/next threshold snapshots.
682+
- keep notification-policy and notification-slo surfaces hydrated from persisted override state instead of implicit constant-only thresholds.
683+
684+
#### M7.22 Progress Note (2026-04-16)
685+
686+
- [Done] expanded `src/server.ts` with notification-threshold governance routes:
687+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`,
688+
- `POST /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`,
689+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit?limit=...`.
690+
- [Done] added file-backed override + audit artifacts:
691+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds.v1.json`,
692+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds_audit.v1.json`.
693+
- [Done] notification governance now hydrates persisted overrides in current operator surfaces:
694+
- `/triage/remediation/escalation/notification-policy` returns override-backed `anomalyThresholdPolicy`,
695+
- `/triage/remediation/escalation/notification-slo` evaluates breach thresholds from persisted overrides.
696+
- [Done] override audit entries now capture:
697+
- `previousThresholds`,
698+
- `nextThresholds`,
699+
- `resetToDefault`,
700+
- `source` / `reason`,
701+
- delta summaries for suppression count, throttled digest count, and suppressed-to-emitted ratio.
702+
- [Done] expanded evidence coverage:
703+
- `src/server.migration.test.ts` now validates override POST/GET/reset semantics, SLO effect under overridden thresholds, audit route payloads, and persisted file contents.
704+
- `src/knowledge.api.contract.test.ts`, `src/agent_workspace.verification.contract.test.ts`, and `scripts/verify-agent-workspace-runtime.js` now fail fast on notification-threshold route and helper drift.
705+
- [Done] verification evidence:
706+
- `npm test -- src/server.migration.test.ts --runInBand --testNamePattern "escalation notification threshold overrides and audit-trail governance stay deterministic"`
707+
- `npm run test:agent-workspace:contracts`
708+
- `npm run verify:agent-workspace:runtime`
709+
676710
## Success Criteria
677711

678712
- CI failure mode that previously blocked the three agent-workspace suites is eliminated on mainline.
@@ -682,4 +716,4 @@ Deliverables:
682716

683717
## Next Step
684718

685-
Proceed to `/prompts:ce-plan` using this document as the source for `M7.22` decomposition (notification threshold overrides and audit-trail governance), while preserving M7 lane boundary constraints.
719+
Proceed to `/prompts:ce-plan` using this document as the source for `M7.23` decomposition (notification-threshold rollback preview and drift-diff governance), while preserving M7 lane boundary constraints.

docs/diataxis/en/explanation/development-progress-dashboard.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -599,22 +599,29 @@ Execution anchor:
599599
- `npm run test:agent-workspace:contracts`
600600
- `npm run verify:agent-workspace:runtime`
601601

602-
## Latest Mainline Increment (2026-04-16 M7.21 Notification Escalation SLOs and Anomaly-Threshold Governance Lane)
603-
604-
- Expanded `src/server.ts` with notification SLO route:
605-
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-slo?limit=...`.
606-
- Expanded notification governance payload:
607-
- `/triage/remediation/escalation/notification-policy` now includes explicit `anomalyThresholdPolicy`.
608-
- Added deterministic notification SLO synthesis:
609-
- suppression-count warning threshold,
610-
- throttled-digest warning threshold,
611-
- suppressed-to-emitted ratio warning threshold.
602+
## Latest Mainline Increment (2026-04-16 M7.22 Notification Threshold Overrides and Audit-Trail Governance Lane)
603+
604+
- Expanded `src/server.ts` with notification-threshold governance routes:
605+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`,
606+
- `POST /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`,
607+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit?limit=...`.
608+
- Added file-backed override governance artifacts:
609+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds.v1.json`,
610+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds_audit.v1.json`.
611+
- Hardened notification governance hydration:
612+
- `/triage/remediation/escalation/notification-policy` now returns override-backed `anomalyThresholdPolicy`,
613+
- `/triage/remediation/escalation/notification-slo` now evaluates breach thresholds from persisted overrides.
614+
- Added bounded audit-trail semantics:
615+
- previous/next threshold snapshots,
616+
- reset-to-default flag,
617+
- operator `source` / `reason`,
618+
- per-field delta summaries for suppression count, throttled digest count, and suppressed-to-emitted ratio.
612619
- Expanded executable evidence:
613-
- `src/server.migration.test.ts` now validates anomaly-threshold policy payload and notification SLO route semantics.
620+
- `src/server.migration.test.ts` now validates override POST/GET/reset semantics, SLO behavior under override, audit-route payloads, and persisted file contents.
614621
- Hardened runtime verification gate:
615-
- `src/knowledge.api.contract.test.ts`, `src/agent_workspace.verification.contract.test.ts`, and `scripts/verify-agent-workspace-runtime.js` now fail fast on notification-slo route and anomaly-threshold/SLO helper drift.
622+
- `src/knowledge.api.contract.test.ts`, `src/agent_workspace.verification.contract.test.ts`, and `scripts/verify-agent-workspace-runtime.js` now fail fast on notification-threshold route and helper drift.
616623
- Verification evidence:
617-
- `npm test -- src/server.migration.test.ts --runInBand --testNamePattern \"escalation notification SLOs and anomaly-threshold governance stay deterministic\"`
624+
- `npm test -- src/server.migration.test.ts --runInBand --testNamePattern \"escalation notification threshold overrides and audit-trail governance stay deterministic\"`
618625
- `npm run test:agent-workspace:contracts`
619626
- `npm run verify:agent-workspace:runtime`
620627

@@ -666,7 +673,7 @@ This dashboard aligns against the following requirement chain:
666673
| L2 Retrieval | explainable hybrid/vector retrieval + governance | Expanded in branch-oriented plans | Mainline file-backed baseline only (`src/learning/store.ts`) | Re-enter lane after concrete module evidence lands on mainline |
667674
| L3 Learning | mastery diagnostics + path/session loop | Expanded in branch | Partially integrated | Contract and integration parity |
668675
| L4 Interaction | agent conversation + focus/path pane runtime | Implemented in branch | M1-M4 baseline integrated on mainline | Expand capability surface via typed contract only |
669-
| L5 Governance | runbook, diagnostics, replay/autonomy controls | Expanded in branch | Operator diagnostics persistence/triage/history/threshold governance + runbook automation/audit + adaptive simulation/remediation + remediation backtest/approval-gate + approval-policy hardening/regression-alarms + approval-policy drift/escalation + escalation acknowledgement lifecycle/audit + escalation SLA/reminder baseline + notification digest/suppression baseline + delivery-log observability + stale-cleanup health auditing + anomaly/retention governance + notification SLO governance integrated | M7.22: notification threshold overrides and audit-trail governance |
676+
| L5 Governance | runbook, diagnostics, replay/autonomy controls | Expanded in branch | Operator diagnostics persistence/triage/history/threshold governance + runbook automation/audit + adaptive simulation/remediation + remediation backtest/approval-gate + approval-policy hardening/regression-alarms + approval-policy drift/escalation + escalation acknowledgement lifecycle/audit + escalation SLA/reminder baseline + notification digest/suppression baseline + delivery-log observability + stale-cleanup health auditing + anomaly/retention governance + notification SLO governance + notification-threshold override/audit governance integrated | M7.23: notification-threshold rollback preview and drift-diff governance |
670677

671678
## Verification Baseline
672679

docs/diataxis/zh/explanation/development-progress-dashboard.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -601,22 +601,29 @@
601601
- `npm run test:agent-workspace:contracts`
602602
- `npm run verify:agent-workspace:runtime`
603603

604-
## 主线最新增量(2026-04-16 M7.21 通知升级 SLO 与异常阈值治理链路)
605-
606-
- 已在 `src/server.ts` 增加通知 SLO 路由:
607-
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-slo?limit=...`
608-
- 已扩展通知治理输出:
609-
- `/triage/remediation/escalation/notification-policy` 现在输出显式 `anomalyThresholdPolicy`
610-
- 已新增确定性通知 SLO 合成:
611-
- suppressed count 告警阈值,
612-
- throttled digest 告警阈值,
613-
- suppressed-to-emitted ratio 告警阈值。
604+
## 主线最新增量(2026-04-16 M7.22 通知阈值覆盖与审计轨迹治理链路)
605+
606+
- 已在 `src/server.ts` 增加通知阈值治理路由:
607+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`
608+
- `POST /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds`
609+
- `GET /api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit?limit=...`
610+
- 已补 file-backed 覆盖治理产物:
611+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds.v1.json`
612+
- `runtime_data/agent_workspace_diagnostics/triage_remediation_escalation_notification_thresholds_audit.v1.json`
613+
- 已加固通知治理链路的覆盖态读取:
614+
- `/triage/remediation/escalation/notification-policy` 现在返回基于持久化覆盖的 `anomalyThresholdPolicy`
615+
- `/triage/remediation/escalation/notification-slo` 现在基于持久化覆盖阈值计算 breach。
616+
- 已增加有界审计轨迹语义:
617+
- `previousThresholds` / `nextThresholds` 快照,
618+
- `resetToDefault` 标记,
619+
- operator `source` / `reason`
620+
- suppression count、throttled digest count 与 suppressed-to-emitted ratio 的逐字段 delta 摘要。
614621
- 已补可执行证据:
615-
- `src/server.migration.test.ts` 新增 anomaly-threshold policy 载荷与 notification SLO 路由语义断言
622+
- `src/server.migration.test.ts` 新增 override POST/GET/reset、覆盖后 SLO 行为、audit 路由语义与持久化文件内容断言
616623
- 已加固 runtime 门禁:
617-
- `src/knowledge.api.contract.test.ts``src/agent_workspace.verification.contract.test.ts``scripts/verify-agent-workspace-runtime.js` 新增 notification-slo 路由与 anomaly-threshold/SLO helper 的 fail-fast 断言。
624+
- `src/knowledge.api.contract.test.ts``src/agent_workspace.verification.contract.test.ts``scripts/verify-agent-workspace-runtime.js` 新增 notification-threshold 路由与 helper 的 fail-fast 断言。
618625
- 验证证据:
619-
- `npm test -- src/server.migration.test.ts --runInBand --testNamePattern \"escalation notification SLOs and anomaly-threshold governance stay deterministic\"`
626+
- `npm test -- src/server.migration.test.ts --runInBand --testNamePattern \"escalation notification threshold overrides and audit-trail governance stay deterministic\"`
620627
- `npm run test:agent-workspace:contracts`
621628
- `npm run verify:agent-workspace:runtime`
622629

@@ -668,7 +675,7 @@
668675
| L2 检索层 | 可解释混合/向量检索 + 治理 | 分支规划增强中 | 主线当前为 file-backed 基线(`src/learning/store.ts`| 待主线出现对应模块证据后再收敛 |
669676
| L3 学习层 | 掌握诊断 + 路径/会话闭环 | 分支增强中 | 主线部分集成 | 契约与集成一致性 |
670677
| L4 交互层 | agent 对话 + focus/path pane 运行时 | 分支已实现 | 主线 M1-M4 已落入基线 | 继续通过 typed contract 扩展动作面 |
671-
| L5 治理层 | runbook/诊断/回放与自动化 | 分支增强中 | 主线已集成运维诊断持久化/分级/趋势历史/阈值治理 + runbook 自动化/阈值审计 + 自适应模拟/自动修复 + 回测/批准门禁 + 批准策略硬化/回归告警 + 批准策略漂移/升级 + 升级确认生命周期/审计 + 升级 SLA/提醒基线 + 通知摘要/抑制基线 + 交付日志可观测性 + 陈旧通知健康审计 + 异常/retention 治理 + 通知 SLO 治理 | M7.22:通知阈值覆盖与审计轨迹治理 |
678+
| L5 治理层 | runbook/诊断/回放与自动化 | 分支增强中 | 主线已集成运维诊断持久化/分级/趋势历史/阈值治理 + runbook 自动化/阈值审计 + 自适应模拟/自动修复 + 回测/批准门禁 + 批准策略硬化/回归告警 + 批准策略漂移/升级 + 升级确认生命周期/审计 + 升级 SLA/提醒基线 + 通知摘要/抑制基线 + 交付日志可观测性 + 陈旧通知健康审计 + 异常/retention 治理 + 通知 SLO 治理 + 通知阈值覆盖/审计治理 | M7.23:通知阈值回滚预览与 drift-diff 治理 |
672679

673680
## 验证基线
674681

scripts/verify-agent-workspace-runtime.js

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,6 +156,14 @@ function verifyAgentWorkspaceRuntime(repoRoot = path.resolve(__dirname, '..')) {
156156
serverSource.includes('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/reminders'),
157157
'Missing diagnostics remediation escalation reminders route in src/server.ts'
158158
);
159+
assert(
160+
serverSource.includes('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds'),
161+
'Missing diagnostics remediation escalation notification-threshold route in src/server.ts'
162+
);
163+
assert(
164+
serverSource.includes('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit'),
165+
'Missing diagnostics remediation escalation notification-threshold audit route in src/server.ts'
166+
);
159167
assert(
160168
serverSource.includes('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-policy'),
161169
'Missing diagnostics remediation escalation notification policy route in src/server.ts'
@@ -316,6 +324,22 @@ function verifyAgentWorkspaceRuntime(repoRoot = path.resolve(__dirname, '..')) {
316324
serverSource.includes('getAgentWorkspaceDiagnosticsRemediationEscalationNotificationAnomalyThresholdPolicy'),
317325
'Missing remediation escalation notification anomaly threshold helper in src/server.ts'
318326
);
327+
assert(
328+
serverSource.includes('readAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdPolicy'),
329+
'Missing remediation escalation notification threshold reader in src/server.ts'
330+
);
331+
assert(
332+
serverSource.includes('persistAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdPolicy'),
333+
'Missing remediation escalation notification threshold writer in src/server.ts'
334+
);
335+
assert(
336+
serverSource.includes('readAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdAuditTrail'),
337+
'Missing remediation escalation notification threshold audit reader in src/server.ts'
338+
);
339+
assert(
340+
serverSource.includes('appendAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdAuditEntry'),
341+
'Missing remediation escalation notification threshold audit writer in src/server.ts'
342+
);
319343
assert(
320344
serverSource.includes('buildAgentWorkspaceDiagnosticsRemediationEscalationNotificationAnomalyReport'),
321345
'Missing remediation escalation notification anomaly report helper in src/server.ts'

src/agent_workspace.verification.contract.test.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ describe('agent workspace verification script contracts', () => {
6262
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/audit');
6363
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/sla');
6464
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/reminders');
65+
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds');
66+
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit');
6567
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-policy');
6668
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notifications');
6769
expect(runtimeSource).toContain('/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-health');
@@ -101,6 +103,10 @@ describe('agent workspace verification script contracts', () => {
101103
expect(runtimeSource).toContain('getAgentWorkspaceDiagnosticsRemediationEscalationNotificationRetentionPolicy');
102104
expect(runtimeSource).toContain('buildAgentWorkspaceDiagnosticsRemediationEscalationNotificationAnomalyReport');
103105
expect(runtimeSource).toContain('getAgentWorkspaceDiagnosticsRemediationEscalationNotificationAnomalyThresholdPolicy');
106+
expect(runtimeSource).toContain('readAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdPolicy');
107+
expect(runtimeSource).toContain('persistAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdPolicy');
108+
expect(runtimeSource).toContain('readAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdAuditTrail');
109+
expect(runtimeSource).toContain('appendAgentWorkspaceDiagnosticsRemediationEscalationNotificationThresholdAuditEntry');
104110
expect(runtimeSource).toContain('buildAgentWorkspaceDiagnosticsRemediationEscalationNotificationSloReport');
105111
expect(runtimeSource).toContain('applyAgentWorkspaceDiagnosticsRemediationEscalationReminderSuppressionPolicy');
106112
expect(runtimeSource).toContain('buildAgentWorkspaceDiagnosticsRemediationEscalationGovernanceContext');

src/knowledge.api.contract.test.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ describe('Knowledge mastery API contract wiring', () => {
2626
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/audit',
2727
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/sla',
2828
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/reminders',
29+
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds',
30+
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-thresholds/audit',
2931
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-policy',
3032
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notifications',
3133
'/api/knowledge/operator/agent-workspace-diagnostics/triage/remediation/escalation/notification-health',

0 commit comments

Comments
 (0)