Skip to content

Commit f44e0db

Browse files
SecAI-Hubclaude
andcommitted
Implement M43: Stronger isolation, adversarial tests, MCP isolation, recovery ceremonies, M5 acceptance suite
Per-service sandbox tightening (device cgroups, resource limits) on agent, mcp-firewall, inference, and search-mediator systemd units. Agent execution compartmentalization with step signatures, subprocess isolation, per-step capability re-validation, and workspace hard walls (sandbox.py). Formal adversarial test suite: 28 Python tests covering prompt injection, policy bypass, step signature tampering, containment determinism, GPU runtime tamper, and blocked paths. MCP firewall adversarial tests (null bytes, shell metachar, path traversal, oversized payloads, dynamic registration denial, taint bypass). Policy-engine adversarial tests (egress spoofing, approval spoofing, taint propagation). CI security regression gate: dedicated security-regression job in ci.yml running all adversarial suites on every push/PR. Negative-path operational controls: recovery ceremony (ack + re-attestation), latched degraded states for critical incident classes, severity escalation with configurable time windows, HMAC-signed forensic bundle export. MCP-specific isolation: trust tier enforcement (trusted/verified/untrusted), per-tool isolation profiles, HMAC-bound session binding with expiry and call limits, dynamic tool registration denial. M5 control matrix (26 controls mapped to enforcing component, failure mode, test coverage, and audit evidence), supply chain provenance documentation, and explicit M5 acceptance suite (32 tests covering all criteria). Go: 399 tests. Python: 60 tests (adversarial + M5 acceptance). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 4df8ba4 commit f44e0db

18 files changed

Lines changed: 3301 additions & 9 deletions

.github/workflows/ci.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ jobs:
7474
python -m py_compile services/agent/agent/executor.py
7575
python -m py_compile services/agent/agent/storage.py
7676
python -m py_compile services/agent/agent/capabilities.py
77+
python -m py_compile services/agent/agent/sandbox.py
7778
7879
- name: Test
7980
run: python -m pytest tests/ -v
@@ -199,3 +200,37 @@ jobs:
199200
done
200201
201202
echo "=== Supply chain verification passed ==="
203+
204+
security-regression:
205+
name: Security Regression Tests
206+
runs-on: ubuntu-latest
207+
permissions:
208+
contents: read
209+
steps:
210+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
211+
212+
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
213+
with:
214+
python-version: "3.12"
215+
216+
- uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0
217+
with:
218+
go-version: "1.23"
219+
220+
- name: Install Python dependencies
221+
run: pip install pyyaml flask requests pytest
222+
223+
- name: Run adversarial Python tests
224+
run: python -m pytest tests/test_adversarial.py -v --tb=short
225+
226+
- name: Run MCP firewall adversarial tests
227+
working-directory: services/mcp-firewall
228+
run: go test -v -race -run TestAdversarial ./...
229+
230+
- name: Run policy-engine adversarial tests
231+
working-directory: services/policy-engine
232+
run: go test -v -race -run TestAdversarial ./...
233+
234+
- name: Run incident-recorder recovery tests
235+
working-directory: services/incident-recorder
236+
run: go test -v -race -run "TestRecovery|TestEscalation|TestForensic|TestLatched" ./...

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ Every model passes through the same fully automatic pipeline:
150150
| **Updates** | Cosign-verified rpm-ostree, staged workflow, greenboot auto-rollback |
151151
| **Supply Chain** | Per-service CycloneDX SBOMs, SLSA3 provenance attestation, cosign-signed checksums |
152152

153-
See [docs/threat-model.md](docs/threat-model.md) for threat classes, residual risks, and security invariants. See [docs/security-status.md](docs/security-status.md) for implementation status of all 42 milestones.
153+
See [docs/threat-model.md](docs/threat-model.md) for threat classes, residual risks, and security invariants. See [docs/security-status.md](docs/security-status.md) for implementation status of all 43 milestones.
154154

155155
### Verify Image Signatures
156156

@@ -200,7 +200,7 @@ See [docs/policy-schema.md](docs/policy-schema.md) for full schema reference. Se
200200
| [Threat Model](docs/threat-model.md) | Threat classes, invariants, residual risks |
201201
| [API Reference](docs/api.md) | HTTP API for all services |
202202
| [Policy Schema](docs/policy-schema.md) | Full policy.yaml schema reference |
203-
| [Security Status](docs/security-status.md) | Implementation status of all 42 milestones |
203+
| [Security Status](docs/security-status.md) | Implementation status of all 43 milestones |
204204
| [Test Matrix](docs/test-matrix.md) | Test coverage: 1000+ tests across Go, Python, shell |
205205
| [Compatibility Matrix](docs/compatibility-matrix.md) | GPU, VM, and hardware support |
206206
| [Security Test Matrix](docs/security-test-matrix.md) | Security feature test coverage |
@@ -224,6 +224,8 @@ See [docs/policy-schema.md](docs/policy-schema.md) for full schema reference. Se
224224
| [Runtime Attestor](docs/components/runtime-attestor.md) | TPM2 attestation and startup gating |
225225
| [Integrity Monitor](docs/components/integrity-monitor.md) | Continuous file integrity verification |
226226
| [Incident Recorder](docs/components/incident-recorder.md) | Security event capture and auto-containment |
227+
| [M5 Control Matrix](docs/m5-control-matrix.md) | M5 acceptance criteria, enforcement paths, operator verification |
228+
| [Supply Chain Provenance](docs/supply-chain-provenance.md) | Provenance pipeline, SBOM coverage, key material |
227229

228230
### Install Guides
229231

@@ -327,7 +329,7 @@ See [docs/test-matrix.md](docs/test-matrix.md) for full breakdown.
327329
## Roadmap
328330

329331
<details>
330-
<summary>All 42 milestones (click to expand)</summary>
332+
<summary>All 43 milestones (click to expand)</summary>
331333

332334
- [x] **M0** -- Threat model, dataflow, invariants, policy files
333335
- [x] **M1** -- Bootable OS, encrypted vault, GPU drivers
@@ -372,6 +374,7 @@ See [docs/test-matrix.md](docs/test-matrix.md) for full breakdown.
372374
- [x] **M40** -- Agent verified supervisor hardening (signed tokens, replay protection, two-phase approval)
373375
- [x] **M41** -- HSM-backed key handling (pluggable keystore: software/TPM2/PKCS#11)
374376
- [x] **M42** -- Enforcement wiring + CI supply chain verification
377+
- [x] **M43** -- Stronger isolation: sandbox tightening, adversarial tests, CI security regression, MCP isolation, recovery ceremonies, M5 acceptance suite
375378

376379
</details>
377380

docs/m5-control-matrix.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# M5 Control Matrix — Stronger Isolation Acceptance Criteria
2+
3+
This matrix maps each M5 security control to its enforcing component, failure mode, test coverage, and audit evidence. A reviewer or operator can use this matrix to verify that every claimed control is actually implemented, tested, and observable.
4+
5+
Last updated: 2026-03-14
6+
7+
## Control Matrix
8+
9+
| # | Control | Enforcing Component | Failure Mode | Test Covering It | Audit Evidence |
10+
|---|---------|-------------------|--------------|-----------------|----------------|
11+
| 1 | Startup gating via TPM2 attestation | Runtime Attestor (:8505) | Service refuses to start; reports `attestation_failure` to Incident Recorder | `TestAttest_BadTPMState`, `TestChain_AttestationFailure_ContainmentDispatched` | `incident-recorder-audit.jsonl` entry class=attestation_failure |
12+
| 2 | Continuous file integrity monitoring | Integrity Monitor (:8510) | State transitions to `degraded`; reports violations to Incident Recorder | `TestScan_BaselineMismatch`, `TestChain_IntegrityViolation_FreezeAndDisable` | Baseline scan results + incident report with file paths/hashes |
13+
| 3 | Auto-containment on integrity violation | Incident Recorder (:8515) | freeze_agent + disable_airlock + force_vault_relock dispatched | `TestChain_IntegrityViolation_FreezeAndDisable`, `TestExecuteContainment_FreezeAgent` | Containment dispatch logs + target service acknowledgment |
14+
| 4 | Auto-containment on attestation failure | Incident Recorder (:8515) | freeze_agent + disable_airlock + force_vault_relock dispatched | `TestChain_AttestationFailure_ContainmentDispatched` | Incident record with state=contained |
15+
| 5 | Model quarantine on manifest mismatch | Incident Recorder (:8515) → Registry (:8470) | quarantine_model + freeze_agent dispatched | `TestChain_ManifestMismatch_QuarantinesModel` | POST to /api/v1/quarantine with model_path |
16+
| 6 | GPU runtime integrity verification | GPU Integrity Watch (:8495) | Warning/critical verdict triggers incident report | `TestProbe_DriverFingerprint`, `TestChain_GPUAnomaly_IncidentAndQuarantine` | GPU probe results + incident class=model_behavior_anomaly |
17+
| 7 | Centralised policy decisions (6 domains) | Policy Engine (:8500) | Allow/deny with structured evidence | `TestDecide_ToolAccess_*`, `TestDecide_AgentRisk_*` (37 tests) | PolicyDecision JSON with decision, reason, evidence |
18+
| 8 | Deny-by-default tool firewall | Tool Firewall (:8475) | Unknown tools denied | `TestEvaluate_*` (10 tests) | Audit log with tool name + decision |
19+
| 9 | Deny-by-default MCP firewall | MCP Firewall (:8496) | Unknown servers/tools denied; taint propagation; input redaction | `TestEvaluate_*`, `TestAdversarial_*` (44+ tests) | Hash-chained audit log + signed decision receipts |
20+
| 10 | HMAC-signed capability tokens | Agent (:8476) capabilities.py | Token verification: expiry, nonce replay, HMAC signature | `TestTokenSigning`, `test_stale_capability_token_rejected`, `test_replayed_capability_token_rejected` | Token ID in agent-audit.jsonl per step |
21+
| 11 | Two-phase approval for high-risk actions | Agent (:8476) policy.py | TRUST_CHANGE, EXPORT_DATA, WIDEN_SCOPE etc. always escalated to "ask" | `test_two_phase_actions_require_approval` | PolicyDecision with decision=ask for TWO_PHASE_ACTIONS |
22+
| 12 | Step signature validation | Agent sandbox.py | Step modified between planning and execution is rejected | `test_signed_step_verifies`, `test_tampered_step_fails_verification` | Step signature in audit trail |
23+
| 13 | Per-step capability re-validation | Agent sandbox.py | Path/tool/scope mutations caught at execution time | `test_path_mutation_caught_at_execution`, `test_tool_mutation_caught_at_execution` | Re-validation check in executor log |
24+
| 14 | Workspace hard walls | Agent sandbox.py WorkspaceGuard | Symlink escape, cross-workspace FD reuse, hardlink tricks detected | `test_symlink_traversal_blocked`, `test_workspace_id_spoofing_blocked` | Workspace violation log entry |
25+
| 15 | Storage gateway blocked paths | Agent storage.py | /etc/shadow, /etc/passwd, policy files, service tokens always blocked | `test_shadow_file_blocked`, `test_service_token_blocked` | Storage gateway deny in audit log |
26+
| 16 | Sensitivity ceiling enforcement | Agent policy.py + storage.py | Files exceeding sensitivity ceiling are blocked | `TestSensitivity_*` | Sensitivity classification in read result |
27+
| 17 | Recovery ceremony after containment | Incident Recorder recovery.go | Require ack + re-attestation before returning to trusted mode | `TestRecovery_CriticalRequiresReattestation` | Recovery requirement record with ack/reattest timestamps |
28+
| 18 | Latched degraded states | Incident Recorder recovery.go | attestation_failure, integrity_violation, unauthorized_access, manifest_mismatch remain latched | `TestLatchedClasses` | Incident state remains until manual review |
29+
| 19 | Severity escalation | Incident Recorder recovery.go | Repeated medium-severity events escalate per rules | `TestEscalation_RepeatedPromptInjection` | Escalated severity in incident record |
30+
| 20 | Forensic bundle export | Incident Recorder recovery.go | Signed export of incidents, audit, state, policy digest | `TestForensicBundle_ExportAndVerify`, `TestForensicBundle_TamperDetection` | Forensic bundle JSON with HMAC signature |
31+
| 21 | Service token propagation | Incident Recorder containment.go | Bearer token included in all containment HTTP calls | `TestChain_BearerToken_PropagatedToContainment` | Authorization header in containment requests |
32+
| 22 | HSM/TPM2 key management | Agent keystore.py | Software/TPM2/PKCS#11 backends with auto-detection | `TestKeystore_*` (31 tests) | Keystore provider name in agent startup log |
33+
| 23 | Prompt injection detection | MCP Firewall global rules | Shell metacharacters and prompt patterns detected and denied | `TestAdversarial_MalformedMCPPayload` | Global rule match in audit log |
34+
| 24 | MCP taint tracking | MCP Firewall taint.go | Session-scoped taint propagation prevents data flow violations | `TestAdversarial_TaintBypassAttempt`, `TestTaint_*` | Taint entries per session ID |
35+
| 25 | SBOM generation verification | CI supply-chain-verify job | Syft generates SBOMs for all services | CI workflow step output | CycloneDX SBOM artifacts |
36+
| 26 | Release provenance attestation | Release workflow (release.yml) | cosign attest with SLSA3 provenance | CI workflow attestation step | Signed provenance attestation |
37+
38+
## End-to-End Enforcement Paths
39+
40+
### Path 1: Bad Attestation → Service Startup Blocked
41+
```
42+
Runtime Attestor detects TPM2 quote mismatch
43+
→ State transitions to "failed"
44+
→ POST to Incident Recorder: class=attestation_failure, severity=critical
45+
→ Incident Recorder creates incident with auto-containment
46+
→ Containment: freeze_agent + disable_airlock + force_vault_relock
47+
→ Recovery: requires operator ack + re-attestation ceremony
48+
```
49+
**Test:** `TestChain_AttestationFailure_ContainmentDispatched`
50+
51+
### Path 2: Baseline Mismatch → Degraded → Incident → Containment
52+
```
53+
Integrity Monitor detects file hash mismatch
54+
→ State transitions to "degraded"
55+
→ POST to Incident Recorder: class=integrity_violation, severity=high
56+
→ Incident Recorder creates incident with auto-containment
57+
→ Containment: freeze_agent + disable_airlock + force_vault_relock
58+
→ State latched until manual review
59+
```
60+
**Test:** `TestChain_IntegrityViolation_FreezeAndDisable`
61+
62+
### Path 3: High-Risk Agent Action → Two-Phase Approval
63+
```
64+
Agent planner proposes TRUST_CHANGE step
65+
→ Policy engine evaluate_with_evidence: decision="ask"
66+
→ Step remains PENDING until user approves via /v1/task/<id>/approve
67+
→ On approval: token re-verified, step signature re-checked
68+
→ Executor re-validates capability before execution
69+
```
70+
**Test:** `test_two_phase_actions_require_approval`
71+
72+
### Path 4: MCP Request with Tainted Input → Deny/Sanitize
73+
```
74+
MCP Firewall receives request from tainted session
75+
→ TaintState checked for session: external-data label found
76+
→ TaintRule "no-external-to-write" matches target tool
77+
→ Decision: deny with reason "taint rule violation"
78+
→ Audit entry with taint evidence
79+
```
80+
**Test:** `TestAdversarial_TaintBypassAttempt`
81+
82+
## Operator Verification
83+
84+
An operator can verify the enforcement chain is active by:
85+
86+
1. **Check service health:** `curl http://localhost:8515/health` — incident recorder reports open incident count
87+
2. **Check recovery status:** `curl http://localhost:8515/api/v1/recovery/status` — pending recovery ceremonies
88+
3. **Export forensic bundle:** `curl http://localhost:8515/api/v1/forensic/export` — signed evidence package
89+
4. **Check attestation state:** `curl http://localhost:8505/api/v1/state` — current attestation status
90+
5. **Check integrity state:** `curl http://localhost:8510/api/v1/state` — current integrity baseline status
91+
6. **Verify audit chain:** `curl http://localhost:8496/v1/audit/verify` — MCP firewall audit chain integrity

docs/security-status.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This document tracks the implementation status of all security features in SecAI_OS.
44

5-
Last updated: 2026-03-14
5+
Last updated: 2026-03-15
66

77
## Implemented Features
88

@@ -51,11 +51,11 @@ Last updated: 2026-03-14
5151
| Agent Verified Supervisor hardening | Implemented | M40 | HMAC-SHA256 signed capability tokens bound to task/intent/policy, nonce replay protection, token expiry, two-phase approval for high-risk actions, per-step PolicyDecision evidence in audit trail, 128 agent tests (up from 93) |
5252
| HSM-backed key handling | Implemented | M41 | Keystore abstraction layer with pluggable backends (software/TPM2/PKCS#11), key rotation, PCR-sealed TPM2 key hierarchy, PKCS#11 HSM stub for external hardware, auto-detection of available backends, keystore.yaml config, 159 agent tests (up from 128) |
5353
| Enforcement wiring + CI supply chain verification | Implemented | M42 | Integrity monitor → incident recorder reporting, runtime attestor → incident recorder reporting, incident recorder → containment action execution (freeze agent, disable airlock, force vault relock, quarantine model), CI SBOM generation verification via Syft, cosign availability check, release workflow provenance validation |
54+
| Stronger isolation (M5 hardening) | Implemented | M43 | Per-service sandbox tightening (device cgroups, resource limits, namespace isolation), agent execution compartmentalization (step signatures, subprocess isolation, per-step capability re-validation), workspace hard walls (symlink/hardlink/FD-reuse detection), model worker isolation profiles, formal adversarial test suite (prompt injection, policy bypass, containment, GPU tamper), CI security regression gate, MCP-specific isolation (trust tier enforcement, per-tool profiles, session binding, dynamic registration denial), recovery ceremony (ack + re-attestation), latched degraded states, severity escalation rules, forensic bundle export (signed), M5 control matrix doc, supply chain provenance doc, M5 acceptance suite (30 tests) |
5455

5556
## Planned Features
5657

5758
| Feature | Status | Notes |
5859
|---------|--------|-------|
5960
| Agent Mode Phase 2: Explainability | Planned | Detailed explanations for quarantine/registry/airlock decisions, per-workspace permissions, audit views |
6061
| Agent Mode Phase 3: Online-assisted | Planned | Airlock-mediated outbound, search mediation, redaction flows, approval UX for online steps |
61-
| Agent Mode Phase 4: Stronger isolation | Planned | Adversarial testing, additional sandboxing profiles, policy bypass regression tests |

docs/supply-chain-provenance.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Supply Chain & Provenance Architecture
2+
3+
This document describes which workflow is the source of truth for each stage of the SecAI OS supply chain: image builds, release artifacts, SBOM generation, provenance attestation, and verification before install/update.
4+
5+
Last updated: 2026-03-14
6+
7+
## Workflow Responsibilities
8+
9+
| Stage | Source of Truth | Workflow File | Trigger |
10+
|-------|----------------|---------------|---------|
11+
| **OS Image Builds** | `build.yml` | `.github/workflows/build.yml` | Push to main, daily schedule (06:00), manual dispatch |
12+
| **Release Artifacts** | `release.yml` | `.github/workflows/release.yml` | Tag push (`v*`), manual dispatch |
13+
| **CI Tests** | `ci.yml` | `.github/workflows/ci.yml` | Push to main, PRs, manual dispatch |
14+
| **Image SBOM** | `build.yml` | `.github/workflows/build.yml` | After image build (non-PR only) |
15+
| **Service SBOMs** | `release.yml` | `.github/workflows/release.yml` | At release time |
16+
| **Provenance Attestation** | `release.yml` | `.github/workflows/release.yml` | At release time |
17+
| **Signing** | `build.yml` + `release.yml` | Both | cosign with `SIGNING_SECRET` |
18+
| **Verification** | `ci.yml` (supply-chain-verify job) | `.github/workflows/ci.yml` | Every CI run |
19+
20+
## Provenance Pipeline
21+
22+
```
23+
build → attest → sign → verify → promote
24+
```
25+
26+
### 1. Build (build.yml)
27+
- BlueBuild action builds the OS image from `recipes/recipe.yml`
28+
- Image published to `ghcr.io/sec_ai/secai_os`
29+
- cosign signs the image using `SIGNING_SECRET`
30+
31+
### 2. Attest (build.yml + release.yml)
32+
- **Image SBOM:** `anchore/sbom-action` generates CycloneDX JSON SBOM for the OS image
33+
- **SBOM Attestation:** `cosign attest --type cyclonedx` creates a signed attestation binding the SBOM to the image
34+
- **Service SBOMs:** Syft generates per-service CycloneDX SBOMs at release time
35+
- **SLSA3 Provenance:** `actions/attest-build-provenance` generates GitHub-native SLSA3 provenance attestation
36+
37+
### 3. Sign (build.yml + release.yml)
38+
- All images signed with cosign + `SIGNING_SECRET`
39+
- Release checksums (SHA256SUMS) signed with cosign
40+
- SBOM attestations signed with cosign private key
41+
42+
### 4. Verify (ci.yml)
43+
The `supply-chain-verify` CI job validates:
44+
- Syft can generate SBOMs for all Go and Python services
45+
- cosign is available and functional
46+
- `release.yml` contains required provenance keywords: `sbom-action`, `attest-build-provenance`, `cosign`, `cyclonedx`, `SHA256SUMS`
47+
- `build.yml` contains required SBOM keywords: `sbom-action`, `cosign attest`, `cyclonedx`
48+
49+
### 5. Promote (runtime)
50+
- At boot, the Runtime Attestor (:8505) verifies the measured boot chain
51+
- rpm-ostree atomic updates ensure image integrity
52+
- Greenboot health checks verify post-boot system state
53+
54+
## Key Material
55+
56+
| Key | Purpose | Storage | Rotation |
57+
|-----|---------|---------|----------|
58+
| `SIGNING_SECRET` | cosign image + SBOM signing | GitHub encrypted secret | Manual rotation |
59+
| HMAC signing key | Capability token + audit chain signing | Keystore (software/TPM2/HSM) | Auto-rotation via keystore |
60+
| TPM2 sealed keys | Vault encryption, attestation | TPM2 PCR-sealed | PCR policy change triggers re-seal |
61+
62+
## SBOM Coverage
63+
64+
| Component | Generator | Format | When |
65+
|-----------|-----------|--------|------|
66+
| OS image | anchore/sbom-action | CycloneDX JSON | build.yml (non-PR) |
67+
| Go services (9) | Syft | CycloneDX JSON | release.yml + ci.yml verification |
68+
| Python services (6) | Syft | CycloneDX JSON | release.yml + ci.yml verification |
69+
70+
### Go Services
71+
airlock, registry, tool-firewall, gpu-integrity-watch, mcp-firewall, policy-engine, runtime-attestor, integrity-monitor, incident-recorder
72+
73+
### Python Services
74+
agent, ui, quarantine, common, diffusion-worker, search-mediator

0 commit comments

Comments
 (0)