Harden agent IPC, reduce audit sensitivity, expand test docs

SecAI-Hub · claude · SecAI-Hub · commit f368598b5d07 · 2026-03-10T16:19:59.000-07:00
- Tighten UI→Agent IPC from loopback TCP to Unix domain socket
  (/run/secure-ai/agent.sock), eliminating TCP attack surface
- Change log_file_paths default to false to reduce audit sensitivity
- Document dev-mode auth bypass as non-production (SECAI_DEV_MODE=1
  required; never set on appliance image)
- Expand test-matrix.md with per-class agent test breakdown (11 classes,
  93 tests with exact counts and categories)
- Update security-status.md M31 entry to reflect current truth

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/docs/components/agent.md b/docs/components/agent.md
@@ -55,9 +55,9 @@ User Intent
 
 | Property | Value |
 |----------|-------|
-| Port | 8476 |
+| Socket | `/run/secure-ai/agent.sock` (Unix domain socket) |
 | Language | Python (Flask) |
-| Bind | 127.0.0.1 (loopback only) |
+| Bind | Unix socket in production; TCP 127.0.0.1:8476 in dev mode |
 | Systemd unit | `secure-ai-agent.service` |
 | Policy file | `/etc/secure-ai/policy/agent.yaml` |
 | Audit log | `/var/lib/secure-ai/logs/agent-audit.jsonl` |
@@ -125,8 +125,9 @@ The agent systemd service uses the same defense-in-depth as other services, with
 
 The agent communicates with other services (registry, tool firewall, airlock, inference) over loopback HTTP. Authentication and access control:
 
-- **Loopback-only binding**: All services bind to `127.0.0.1`, never `0.0.0.0`. Only processes on the local machine can reach service endpoints.
-- **Service tokens**: The agent reads a shared service token from `/run/secure-ai/service-token` (mounted read-only). This Bearer token authenticates requests to peer services with mutating endpoints. If the token file is absent (dev mode), auth is bypassed.
+- **Unix socket IPC (UI→Agent)**: The UI communicates with the agent over a Unix domain socket at `/run/secure-ai/agent.sock`, eliminating TCP attack surface for this channel. The agent still uses loopback TCP for outbound calls to Go services (registry, tool firewall, airlock) which do not support Unix sockets.
+- **Loopback-only binding**: Go services bind to `127.0.0.1`, never `0.0.0.0`. Only processes on the local machine can reach their endpoints.
+- **Service tokens**: The agent reads a shared service token from `/run/secure-ai/service-token` (mounted read-only). This Bearer token authenticates requests to peer services with mutating endpoints. **Production (appliance):** The token file MUST exist; if absent, the agent refuses to start. **Development only:** When `SECAI_DEV_MODE=1` is set explicitly, auth is bypassed to allow local testing without the full service stack. Dev-mode bypass is never enabled on the appliance image — the systemd unit does not set this variable, and the token file is provisioned at boot by `secure-ai-init.service`.
 - **UI→Agent auth**: The UI proxies agent requests through `/api/agent/*` endpoints. These are protected by session-based authentication (scrypt passphrase) and are not in the public endpoint list. All state-changing endpoints (approve, deny, cancel) require an authenticated session.
 - **CSRF protection**: The UI applies CSRF token validation on all POST requests, including agent proxy endpoints. Direct agent-to-agent calls are backend-only (no browser origin).
 - **Fail-closed**: If any peer service is unreachable, the agent returns an error rather than bypassing the service (e.g., tool firewall unreachable → tool invocation fails, airlock unreachable → outbound request fails).
diff --git a/docs/policy-schema.md b/docs/policy-schema.md
@@ -396,7 +396,7 @@ Minimal-logging policy for agent audit records.
 | `log_step_actions` | boolean | `true` | Log which actions were executed |
 | `log_raw_prompts` | boolean | `false` | Log raw LLM prompts (privacy risk — keep false) |
 | `log_raw_content` | boolean | `false` | Log raw file content (privacy risk — keep false) |
-| `log_file_paths` | boolean | `true` | Log which files were accessed (not their content) |
+| `log_file_paths` | boolean | `false` | Log which files were accessed (not their content) — off by default to reduce audit sensitivity; enable explicitly if needed |
 
 ### audit retention
 
diff --git a/docs/security-status.md b/docs/security-status.md
@@ -39,7 +39,7 @@ Last updated: 2026-03-10
 | Weight distribution fingerprinting | Implemented | M28 | Statistical fingerprinting of model weight distributions |
 | Garak LLM vulnerability scanner | Implemented | M29 | Garak integration for LLM vulnerability scanning |
 | gguf-guard deep GGUF integrity scanner | Implemented | M30 | Deep GGUF file format integrity and safety scanning |
-| Agent Mode (Phase 1: safe local autopilot) | Implemented | M31 | Policy-bound agent on :8476 with deny-by-default policy engine, capability tokens, hard budgets, storage gateway, 82 tests |
+| Agent Mode (Phase 1: safe local autopilot) | Implemented | M31 | Policy-bound agent with deny-by-default policy engine, capability tokens, hard budgets, storage gateway, workspace ID abstraction, Unix socket IPC (UI→Agent), 93 tests across 11 classes |
 
 ## Planned Features
 
diff --git a/docs/test-matrix.md b/docs/test-matrix.md
@@ -35,7 +35,23 @@ Last updated: 2026-03-10
 | test_canary_tripwire.py | tests/ | ~49 | Canary token placement, tripwire monitoring, alerts |
 | test_emergency_wipe.py | tests/ | ~65 | 3-level panic wipe, secure deletion, escalation |
 | test_update_rollback.py | tests/ | ~74 | Signed update verification, rollback triggers, recovery |
-| test_agent.py | tests/ | ~93 | Agent policy engine, capability tokens, storage gateway, budgets, planner, executor, API, workspace validation, security invariants |
+| test_agent.py | tests/ | 93 | Agent policy engine, capability tokens, storage gateway, budgets, planner, executor, API, workspace validation, security invariants |
+
+### Agent test breakdown (test_agent.py)
+
+| Class | Tests | Category | Description |
+|-------|-------|----------|-------------|
+| TestClassifyRisk | 3 | Unit | Risk-level classification for agent actions |
+| TestPolicyEngine | 15 | Unit / Security | Deny-by-default evaluation, always-deny invariants, hard-approval gates |
+| TestCapabilityTokens | 8 | Unit | Token creation, workspace scoping, mode-specific capabilities |
+| TestBudgets | 7 | Unit | Budget enforcement, limit checking, sensitive-mode tighter limits |
+| TestStorageGateway | 14 | Unit / Security | Path scope validation, sensitive file blocking, sensitivity ceiling, file size limits |
+| TestPlannerHeuristic | 8 | Unit | Heuristic plan decomposition, keyword-to-action mapping |
+| TestPlannerLLMParsing | 4 | Unit | LLM response parsing, malformed plan rejection |
+| TestExecutor | 6 | Integration | Step execution dispatch, tool firewall calls, budget tracking |
+| TestAgentAPI | 17 | Integration | HTTP endpoint contracts, input validation, task CRUD lifecycle, workspace ID resolution |
+| TestSecurityInvariants | 7 | Security | Fail-closed behavior, airlock/firewall bypass prevention, service-down handling |
+| TestDataModels | 4 | Unit | Task/step serialisation, status enum coverage |
 
 ## Shell Checks
 
diff --git a/files/system/etc/secure-ai/policy/agent.yaml b/files/system/etc/secure-ai/policy/agent.yaml
@@ -93,4 +93,4 @@ logging:
   log_step_actions: true
   log_raw_prompts: false
   log_raw_content: false
-  log_file_paths: true  # log which files were accessed (not content)
+  log_file_paths: false  # log which files were accessed (not content) — off by default to reduce audit sensitivity
diff --git a/files/system/usr/lib/systemd/system/secure-ai-agent.service b/files/system/usr/lib/systemd/system/secure-ai-agent.service
@@ -6,7 +6,7 @@ Requires=secure-ai-registry.service secure-ai-tool-firewall.service
 [Service]
 Type=simple
 ExecStart=/usr/libexec/secure-ai/agent
-Environment=BIND_ADDR=127.0.0.1:8476
+Environment=BIND_ADDR=unix:/run/secure-ai/agent.sock
 Environment=INFERENCE_URL=http://127.0.0.1:8465
 Environment=REGISTRY_URL=http://127.0.0.1:8470
 Environment=TOOL_FIREWALL_URL=http://127.0.0.1:8475
@@ -20,7 +20,8 @@ Environment=SERVICE_TOKEN_PATH=/run/secure-ai/service-token
 # Filesystem isolation
 DynamicUser=yes
 ReadOnlyPaths=/etc/secure-ai
-ReadOnlyPaths=/run/secure-ai
+ReadOnlyPaths=/run/secure-ai/service-token
+ReadWritePaths=/run/secure-ai/agent.sock
 ReadOnlyPaths=/var/lib/secure-ai/vault/user_docs
 ReadWritePaths=/var/lib/secure-ai/vault/outputs
 ReadWritePaths=/var/lib/secure-ai/logs
diff --git a/files/system/usr/lib/systemd/system/secure-ai-ui.service b/files/system/usr/lib/systemd/system/secure-ai-ui.service
@@ -11,7 +11,7 @@ Environment=INFERENCE_URL=http://127.0.0.1:8465
 Environment=REGISTRY_URL=http://127.0.0.1:8470
 Environment=TOOL_FIREWALL_URL=http://127.0.0.1:8475
 Environment=AIRLOCK_URL=http://127.0.0.1:8490
-Environment=AGENT_URL=http://127.0.0.1:8476
+Environment=AGENT_SOCKET=/run/secure-ai/agent.sock
 Environment=SEARCH_MEDIATOR_URL=http://127.0.0.1:8485
 Environment=DIFFUSION_URL=http://127.0.0.1:8455
 Environment=APPLIANCE_CONFIG=/etc/secure-ai/config/appliance.yaml
diff --git a/services/agent/agent/app.py b/services/agent/agent/app.py
@@ -471,16 +471,44 @@ def main():
         format="%(asctime)s %(name)s %(levelname)s %(message)s",
     )
 
-    host, port_str = _BIND_ADDR.rsplit(":", 1)
-    port = int(port_str)
-
-    log.info("agent service starting on %s:%d", host, port)
     log.info("policy: %s", _POLICY_PATH)
     log.info("vault: %s", _VAULT_ROOT)
 
-    _audit_log("service_started", {"bind": _BIND_ADDR})
-
-    app.run(host=host, port=port, debug=False, threaded=True)
+    if _BIND_ADDR.startswith("unix:"):
+        # Production: listen on a Unix domain socket (no TCP attack surface).
+        import socket as _socket
+        from wsgiref.simple_server import WSGIServer, make_server
+
+        sock_path = _BIND_ADDR[len("unix:"):]
+
+        # Remove stale socket file if present (e.g. after unclean shutdown).
+        try:
+            os.unlink(sock_path)
+        except FileNotFoundError:
+            pass
+
+        class _UnixWSGIServer(WSGIServer):
+            address_family = _socket.AF_UNIX
+
+        srv = make_server("", 0, app, server_class=_UnixWSGIServer)
+        # Replace the TCP socket with a Unix one bound to sock_path.
+        srv.socket.close()
+        sock = _socket.socket(_socket.AF_UNIX, _socket.SOCK_STREAM)
+        sock.bind(sock_path)
+        os.chmod(sock_path, 0o660)
+        sock.listen(128)
+        srv.socket = sock
+
+        log.info("agent service starting on unix:%s", sock_path)
+        _audit_log("service_started", {"bind": _BIND_ADDR})
+        srv.serve_forever()
+    else:
+        # Dev / fallback: plain TCP on loopback.
+        host, port_str = _BIND_ADDR.rsplit(":", 1)
+        port = int(port_str)
+        log.info("agent service starting on %s:%d (TCP — dev mode)", host, port)
+        _audit_log("service_started", {"bind": _BIND_ADDR})
+        app.run(host=host, port=port, debug=False, threaded=True)
 
 
 if __name__ == "__main__":
diff --git a/services/ui/ui/app.py b/services/ui/ui/app.py
@@ -164,7 +164,8 @@ def add_security_headers(response):
 REGISTRY_URL = os.getenv("REGISTRY_URL", "http://127.0.0.1:8470")
 TOOL_FIREWALL_URL = os.getenv("TOOL_FIREWALL_URL", "http://127.0.0.1:8475")
 AIRLOCK_URL = os.getenv("AIRLOCK_URL", "http://127.0.0.1:8490")
-AGENT_URL = os.getenv("AGENT_URL", "http://127.0.0.1:8476")
+AGENT_SOCKET = os.getenv("AGENT_SOCKET", "")  # Unix socket path (production)
+AGENT_URL = os.getenv("AGENT_URL", "http://127.0.0.1:8476")  # TCP fallback (dev)
 SEARCH_MEDIATOR_URL = os.getenv("SEARCH_MEDIATOR_URL", "http://127.0.0.1:8485")
 APPLIANCE_CONFIG = os.getenv("APPLIANCE_CONFIG", "/etc/secure-ai/config/appliance.yaml")
 QUARANTINE_DIR = Path(os.getenv("QUARANTINE_DIR", "/var/lib/secure-ai/quarantine"))
@@ -1695,35 +1696,75 @@ def update_health():
 
 
 # ---------------------------------------------------------------------------
-# Agent mode endpoints (proxy to agent service at :8476)
+# Agent IPC helper (Unix socket in production, TCP fallback for dev)
+# ---------------------------------------------------------------------------
+
+def _agent_request(method: str, path: str, *, json_body=None, params=None, timeout=10):
+    """Send an HTTP request to the agent service.
+
+    Uses a Unix domain socket when AGENT_SOCKET is set (production),
+    falls back to TCP via AGENT_URL for local development.
+    """
+    if AGENT_SOCKET:
+        import http.client
+        import json as _json
+        import socket as _socket
+
+        conn = http.client.HTTPConnection("localhost")
+        sock = _socket.socket(_socket.AF_UNIX, _socket.SOCK_STREAM)
+        sock.settimeout(timeout)
+        sock.connect(AGENT_SOCKET)
+        conn.sock = sock
+
+        headers = {"Host": "localhost"}
+        body = None
+        if json_body is not None:
+            body = _json.dumps(json_body).encode()
+            headers["Content-Type"] = "application/json"
+        if params:
+            from urllib.parse import urlencode
+            path = f"{path}?{urlencode(params)}"
+
+        conn.request(method, path, body=body, headers=headers)
+        resp = conn.getresponse()
+        data = resp.read()
+        conn.close()
+        return _json.loads(data), resp.status
+    else:
+        url = f"{AGENT_URL}{path}"
+        if method == "GET":
+            resp = requests.get(url, params=params, timeout=timeout)
+        else:
+            resp = requests.post(url, json=json_body, timeout=timeout)
+        return resp.json(), resp.status_code
+
+
+# ---------------------------------------------------------------------------
+# Agent mode endpoints (proxy to agent service)
 # ---------------------------------------------------------------------------
 
 @app.route("/api/agent/task", methods=["POST"])
 def agent_submit_task():
     """Submit a task to the agent service."""
     body = request.get_json(silent=True) or {}
     try:
-        resp = requests.post(
-            f"{AGENT_URL}/v1/task",
-            json=body,
-            timeout=30,
-        )
+        data, status = _agent_request("POST", "/v1/task", json_body=body, timeout=30)
         _ui_audit.append("agent_task_submitted", {
             "intent_length": len(body.get("intent", "")),
             "mode": body.get("mode", "standard"),
         })
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
 @app.route("/api/agent/task/<task_id>")
 def agent_get_task(task_id):
     """Get task status from agent service."""
     try:
-        resp = requests.get(f"{AGENT_URL}/v1/task/{task_id}", timeout=10)
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        data, status = _agent_request("GET", f"/v1/task/{task_id}")
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
@@ -1732,14 +1773,10 @@ def agent_approve_steps(task_id):
     """Approve pending steps in an agent task."""
     body = request.get_json(silent=True) or {}
     try:
-        resp = requests.post(
-            f"{AGENT_URL}/v1/task/{task_id}/approve",
-            json=body,
-            timeout=10,
-        )
+        data, status = _agent_request("POST", f"/v1/task/{task_id}/approve", json_body=body)
         _ui_audit.append("agent_steps_approved", {"task_id": task_id})
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
@@ -1748,29 +1785,21 @@ def agent_deny_steps(task_id):
     """Deny pending steps in an agent task."""
     body = request.get_json(silent=True) or {}
     try:
-        resp = requests.post(
-            f"{AGENT_URL}/v1/task/{task_id}/deny",
-            json=body,
-            timeout=10,
-        )
+        data, status = _agent_request("POST", f"/v1/task/{task_id}/deny", json_body=body)
         _ui_audit.append("agent_steps_denied", {"task_id": task_id})
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
 @app.route("/api/agent/task/<task_id>/cancel", methods=["POST"])
 def agent_cancel_task(task_id):
     """Cancel an agent task."""
     try:
-        resp = requests.post(
-            f"{AGENT_URL}/v1/task/{task_id}/cancel",
-            json={},
-            timeout=10,
-        )
+        data, status = _agent_request("POST", f"/v1/task/{task_id}/cancel", json_body={})
         _ui_audit.append("agent_task_cancelled", {"task_id": task_id})
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
@@ -1779,23 +1808,19 @@ def agent_list_tasks():
     """List agent tasks."""
     limit = request.args.get("limit", 50)
     try:
-        resp = requests.get(
-            f"{AGENT_URL}/v1/tasks",
-            params={"limit": limit},
-            timeout=10,
-        )
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        data, status = _agent_request("GET", "/v1/tasks", params={"limit": limit})
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503
 
 
 @app.route("/api/agent/modes")
 def agent_list_modes():
     """List available agent operating modes."""
     try:
-        resp = requests.get(f"{AGENT_URL}/v1/modes", timeout=5)
-        return jsonify(resp.json()), resp.status_code
-    except requests.RequestException as e:
+        data, status = _agent_request("GET", "/v1/modes", timeout=5)
+        return jsonify(data), status
+    except Exception as e:
         return jsonify({"error": f"agent service unavailable: {e}"}), 503