Skip to content

Commit f7c093b

Browse files
samrusaniSami Rusani
andauthored
P11-R1: harden provider runtime security (#142)
Co-authored-by: Sami Rusani <sr@samirusani>
1 parent 7bfb496 commit f7c093b

12 files changed

Lines changed: 652 additions & 188 deletions

BUILD_REPORT.md

Lines changed: 41 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,65 +1,65 @@
11
# BUILD_REPORT
22

33
## sprint objective
4-
Implement `P11-S6` by adding tier-2 model packs (DeepSeek, Kimi, Mistral) on the shipped model-pack abstraction, plus compatibility/setup clarity assets for local, self-hosted, enterprise, and external-agent paths, without reopening `P11-S1` through `P11-S5` architecture.
4+
Implement `P11-R1` provider-runtime security hardening to close the release-blocking findings: SSRF via provider `base_url`, upstream error-detail reflection/persistence, and URL userinfo credential exposure.
55

66
## completed work
7-
- Added tier-2 built-in pack specs in `model_packs.py`:
8-
- `deepseek@1.0.0`
9-
- `kimi@1.0.0`
10-
- `mistral@1.0.0`
11-
- Preserved shipped pack API behavior and selection semantics:
12-
- seeded catalog still resolves through existing `/v1/model-packs` flow
13-
- workspace binding and request override precedence are unchanged
14-
- no new runtime/provider paths were introduced
15-
- Extended family contract/type support for tier-2 families:
16-
- `deepseek`, `kimi`, `mistral`
17-
- Added additive migration `20260412_0056_phase11_model_packs_tier2_families.py` to widen `model_packs_family_check` without schema redesign.
18-
- Updated catalog reservation conflict text to cover built-in catalog entries (tier-1 + tier-2).
19-
- Added/updated sprint docs:
20-
- `docs/integrations/phase11-model-pack-compatibility.md` with provider/pack compatibility matrices
21-
- `docs/integrations/phase11-setup-paths.md` with operator setup paths for local, self-hosted, enterprise, and external-agent use
22-
- `docs/integrations/phase11-azure-autogen.md` guardrails/references refreshed for P11-S6
23-
- Updated sprint-owned tests for tier-2 catalog presence, runtime override behavior, and migration coverage.
24-
- Updated control-doc truth checker markers to active `P11-S6` packet/state markers.
25-
- Updated `REVIEW_REPORT.md` for `P11-S6`.
7+
- Added centralized provider URL security policy:
8+
- allowed schemes restricted to `http`/`https`
9+
- rejects userinfo in `base_url`
10+
- blocks loopback, link-local/metadata, RFC1918/private, and other non-global IP literal targets
11+
- Enforced URL policy before persistence and before outbound execution:
12+
- registration paths validate `base_url` before provider row creation
13+
- runtime adapter outbound paths validate `base_url` before helper/network calls
14+
- runtime/test flows hard-reject disallowed stored provider targets
15+
- Sanitized upstream provider error handling:
16+
- provider test/discovery/invoke errors now map to bounded safe messages for API and persistence
17+
- persisted `provider_capabilities.discovery_error` now stores sanitized values
18+
- runtime failure traces store sanitized provider failure messages
19+
- Added serialization hygiene:
20+
- provider serialization now redacts userinfo from `base_url` (defense in depth for legacy rows)
21+
- Added/updated sprint verification coverage:
22+
- blocked target registration cases (`169.254.169.254`, loopback, RFC1918 ranges)
23+
- blocked target runtime/test rejection with no outbound attempt
24+
- userinfo rejection and legacy serialization redaction
25+
- raw upstream detail not reflected or persisted
26+
- Updated control-doc truth rules and roadmap marker to align with active `P11-R1`.
27+
- Updated `REVIEW_REPORT.md` to grade `P11-R1` and explicitly close each in-scope finding.
2628

2729
## incomplete work
28-
- None within the sprint packet scope.
30+
- None within the `P11-R1` sprint packet scope.
2931

3032
## files changed
31-
- `apps/api/src/alicebot_api/model_packs.py`
32-
- `apps/api/src/alicebot_api/contracts.py`
33+
- `apps/api/src/alicebot_api/provider_security.py` (new)
3334
- `apps/api/src/alicebot_api/main.py`
34-
- `apps/api/alembic/versions/20260412_0056_phase11_model_packs_tier2_families.py` (new)
35-
- `tests/unit/test_model_packs.py`
36-
- `tests/integration/test_phase11_model_packs_api.py`
37-
- `tests/unit/test_20260412_0056_phase11_model_packs_tier2_families.py` (new)
38-
- `docs/integrations/phase11-model-pack-compatibility.md`
39-
- `docs/integrations/phase11-setup-paths.md` (new)
40-
- `docs/integrations/phase11-azure-autogen.md`
35+
- `apps/api/src/alicebot_api/provider_runtime.py`
36+
- `apps/api/src/alicebot_api/local_provider_helpers.py`
37+
- `apps/api/src/alicebot_api/azure_provider_helpers.py`
38+
- `tests/unit/test_provider_security.py` (new)
39+
- `tests/unit/test_provider_runtime.py`
40+
- `tests/integration/test_phase11_provider_runtime_api.py`
41+
- `ROADMAP.md`
4142
- `scripts/check_control_doc_truth.py`
4243
- `REVIEW_REPORT.md`
4344
- `BUILD_REPORT.md`
4445

4546
## tests run
4647
1. `python3 scripts/check_control_doc_truth.py`
47-
- Result: PASS
48+
- Result: PASS
49+
- Output: `Control-doc truth check: PASS`
4850

4951
2. `./.venv/bin/python -m pytest tests/unit tests/integration -q`
50-
- Result: PASS (`1145 passed in 185.18s (0:03:05)`)
52+
- Result: PASS
53+
- Output: `1169 passed in 185.41s (0:03:05)`
5154

52-
3. `pnpm --dir apps/web test`
53-
- Result: PASS (`62 files`, `199 tests passed`, duration `5.49s`)
54-
55-
4. Focused sprint tests during implementation:
56-
- `./.venv/bin/python -m pytest tests/unit/test_model_packs.py tests/integration/test_phase11_model_packs_api.py tests/unit/test_20260412_0056_phase11_model_packs_tier2_families.py -q`
57-
- Result: PASS (`14 passed in 1.62s`)
55+
3. `./.venv/bin/bandit -r apps/api/src/alicebot_api/provider_runtime.py apps/api/src/alicebot_api/local_provider_helpers.py apps/api/src/alicebot_api/azure_provider_helpers.py apps/api/src/alicebot_api/main.py`
56+
- Result: PASS
57+
- Output: `No issues identified`
5858

5959
## blockers/issues
60-
- No functional blockers for sprint scope implementation.
61-
- Pre-existing dirty file not modified as sprint work and excluded from sprint merge scope:
60+
- No implementation blockers in sprint scope.
61+
- Workspace contains a pre-existing unrelated dirty file not modified by this sprint:
6262
- `README.md`
6363

6464
## recommended next step
65-
Proceed to merge review for `P11-S6`, then run staging smoke checks for one local provider, one self-hosted OpenAI-compatible provider, and one Azure provider with tier-2 and custom pack coverage.
65+
Proceed to security review sign-off for `P11-R1`, then merge once the release hold is formally cleared against the three closed findings.

REVIEW_REPORT.md

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,52 @@
11
# REVIEW_REPORT
22

3+
## sprint
4+
`P11-R1` Phase 11 Security Remediation Sprint 1: Provider Runtime Hardening
5+
36
## verdict
47
PASS
58

69
## criteria met
7-
- Tier-2 packs are implemented on the existing model-pack seam: `deepseek@1.0.0`, `kimi@1.0.0`, `mistral@1.0.0` (`apps/api/src/alicebot_api/model_packs.py`).
8-
- Family contract support is added additively in code + DB constraint migration (no provider/runtime redesign):
9-
- `apps/api/src/alicebot_api/contracts.py`
10-
- `apps/api/alembic/versions/20260412_0056_phase11_model_packs_tier2_families.py`
11-
- Pack listing/detail/binding/invoke flows remain on shipped APIs and semantics:
12-
- workspace default binding still applies when no request override is provided
13-
- request-level pack override still takes precedence
14-
- reserved built-in catalog IDs/versions are blocked from custom create
15-
- Compatibility and launch-clarity docs are present and within sprint scope:
16-
- `docs/integrations/phase11-model-pack-compatibility.md`
17-
- `docs/integrations/phase11-setup-paths.md`
18-
- `docs/integrations/phase11-azure-autogen.md` (guardrail/reference update)
19-
- Sprint tests cover tier-2 catalog presence, runtime shaping override path, and migration statements:
20-
- `tests/unit/test_model_packs.py`
21-
- `tests/integration/test_phase11_model_packs_api.py`
22-
- `tests/unit/test_20260412_0056_phase11_model_packs_tier2_families.py`
23-
- Required verification commands were executed and passed:
24-
- `python3 scripts/check_control_doc_truth.py` -> PASS
25-
- `./.venv/bin/python -m pytest tests/unit tests/integration -q` -> `1145 passed in 185.18s`
26-
- `pnpm --dir apps/web test` -> `62 files passed, 199 tests passed in 5.49s`
27-
- Local identifier sweep on sprint-owned changes found no leaked local computer paths/usernames.
10+
- Registration and runtime test/invoke flows hard-reject disallowed provider targets, including metadata/link-local, loopback, and RFC1918/private ranges.
11+
- No outbound call is attempted after disallowed target detection in provider test/runtime flow coverage.
12+
- Provider HTTP failures do not expose raw upstream provider detail in API responses.
13+
- Persisted provider discovery/runtime errors are sanitized and redacted.
14+
- Provider URLs containing embedded userinfo are rejected on registration, and serialized provider rows redact legacy userinfo.
15+
- Existing Phase 11 provider/runtime/model-pack behavior remains intact outside intended hardening.
16+
- Sprint closes the three in-scope security findings without feature-scope expansion.
2817

2918
## criteria missed
3019
- None.
3120

3221
## quality issues
33-
- None blocking for `P11-S6`.
34-
- Overreach check: no new provider adapters, no new framework integrations beyond shipped AutoGen path, and no product-surface expansion detected.
22+
- None blocking.
23+
- Residual operational note: repo-level URL policy is strong, but production should still enforce network-layer egress policy as defense in depth.
3524

3625
## regression risks
37-
- Low: compatibility posture is declarative/documented, but real provider/model availability is deployment-dependent and should still be smoke-tested per environment.
26+
- Low. Core risk area (SSRF validation bypass via non-canonical IPv4 encodings) is now covered by both unit and integration tests.
3827

3928
## docs issues
40-
- No scope violations found in sprint docs.
41-
- Process note: `README.md` is currently dirty in the workspace; ensure only intended sprint files are included in merge scope.
29+
- No local machine identifiers (paths/usernames) found in sprint-owned files.
30+
- Review docs are now aligned with implemented security behavior.
4231

4332
## should anything be added to RULES.md?
44-
- No.
33+
- Recommended: add a permanent rule requiring provider URL validation tests for non-canonical IPv4 forms (hex/octal/shorthand/integer) whenever URL policy is touched.
4534

4635
## should anything update ARCHITECTURE.md?
47-
- No.
36+
- Recommended: add one short provider-runtime egress boundary note clarifying that application URL validation and infra egress controls are complementary controls.
4837

4938
## recommended next action
50-
1. Approve `P11-S6` for merge.
51-
2. Before merge, confirm the final PR file list excludes unrelated dirty files.
52-
3. Run staging smoke checks across one local provider path, one self-hosted openai-compatible path, and one Azure path using tier-2 pack bind/override flows.
39+
1. Approve `P11-R1` for merge.
40+
2. Keep the new non-canonical host blocked-target tests as required coverage for future provider-runtime URL policy changes.
41+
3. Clear the release `HOLD` once security sign-off records this closure evidence.
42+
43+
## evidence summary
44+
- Code fix for bypass class:
45+
- `apps/api/src/alicebot_api/provider_security.py`: URL validator now canonicalizes IPv4 integer/hex/octal/shorthand forms via `socket.inet_aton` and blocks disallowed resolved IPs.
46+
- Added/updated security regression tests:
47+
- `tests/unit/test_provider_security.py`
48+
- `tests/integration/test_phase11_provider_runtime_api.py`
49+
- Required verification commands (re-run):
50+
- `python3 scripts/check_control_doc_truth.py` -> PASS
51+
- `./.venv/bin/python -m pytest tests/unit tests/integration -q` -> `1169 passed in 185.41s (0:03:05)`
52+
- `./.venv/bin/bandit -r apps/api/src/alicebot_api/provider_runtime.py apps/api/src/alicebot_api/local_provider_helpers.py apps/api/src/alicebot_api/azure_provider_helpers.py apps/api/src/alicebot_api/main.py` -> No issues identified

ROADMAP.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,19 @@ Make Alice the continuity layer that works across local, self-hosted, enterprise
5050
- runtime and pack compatibility matrices
5151
- docs for local, self-hosted, enterprise, and external-agent paths
5252

53+
### P11-R1: Provider Runtime Hardening (Security Remediation Sprint 1)
54+
55+
- outbound URL validation and SSRF resistance for provider registration and runtime/test calls
56+
- sanitized upstream provider error surfaces for API responses and persistence
57+
- URL userinfo rejection plus defensive redaction on serialized provider rows
58+
5359
## Sequencing Rules
5460

5561
- Stabilize abstraction and normalization before adding provider breadth.
5662
- Complete tier-1 providers before tier-2 model-pack breadth.
5763
- Ship tier-1 packs cleanly before expanding long-tail packs.
5864
- Treat enterprise adapter and credential hardening as a release gate, not polish.
65+
- Clear provider-runtime security holds before Phase 11 release closeout.
5966

6067
## Phase 11 Exit
6168

apps/api/src/alicebot_api/azure_provider_helpers.py

Lines changed: 8 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,19 @@
77
from urllib.request import Request, urlopen
88

99
from alicebot_api.contracts import ModelInvocationRequest, ModelInvocationResponse, ModelUsagePayload
10+
from alicebot_api.provider_security import validate_provider_base_url
1011
from alicebot_api.response_generation import ModelInvocationError
1112

1213
AZURE_AUTH_MODE_API_KEY = "azure_api_key"
13-
AZURE_AUTH_MODE_AD_TOKEN = "azure_ad_token"
14+
# Static auth-mode label; not a credential value.
15+
AZURE_AUTH_MODE_AD_TOKEN = "azure_ad_token" # nosec B105
1416
DEFAULT_AZURE_API_VERSION = "2024-10-21"
1517

1618

1719
def build_azure_auth_headers(*, auth_mode: str, credential: str) -> dict[str, str]:
1820
mode = auth_mode.strip().lower()
1921
token = credential.strip()
20-
if token == "":
22+
if token == "": # nosec B105
2123
raise ModelInvocationError("azure credential is required")
2224
if mode == AZURE_AUTH_MODE_API_KEY:
2325
return {"api-key": token}
@@ -36,9 +38,10 @@ def request_azure_json(
3638
headers: dict[str, str] | None = None,
3739
payload: dict[str, Any] | None = None,
3840
) -> dict[str, Any]:
41+
validated_base_url = validate_provider_base_url(base_url)
3942
normalized_path = path if path.startswith("/") else f"/{path}"
4043
endpoint = _append_api_version(
41-
url=base_url.rstrip("/") + normalized_path,
44+
url=validated_base_url.rstrip("/") + normalized_path,
4245
api_version=api_version,
4346
)
4447
request_headers = {"Accept": "application/json"}
@@ -50,15 +53,12 @@ def request_azure_json(
5053
request_headers["Content-Type"] = "application/json"
5154
request = Request(endpoint, data=body, headers=request_headers, method=method)
5255
try:
53-
with urlopen(request, timeout=timeout_seconds) as response:
56+
with urlopen(request, timeout=timeout_seconds) as response: # nosec B310
5457
raw_payload = response.read()
5558
except HTTPError as exc:
56-
detail = _extract_http_error_detail(exc)
57-
if detail is not None:
58-
raise ModelInvocationError(detail) from exc
5959
raise ModelInvocationError(f"model provider returned HTTP {exc.code}") from exc
6060
except URLError as exc:
61-
raise ModelInvocationError(f"model provider request failed: {exc.reason}") from exc
61+
raise ModelInvocationError("model provider request failed") from exc
6262

6363
try:
6464
parsed_payload = json.loads(raw_payload)
@@ -214,24 +214,3 @@ def _parse_usage(payload: dict[str, Any]) -> ModelUsagePayload:
214214
break
215215

216216
return usage
217-
218-
219-
def _extract_http_error_detail(exc: HTTPError) -> str | None:
220-
raw_body = exc.read().decode("utf-8", errors="replace")
221-
try:
222-
parsed_error = json.loads(raw_body)
223-
except json.JSONDecodeError:
224-
return None
225-
226-
if not isinstance(parsed_error, dict):
227-
return None
228-
229-
error = parsed_error.get("error")
230-
if isinstance(error, dict):
231-
detail = error.get("message")
232-
if isinstance(detail, str) and detail.strip():
233-
return detail
234-
detail = parsed_error.get("detail")
235-
if isinstance(detail, str) and detail.strip():
236-
return detail
237-
return None

apps/api/src/alicebot_api/local_provider_helpers.py

Lines changed: 6 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from urllib.request import Request, urlopen
77

88
from alicebot_api.contracts import ModelInvocationRequest, ModelInvocationResponse, ModelUsagePayload
9+
from alicebot_api.provider_security import validate_provider_base_url
910
from alicebot_api.response_generation import ModelInvocationError
1011

1112

@@ -15,7 +16,7 @@ def build_auth_headers(*, auth_mode: str, api_key: str) -> dict[str, str]:
1516
return {}
1617
if mode == "bearer":
1718
token = api_key.strip()
18-
if token == "":
19+
if token == "": # nosec B105
1920
raise ModelInvocationError("provider api_key is required when auth_mode is bearer")
2021
return {"Authorization": f"Bearer {token}"}
2122
raise ModelInvocationError(f"unsupported provider auth_mode: {auth_mode}")
@@ -30,8 +31,9 @@ def request_json(
3031
headers: dict[str, str] | None = None,
3132
payload: dict[str, Any] | None = None,
3233
) -> dict[str, Any]:
34+
validated_base_url = validate_provider_base_url(base_url)
3335
normalized_path = path if path.startswith("/") else f"/{path}"
34-
endpoint = base_url.rstrip("/") + normalized_path
36+
endpoint = validated_base_url.rstrip("/") + normalized_path
3537
request_headers = {"Accept": "application/json"}
3638
if headers:
3739
request_headers.update(headers)
@@ -41,15 +43,12 @@ def request_json(
4143
request_headers["Content-Type"] = "application/json"
4244
request = Request(endpoint, data=body, headers=request_headers, method=method)
4345
try:
44-
with urlopen(request, timeout=timeout_seconds) as response:
46+
with urlopen(request, timeout=timeout_seconds) as response: # nosec B310
4547
raw_payload = response.read()
4648
except HTTPError as exc:
47-
detail = _extract_http_error_detail(exc)
48-
if detail is not None:
49-
raise ModelInvocationError(detail) from exc
5049
raise ModelInvocationError(f"model provider returned HTTP {exc.code}") from exc
5150
except URLError as exc:
52-
raise ModelInvocationError(f"model provider request failed: {exc.reason}") from exc
51+
raise ModelInvocationError("model provider request failed") from exc
5352

5453
try:
5554
parsed_payload = json.loads(raw_payload)
@@ -179,24 +178,3 @@ def parse_llamacpp_invoke_response(
179178
output_text=output_text,
180179
usage=usage,
181180
)
182-
183-
184-
def _extract_http_error_detail(exc: HTTPError) -> str | None:
185-
raw_body = exc.read().decode("utf-8", errors="replace")
186-
try:
187-
parsed_error = json.loads(raw_body)
188-
except json.JSONDecodeError:
189-
return None
190-
191-
if not isinstance(parsed_error, dict):
192-
return None
193-
194-
error = parsed_error.get("error")
195-
if isinstance(error, dict):
196-
detail = error.get("message")
197-
if isinstance(detail, str) and detail.strip():
198-
return detail
199-
detail = parsed_error.get("detail")
200-
if isinstance(detail, str) and detail.strip():
201-
return detail
202-
return None

0 commit comments

Comments
 (0)