Skip to content

P11-S3: vLLM adapter and telemetry#137

Open
samrusani wants to merge 1 commit intomainfrom
codex/phase11-sprint-3-vllm-adapter-selfhosted
Open

P11-S3: vLLM adapter and telemetry#137
samrusani wants to merge 1 commit intomainfrom
codex/phase11-sprint-3-vllm-adapter-selfhosted

Conversation

@samrusani
Copy link
Copy Markdown
Owner

@samrusani samrusani commented Apr 11, 2026

Summary

This PR delivers Phase 11 Sprint 3 by adding the vLLM self-hosted adapter path on top of the shipped provider abstraction from P11-S1 and the local-adapter work from P11-S2.

What changed

  • adds the vllm adapter and self-hosted helper wiring behind the existing provider registry
  • adds POST /v1/providers/vllm/register and GET /v1/providers/{provider_id}/telemetry
  • keeps POST /v1/providers/test, POST /v1/runtime/invoke, GET /v1/providers, and GET /v1/providers/{provider_id} working through the shipped normalized runtime seam
  • adds bounded vLLM passthrough options, normalized latency and usage telemetry persistence, and telemetry exposure for provider test/runtime invoke flows
  • adds self-hosted docs and a runnable end-to-end helper script for the vLLM path
  • updates build/review evidence and the control-doc truth checker to match the committed P11-S3 payload

Upgrade Overview

Protected Areas

  • continuity APIs
  • memory schema
  • trust rules

Compatibility Impact

The P11-S3 changes are additive on top of the shipped provider abstraction. Existing P11-S1 OpenAI-compatible flows, P11-S2 local-provider flows, and v0/responses behavior remain on the same normalized runtime seam; the new API surface is the additive POST /v1/providers/vllm/register and GET /v1/providers/{provider_id}/telemetry path.

Migration / Rollout

Apply the new 20260411_0054_phase11_vllm_telemetry migration before using the vLLM telemetry path. Roll out by registering vLLM providers through the new endpoint and validating healthcheck, invoke, and telemetry behavior in one workspace before broader self-hosted adoption.

Operator Action

Operators need to run the normal API migration flow, keep the vLLM service reachable at its configured base URL, and register the provider with only the bounded passthrough options supported by the adapter. No manual data backfill is required for existing providers.

Validation

Validation for this branch head is: python3 scripts/check_control_doc_truth.py PASS, ./.venv/bin/python -m pytest tests/unit tests/integration -q PASS (1122 passed in 170.62s), and pnpm --dir apps/web test PASS (62 passed files, 199 passed tests). The sprint also includes targeted runtime and telemetry coverage in the updated provider-runtime unit/integration tests and the new telemetry migration unit test.

Rollback

Rollback is a standard application rollback plus reverting the P11-S3 commit and schema change if the vLLM path must be withdrawn. If rollout issues appear after registration, operators can stop using the registered vLLM provider path without affecting the shipped OpenAI-compatible, Ollama, or llama.cpp flows.

Verification

  • python3 scripts/check_control_doc_truth.py
    • PASS
  • ./.venv/bin/python -m pytest tests/unit tests/integration -q
    • PASS (1122 passed in 170.62s)
  • pnpm --dir apps/web test
    • PASS (62 passed files, 199 passed tests)
    • duration 4.86s

Merge Scope Notes

  • README.md, ARCHITECTURE.md, and PRODUCT_BRIEF.md remain locally dirty and are explicitly excluded from this sprint merge scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant