Skip to content

P124 — Codex Subscription Proxy Re-Architecture (Phase 1)#1

Open
sterling-prog wants to merge 10 commits into
mainfrom
build/P124-ccproxy-codex
Open

P124 — Codex Subscription Proxy Re-Architecture (Phase 1)#1
sterling-prog wants to merge 10 commits into
mainfrom
build/P124-ccproxy-codex

Conversation

@sterling-prog
Copy link
Copy Markdown
Owner

Summary

  • Module 1: Forked ccproxy-api at v0.2.6, stripped to Codex + operational plugins, port :3462, Claude Code adapter retained but disabled, fork documented in FORK-README.md
  • Module 2: Codex Integration & Hardening — rate limiting (60 req/min), /ready alias, all 8 Codex models registered with correct context windows (spark: 128K, rest: 272K), smoke test suite (9/9 pass), auth verified (grady@athenscooks.com, pro, active until 2026-04-21), tool-policy gap documented
  • Module 3: PM2 service (ccproxy-codex) registered and running, dump saved — openclaw.json gateway switch pending Grady explicit approval

openclaw.json change required (needs explicit authorization)

The only remaining step for Module 3 is a one-line config change:

# in openclaw.json, codex provider:
"baseUrl": "http://localhost:3462/v1"   # change from :3460 to :3462

Per AGENTS.md policy (gateway config = explicit approval required), I'm holding on this until Grady signs off.

Test plan

  • Smoke test: 9 pass, 1 skip (long-running, manual)
  • Non-streaming chat completion confirmed working
  • Streaming SSE confirmed working
  • Auth valid via ccproxy auth status codex
  • All 8 Codex models in /v1/models with correct context windows
  • Metrics endpoint populated
  • PM2 process online and healthy
  • openclaw.json baseUrl switch (pending Grady approval)
  • Gateway-routed integration test (post-switch)
  • Shadow validation / 24h auth cycle (Module 4)

🤖 Generated with Claude Code

Sterling and others added 10 commits March 26, 2026 17:38
…gure for port 3462

- Remove plugins: copilot, credential_balancer, analytics, dashboard, duckdb_storage, pricing, docker
- Fix access_log/plugin.py: make analytics.ingest import lazy (was hard top-level import)
- Fix testing/endpoints/config.py: remove copilot import + PROVIDER_CONFIGS/TOOL_ACCUMULATORS entries
- Update config/settings.py: remove copilot from DEFAULT_ENABLED_PLUGINS
- Update pyproject.toml: remove entry points for deleted plugins
- Add config.toml (gitignored): port 3462, codex-only, claude disabled, concurrency tuned
- Add FORK-README.md documenting what was kept vs removed and why

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rate limiting middleware (60 req/min sliding window, P124 spec invariant)
- /ready alias endpoint for health checks
- All 8 Codex models registered with correct context windows (gpt-5.3-codex-spark: 128K, all others: 272K)
- Concurrency config in config.toml: max 5 concurrent, queue 20, timeout 900s
- Tool execution policy gap documented in docs/tool-policy-gap.md
- Smoke test suite in scripts/smoke-test.sh (9 pass, 1 manual skip)
- Live test confirmed: non-streaming, streaming, /health, /ready, /v1/models, /metrics all passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ecosystem.config.cjs: PM2 config for ccproxy-codex on port :3462
- Process registered, health verified, PM2 dump saved
- openclaw.json gateway switch pending Grady explicit approval (see below)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…GE_GAP)

Default path now verifies timeout config (request_timeout=900s, queue_timeout=120s)
instead of unconditionally skipping. Live >60s request test available via
SMOKE_LONG_RUNNING=1. Smoke results: 10/10 pass, 0 skip.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. middleware.py: Remove redundant if/else TYPE_CHECKING guard where both
   branches were identical — replace with a direct unconditional import of
   BaseHTTPMiddleware. Remove now-unused TYPE_CHECKING from typing import.

2. config.toml: Set allow_credentials = false. The CORS spec prohibits
   Access-Control-Allow-Credentials: true when Allow-Origin is '*'.
   This proxy is localhost-only; credentials flag was unneeded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PM2 defaults to the Node.js interpreter when no extension is present.
The ccproxy binary has a Python shebang that PM2 ignores without
interpreter: 'none', causing SyntaxError crash loops (430 restarts).
Setting interpreter: 'none' lets the OS honour the shebang directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1. rate_limit.py: Add asyncio.Lock around check-and-append to prevent
   concurrent coroutines from both passing the capacity check before
   either records its timestamp. Lock also guards the headers snapshot.

2. scripts/smoke-test.sh: Replace predictable /tmp/smoke_*.json names
   with mktemp -d temp directory + EXIT trap cleanup. Eliminates symlink
   TOCTOU race against world-writable /tmp.

3. config.toml: Disable command_replay plugin (not tracked in git but
   updated on disk). Replay scripts include Authorization: Bearer headers
   — credential exposure to any local user with /tmp access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
apply_to_app() was silently swallowing exceptions from add_middleware()
and continuing startup. A security-layer middleware (rate limiter, auth)
failing to register would leave the proxy unprotected with no error at
startup — the server would appear healthy but serve unguarded traffic.

Fix: re-raise as RuntimeError if a SECURITY-layer middleware (priority
<= 100) fails to register. Plugin middleware still logs and continues
(a plugin failing should not crash the service).

Also: /metrics smoke test now skips on 503 instead of failing — 503
means prometheus-client is not installed, which is not a regression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y middleware error

The previous fix raised RuntimeError with 'from e', chaining the original
exception. kwargs passed to add_middleware() may include secrets or tokens
(e.g., auth keys for future security plugins). Chaining via __cause__ would
propagate those sensitive values up the call stack and into any exception
handler that inspects __context__ or prints the full traceback.

Fix: use 'from None' to suppress the chain. The original exception is
already captured in the structured logger (exc_info=e) before the re-raise,
so no diagnostic information is lost.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant