Skip to content

Address deferred items from issue #48#59

Merged
anilmurty merged 1 commit into
mainfrom
fix/issue-48-deferred-items
May 12, 2026
Merged

Address deferred items from issue #48#59
anilmurty merged 1 commit into
mainfrom
fix/issue-48-deferred-items

Conversation

@anilmurty
Copy link
Copy Markdown
Contributor

Closes #48.

Three fixes, one per sub-item in the issue.

1. FastAPI lifespan migration (Greptile P2 from #47)

`tokenjam/cli/cmd_serve.py` had two `@app.on_event("startup" | "shutdown")` handlers — the legacy FastAPI API that's deprecated in newer releases. Converted to a single `@asynccontextmanager`-decorated lifespan and threaded it through `create_app(..., lifespan=...)`. `create_app()` grows an optional `lifespan` parameter (default `None`) so existing test callers keep working without changes.

Side benefit: `scheduler.start()` now runs inside the lifespan too — same reasoning as the existing comment about `server.state`: a failed-to-bind `tj serve` no longer spins up an orphan retention scheduler before the port-bind failure.

2. Doctor's drift-baseline check no longer warns on every new agent

`DriftConfig.enabled` defaults to `True`, so every brand-new agent (0–9 completed sessions) was tripping a `warning` on every `tj doctor` run. Collection-in-progress is the expected state, so this was just noise.

Changes in `_check_drift_inactive`:

  • Downgraded `warning` → `info` for in-progress agents.
  • Reworded as `Collecting baseline: agent (N/M)` so the line is informative rather than alarming.
  • Multiple in-progress agents are reported in one line instead of returning at the first match.
  • Threshold-reached agents stay silent (`ok`); agents with drift explicitly disabled are ignored.

Also tightened the spans-stats check's "skipped" messages — they now say "running through the HTTP API fallback" instead of the misleading "non-DuckDB backend". The backend IS DuckDB; the CLI session just doesn't have direct access while `tj serve` holds the write lock.

3. `cost_budget_daily` now fires in the budget breach demo

Diagnosed as a pricing-reference mismatch, not an alert-engine bug. The demo used model id `claude-sonnet-4-20250514` which isn't in `pricing/models.toml` (the table only carries the short-form ids `claude-{opus,sonnet,haiku}-4-X`). The cost engine fell back to default rates (`$0.50`/`$2.00` per MTok), making the 10-call total $0.0315 — below the $0.05 daily budget, so daily never breached. The session alert still fired because session budget is $0.02.

Switched the demo to `claude-sonnet-4-6` (which is in the table at $3/$15 per MTok), bringing total spend to $0.2205. Both alerts now fire as the demo's comments describe:

```
17:38:11 ⚠ CRITICAL cost_budget_session budget-demo Session cost $0.2205 exceeds budget $0.0200
17:38:11 ⚠ CRITICAL cost_budget_daily budget-demo Daily cost $0.2205 exceeds budget $0.0500
```

Not in this PR (worth tracking separately)

Adding dated-form Anthropic model ids (e.g. `claude-sonnet-4-20250514`) to `pricing/models.toml` is a broader concern — real users running real LLM calls will hit the same missing-pricing fallback and see surprisingly low costs. Worth filing as its own issue: either a fixture of common dated aliases, or normalization in `get_rates()` to strip the date suffix.

Test plan

  • `pytest tests/unit/ tests/synthetic/ tests/agents/ tests/integration/` → 416 passed
  • `ruff check tokenjam/` → clean
  • `mypy tokenjam/` → "Success: no issues found in 78 source files"
  • Live smoke test: `tj serve` starts cleanly under the new lifespan, writes `server.state` after port-bind, shuts down cleanly under `tj stop`
  • `tj doctor` against a real DB: drift-in-progress reported as `i Drift detection: Collecting baseline: claude-code-tokenjam (0/10)` (info, not warning)
  • Live re-run of the budget breach demo: both `cost_budget_session` and `cost_budget_daily` alerts fire

🤖 Generated with Claude Code

Three fixes, one per sub-item.

1. FastAPI lifespan migration (Greptile P2 from #47)

Convert the two `@app.on_event("startup" | "shutdown")` handlers in
`cmd_serve.py` to a single `@asynccontextmanager`-decorated lifespan and
pass it to `create_app(..., lifespan=...)`. The legacy `on_event` API is
deprecated in newer FastAPI releases. `create_app()` grows an optional
`lifespan` parameter (default `None`) so existing test callers keep working
without changes.

As a side benefit, this also delays `scheduler.start()` until after uvicorn
binds the port — matching the existing reasoning for the `server.state`
write. A failed bind no longer spins up an orphan retention scheduler.

2. Tune doctor's drift baseline check

`DriftConfig.enabled` defaults to True, so every new agent (0–9 completed
sessions) was tripping a `warning` on every `tj doctor` run — pure noise,
since collection-in-progress is the expected state. Downgraded to `info`
and reworded as "Collecting baseline: agent (N/M)". Threshold-reached
agents are silent ("ok"); agents with drift explicitly disabled are
ignored. Multiple in-progress agents are reported in one line instead of
returning at the first match.

Also tightened the two "skipped" messages in the spans-stats check so they
say "running through the HTTP API fallback" instead of the misleading
"non-DuckDB backend" — the backend IS DuckDB, the CLI session just doesn't
have direct access because `tj serve` holds the write lock.

3. cost_budget_daily now fires in the budget breach demo

Pricing reference mismatch, not a bug in the alert engine. The demo used
model id `claude-sonnet-4-20250514` which isn't in `pricing/models.toml`
(only the short-form ids `claude-{opus,sonnet,haiku}-4-X` are listed). The
cost engine fell back to default rates ($0.50/$2.00 per MTok), making the
10-call total $0.0315 — below the $0.05 daily budget, so daily never
breached. Switching the demo to `claude-sonnet-4-6` brings total spend to
$0.2205 with real Sonnet rates, and both `cost_budget_session` and
`cost_budget_daily` fire as the demo's comments describe.

(Adding dated-form model ids like `claude-sonnet-4-20250514` to
`pricing/models.toml` is a broader concern — real users running real LLM
calls will see them too — but that's a separate change.)

Verified: 416 tests pass, ruff + mypy clean, `tj serve` starts and shuts
down cleanly with the new lifespan, `tj doctor` reports drift-in-progress
as info, demo fires both budget alerts on re-run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anilmurty anilmurty merged commit e1b89e4 into main May 12, 2026
4 checks passed
@anilmurty anilmurty deleted the fix/issue-48-deferred-items branch May 12, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issues Deferred from 0.1.9

1 participant