Address deferred items from issue #48#59
Merged
Conversation
Three fixes, one per sub-item. 1. FastAPI lifespan migration (Greptile P2 from #47) Convert the two `@app.on_event("startup" | "shutdown")` handlers in `cmd_serve.py` to a single `@asynccontextmanager`-decorated lifespan and pass it to `create_app(..., lifespan=...)`. The legacy `on_event` API is deprecated in newer FastAPI releases. `create_app()` grows an optional `lifespan` parameter (default `None`) so existing test callers keep working without changes. As a side benefit, this also delays `scheduler.start()` until after uvicorn binds the port — matching the existing reasoning for the `server.state` write. A failed bind no longer spins up an orphan retention scheduler. 2. Tune doctor's drift baseline check `DriftConfig.enabled` defaults to True, so every new agent (0–9 completed sessions) was tripping a `warning` on every `tj doctor` run — pure noise, since collection-in-progress is the expected state. Downgraded to `info` and reworded as "Collecting baseline: agent (N/M)". Threshold-reached agents are silent ("ok"); agents with drift explicitly disabled are ignored. Multiple in-progress agents are reported in one line instead of returning at the first match. Also tightened the two "skipped" messages in the spans-stats check so they say "running through the HTTP API fallback" instead of the misleading "non-DuckDB backend" — the backend IS DuckDB, the CLI session just doesn't have direct access because `tj serve` holds the write lock. 3. cost_budget_daily now fires in the budget breach demo Pricing reference mismatch, not a bug in the alert engine. The demo used model id `claude-sonnet-4-20250514` which isn't in `pricing/models.toml` (only the short-form ids `claude-{opus,sonnet,haiku}-4-X` are listed). The cost engine fell back to default rates ($0.50/$2.00 per MTok), making the 10-call total $0.0315 — below the $0.05 daily budget, so daily never breached. Switching the demo to `claude-sonnet-4-6` brings total spend to $0.2205 with real Sonnet rates, and both `cost_budget_session` and `cost_budget_daily` fire as the demo's comments describe. (Adding dated-form model ids like `claude-sonnet-4-20250514` to `pricing/models.toml` is a broader concern — real users running real LLM calls will see them too — but that's a separate change.) Verified: 416 tests pass, ruff + mypy clean, `tj serve` starts and shuts down cleanly with the new lifespan, `tj doctor` reports drift-in-progress as info, demo fires both budget alerts on re-run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #48.
Three fixes, one per sub-item in the issue.
1. FastAPI lifespan migration (Greptile P2 from #47)
`tokenjam/cli/cmd_serve.py` had two `@app.on_event("startup" | "shutdown")` handlers — the legacy FastAPI API that's deprecated in newer releases. Converted to a single `@asynccontextmanager`-decorated lifespan and threaded it through `create_app(..., lifespan=...)`. `create_app()` grows an optional `lifespan` parameter (default `None`) so existing test callers keep working without changes.
Side benefit: `scheduler.start()` now runs inside the lifespan too — same reasoning as the existing comment about `server.state`: a failed-to-bind `tj serve` no longer spins up an orphan retention scheduler before the port-bind failure.
2. Doctor's drift-baseline check no longer warns on every new agent
`DriftConfig.enabled` defaults to `True`, so every brand-new agent (0–9 completed sessions) was tripping a `warning` on every `tj doctor` run. Collection-in-progress is the expected state, so this was just noise.
Changes in `_check_drift_inactive`:
Also tightened the spans-stats check's "skipped" messages — they now say "running through the HTTP API fallback" instead of the misleading "non-DuckDB backend". The backend IS DuckDB; the CLI session just doesn't have direct access while `tj serve` holds the write lock.
3. `cost_budget_daily` now fires in the budget breach demo
Diagnosed as a pricing-reference mismatch, not an alert-engine bug. The demo used model id `claude-sonnet-4-20250514` which isn't in `pricing/models.toml` (the table only carries the short-form ids `claude-{opus,sonnet,haiku}-4-X`). The cost engine fell back to default rates (`$0.50`/`$2.00` per MTok), making the 10-call total $0.0315 — below the $0.05 daily budget, so daily never breached. The session alert still fired because session budget is $0.02.
Switched the demo to `claude-sonnet-4-6` (which is in the table at $3/$15 per MTok), bringing total spend to $0.2205. Both alerts now fire as the demo's comments describe:
```
17:38:11 ⚠ CRITICAL cost_budget_session budget-demo Session cost $0.2205 exceeds budget $0.0200
17:38:11 ⚠ CRITICAL cost_budget_daily budget-demo Daily cost $0.2205 exceeds budget $0.0500
```
Not in this PR (worth tracking separately)
Adding dated-form Anthropic model ids (e.g. `claude-sonnet-4-20250514`) to `pricing/models.toml` is a broader concern — real users running real LLM calls will hit the same missing-pricing fallback and see surprisingly low costs. Worth filing as its own issue: either a fixture of common dated aliases, or normalization in `get_rates()` to strip the date suffix.
Test plan
🤖 Generated with Claude Code