Skip to content

feat(llm): implement token usage tracking and cost monitoring (#361)#416

Open
Francis6-git wants to merge 1 commit into
Traqora:mainfrom
Francis6-git:feature/llm-token-usage-cost-monitoring-issue-361
Open

feat(llm): implement token usage tracking and cost monitoring (#361)#416
Francis6-git wants to merge 1 commit into
Traqora:mainfrom
Francis6-git:feature/llm-token-usage-cost-monitoring-issue-361

Conversation

@Francis6-git

Copy link
Copy Markdown

Description

Implements process-wide token usage tracking, latency monitoring, and cost evaluation for LLM calls as requested in #361. This implementation is entirely provider-agnostic, allowing any integrated model or service wrapper to pass standardized LLMUsage payloads to the central tracker.

Key Changes

  • Core Tracker (astroml/tracking/llm_usage_tracker.py): Created the thread-safe LLMUsageTracker utilizing a collections.deque ring buffer to log the last 5,000 events safely without chronological corruption. Includes optional fallback cost estimation via static LLMPrices mapping.
  • Metrics Exposure: Added Prometheus counters and histograms (astroml_llm_calls_total, astroml_llm_tokens_total, astroml_llm_latency_seconds, astroml_llm_cost_usd_total) with clean conditional imports if prometheus_client is unavailable.
  • API Endpoints (api/routers/llm_usage.py): Exposed /api/v1/llm/usage/recent and /api/v1/llm/usage/summary to fetch lightweight, in-memory rolling analytics via FastAPI.
  • Monitoring Infrastructure: - Added a complete Grafana dashboard (api_llm_cost_dashboard.json) mapping real-time throughput, token delta splits, latencies, and aggregated cumulative expenditure.
    • Added a baseline Prometheus alert rule rule group (alert_rules_llm_cost.yml) to trigger warnings when cost budgets are crossed.

Closes #361

@drips-wave

drips-wave Bot commented Jun 26, 2026

Copy link
Copy Markdown

@Francis6-git Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[LLM] Implement token usage tracking and cost monitoring

1 participant