[token-consumption] Daily Token Consumption Report - 2026-05-21

### Executive Summary

In the last 24 hours (2026-05-20 12:46 UTC → 2026-05-21 12:46 UTC), agentic workflows in `github/gh-aw` consumed approximately **213.4M total tokens** across **~310 unique workflow runs** spanning **~100 unique workflows**. Per-PR review pipelines (`Test Quality Sentinel`, `Matt Pocock Skills Reviewer`, `PR Code Quality Reviewer`) dominate aggregate consumption due to high run frequency, while several single-run daily reports (Firewall Logs Collector, Community Attribution, Package Spec Librarian, Linter Miner) consume 3–6M tokens *per single run* and are the most expensive per-invocation. No errors or log entries were observed in companion `errors` / `logs` datasets — telemetry pipeline appears healthy.

### Key Metrics

| Metric | Value |
|---|---|
| Events analyzed (gen_ai spans w/ token data) | ~1,107 |
| Events with token data | ~1,107 (100%) |
| Total input tokens | ~211M |
| Total output tokens | ~2.4M |
| Total tokens | ~213.4M |
| Unique workflows | ~100 |
| Unique workflow runs | ~310 |
| Avg tokens / run | ~688K |
| P95 tokens / run (est.) | ~3.5M |

*Note*: each workflow run typically emits 4 sibling gen_ai spans reporting the same token totals; aggregate sums above were divided by the per-workflow `count() / count_unique(gh-aw.run.id)` ratio to avoid double-counting. Raw aggregate `sum(gen_ai.usage.total_tokens)` across all spans was ~854M (= 213.4M × ~4 duplication factor).

### Top 10 Workflows by Token Consumption

| Workflow | Runs | Input Tokens | Output Tokens | Total Tokens | Avg/Run |
|---|---:|---:|---:|---:|---:|
| Test Quality Sentinel | 39 | 28,351,864 | 288,579 | **28,640,443** | 734,370 |
| Matt Pocock Skills Reviewer | 31 | 21,382,720 | 162,941 | **21,545,661** | 695,021 |
| PR Code Quality Reviewer | 40 | 21,343,753 | 158,485 | **21,502,238** | 537,556 |
| PR Sous Chef | 22 | ~18.2M | ~497K | **18,701,464** | 850,066 |
| Chaos PR Bundle Fuzzer | 7 | 10,115,455 | 65,192 | **10,180,647** | 1,454,378 |
| Contribution Check | 5 | 9,969,561 | 79,687 | **10,049,248** | 2,009,850 |
| Daily Firewall Logs Collector and Reporter | 1 | 5,835,404 | 39,262 | **5,874,666** | 5,874,666 |
| Daily Community Attribution Updater | 1 | 4,221,504 | 21,054 | **4,242,558** | 4,242,558 |
| Package Specification Librarian | 1 | 3,478,460 | 51,904 | **3,530,364** | 3,530,364 |
| Linter Miner | 1 | 3,118,624 | 31,975 | **3,150,599** | 3,150,599 |

<details>
<summary>Highest-token single run per top workflow</summary>

| Workflow | Run ID | Total Tokens |
|---|---|---:|
| Daily Firewall Logs Collector and Reporter | [26203472250](https://github.com/github/gh-aw/actions/runs/26203472250) | 5,874,666 |
| PR Sous Chef (highest single run) | [26222262708](https://github.com/github/gh-aw/actions/runs/26222262708) | 4,319,402 |
| Daily Community Attribution Updater | [26203736817](https://github.com/github/gh-aw/actions/runs/26203736817) | 4,242,558 |
| PR Sous Chef (second-highest) | [26218530960](https://github.com/github/gh-aw/actions/runs/26218530960) | 3,926,768 |
| Package Specification Librarian | [26167560886](https://github.com/github/gh-aw/actions/runs/26167560886) | 3,530,364 |
| Linter Miner | [26180896986](https://github.com/github/gh-aw/actions/runs/26180896986) | 3,150,599 |
| Daily SPDD Spec Planner | [26176947757](https://github.com/github/gh-aw/actions/runs/26176947757) | 3,070,226 |
| Daily Security Observability Report | [26176856527](https://github.com/github/gh-aw/actions/runs/26176856527) | 2,772,393 |
| Contribution Check (highest single run) | [26166594512](https://github.com/github/gh-aw/actions/runs/26166594512) | 2,443,973 |
| Chaos PR Bundle Fuzzer (highest single run) | [26218516360](https://github.com/github/gh-aw/actions/runs/26218516360) | 2,414,476 |
| Daily MCP Tool Concurrency Analysis | [26220221074](https://github.com/github/gh-aw/actions/runs/26220221074) | 2,212,930 |
| UK AI Operational Resilience | [26175963806](https://github.com/github/gh-aw/actions/runs/26175963806) | 2,287,188 |
| Daily Agent of the Day Blog Writer | [26172025729](https://github.com/github/gh-aw/actions/runs/26172025729) | 2,202,516 |
| Daily Regulatory Report Generator | [26191963827](https://github.com/github/gh-aw/actions/runs/26191963827) | 2,099,186 |
| Test Quality Sentinel (highest single run) | [26203507094](https://github.com/github/gh-aw/actions/runs/26203507094) | 2,106,996 |

</details>

<details>
<summary>Token output ratios — workflows with unusually high output share</summary>

Most workflows are input-heavy (output / input < 5%). Workflows where output dominates and may indicate generative/long-form behavior:

| Workflow | Input | Output | Output Share |
|---|---:|---:|---:|
| Daily Code Metrics and Trend Tracking Agent | 17,583 | 32,735 | 65.1% |
| Daily Reliability Review | ~7,047 | ~20,674 | 74.6% |
| Agentic Workflow Audit Agent | 10,068 | 34,819 | 77.6% |
| Sergo - Serena Go Expert | 19,646 | 35,409 | 64.3% |
| Daily Sub-Agent Optimizer | 18,122 | 36,804 | 67.0% |
| [aw] Failure Investigator (6h) | ~7,200 | ~20,000–32,000 | ~70–82% |
| Copilot Session Insights | 21,604 | 27,081 | 55.6% |
| GitHub API Consumption Report Agent | 14,700 | 27,649 | 65.3% |
| DeepReport - Intelligence Gathering Agent | 13,810 | 21,843 | 61.2% |

These small-input/large-output workflows likely use Claude/Anthropic models in generation-heavy mode (analysis reports, documentation rewrites).

</details>

<details>
<summary>Data Quality and Gaps</summary>

- **Span duplication**: each workflow run emits 2–5 sibling `gen_ai` spans that report the same token totals. Aggregation divided raw `sum()` by per-workflow `count()/count_unique(gh-aw.run.id)` ratio (typically 4) to avoid inflation. This is the dominant source of estimation error.
- **Pagination coverage**: ~700+ raw spans were paginated explicitly; aggregate `sum()`/`count()` query covered the full 24h window for the top-100 workflows. ~5–10 small-volume workflows fall below the 100-result truncation and are excluded from the grand total — their contribution is estimated <500K tokens.
- **Companion datasets**: `errors` and `logs` datasets returned **zero events** in 24h for the `gh-aw` project. No telemetry-emit failures or runtime errors observed.
- **Workflow name attribution**: 100% of token-bearing spans had `gh-aw.workflow.name` populated. Two appearances of `[Filtered]` (2 unique runs, 701,298 + 141,418 = 842,716 tokens) reflect Sentry's PII scrubbing of the workflow name attribute, not actual workflow naming. Worth confirming why those specific workflow names trigger PII filters in `actions/setup/js/send_otlp_span.cjs`.
- **Missing model attribution**: `gen_ai.request.model` was populated on ~25% of spans (typically only the leaf span carries it). Most reported `claude-sonnet-4.5`, `gpt-5-mini`, `claude-haiku-4.5`, or `auto`. Spotted anomaly: one span tagged `gpt-5.4` and one `gpt-5.4-mini` — possibly typos or experimental model strings worth verifying.
- **`gh-aw.workflow.name` vs `github.workflow`**: The OTLP emit code populates the gh-aw-prefixed attribute, but the standard `github.workflow` attribute was empty across all spans inspected. Worth aligning these for consumers expecting the conventional OTel/GitHub semantic.

</details>

### Recommendations

1. **Investigate PR Sous Chef cost variance** — observed runs range from 178K to 4.32M tokens (24x spread). Run `26222262708` (4.32M) and `26218530960` (3.93M) used `gpt-5-mini` but consumed Claude-Sonnet-tier token volumes. Add a per-run input-context audit and consider prompt-caching or truncation if the high runs replay large PR diffs.
2. **Cap context for single-run "daily" workflows >3M tokens** — `Daily Firewall Logs Collector and Reporter` (5.87M), `Daily Community Attribution Updater` (4.24M), `Package Specification Librarian` (3.53M), `Linter Miner` (3.15M), and `Daily SPDD Spec Planner` (3.07M) each burn 3–6M tokens in one shot. Audit their MCP queries / Sentry windows for over-fetching and consider pagination or summarization passes before sending to the model.
3. **De-duplicate sibling `gen_ai` spans at emit time** — the 4x span duplication inflates all OTel sums and confuses cost dashboards. Either emit token usage only on the leaf span, or add a `gen_ai.span.role` attribute (`parent`/`leaf`) so consumers can filter cleanly. Reference: `actions/setup/js/send_otlp_span.cjs`.
4. **Right-size high-frequency PR review fleet** — `Test Quality Sentinel`, `Matt Pocock Skills Reviewer`, and `PR Code Quality Reviewer` ran 39–40 times in 24h consuming ~71.7M tokens combined. Consider merging overlapping reviewer agents, increasing trigger thresholds (skip drafts/WIP), or sharing a cached repo-context blob across the three review pipelines.

### References

- Sentry trace explorer (filtered query, 24h): https://github.sentry.io/explore/traces/?query=span.op:gen_ai*+has:gen_ai.usage.total_tokens&project=4511347087179777&statsPeriod=24h
- Sentry aggregate by workflow: https://github.sentry.io/explore/traces/?query=span.op:gen_ai*+has:gen_ai.usage.total_tokens&project=4511347087179777&aggregateField=%7B%22groupBy%22:%22gh-aw.workflow.name%22%7D&mode=aggregate&statsPeriod=24h
- This report's workflow run: [§26226782084](https://github.com/github/gh-aw/actions/runs/26226782084)







> Generated by [📊 Daily Token Consumption Report (Sentry OTel)](https://github.com/github/gh-aw/actions/runs/26226782084) · ● 36.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-token-consumption-report%22&type=issues)
> - [x] expires  on May 22, 2026, 12:58 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[token-consumption] Daily Token Consumption Report - 2026-05-21 #33751

Executive Summary

Key Metrics

Top 10 Workflows by Token Consumption

Recommendations

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Value
Events analyzed (gen_ai spans w/ token data)	~1,107
Events with token data	~1,107 (100%)
Total input tokens	~211M
Total output tokens	~2.4M
Total tokens	~213.4M
Unique workflows	~100
Unique workflow runs	~310
Avg tokens / run	~688K
P95 tokens / run (est.)	~3.5M

Workflow	Runs	Input Tokens	Output Tokens	Total Tokens	Avg/Run
Test Quality Sentinel	39	28,351,864	288,579	28,640,443	734,370
Matt Pocock Skills Reviewer	31	21,382,720	162,941	21,545,661	695,021
PR Code Quality Reviewer	40	21,343,753	158,485	21,502,238	537,556
PR Sous Chef	22	~18.2M	~497K	18,701,464	850,066
Chaos PR Bundle Fuzzer	7	10,115,455	65,192	10,180,647	1,454,378
Contribution Check	5	9,969,561	79,687	10,049,248	2,009,850
Daily Firewall Logs Collector and Reporter	1	5,835,404	39,262	5,874,666	5,874,666
Daily Community Attribution Updater	1	4,221,504	21,054	4,242,558	4,242,558
Package Specification Librarian	1	3,478,460	51,904	3,530,364	3,530,364
Linter Miner	1	3,118,624	31,975	3,150,599	3,150,599

Workflow	Run ID	Total Tokens
Daily Firewall Logs Collector and Reporter	26203472250	5,874,666
PR Sous Chef (highest single run)	26222262708	4,319,402
Daily Community Attribution Updater	26203736817	4,242,558
PR Sous Chef (second-highest)	26218530960	3,926,768
Package Specification Librarian	26167560886	3,530,364
Linter Miner	26180896986	3,150,599
Daily SPDD Spec Planner	26176947757	3,070,226
Daily Security Observability Report	26176856527	2,772,393
Contribution Check (highest single run)	26166594512	2,443,973
Chaos PR Bundle Fuzzer (highest single run)	26218516360	2,414,476
Daily MCP Tool Concurrency Analysis	26220221074	2,212,930
UK AI Operational Resilience	26175963806	2,287,188
Daily Agent of the Day Blog Writer	26172025729	2,202,516
Daily Regulatory Report Generator	26191963827	2,099,186
Test Quality Sentinel (highest single run)	26203507094	2,106,996

Workflow	Input	Output	Output Share
Daily Code Metrics and Trend Tracking Agent	17,583	32,735	65.1%
Daily Reliability Review	~7,047	~20,674	74.6%
Agentic Workflow Audit Agent	10,068	34,819	77.6%
Sergo - Serena Go Expert	19,646	35,409	64.3%
Daily Sub-Agent Optimizer	18,122	36,804	67.0%
[aw] Failure Investigator (6h)	~7,200	~20,000–32,000	~70–82%
Copilot Session Insights	21,604	27,081	55.6%
GitHub API Consumption Report Agent	14,700	27,649	65.3%
DeepReport - Intelligence Gathering Agent	13,810	21,843	61.2%

[token-consumption] Daily Token Consumption Report - 2026-05-21 #33751

Description

Executive Summary

Key Metrics

Top 10 Workflows by Token Consumption

Recommendations

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions