[agent-efficiency] Agent runs fail after pick-three sub-agent fanout

## Run summary

Lookback window: 2026-06-16T17:38:18Z to 2026-06-23T17:38:18Z. These counts are for agentic workflow paths containing `trigger-` or `gh-aw-`. Pass/fail rates are calculated over `success + failure` runs only; startup failures and non-terminal/approval statuses are shown separately.

| Repository | Total runs | Success | Failure | Startup failure | Other statuses | Pass rate | Fail rate |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| `elastic/ai-github-actions` | 1,115 | 84 | 109 | 36 | 886 | 43.5% | 56.5% |

Other statuses for this repository were `action_required` 777, `skipped` 106, `cancelled` 2, and `in_progress` 1. No downstream repository rows are included: downstream discovery via GitHub code search was unavailable from this run (`search_code` returned 429 and unauthenticated public REST returned 401), so no downstream repositories were reliably discovered.

## Findings

### 1. Pick-three workflows fan out to sub-agents and then fail on provider auth or AI credit limits

Affected downloaded failed runs: 4 of the 9 logs where the agent process started: `trigger-docs-patrol`, `trigger-text-auditor`, `trigger-framework-best-practices`, and `trigger-bug-hunter`.

**Evidence:**

- Docs Patrol run https://github.com/elastic/ai-github-actions/actions/runs/28035785847 launched three `General-purpose(gpt-5.3-codex)` sub-agents at `/tmp/gh-aw/logs/28035785847/4_run _ agent.txt:1684-1691`, then each read returned `Authentication failed with provider at (172.30.0.30/redacted) (HTTP 403)` at lines 1699, 1705, and 1711. The run ended with `Tokens     ↑ 1.8m (1.5m cached) • ↓ 21.5k` at line 1717 and `failureClass=authentication_failed` at line 1720.
- Text Auditor run https://github.com/elastic/ai-github-actions/actions/runs/28031348044 launched three `General-purpose(gpt-5.3-codex)` sub-agents at `/tmp/gh-aw/logs/28031348044/4_run _ agent.txt:1595-1602`, then logged provider 403s at lines 1613, 1619, 1625, and 1631. The run ended with `Tokens     ↑ 1.8m (1.3m cached) • ↓ 24.5k` at line 1637 and `failureClass=authentication_failed` at line 1640.
- Framework Best Practices run https://github.com/elastic/ai-github-actions/actions/runs/28031380597 logged repeated provider 403s at `/tmp/gh-aw/logs/28031380597/4_run _ agent.txt:1626-1650` and again at lines 1669, 1672, and 1682. It ended with `Tokens     ↑ 1.9m (1.0m cached) • ↓ 26.3k` at line 1688 and `failureClass=authentication_failed` at line 1691.
- Bug Hunter run https://github.com/elastic/ai-github-actions/actions/runs/28024841420 launched three `General-purpose(gpt-5.3-codex)` sub-agents at `/tmp/gh-aw/logs/28024841420/4_run _ agent.txt:1754-1761`, then hit `CAPIError: 429 Maximum AI credits exceeded (1057.625500 / 1000)` at lines 1775 and 1777. The run ended with `Tokens     ↑ 2.0m (1.4m cached) • ↓ 20.9k` at line 1782 and `AI credits budget exceeded — not retrying` at line 1786.

**Root cause:** The shared prompt fragments require this fanout without a budget or model-availability guard. `.github/workflows/gh-aw-fragments/pick-three-keep-many.md:3-15` says to spawn multiple independent sub-agents, make each prompt fully self-contained, and wait for all sub-agents. `.github/workflows/gh-aw-fragments/pick-three-keep-one.md:3-15` similarly requires exactly three sub-agents and explicitly says a 10,000-token prompt is preferable to a short one. These fragments are included by the affected workflow sources at `.github/workflows/gh-aw-docs-patrol.md:15`, `.github/workflows/gh-aw-text-auditor.md:14`, `.github/workflows/gh-aw-framework-best-practices.md:14`, and `.github/workflows/gh-aw-bug-hunter.md:15`.

### 2. Broad GitHub MCP issue searches overflowed in failed runs

Affected downloaded failed runs: 2 of the 9 logs where the agent process started: `trigger-framework-best-practices` and `trigger-bug-hunter`.

**Evidence:**

- Framework Best Practices run https://github.com/elastic/ai-github-actions/actions/runs/28031380597 called `search_issues (MCP: github)` with an open-issue title query at `/tmp/gh-aw/logs/28031380597/4_run _ agent.txt:1594`, and the tool response was `Output too large to read at once (57.8 KB). Saved to: /tmp/1782222986168...` at line 1595.
- Bug Hunter run https://github.com/elastic/ai-github-actions/actions/runs/28024841420 called `search_issues (MCP: github) · bug hunter OR bug-hunter` at `/tmp/gh-aw/logs/28024841420/4_run _ agent.txt:1730`, and the tool response was `Output too large to read at once (59.7 KB). Saved to: /tmp/1782216509451...` at line 1731.

**Root cause:** The agent used broad MCP issue searches for duplicate checks in workflows whose prompts also require sub-agent fanout. The repository prompt includes MCP pagination guidance, but the observed queries did not constrain result size enough to avoid large dumped responses before the later sub-agent/provider failures.

## Duplicate check

Open issue searches for `Authentication failed with provider` + `sub-agent`, `Maximum AI credits exceeded` + `sub-agent`, `Pick Three` + `sub-agents` + `AI credits`, and `pick-three-keep` + `Authentication failed` found no matching open tracking issue. The separate `UV_PATH` startup failure in `trigger-code-duplication-detector` and `trigger-code-complexity-detector` is already covered by open draft PR #1385, so it is not reported here.




> Generated by [Internal: Agent Efficiency](https://github.com/elastic/ai-github-actions/actions/runs/28044288088) · 936.5 AIC · ⌖ 12.1 AIC · ⊞ 24.5K · [◷](https://github.com/search?q=repo%3Aelastic%2Fai-github-actions+is%3Aissue+%22gh-aw-workflow-call-id%3A+elastic%2Fai-github-actions%2Fagent-efficiency%22&type=issues)
> - [x] expires  on Jun 30, 2026, 5:54 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[agent-efficiency] Agent runs fail after pick-three sub-agent fanout #1394

Run summary

Findings

1. Pick-three workflows fan out to sub-agents and then fail on provider auth or AI credit limits

2. Broad GitHub MCP issue searches overflowed in failed runs

Duplicate check

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[agent-efficiency] Agent runs fail after pick-three sub-agent fanout #1394

Description

Run summary

Findings

1. Pick-three workflows fan out to sub-agents and then fail on provider auth or AI credit limits

2. Broad GitHub MCP issue searches overflowed in failed runs

Duplicate check

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions