Problem
The Pre-fetch PR diff step (step 19) in the Matt Pocock Skills Reviewer agent job fails with ##[error]Process completed with exit code 141. (SIGPIPE, 128+13). The failure occurs before agent activation — the agent never produces a turn, so there is no useful audit output beyond the step-exit code. 9 failures out of 30 runs in the last 30 days (~30% failure rate); 6 of those failures occurred in the last 11 hours alone.
Affected workflow and runs (last 6h window)
| Run |
Branch |
Duration |
Step that failed |
| §26222142424 |
copilot/fix-github-actions-job-agent |
5.3m |
Pre-fetch PR diff (1s) |
| §26222141685 |
copilot/fix-github-actions-job-agent |
2.9m |
Pre-fetch PR diff (1s) |
Additional same-signature failures in the prior 24h (same workflow, same step, same exit code):
Root cause
The Pre-fetch PR diff step in .github/workflows/mattpocock-skills-reviewer.lock.yml (around line 539) runs this shape:
set -euo pipefail
mkdir -p /tmp/gh-aw/agent
gh pr diff "$PR_NUMBER" --repo $EXPR_GITHUB_REPOSITORY \
--exclude '**/*.lock.yml' \
--exclude '**/generated/**' \
--exclude '**/dist/**' \
--exclude '**/build/**' \
| head -n 3000 \
> /tmp/gh-aw/agent/pr-diff.patch
...
With set -o pipefail, when the PR diff exceeds 3000 lines, head -n 3000 closes its stdin and exits 0 after reading 3000 lines. The still-running gh pr diff then receives SIGPIPE (signal 13) on its next write, exits 141, and pipefail propagates that 141 as the step exit code. The step fails in ~1 second, before the agent runs.
This is a classic set -o pipefail + | head pitfall. The failure consistently affects PRs whose post-exclude diff is large — i.e. exactly the PRs that the truncation is designed to protect against. Workflows that DON'T have large diffs succeed; workflows that DO have large diffs always fail this step.
Proposed fix
The source is the .md workflow file (.github/workflows/mattpocock-skills-reviewer.md); the .lock.yml is generated. Update the Pre-fetch PR diff step body to one of:
Option A (preferred — fewest invariants changed): swallow SIGPIPE only for this command:
set -euo pipefail
mkdir -p /tmp/gh-aw/agent
{ gh pr diff "$PR_NUMBER" --repo $EXPR_GITHUB_REPOSITORY \
--exclude '**/*.lock.yml' \
--exclude '**/generated/**' \
--exclude '**/dist/**' \
--exclude '**/build/**' \
|| true; } | head -n 3000 > /tmp/gh-aw/agent/pr-diff.patch
Option B: write full diff to a temp file, then truncate (no pipe, no SIGPIPE):
gh pr diff "$PR_NUMBER" --repo $EXPR_GITHUB_REPOSITORY \
--exclude '**/*.lock.yml' --exclude '**/generated/**' \
--exclude '**/dist/**' --exclude '**/build/**' \
> /tmp/gh-aw/agent/pr-diff.full
head -n 3000 /tmp/gh-aw/agent/pr-diff.full > /tmp/gh-aw/agent/pr-diff.patch
Option C: localize the pipefail relaxation: set +o pipefail; gh pr diff ... | head ...; set -o pipefail.
Option A is the minimal change and keeps the streaming-truncate property. The same fix pattern should be checked in other agent workflows that use gh ... | head shapes with set -o pipefail.
Success criteria / verification
- The
Pre-fetch PR diff step exits 0 on a PR whose post-exclude diff exceeds 3000 lines (e.g. re-run on copilot/fix-github-actions-job-agent).
pr-diff.patch contains exactly the first 3000 lines of the (excluded) diff.
- Across the next 24h of scheduled runs, the workflow's failure-on-step-19 rate drops to 0.
- No regression: small-diff PRs continue to succeed and
pr-diff.patch still has fewer than 3000 lines for them.
Confidence / unknowns
- High confidence on root cause: exit code 141 is unambiguous (SIGPIPE), the step body matches the
pipefail + head pattern exactly, all observed failures occurred at step 19 in ~1 second with no errors in any other step, and the failures correlate with branches likely to have large diffs (refactor / regex-cleanup / chore-bump-mcpg).
- Unknown: whether other workflows in this repo use the same pattern and silently mask similar SIGPIPE failures (e.g., not in step-fail mode). Worth a quick
grep -n '| head -n' .github/workflows/*.lock.yml follow-up scan.
Parent report: #33620
References:
Generated by 🔍 [aw] Failure Investigator (6h) · ● 12.2M · ◷
Problem
The
Pre-fetch PR diffstep (step 19) in theMatt Pocock Skills Revieweragent job fails with##[error]Process completed with exit code 141.(SIGPIPE, 128+13). The failure occurs before agent activation — the agent never produces a turn, so there is no useful audit output beyond the step-exit code. 9 failures out of 30 runs in the last 30 days (~30% failure rate); 6 of those failures occurred in the last 11 hours alone.Affected workflow and runs (last 6h window)
copilot/fix-github-actions-job-agentcopilot/fix-github-actions-job-agentAdditional same-signature failures in the prior 24h (same workflow, same step, same exit code):
copilot/refactor-oversized-functions-parser-workflow)copilot/decouple-engine-permission-mode)copilot/chore-bump-mcpg-to-v0316-and-firewall-to-v02550)Root cause
The
Pre-fetch PR diffstep in.github/workflows/mattpocock-skills-reviewer.lock.yml(around line 539) runs this shape:With
set -o pipefail, when the PR diff exceeds 3000 lines,head -n 3000closes its stdin and exits 0 after reading 3000 lines. The still-runninggh pr diffthen receives SIGPIPE (signal 13) on its next write, exits 141, andpipefailpropagates that 141 as the step exit code. The step fails in ~1 second, before the agent runs.This is a classic
set -o pipefail+| headpitfall. The failure consistently affects PRs whose post-exclude diff is large — i.e. exactly the PRs that the truncation is designed to protect against. Workflows that DON'T have large diffs succeed; workflows that DO have large diffs always fail this step.Proposed fix
The source is the
.mdworkflow file (.github/workflows/mattpocock-skills-reviewer.md); the.lock.ymlis generated. Update thePre-fetch PR diffstep body to one of:Option A (preferred — fewest invariants changed): swallow SIGPIPE only for this command:
Option B: write full diff to a temp file, then truncate (no pipe, no SIGPIPE):
Option C: localize the pipefail relaxation:
set +o pipefail; gh pr diff ... | head ...; set -o pipefail.Option A is the minimal change and keeps the streaming-truncate property. The same fix pattern should be checked in other agent workflows that use
gh ... | headshapes withset -o pipefail.Success criteria / verification
Pre-fetch PR diffstep exits 0 on a PR whose post-exclude diff exceeds 3000 lines (e.g. re-run oncopilot/fix-github-actions-job-agent).pr-diff.patchcontains exactly the first 3000 lines of the (excluded) diff.pr-diff.patchstill has fewer than 3000 lines for them.Confidence / unknowns
pipefail + headpattern exactly, all observed failures occurred at step 19 in ~1 second with no errors in any other step, and the failures correlate with branches likely to have large diffs (refactor / regex-cleanup / chore-bump-mcpg).grep -n '| head -n' .github/workflows/*.lock.ymlfollow-up scan.Parent report: #33620
References:
Related to [aw-failures] Agentic workflow failures (last 6h): MCP telemetry regression + codex model 404 + 4 smaller fixes #33620