Skip to content

feat(test-impact): classify infrastructure failures separately#926

Open
Copilot wants to merge 6 commits into
mainfrom
copilot/add-infrastructure-failure-category
Open

feat(test-impact): classify infrastructure failures separately#926
Copilot wants to merge 6 commits into
mainfrom
copilot/add-infrastructure-failure-category

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

Summary

  • Adds infrastructure_failure classification for OOM, process termination, network, timeout, and exit-code-137 failure text.
  • Narrows the Killed matcher so domain assertion text such as expected process to be killed still follows normal regression classification.
  • Includes focused classifier/barrel coverage, a pending release fragment, and rebuilt tracked dist/ artifacts.

Invariant audit

  • 1 (plugin init): not touched - no plugin-init code changed; node scripts/repro-704.mjs printed OK for T1/T2/T3, then the process did not exit and the single identified node PID was stopped.
  • 2 (runtime portability): touched - bun run build, bun test tests/unit/build/bundle-portability.test.ts tests/unit/build/bundle-plugin-shape.test.ts tests/unit/build/bundle-node-load.test.ts, node --input-type=module -e "await import('./dist/index.js'); console.log('dist import OK')", and post-commit git diff --exit-code -- dist passed.
  • 3 (subprocesses): not touched - no source subprocess code changed.
  • 4 (.swarm containment): not touched - no .swarm path or working-directory logic changed.
  • 5 (plan durability): not touched - no plan ledger/projection code changed.
  • 6 (test_runner safety): not touched - repo validation used shell commands, not the OpenCode test_runner tool.
  • 7 (test writing): touched - loaded .opencode/skills/writing-tests/SKILL.md; added a bun:test regression in the existing classifier test file with no mocks.
  • 8 (session state): not touched - no session/global state changed.
  • 9 (guardrails/retry): not touched - no guardrail/retry logic changed.
  • 10 (chat/system msg): not touched - no chat/system-message hooks changed.
  • 11 (tool registration): not touched - no tool map or registration changed; barrel type export coverage was updated only for the classifier type surface.
  • 12 (release/cache): touched - added docs/releases/pending/failure-classifier-infrastructure-failure.md; no cache/version files changed.

Test plan

  • bun test src/test-impact/__tests__/failure-classifier.test.ts src/test-impact/__tests__/failure-classifier.adversarial.test.ts src/tools/__tests__/barrel-exports.test.ts
  • Runtime probes: assertion text containing killed returns new_regression; bare Killed returns infrastructure_failure
  • bun run typecheck
  • bunx biome ci src/test-impact/failure-classifier.ts src/test-impact/__tests__/failure-classifier.test.ts src/tools/__tests__/barrel-exports.test.ts docs/releases/pending/failure-classifier-infrastructure-failure.md
  • bunx biome ci . exited 0 with 10 warning-only findings in unrelated files
  • bun run build
  • git diff --exit-code -- dist after rebuild
  • node --input-type=module -e "await import('./dist/index.js'); console.log('dist import OK')"
  • bun test tests/unit/build/bundle-portability.test.ts tests/unit/build/bundle-plugin-shape.test.ts tests/unit/build/bundle-node-load.test.ts
  • git diff --check

Validation notes

  • Broader src/tools/__tests__/test-runner-history.test.ts still has an unrelated existing assertion mismatch for the scope "all" error message.
  • node scripts/repro-704.mjs printed OK for all three documented scenarios but did not exit; the single identified node process was stopped. This PR does not change plugin init code.

Copilot AI changed the title [WIP] Add infrastructure failure category to failure classifier Add infrastructure_failure classification to test impact failure classifier May 19, 2026
Copilot AI requested a review from zaxbysauce May 19, 2026 21:35
@zaxbysauce zaxbysauce changed the title Add infrastructure_failure classification to test impact failure classifier feat(test-impact): classify infrastructure failures separately May 20, 2026
@zaxbysauce zaxbysauce marked this pull request as ready for review May 20, 2026 00:14
@zaxbysauce
Copy link
Copy Markdown
Owner

@copilot review this PR

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 20, 2026

@copilot review this PR

Reviewed and addressed follow-up risks in 2e5f4ea. I tightened infrastructure matching context, added assertion-error guard handling in the classifier, and validated with targeted classifier + bundle portability tests (all passing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failure classifier: add infrastructure_failure category

3 participants