Skip to content

init: import whichtests sources#1

Draft
ethanndickson wants to merge 14 commits into
mainfrom
init
Draft

init: import whichtests sources#1
ethanndickson wants to merge 14 commits into
mainfrom
init

Conversation

@ethanndickson
Copy link
Copy Markdown
Member

@ethanndickson ethanndickson commented May 20, 2026

Initial import of whichtests, the Go test-plan generator that drives the flake-go CI workflow in coder/coder. It was extracted from scripts/testselect/ in coder/coder and rewritten over the course of this PR's review rounds.

What it does

Given a base/head git revision pair (or a GitHub Actions event), whichtests walks the diff, parses each changed *_test.go file, picks the smallest set of tests to rerun, and writes a workflow matrix (include array of {package, run_regex, test_count}) plus a Markdown summary.

The flake-go workflow consumes that matrix to rerun each selected test with -count=N -race.

Selection invariant

whichtests selects only the tests a human reviewer would say were directly changed: a top-level Test*, Fuzz*, or Example* function in a changed *_test.go file whose body or signature lines overlap a diff hunk in that file.

Concretely:

  • Only changed *_test.go files are considered. Non-test Go files, docs, frontend, configs, dependency bumps: zero tests selected.
  • Within a changed test file, only top-level runnable functions are selected. Method-style suite tests (func (s *X) TestY), helpers, TestMain, init, package-level vars/consts/types, and import changes are all ignored.
  • Selection is hunk-overlap on the runnable test's declaration range. Context lines around an unchanged neighbor test do not pick that neighbor up.
  • For renamed or cross-package moves, the new-side runnable test is selected for the new package's head inventory. Independently, old-side tests whose names still exist in the old package's head inventory can be selected for the old package.
  • Unsafe test names that cannot be passed safely through -run cause the tool to return an error, rather than silently falling back to a whole-package run.

There is no broadening, no whole-package fallback, no matrix overflow row. Changes that do not directly modify a runnable test body produce an empty matrix.

File layout

Single package main, split into focused files:

File Responsibility
cli.go main, flag parsing, command orchestration (runCommand)
config.go config / commandConfig types and defaults
request.go runRequest, diffRange, fetchSpec, outputSinks, revision validation
gitexec.go gitRunner / gitFetcher types and the real exec.Command implementation
diff.go git diff listing and parsing, change kinds, hunks, line ranges
snapshot.go AST parsing, fileSnapshot, top-level runnable test ranges
selection.go Per-change direct test selection (selectChange, hunk overlap)
inventory.go inventoryCache for revision validity, file existence, parsed snapshots, and package inventories
plan.go Plan construction, matrix and summary rendering
githubactions.go GitHub Actions event request builder and fetch fallback chain
publish.go Matrix and summary output sinks (file, stdout, GITHUB_OUTPUT, GITHUB_STEP_SUMMARY)

Test files are 1:1 with sources (<source>_test.go), plus helpers_test.go, gitfake_test.go, and integration_test.go.

Boilerplate

.gitignore, .golangci.yaml, .prettierrc, LICENSE, Makefile, the CI workflow, and the setup-go composite action are imported from coder/paralleltestctx. CI runs quality / fmt, quality / lint, and quality / test.

Follow-up

coder/coder's scripts/testselect/ directory and the .github/workflows/flake-go.yaml reference still need to be migrated to consume this repo. That migration is out of scope for this PR.

Relates to CODAGT-381

Imports the testselect Go test-plan generator from coder/coder@bf6a0d953f
(branch ethan/go-test-flake-detector) and splits the single ~1.6k-line
main.go into ten focused files inside package main:

- cli.go: entrypoint, flag parsing, command orchestration
- config.go: config / commandConfig types and defaults
- request.go: runRequest, diffRange, revision validation
- gitexec.go: gitRunner / gitFetcher types and exec.Command impl
- diff.go: git diff parsing, change kinds, hunks, line ranges
- snapshot.go: AST snapshot parsing, fileSnapshot, sharedDecl, fallbacks
- broadening.go: per-kind broadening rules (broadeningScope)
- selection.go: per-change selection logic
- inventory.go: inventoryCache for package/directory test discovery
- plan.go: plan construction, matrix and summary rendering

githubactions.go and publish.go are imported unchanged. Tests
(githubactions_test.go, integration_test.go, main_test.go) are kept as
single files initially.

go vet, go build, go test, and go test -race are all green.
Split main_test.go (2297 lines) into focused *_test.go files mirroring
the source layout, and move two helpers (mergePackageSelection,
packagePattern) from plan.go to selection.go so layering is consistent
(plan depends on selection, not vice versa).

Test files now line up 1:1 with their production counterparts:
  cli_test.go        - run/runCommand end-to-end tests
  diff_test.go       - hunk/change parsing tests
  snapshot_test.go   - parseFileSnapshot/fallback tests
  selection_test.go  - selectTestsForSnapshots and merge tests
  plan_test.go       - buildExecutionPlan/renderSummary tests
  publish_test.go    - publishPlan tests (from githubactions_test.go)
  githubactions_test.go - GitHub event/range tests
  integration_test.go   - real-git integration (unchanged)
  helpers_test.go    - shared line-range + inventory helpers
  gitfake_test.go    - fakeGitRepo mock + dispatcher methods

go vet, go build, go test, and go test -race all pass.
Mirrors the parallel testctx repo's structure: Makefile, golangci-lint
v2 config, prettier config, gitignore, MIT license, CI workflow with
fmt/lint/test jobs, and the composite setup-go action (default Go
version bumped to 1.26.2 to match go.mod).

Drive-by gofumpt fix: blank line between needsOldSnapshot and
addMatchingTests in selection.go.
- publish.go: surface Close errors via named return on appendFile,
  fixing errcheck. Close errors on a write path can indicate lost data,
  so propagating beats silently discarding.
- plan.go: drop `:=` in selectTestPlan loop so the inner err reuses
  the outer binding, fixing govet shadow.
- githubactions.go: drop the redundant cfg.config selector and rely on
  the promoted method, fixing staticcheck QF1008.
- helpers_test.go: drop both dir and packageName parameters from
  mustPackageInventory; all 16 call sites pass the same synthetic
  values, so hardcode them and document the helper, fixing unparam.

go vet, go build, make test, and make fmt are clean. make lint still
reports six gosec findings (G301/G304) covering directory permissions
and reads from user-provided paths; those are policy decisions.
The previous name was honest but generic — many things 'select tests.'
'whichtests' is the question this tool answers: given a Go PR's diff,
which tests should run? Renaming makes the intent obvious in workflow
YAML, CLI help, and the module path.

Touchpoints:
- go.mod module github.com/coder/whichtests
- Makefile builds build/whichtests
- cli.go doc comment
- README.md heading and invocation examples

Internal symbol names like selectTestsForSnapshots are untouched —
they describe what the function does, not the tool's identity.
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for extracting this from coder/coder. The algorithm is small, the git/AST/matrix boundaries are kept clean, and the regex gates on safeTestNameRE/safePackagePatternRE plus the GITHUB_OUTPUT CRLF + 1 MiB guard close the highest-impact injection paths. The two real-git integration tests are a particularly strong asset.

Reviewer context: this binary has exactly one consumer — coder/coder's flake-go.yaml, invoked as whichtests --repo-root . --github-actions --out-matrix "$RUNNER_TEMP/flake-matrix.json" after a per-PR go install from a pinned SHA (no binary caching). Severity is calibrated to that flow.

11 reviewers in parallel cross-checked into 6 P2, 8 P3, 1 perf bundle, 2 nit bundles across 17 inline comments. Most P2s are deletions (dead config, fossil fields, a test-only run() parallel entry that 9 reviewers independently flagged) plus two algorithm findings the smoke test exposed. None block merge.

Comment thread request.go
Comment thread selection.go Outdated
Comment thread broadening.go Outdated
Comment thread broadening.go Outdated
Comment thread cli.go Outdated
Comment thread plan.go Outdated
Comment thread request.go
Comment thread inventory.go
Comment thread go.mod
Comment thread config.go Outdated
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 1 cleanups all landed cleanly at 4acef74. The testDecl flattening, run() shim removal, fallback-parsing deletion, pull_request.head.sha strictness, broadening max-scope correction, summary filename quoting, and xerrors removal are all in place; the resulting code is noticeably smaller and clearer.

Round 2 findings are simplification opportunities in the same vein: dead struct fields (matching the round-1 testDecl pattern), single-use helpers, double-AST-parsing with unreachable fallback branches, and a defaults function pair that risks drift. 9 P3, 4 nits, 1 obs across 14 inline comments. None block merge.

Comment thread gitexec.go
Comment thread plan.go Outdated
Comment thread selection.go Outdated
Comment thread snapshot.go Outdated
Comment thread selection.go Outdated
Comment thread selection.go Outdated
Comment thread publish.go Outdated
Comment thread plan.go Outdated
Comment thread diff.go Outdated
Comment thread diff.go
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Redundant follow-up; the round-2 review above already contains all 14 comments.)

Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up to the round-2 review: one P3 simplification that I previously catalogued as an observation. On reflection, since PR #1 is the initial import, every line lands now, so the "future-shape" framing was the wrong lens. This one is a real simplification with clear safety, so promoting it to P3 and posting inline.

Comment thread selection.go Outdated
Apply deletion-first cleanups identified in the round 2 deep review:

- gitexec: shrink gitResult to just Stdout; keep stderr/exit local to execGit
- config: collapse defaultConfig through withDefaults via cmp.Or
- publish: inline the trailing-newline append at its single call site
- snapshot: stop caching testingDotImport on fileSnapshot; pass it directly
- plan: drop unused executionAccumulator.Package, fold appendUniqueNote,
  rename unsafeRunRegexTestNames -> unsafeRunRegexTestCount
- diff: inline oldRevisionPath/newRevisionPath into pathspecs; fold
  isRunnableGoTestPath into isRunnableTestFilePath
- selection: parse each changed file once via parsedFileSnapshot; thread
  cache through selectChange instead of re-passing cfg/git; drop unreachable
  parse fallback branches; simplify allDirectoryTestsSelection
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 3 cleanup landed cleanly at f3da5e5 — every Resolved finding from rounds 1 and 2 matches the new code, and the deferred testFileChange.Kind observation still looks like the right call to defer.

Fresh review across 11 reviewer roles: 8 P3, 4 Obs, and 1 grouped Nit across 13 inline comments. The two convergent findings are readFileAtRevision per-file caching (5 reviewers; round 1 cited this cost but the caching that landed addressed package inventories, not the per-file git work) and appendGitHubOutput's outputSizeLimit test-only seam (3 reviewers; same shape round 1 removed from runRequest.OutputSizeLimit).

Comment thread inventory.go
Comment thread githubactions.go Outdated
Comment thread plan.go
Comment thread gitexec.go
Comment thread plan.go Outdated
Comment thread selection.go
Comment thread config.go
Comment thread selection.go
Comment thread plan.go Outdated
Comment thread broadening.go Outdated
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4 cleanup landed cleanly at 7405766. The inventoryCache rewrite, selectChange state machine with expectsOldFile/expectsNewFile, the ensureConcreteRangeAvailable ordered-recovery chain, the summaryReport.Notes/outputSizeLimit removals, and the selectionFixture01..22 hoist all match the round-3 plan, and the new tests honestly pin each contract (the counter.total cache assertions and the errors.Join substring locks are especially clean).

Fresh review across 9 reviewer roles: 2 P3 and 4 observations across 6 inline comments. The two convergent P3s are an orphan readFileAtRevision in diff.go whose body the cache reimplements inline (Structural Analyst + Duplication Checker), and one branch in ensureConcreteRangeAvailable that still discards the attempts slice (Go Architect + Edge Case Analyst). The five-way convergent finding on expectsOldFile/expectsNewFile's trailing return true is observation-level , today the fallthrough is dead code, but every reviewer wanted the intent locked in writing.

Comment thread diff.go Outdated
Comment thread githubactions.go Outdated
Comment thread selection.go
Comment thread inventory.go Outdated
Comment thread inventory.go
Comment thread selection_test.go
Copy link
Copy Markdown
Member Author

@ethanndickson ethanndickson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the round-4 fix-ups. The cachedFile collapse cleanly merges the two parallel caches, noteFileExists cuts the per-file ls-tree round-trips, expectedFileSides centralizes the changeKind to file-side mapping, and the new fetch-spec tests pin the right failure shape.

Fresh review found no defects: 5 observations and 1 nit across 6 inline comments. All are cumulative debt notes about contracts and invariants that now live in caller discipline rather than the type.

Comment thread inventory.go
Comment thread inventory.go
Comment thread inventory.go
Comment thread selection.go Outdated
Comment thread githubactions.go
Comment thread selection_test.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant