bash: speed up the Git test suite on Windows via spawnve() and a built-in expr by dscho · Pull Request #288 · git-for-windows/MSYS2-packages

dscho · 2026-05-13T12:36:24Z

Git's test suite on Windows runs in roughly three to four times the wall-clock time of the same suite on Linux, and the dominant cause is fork(). Every shell script, every helper, every $(expr ...) substitution, every short cmd | cmd pipeline forces Cygwin's runtime to snapshot the parent's address space so that POSIX copy-on-write semantics can be emulated on top of CreateProcess. The test suite spawns a lot of those, and that cost is paid every push to git-for-windows/git, every contributor running the suite locally on their machine before opening a PR, every retry of a flaky job in CI.

This branch teaches bash to side-step fork() for the cases where the regular fork path would have behaved identically anyway, and to handle expr internally rather than spawning /usr/bin/expr for every arithmetic substitution. The cumulative payoff matters in two distinct ways: contributors get noticeably shorter iteration loops when running the test suite locally on Windows, and CI runs spend less time on a Windows runner per push. The Git for Windows CI in particular pays the fork() cost every time, and the savings compound across PRs, releases, and re-runs.

Empirical evidence

Whole Git test suite, 17-way matrix, identical minimal SDK image, identical git-artifacts, the only variable being the bash binary:

	wall-clock total	run
stock `bash` 5.3.009	32,401 s	`bash-spawnve-baseline` 25742171983
patched `bash`	24,334 s	`bash-spawnve-fast-path` 25742172112

That is a 25% reduction across 878 tests. No tests fail.

A handful of tests looked like regressions in single-shot timings; the worst was t6433-merge-toplevel.sh at a 1.24x ratio. To rule out CI scheduler noise, the sibling drill-bash-regression workflow re-runs a single test 20 times under each variant on the same runner in randomized paired order, swapping both bash.exe and sh.exe between every iteration. For t6433-merge-toplevel.sh the result was:

	mean over 20 runs
stock `bash`	16.714 s
patched `bash`	14.226 s

Every one of the 20 paired iterations showed the patched build faster. The apparent regression in the whole-suite comparison was scheduler noise; the patched build has no code path that adds work over the baseline by mechanism, only paths that skip it. The same harness is available for any other test a reviewer wants spot-checked.

The workflow that produced the whole-suite numbers ships on this branch as .github/workflows/bash-with-git-tests.yml so the comparison can be reproduced on demand.

Scope

The fast path is strictly gated on __CYGWIN__. Non-Windows MSYS2 environments are unaffected. The work deliberately does not touch the Cygwin runtime itself: a proper posix_spawn() integration there would require driving Jeremy Drake's stalled patches to completion and is much more invasive. Deferring that keeps this change isolated to the bash package alone while still delivering the bulk of the available speedup.

…t-in expr Git's CI runs on Windows take far longer than on Linux primarily because the test suite is a tree of shell scripts orchestrated by Perl's "prove", and on Windows both Bash and Perl must go through msys2-runtime's POSIX emulation. fork() in particular is expensive on Windows: Cygwin has to snapshot the parent's entire address space to fake POSIX copy-on-write semantics, and every external command the test suite spawns pays that cost. Comparing per-test timings between the Linux and Windows CI runs makes the pattern obvious: scripts that hammer plain external commands such as /usr/bin/expr or short "cmd | cmd" pipelines are the worst offenders, with Windows/Linux runtime ratios in the high single digits to low double digits even when the underlying work is trivial. This series, developed and benchmarked in isolation against a clean bash-5.3 source tree, attacks both factors directly. An "expr" built-in handles the test suite's many $(expr ...) invocations inside the shell process, removing the fork()+execve()+exit() round trip entirely. A spawnve()-based fast path in execute_cmd.c then sends synchronous external commands straight to CreateProcess() via Cygwin's spawnve(), bypassing fork() for the simple-command case. The fast path is grown incrementally across the series to cover pipeline stages, simple redirections (filename targets, fd dup and close, here-docs and here-strings, fd numbers >= 3), non-interactive asynchronous commands, and the awkward edge cases: a sentinel return value disambiguates "did not attempt" from "attempted and failed" so we don't print duplicate diagnostics, and an unwind-protect frame makes the whole window resilient against longjmps from OOM or fatal traps. Standalone benchmarks against the distro bash show roughly 2x on plain external commands, 1.9x on redirected commands, and 1.1x on short pipelines; the test-suite-level effect is what this branch exists to measure. The fast path is gated on __CYGWIN__ and only kicks in for cases the regular fork path would have handled with no observable difference, so non-Windows MSYS2 environments and corner cases like job control or signal-sensitive backgrounded interactive jobs continue to use the existing fork()+execve() path unchanged. pkgrel is bumped so the freshly built package is preferred over the in-distro 5.3.009-1 when a CI job drops it on top of a git-sdk-64 minimal SDK. Assisted-by: Claude Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

…branch The patches added in the previous commit are meant to be evaluated against Git's own test suite on Windows, not in isolation: standalone microbenchmarks are useful for sanity-checking individual cases, but the question that matters is whether the spawnve() fast path and the expr built-in measurably shrink the end-to-end runtime of the test scripts that "prove" orchestrates on the Git for Windows CI. Add a workflow that builds the package on a windows-latest runner via setup-git-for-windows-sdk@v2 (full flavor, so makepkg's build dependencies are available), uploads the resulting bash-*.pkg.tar.*, and then prepares a minimal SDK image with the freshly built bash dropped on top. The image-preparation job follows the same shape as msys2-runtime's build.yaml: download the latest successful git-for-windows/git-sdk-64 ci-artifacts run for the minimal SDK tarball and the git-artifacts tarball, extract the SDK to a private directory, overlay our package by tar-extracting it (we don't have a pacman database in the stand-alone tree, so "pacman -U" would refuse to operate; tar extraction is what an in-place pacman install would do anyway, modulo the pacman bookkeeping that we deliberately exclude), sanity-check the resulting "bash --version", and repackage as the git-sdk-x86_64-minimal artifact the downstream reusable workflow expects. Hand off to git-for-windows/git-sdk-64's test-ci-artifacts.yml reusable workflow, which fans Git's test suite out across 17 parallel matrix shards. This is the same harness msys2-runtime's CI uses, so the runtime-vs-bash variable is isolated cleanly: each PR run produces 17 per-shard logs whose timings can be compared directly against those of a vanilla-bash run on the same SDK revision. Trigger on push/PR for bash/** and the workflow file itself so the harness is exercised whenever the patches change, and on workflow_dispatch for ad-hoc reruns. Assisted-by: Claude Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

dscho added 2 commits May 12, 2026 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288

bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288
dscho wants to merge 2 commits into
git-for-windows:mainfrom
dscho:bash-spawnve-fast-path

dscho commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dscho commented May 13, 2026

Empirical evidence

Scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant