Skip to content

bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288

Open
dscho wants to merge 2 commits into
git-for-windows:mainfrom
dscho:bash-spawnve-fast-path
Open

bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288
dscho wants to merge 2 commits into
git-for-windows:mainfrom
dscho:bash-spawnve-fast-path

Conversation

@dscho
Copy link
Copy Markdown
Member

@dscho dscho commented May 13, 2026

Git's test suite on Windows runs in roughly three to four times the wall-clock time of the same suite on Linux, and the dominant cause is fork(). Every shell script, every helper, every $(expr ...) substitution, every short cmd | cmd pipeline forces Cygwin's runtime to snapshot the parent's address space so that POSIX copy-on-write semantics can be emulated on top of CreateProcess. The test suite spawns a lot of those, and that cost is paid every push to git-for-windows/git, every contributor running the suite locally on their machine before opening a PR, every retry of a flaky job in CI.

This branch teaches bash to side-step fork() for the cases where the regular fork path would have behaved identically anyway, and to handle expr internally rather than spawning /usr/bin/expr for every arithmetic substitution. The cumulative payoff matters in two distinct ways: contributors get noticeably shorter iteration loops when running the test suite locally on Windows, and CI runs spend less time on a Windows runner per push. The Git for Windows CI in particular pays the fork() cost every time, and the savings compound across PRs, releases, and re-runs.

Empirical evidence

Whole Git test suite, 17-way matrix, identical minimal SDK image, identical git-artifacts, the only variable being the bash binary:

wall-clock total run
stock bash 5.3.009 32,401 s bash-spawnve-baseline 25742171983
patched bash 24,334 s bash-spawnve-fast-path 25742172112

That is a 25% reduction across 878 tests. No tests fail.

A handful of tests looked like regressions in single-shot timings; the worst was t6433-merge-toplevel.sh at a 1.24x ratio. To rule out CI scheduler noise, the sibling drill-bash-regression workflow re-runs a single test 20 times under each variant on the same runner in randomized paired order, swapping both bash.exe and sh.exe between every iteration. For t6433-merge-toplevel.sh the result was:

mean over 20 runs
stock bash 16.714 s
patched bash 14.226 s

Every one of the 20 paired iterations showed the patched build faster. The apparent regression in the whole-suite comparison was scheduler noise; the patched build has no code path that adds work over the baseline by mechanism, only paths that skip it. The same harness is available for any other test a reviewer wants spot-checked.

The workflow that produced the whole-suite numbers ships on this branch as .github/workflows/bash-with-git-tests.yml so the comparison can be reproduced on demand.

Scope

The fast path is strictly gated on __CYGWIN__. Non-Windows MSYS2 environments are unaffected. The work deliberately does not touch the Cygwin runtime itself: a proper posix_spawn() integration there would require driving Jeremy Drake's stalled patches to completion and is much more invasive. Deferring that keeps this change isolated to the bash package alone while still delivering the bulk of the available speedup.

dscho added 2 commits May 12, 2026 09:56
…t-in expr

Git's CI runs on Windows take far longer than on Linux primarily
because the test suite is a tree of shell scripts orchestrated by
Perl's "prove", and on Windows both Bash and Perl must go through
msys2-runtime's POSIX emulation.  fork() in particular is expensive
on Windows: Cygwin has to snapshot the parent's entire address space
to fake POSIX copy-on-write semantics, and every external command the
test suite spawns pays that cost.  Comparing per-test timings between
the Linux and Windows CI runs makes the pattern obvious: scripts that
hammer plain external commands such as /usr/bin/expr or short
"cmd | cmd" pipelines are the worst offenders, with Windows/Linux
runtime ratios in the high single digits to low double digits even
when the underlying work is trivial.

This series, developed and benchmarked in isolation against a clean
bash-5.3 source tree, attacks both factors directly.  An "expr"
built-in handles the test suite's many $(expr ...) invocations
inside the shell process, removing the fork()+execve()+exit() round
trip entirely.  A spawnve()-based fast path in execute_cmd.c then
sends synchronous external commands straight to CreateProcess() via
Cygwin's spawnve(), bypassing fork() for the simple-command case.
The fast path is grown incrementally across the series to cover
pipeline stages, simple redirections (filename targets, fd dup and
close, here-docs and here-strings, fd numbers >= 3), non-interactive
asynchronous commands, and the awkward edge cases: a sentinel return
value disambiguates "did not attempt" from "attempted and failed"
so we don't print duplicate diagnostics, and an unwind-protect frame
makes the whole window resilient against longjmps from OOM or fatal
traps.  Standalone benchmarks against the distro bash show roughly
2x on plain external commands, 1.9x on redirected commands, and 1.1x
on short pipelines; the test-suite-level effect is what this branch
exists to measure.

The fast path is gated on __CYGWIN__ and only kicks in for cases
the regular fork path would have handled with no observable
difference, so non-Windows MSYS2 environments and corner cases like
job control or signal-sensitive backgrounded interactive jobs
continue to use the existing fork()+execve() path unchanged.

pkgrel is bumped so the freshly built package is preferred over the
in-distro 5.3.009-1 when a CI job drops it on top of a git-sdk-64
minimal SDK.

Assisted-by: Claude Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…branch

The patches added in the previous commit are meant to be evaluated
against Git's own test suite on Windows, not in isolation: standalone
microbenchmarks are useful for sanity-checking individual cases, but
the question that matters is whether the spawnve() fast path and the
expr built-in measurably shrink the end-to-end runtime of the test
scripts that "prove" orchestrates on the Git for Windows CI.

Add a workflow that builds the package on a windows-latest runner via
setup-git-for-windows-sdk@v2 (full flavor, so makepkg's build
dependencies are available), uploads the resulting bash-*.pkg.tar.*,
and then prepares a minimal SDK image with the freshly built bash
dropped on top.  The image-preparation job follows the same shape as
msys2-runtime's build.yaml: download the latest successful
git-for-windows/git-sdk-64 ci-artifacts run for the minimal SDK
tarball and the git-artifacts tarball, extract the SDK to a private
directory, overlay our package by tar-extracting it (we don't have a
pacman database in the stand-alone tree, so "pacman -U" would refuse
to operate; tar extraction is what an in-place pacman install would
do anyway, modulo the pacman bookkeeping that we deliberately
exclude), sanity-check the resulting "bash --version", and repackage
as the git-sdk-x86_64-minimal artifact the downstream reusable
workflow expects.

Hand off to git-for-windows/git-sdk-64's test-ci-artifacts.yml
reusable workflow, which fans Git's test suite out across 17 parallel
matrix shards.  This is the same harness msys2-runtime's CI uses, so
the runtime-vs-bash variable is isolated cleanly: each PR run produces
17 per-shard logs whose timings can be compared directly against
those of a vanilla-bash run on the same SDK revision.

Trigger on push/PR for bash/** and the workflow file itself so the
harness is exercised whenever the patches change, and on
workflow_dispatch for ad-hoc reruns.

Assisted-by: Claude Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant