bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288
Open
dscho wants to merge 2 commits into
Open
bash: speed up the Git test suite on Windows via spawnve() and a built-in expr#288dscho wants to merge 2 commits into
dscho wants to merge 2 commits into
Conversation
…t-in expr Git's CI runs on Windows take far longer than on Linux primarily because the test suite is a tree of shell scripts orchestrated by Perl's "prove", and on Windows both Bash and Perl must go through msys2-runtime's POSIX emulation. fork() in particular is expensive on Windows: Cygwin has to snapshot the parent's entire address space to fake POSIX copy-on-write semantics, and every external command the test suite spawns pays that cost. Comparing per-test timings between the Linux and Windows CI runs makes the pattern obvious: scripts that hammer plain external commands such as /usr/bin/expr or short "cmd | cmd" pipelines are the worst offenders, with Windows/Linux runtime ratios in the high single digits to low double digits even when the underlying work is trivial. This series, developed and benchmarked in isolation against a clean bash-5.3 source tree, attacks both factors directly. An "expr" built-in handles the test suite's many $(expr ...) invocations inside the shell process, removing the fork()+execve()+exit() round trip entirely. A spawnve()-based fast path in execute_cmd.c then sends synchronous external commands straight to CreateProcess() via Cygwin's spawnve(), bypassing fork() for the simple-command case. The fast path is grown incrementally across the series to cover pipeline stages, simple redirections (filename targets, fd dup and close, here-docs and here-strings, fd numbers >= 3), non-interactive asynchronous commands, and the awkward edge cases: a sentinel return value disambiguates "did not attempt" from "attempted and failed" so we don't print duplicate diagnostics, and an unwind-protect frame makes the whole window resilient against longjmps from OOM or fatal traps. Standalone benchmarks against the distro bash show roughly 2x on plain external commands, 1.9x on redirected commands, and 1.1x on short pipelines; the test-suite-level effect is what this branch exists to measure. The fast path is gated on __CYGWIN__ and only kicks in for cases the regular fork path would have handled with no observable difference, so non-Windows MSYS2 environments and corner cases like job control or signal-sensitive backgrounded interactive jobs continue to use the existing fork()+execve() path unchanged. pkgrel is bumped so the freshly built package is preferred over the in-distro 5.3.009-1 when a CI job drops it on top of a git-sdk-64 minimal SDK. Assisted-by: Claude Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…branch The patches added in the previous commit are meant to be evaluated against Git's own test suite on Windows, not in isolation: standalone microbenchmarks are useful for sanity-checking individual cases, but the question that matters is whether the spawnve() fast path and the expr built-in measurably shrink the end-to-end runtime of the test scripts that "prove" orchestrates on the Git for Windows CI. Add a workflow that builds the package on a windows-latest runner via setup-git-for-windows-sdk@v2 (full flavor, so makepkg's build dependencies are available), uploads the resulting bash-*.pkg.tar.*, and then prepares a minimal SDK image with the freshly built bash dropped on top. The image-preparation job follows the same shape as msys2-runtime's build.yaml: download the latest successful git-for-windows/git-sdk-64 ci-artifacts run for the minimal SDK tarball and the git-artifacts tarball, extract the SDK to a private directory, overlay our package by tar-extracting it (we don't have a pacman database in the stand-alone tree, so "pacman -U" would refuse to operate; tar extraction is what an in-place pacman install would do anyway, modulo the pacman bookkeeping that we deliberately exclude), sanity-check the resulting "bash --version", and repackage as the git-sdk-x86_64-minimal artifact the downstream reusable workflow expects. Hand off to git-for-windows/git-sdk-64's test-ci-artifacts.yml reusable workflow, which fans Git's test suite out across 17 parallel matrix shards. This is the same harness msys2-runtime's CI uses, so the runtime-vs-bash variable is isolated cleanly: each PR run produces 17 per-shard logs whose timings can be compared directly against those of a vanilla-bash run on the same SDK revision. Trigger on push/PR for bash/** and the workflow file itself so the harness is exercised whenever the patches change, and on workflow_dispatch for ad-hoc reruns. Assisted-by: Claude Opus 4.7 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Git's test suite on Windows runs in roughly three to four times the wall-clock time of the same suite on Linux, and the dominant cause is
fork(). Every shell script, every helper, every$(expr ...)substitution, every shortcmd | cmdpipeline forces Cygwin's runtime to snapshot the parent's address space so that POSIX copy-on-write semantics can be emulated on top ofCreateProcess. The test suite spawns a lot of those, and that cost is paid every push togit-for-windows/git, every contributor running the suite locally on their machine before opening a PR, every retry of a flaky job in CI.This branch teaches
bashto side-stepfork()for the cases where the regular fork path would have behaved identically anyway, and to handleexprinternally rather than spawning/usr/bin/exprfor every arithmetic substitution. The cumulative payoff matters in two distinct ways: contributors get noticeably shorter iteration loops when running the test suite locally on Windows, and CI runs spend less time on a Windows runner per push. The Git for Windows CI in particular pays thefork()cost every time, and the savings compound across PRs, releases, and re-runs.Empirical evidence
Whole Git test suite, 17-way matrix, identical minimal SDK image, identical
git-artifacts, the only variable being thebashbinary:bash5.3.009bash-spawnve-baseline25742171983bashbash-spawnve-fast-path25742172112That is a 25% reduction across 878 tests. No tests fail.
A handful of tests looked like regressions in single-shot timings; the worst was
t6433-merge-toplevel.shat a 1.24x ratio. To rule out CI scheduler noise, the siblingdrill-bash-regressionworkflow re-runs a single test 20 times under each variant on the same runner in randomized paired order, swapping bothbash.exeandsh.exebetween every iteration. Fort6433-merge-toplevel.shthe result was:bashbashEvery one of the 20 paired iterations showed the patched build faster. The apparent regression in the whole-suite comparison was scheduler noise; the patched build has no code path that adds work over the baseline by mechanism, only paths that skip it. The same harness is available for any other test a reviewer wants spot-checked.
The workflow that produced the whole-suite numbers ships on this branch as
.github/workflows/bash-with-git-tests.ymlso the comparison can be reproduced on demand.Scope
The fast path is strictly gated on
__CYGWIN__. Non-Windows MSYS2 environments are unaffected. The work deliberately does not touch the Cygwin runtime itself: a properposix_spawn()integration there would require driving Jeremy Drake's stalled patches to completion and is much more invasive. Deferring that keeps this change isolated to thebashpackage alone while still delivering the bulk of the available speedup.