Improve performance of status on windows#2547
Improve performance of status on windows#2547Special Bread (special-bread) wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c27c3aebe0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".
| pub fn normalize_path(path: &[u8]) -> BString { | ||
| use bstr::ByteSlice; | ||
| path.to_str_lossy().to_lowercase().into() |
There was a problem hiding this comment.
Respect case-sensitive worktrees in metadata keys
Do not lowercase every cache key unconditionally here: on Windows repositories living in case-sensitive directories (e.g. per-directory case sensitivity enabled, typically with core.ignoreCase=false), distinct tracked paths like Foo.txt and foo.txt collapse to the same key and one entry overwrites the other. index_as_worktree then reads the wrong cached stat for at least one file, which can misreport tracked-file status (clean/modified/removed) instead of merely causing a cache miss.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Shameful display on my part, I believe this may be left over from a different solution. There are various combinations of the folder on windows being case sensitive/not and there being collisions with the entries in the git tree/not, so either there is some mapping on what to do for each of the cases, or the cache should be case sensitive where any miss results in a miss, which slows down the status a tiny bit, but also maintains correctness. I believe that the worst case is if every path has incorrect casing between tree and disk, which would be strange and in practice I found it to be only a handful of files across multiple repos, with the typical case being zero. In the worst case this should revert back to the original performance, so not being slower than it used to be.
Ill make a commit that addresses this here
|
My apologies for the late response. I guess that I thought I would wait until CI is green. But maybe it makes sense to align and see what your plans are with the PR. Meantime, let me put it back to draft and you can bring it out of draft once you think it's ready for review. |
|
No worries at all, I too am very busy! This is me trying to squeeze it out since there were simply too many things and it was a shame to let the work sit without making it upstream. If anything I should be apologizing for taking so long! (Edit - I fixed a doc link didnt resolve on non windows machines and reran cargo fmt, but otherwise it seemed like previous tests were mostly passing, but a few failed to load and so failed overall, and now rerunning them without changing any of the code seems to make them all pass) I think that some general guidance would be helpful on whether we want to land a minimal scope first (i.e. this) which matches to libgit performance, or look into more invasive changes which will get more speed. Also on the minimal implementation it is also worth asking the question if its a bit much even in the current state as perhaps the idea of the cache should be cleaned up into a windows only preprocessing step for the status, mostly the same thing, but the idea of maybe-later caching seems perhaps more work than its work since invalidating the cache is as expensive as recreating one in many cases. |
71f7b63 to
64cdbf4
Compare
|
I'd definitely think that as a first pass, this PR doesn't need to get larger, so if you are happy with the Windows speedups, we should work on cashing those in.
Then let's not expose it, and if you want, even keep it Windows only. I am running some performance tests right now, so let's see.
There is a untracked-cache in the Git index which I believe is at least related, and #2503 might be worth a look. If you think it's related, I could prioritize its review - it's a biggie and of unknown quality (i.e. it might be faster to re-prompt it), but maybe it's good to get additional work off the ground more quickly or prevents reinventing something. Codex says it's related, but I don't know how easy it is for you to evaluate beyond trying the other PR in isolation. Codex
The cache is used only in the tracked-file status path. Call path:
So yes, it affects Windows directly, and only Windows. The module, fields, API methods, and cache plumbing are behind PR #2503 overlaps conceptually but not at the same layer:
They are complementary in status: PR #2503 speeds the dirwalk/untracked side; this branch speeds tracked-file metadata checks, especially on Windows. There is likely merge overlap, though. Both touch status plumbing and outcomes/tests, and both are sensitive to stat semantics. PR #2503 also fixes index ctime/mtime decoding, which is adjacent to this branch’s Windows cached-stat comparison. I would treat them as interacting PRs: separate goals, compatible design, but expect conflicts or review coordination around status outcomes, stat comparison correctness, and cache invalidation semantics.
Yes, let's not rush this.
It can't hurt the let people try the PR! I hope my answers help somewhat, and my plan is to wait until it comes out of draft for a proper review. Meantime, here is some generated info at your discretion (I just skimmed it). Codex
Review outcome: changes requested. I confirmed the P2 correctness issue: Windows directory reparse points can be cached with both Performance note: these benchmarks ran on macOS, so they do not exercise the Windows-only metadata cache path gated by Environment:
Compared binaries:
Repos selected from
Command shape: hyperfine --style basic --warmup 2 --min-runs 10 \
--command-name "main <repo>" "/tmp/gitoxide-bench-bins/gix-main-8af2691270 --no-verbose -r '<repo-path>' status --no-write > /dev/null" \
--command-name "head <repo>" "/tmp/gitoxide-bench-bins/gix-head-64cdbf47e1 --no-verbose -r '<repo-path>' status --no-write > /dev/null"Results:
Conclusion: no meaningful non-Windows performance regression showed up across the selected repo sizes. A Windows benchmark is still needed to validate the intended metadata-cache speedup and to verify the corrected symlink/reparse-point semantics once fixed. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a Windows-only metadata cache for status to avoid per-file lstat calls (slow on Windows) by pre-enumerating directories and reusing the collected metadata during tracked-file checks, aiming to bring gix status performance closer to libgit2/Git for Windows.
Changes:
- Add a Windows-only
gix_status::metadata_cachemodule that walks the worktree via Windows APIs and builds a path→metadata map. - Thread the optional metadata cache through
gixstatus configuration (MetadataCacheConfig) and into thegix-statusindex-as-worktree pipeline. - Extend mode-change logic to accept pre-extracted file-type/permission bits (to support cached metadata).
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| gix/src/status/platform.rs | Adds Windows-only builder methods to provide/disable/prepare the metadata cache from the status() platform. |
| gix/src/status/mod.rs | Introduces MetadataCacheConfig and a Windows-only helper to build the cache (gitignore-aware). |
| gix/src/status/iter/mod.rs | Wires MetadataCacheConfig into iterator construction and passes cache refs into index-worktree status. |
| gix/src/status/index_worktree.rs | Extends index_worktree_status() signature/context to accept an optional Windows metadata cache. |
| gix-status/src/lib.rs | Exposes the Windows-only metadata_cache module and re-exports its types. |
| gix-status/src/metadata_cache.rs | Implements the cache builder and Windows directory enumeration walker. |
| gix-status/src/index_as_worktree/types.rs | Adds an optional Windows-only metadata_cache field to the tracked-modifications context. |
| gix-status/src/index_as_worktree/function.rs | Consults the cache on Windows before falling back to live syscalls; adapts mode/stat extraction to cached metadata. |
| gix-status/src/index_as_worktree_with_renames/types.rs | Threads the optional cache through the “with renames” context on Windows. |
| gix-status/src/index_as_worktree_with_renames/mod.rs | Passes the cache into the underlying tracked-modifications status call on Windows. |
| gix-status/tests/status/index_as_worktree.rs | Updates test contexts to include the new Windows-only metadata_cache field. |
| gix-status/tests/status/index_as_worktree_with_renames.rs | Updates test contexts to include the new Windows-only metadata_cache field. |
| gix-status/Cargo.toml | Adds hashbrown and Windows-only windows-sys dependency for the cache implementation. |
| gix-index/src/entry/mode.rs | Adds change_to_match_fs_with_values() to reuse mode-change logic with cached metadata bits. |
| Cargo.lock | Records new dependency resolutions for added crates/features. |
| thread_limit: Option<usize>, | ||
| ) -> Result<gix_status::MetadataCache, crate::status::index_worktree::Error> { | ||
| let workdir = repo | ||
| .workdir() | ||
| .ok_or(crate::status::index_worktree::Error::MissingWorkDir)?; | ||
| let sync_repo = repo.clone().into_sync(); | ||
| let index = repo.index_or_empty()?; | ||
| let index_state: &gix_index::State = &index; |
Follow up to git status performance improvement, this fixes an edge case where a case sensitive entry in the cache gets lowercased and matches a second case sensitive entry in the tree, potentially resulting in incorrect git status entries. Skipping lowercasing entirely results in those cases being a cache miss instead making it more transparent.
64cdbf4 to
45fc1e0
Compare
45fc1e0 to
0efcd29
Compare
|
I cleaned up the code to be more minimal and instead of styling it as a cache its now styled as a windows only preprocessing step. There is still a parameter to run status without the cache but that is intended mostly for regression tests and such. In theory there is a tiny amount of overhead due to starting the walk first, then doing git status afterwards. But in practice in every repo with more than a few dozen files in it should be faster since your directory wide walk will capture those in one go as opposed to multiple single file queries Re untracked caches: I admit i didnt comb through the other pr, so this is just my thoughts on those: In my experience on windows the main slow down is the lstat approach to querying files, and untracked cache doesnt fix that, so it can only provide a modest speedup, and in my experience never helped that much. I believe that you can get a really fast status without it on windows The linux story for that is nicer - where in linux you dont get any speedup from directory wide queries, but I believe that you can use the untracked cache to skip some comparisons if you have a bunch of files that are untracked, and directories too. Also you can swap some queries for cheaper queries if you rely on it, so thats nice. But im not on a linux machine so this is mostly theory not practice. An untracked cache is additive on top of a good status implementation, and with windows in particular there is a bit of tension since the fastest way to walk is by directory anyway so it does little good to have an untracked cache, which mostly only works for where entire directories are untracked so you can return early. For that reason in my custom implementation I didnt add support for that since it was more complexity and didnt really help. I havent looked super deeply into why the current implementation is 350ms and not 100ms for example, but I imagine that those speedups would carry over to linux too, since that would be touching the core status implementation, not just adding a windows preprocessing step. So to speed things up more i would focus on that next, i.e. making sure that the status is fast without an untracked cache first, then once that is fast, looking at what the cache can do - i.e. -> make the current code strong first, then add complexity after Also to manage expectations a bit - I am sadly amazingly busy so wont be able to devote my time to a full rewrite of status (which is also terrifying since this is public code and must be completely correct and there are lots of edge cases to status!) Hope this helps :> |
This creates a cache of file metadata that is then prefilled by windows API calls to allow per-directory walking instead of per file. As a result performance is much faster.
The cache method is made to minimise the surface area of the change, and is also windows-only where other targets should be unaffected.
Testing status on the linux repo improves speed from ~1000ms to ~300ms - putting this to be roughly on par with libgit2. Faster speeds are possible but would require larger changes, so this is an initial pass while avoiding doing too much.
Additional things to consider and discuss perhaps:
Given that this is a common piece of functionality I would love for someone else to test this too, I myself have been embarrassingly busy recently so this PR cooked for a while, and I may have missed some stuff while working on it on and off.