gix-pack: dynamic work queue for parallel delta resolution#2592
gix-pack: dynamic work queue for parallel delta resolution#2592Vinaya Mandke (vmandke) wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 325b86c31c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".
| size.fetch_add(base_bytes.len(), Ordering::Relaxed); | ||
| } | ||
| } | ||
| queue.finish_item(); |
There was a problem hiding this comment.
Make queue accounting panic-safe
When a worker panics after queue.pop() has incremented in_progress—for example from the existing delta-size assert_eq! on malformed input or from a panicking inspector—this explicit finish_item() is skipped. Any other workers waiting in pop() then see an empty queue with in_progress > 0 and block forever, so the scoped join cannot unwind; use a guard or otherwise ensure the counter is decremented on panic paths.
Useful? React with 👍 / 👎.
…a resolution Replaces: resolve::deltas Modifies: resolve::delta_mt Previously each thread was assigned a fixed subtree via in_parallel_with_slice, causing load imbalance when one root had a deep chain of deltas. Thread synchronization also used a busy wait of poll_interval. This PR rewrites deltas_mt to use a Workqueue with Condvar blocking, so all threads pull work evenly regardless of tree shape. Co-authored-by: Claude Code
325b86c to
c308380
Compare
|
edit:: updated commit message, as this is not a work stealing queue, rather a global work queue |
|
Thanks a lot! Without looking at the code in depth, I just had it run benchmarks to see what the algorithm change does given different and somewhat extreme packs. My takeaway is that the user time of the new algorithm is probably more similar to Git because it will use threads more evenly, but these threads duplicate more work as well. So the idea behind the implementation is that the work is organized so that no duplicate work exists. For now, however, I would be ok to use more memory if it's in the service to ultimately doing less work. To do that, the previous implementation already implemented something like a work stealing model, but apparently the heuristics didn't make it kick in for a PHPstan. Maybe this can be fixed, or maybe the algorithm behind that can be improved to allow incorporating threads not only per root, but let idle threads work together with other threads that still have work to do. If this works, then a follow up could deal with reducing the memory footprint. There should be a lot of potential, given that right now it keeps in memory all the bases in the delta tree, even though many of these at some point will not be used anymore. And from what we know, apparently this could halve the memory usage in case of PHPstan. Please note that I put this PR back into draft while the details of the algorithm are still not figured out. Codex Benchmark Results
Pack Verification Performance ReportCompared this branch’s Binaries:
Methodology:
Linux Fixture PackPack index: Pack size: about Runtime
Detailed gix passes:
On this pack, the branch is about Peak RSS
The branch uses about PHPStan PackPack index: Pack size: about Runtime
Detailed gix passes:
On this pack, the branch is about Peak RSS
The branch uses about Summary
Overall, this branch is strongly beneficial for the PHPStan pack, both in wall-clock time and peak RSS. On the Linux fixture, it reduces peak RSS modestly but regresses runtime compared to mainline gix, with notably higher system CPU time. |
|
Yes, I tried to work out something similar to how Git was parallelising. And I was only looking at the phpstan pack :) Will benchmark this more, and look into how the previous implementation can be fixed. Thanks for the review. Will get back with something by next week. |
|
Thanks. Meantime, I also added Git as a baseline. Overall, Gitoxide is much faster in mainline and in this branch, but it uses significantly more memory. However, my feeling is that it would be better to optimize the current algorithm to also deal with PHP Stan and similar packs. It seems more like a bug in the current implementation than something that needs a complete change to be more like Git. And in the second step, clearly, Gitoxide needs an algorithm that is just like the one in Git to reduce memory usage. Despite Git being much slower, it's very respectable how small the memory footprint is that it needs to handle these large packs. There's clearly a trade-off and I personally would always want it to be fast but we'd want to offer a choice here, and with it, a more balanced algorithm. With all that said, it also seems that this branch already contains an implementation that is similar to Git, given It's memory footprint, and all that speaks against it, a proper review outstanding is that it is slower on the Linux pack. So maybe what we really want is to implement this as alternative and further reduce its memory footprint instead. I will leave it to you. |
Fixes #2424
Replaces: resolve::deltas
Modifies: resolve::delta_mt
Previously each thread was assigned a fixed subtree via in_parallel_with_slice, causing load imbalance
when one root had a deep chain of deltas. Thread synchronization also used a busy wait of poll_interval.
This PR rewrites deltas_mt to use a Workqueue with Condvar blocking, so all threads pull work evenly
regardless of tree shape.
Timings
--- gix main (in_parallel_with_slice) ---
/tmp/gix-main free pack index create --pack-path /Users/vmandke/phpstan.pack /tmp/bench2-main
index: bf450685c28fb13ad44fcafe90a5401b7b54141a
pack: 73555a206b95a6590d30e8671eacb5bd4ff55a70
real 2m36.613s
user 6m13.277s
sys 0m15.543s
--- gix fix (work-stealing queue) ---
/tmp/gix-fix free pack index create --pack-path /Users/vmandke/phpstan.pack /tmp/bench2-fix
index: bf450685c28fb13ad44fcafe90a5401b7b54141a
pack: 73555a206b95a6590d30e8671eacb5bd4ff55a70
real 1m14.710s
user 6m43.032s
sys 0m10.444s
Profiling
Gix on Main
Gix on Fix
Git on same pack
AI Usage disclaimer
Used Claude Code for benchmark scripting, fixes and cleanup.
Also used for understanding both Gitoxide and Git codebases.