Skip to content

Add combined AddTempRootReturningPathInfo op to speed up nix eval#15546

Open
bouk wants to merge 2 commits intoNixOS:masterfrom
bouk:bouk/combined-write-derivation-op
Open

Add combined AddTempRootReturningPathInfo op to speed up nix eval#15546
bouk wants to merge 2 commits intoNixOS:masterfrom
bouk:bouk/combined-write-derivation-op

Conversation

@bouk
Copy link
Copy Markdown
Member

@bouk bouk commented Mar 24, 2026

Motivation

writeDerivation() currently makes 2 separate daemon IPC round-trips per derivation: addTempRoot then isValidPath. For a NixOS system with thousands of derivations, this adds a lot of IPC overhead.

This adds a new Op::AddTempRootReturningPathInfo that combines both operations into a single round-trip. writeDerivation() now uses addTempRootReturningPathInfo() which for RemoteStore uses the combined op when the daemon supports it, and falls back to sequential calls on the same connection otherwise.

Context

I was profiling the nix eval of a big system we have and figured out that nix eval --read-only was taking ~9 seconds but nix eval was taking 20 seconds.

With the help of Claude I figured out that this was caused by IPC overhead, writeDerivation does two IPCs which adds up since in this case the NixOS system has +8000 derivations. On my system (Apple M3 Pro MacBook Pro, macOS 26.3.1) the overhead per IPC is about 1ms(!).

This PR combines two IPCs into one, which in my case brought nix eval from 20 seconds to 15 seconds, 25% faster! I think we could bring it down even more by putting the writeDerivations into a background queue or something so the eval path isn't dependent on the IPC latency, but that would be better as a separate PR.

Improvement

Command 2.34.2+1 This PR
nix eval --read-only 9s 9s
nix eval 20s 12.5s

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@bouk bouk requested a review from Ericson2314 as a code owner March 24, 2026 13:22
@github-actions github-actions Bot added the store Issues and pull requests concerning the Nix store label Mar 24, 2026
@edolstra
Copy link
Copy Markdown
Member

The general idea looks good to me, but it may be better to combine this with QueryPathInfo rather than IsValidPath. The former is better because it can be cached by pathInfoCache. isValidPath() can use but not populate the pathInfoCache. This is especially a problem when using lazy trees (see DeterminateSystems#157).

Note that the overhead of the daemon calls is much less an issue when using the async path writer (#13798) because then the evaluator doesn't need to wait for the calls to addTempRoot() / isValidPath().

@xokdvium
Copy link
Copy Markdown
Contributor

Yup, async writing from the evaluator is definitely on the menu. Note that this can also be applied to all the other cases where we cache things client-side (like fetching sources to the store).

@bouk
Copy link
Copy Markdown
Member Author

bouk commented Mar 24, 2026

Ah great, #13798 is exactly the next step I had in mind.

I can change this PR to combine it with QueryPathInfo, it still makes the async path writer faster.

@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from 2abc777 to 221aa20 Compare March 25, 2026 10:40
@bouk bouk changed the title Add combined AddTempRootAndCheck op to speed up nix eval Add combined AddTempRootReturningPathInfo op to speed up nix eval Mar 25, 2026
@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from 221aa20 to 49f7655 Compare March 25, 2026 10:42
}
// Fallback for older daemons: use the default implementation
// which calls addTempRoot + queryPathInfo separately.
return Store::addTempRootReturningPathInfo(path);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work correctly? I don't use C++ often

@bouk
Copy link
Copy Markdown
Member Author

bouk commented Mar 25, 2026

@edolstra I've amended the PR to query path info instead

@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from 49f7655 to 658c555 Compare March 25, 2026 10:48
@bouk
Copy link
Copy Markdown
Member Author

bouk commented Mar 25, 2026

The ...ReturningPathInfo version seems to be even faster, now 12.5s when checking against the big NixOS system I have locally (not a very controlled benchmark tbh!)

@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from 658c555 to c3b6f08 Compare March 25, 2026 16:04
@bouk
Copy link
Copy Markdown
Member Author

bouk commented Mar 25, 2026

The tests are failing after switching to AddTempRootReturningPathInfo but I don't really understand why :(

@bouk bouk marked this pull request as draft March 27, 2026 09:52
@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from c3b6f08 to a785b46 Compare March 27, 2026 13:02
@bouk
Copy link
Copy Markdown
Member Author

bouk commented Mar 27, 2026

I've not been able to reproduce the CI failure locally, not sure how to proceed. Someone with deeper knowledge of nix internals would be more qualified to take a stab at this

Comment thread src/libstore/include/nix/store/store-api.hh Outdated
@xokdvium xokdvium self-assigned this Mar 27, 2026
@tomberek
Copy link
Copy Markdown
Contributor

tomberek commented Mar 29, 2026

I get the same error as CI:

nix-functional-tests> 165/208 nix-functional-tests:local-overlay-store / delete-refs                       FAIL            0.78s   exit status 1                                                            
...
      > 159/208 nix-functional-tests:local-overlay-store / build                    FAIL            0.82s   exit status 1                                                                                  
       > 165/208 nix-functional-tests:local-overlay-store / delete-refs              FAIL            0.78s   exit status 1                                                                                  
       > 172/208 nix-functional-tests:local-overlay-store / gc                       FAIL            0.78s   exit status 1                                                                                  
   

I'm pretty sure you don't run those on OSX, explaining why it didn't fail in your local case. EIther way, it seems related to being able to check for validity in both the upper and lower stores and that handling needs some care.

@bouk bouk force-pushed the bouk/combined-write-derivation-op branch 2 times, most recently from fbab903 to 72586ef Compare April 1, 2026 08:40
@bouk bouk marked this pull request as ready for review April 1, 2026 11:35
@bouk
Copy link
Copy Markdown
Member Author

bouk commented Apr 1, 2026

I've fixed CI, it seems my change unearthed a problem in queryPathInfoUncached. Please let me know if this fix is good

bouk added 2 commits April 1, 2026 14:29
writeDerivation() currently makes 2 separate daemon IPC round-trips
per derivation: addTempRoot then isValidPath. For a NixOS system with
~8265 derivations, this adds a lot of IPC overhead.

This adds a new Op::AddTempRootReturningPathInfo (opcode 48) that
combines both operations into a single round-trip.
writeDerivation() now uses addTempRootReturningPathInfo() which for
RemoteStore uses the combined op when the daemon supports it, and
falls back to sequential calls otherwise.
@bouk bouk force-pushed the bouk/combined-write-derivation-op branch from 72586ef to 31e0d1d Compare April 1, 2026 12:29
@bouk bouk requested a review from xokdvium April 1, 2026 12:30
@Ericson2314
Copy link
Copy Markdown
Member

I have talked to @xokdvium about making a write back cache store rather than introducing a bunch of new operations to accelerate eval. (See also the similar PR from Eelco.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

store Issues and pull requests concerning the Nix store

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants