various: ephemeral distributed build agents#102
Open
NotAShelf wants to merge 36 commits into
Open
Conversation
Give each workflow a dedicated, single-writer cache key. Each key now has exactly one writer, so the saved cache always holds the complete closure its consumers expect. Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Id26fbe1099b59d96ecad2e213d7b12946a6a6964
Since we're leaning closer into remote builder setups and a non-zero amount of users will probably think SSH builders suffice I've decided to do a bit of basic hardening to SSH remote builder setups. Namely: - StrictHostKeyChecking=yes, - Add IdentitiesOnly/IdentityAgent=none/BatchMode/ConnectTimeout, - Add ssh_require_host_key to refuse unpinned builders. Users should use proper *agents* anyway, but as long as we support building over SSH I prefer to keep connections clean. Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ic5066f2a6d980cabe0d78a9732856e916a6a6964
Ephemeral agents mint a fresh unpersisted machine ID, take a unique name, drain the queue under max-builds/idle/lifetime limits, then exit instead of reconnecting. Advertise it via `AgentInfo.ephemeral`; the runner persists ephemeral/auth_kind and prunes stale CI sessions. Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ic011da0de951d6d0c5310c1b29b8f6836a6a6964
Agents may now present an OIDC JWT (e.g. a GitHub Actions ID token) in the register authToken field instead of a bearer token. Yay! The runner verifies the RS256 signature against the issuer's JWKS, checks iss/aud/exp/sub, and gates on an owner/repo allowlist; Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I3f07e93b3ff526c91a7bde0f21294fb56a6a6964
59e8fe2 to
18f07fa
Compare
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Id733430e7c3a3c2f4fac8f0addd18c476a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I7ff05f5c55dbae038e15af6aef230d336a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I554ed9e8934a12f836d966178d14c22a6a6a6964
…ow dispatch Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I425fd270e79cd3a5131c83d4e64c03f56a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I38f83da9b762f0f2e413e86759f498d66a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Icab25eb763e12f94d47a0078ac8483326a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I143ffdfe0481679a4781f26f0f8c07fe6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I72e4592cc33ca4b073a9c3bd306be0796a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Iffbc49849945a209f81655a8c94027f86a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ic607145b08061b71b30f076b38c94c606a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I66684e6558c9103bd483f1718f2b01d66a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I233c61041a8b48d4d379cc0171172ad56a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I99354146c80574403668afee357f5d3c6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ic41cb7f76f29b27a8e7eec61427b91596a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I6cdee25234a1d4c9949df5e1cf32efad6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ia8019cd9dc2c1a6756fc9420713e0ac06a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ifb94186d354f264fbf4afbfae01b8b876a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I3ae6d4c08a9d058eca3227b55acdc34d6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I5123344b54d6324890b6c56448de163d6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: If6d7dd7bab338750c08fe4427f6de3986a6a6964
7d84c5a to
2f53407
Compare
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I4f20cd9cbe4bfdd9b90f65da57642ff86a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Iccd209c1b8e77a9de672cf91dc8207166a6a6964
Only the build drv outputs may be presigned, so an agent cannot sign a narinfo for an arbitrary input-addressed path. Fix the fingerprint to the canonical 1-prefixed sha256 base32 form with sorted refs and bound the verify stream.
Cap one streamed import at `MAX_IMPORT_TOTAL_BYTES` so a trusted-ref agent cannot push unbounded bytes into the runner store.
Require `rpc.tls` when `auth_tokens` or oidc are set, and force the gha `runner_url` onto `circus+tls`, so credentials are never sent in cleartext.
Floor JWKS refreshes against unknown-kid floods, pin the discovered `jwks_uri` to the issuer origin, and derive agent ephemerality from the OIDC identity rather than the agent flag. Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Id97cbb2916646a1d03da1cbcc7756e546a6a6964
Require the host-key type at index 0 or 1 and reject a leading comment token.
Apply the `ssh_require_host_key` gate in `reserve_venue` so it matches `try_remote_build`. Reject whitespace in `NIX_SSHOPTS` paths and stop double-bracketing IPv6 `known_hosts` entries.
Subtract already-registered agents from the inflight count so a launched runner is not double-counted. Add a total order to slot matching, cooldown on dispatch failure, client timeouts, and a `machine_id` tie-break. Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: Ic35307d8e625475ba8e2d67cca255bad6a6a6964
Also prune connected rows whose `last_seen` is stale, since a force- killed runner never flips connected to false.
Reserve against `max_builds` at assign time so an agent cannot accept one past the cap.
Signed-off-by: NotAShelf <raf@notashelf.dev> Change-Id: I4d96dda2d3c75a1fc78102a1d9eeeb746a6a6964
2f53407 to
78e3612
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Kind of WIP. The main logic is done, but there's polish lacking. Each commit is self-contained enough to be individually reviewable
Finally introduces support for ephemeral (single-session) agent runs, primarily intended for CI environments like GHA. New config, CLI support, and a bit of agent/remote cleanup. We can now have robust and collision-free ephemeral agent runs for CI pipelines (starting with GHA). More forges could be supported in the future, if we decide it's valuable.