Skip to content

various: ephemeral distributed build agents#102

Open
NotAShelf wants to merge 36 commits into
mainfrom
notashelf/push-wkzslqwowkku
Open

various: ephemeral distributed build agents#102
NotAShelf wants to merge 36 commits into
mainfrom
notashelf/push-wkzslqwowkku

Conversation

@NotAShelf

@NotAShelf NotAShelf commented Jun 11, 2026

Copy link
Copy Markdown
Member

Kind of WIP. The main logic is done, but there's polish lacking. Each commit is self-contained enough to be individually reviewable

Finally introduces support for ephemeral (single-session) agent runs, primarily intended for CI environments like GHA. New config, CLI support, and a bit of agent/remote cleanup. We can now have robust and collision-free ephemeral agent runs for CI pipelines (starting with GHA). More forges could be supported in the future, if we decide it's valuable.

Give each workflow a dedicated, single-writer cache key. Each key now 
has exactly one writer, so the saved cache always holds the complete
closure its consumers expect.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Id26fbe1099b59d96ecad2e213d7b12946a6a6964
Since we're leaning closer into remote builder setups and a non-zero
amount of users will probably think SSH builders suffice I've decided to
do a bit of basic hardening to SSH remote builder setups. Namely:

- StrictHostKeyChecking=yes,
- Add IdentitiesOnly/IdentityAgent=none/BatchMode/ConnectTimeout,
- Add ssh_require_host_key to refuse unpinned builders.

Users should use proper *agents* anyway, but as long as we support
building over SSH I prefer to keep connections clean.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ic5066f2a6d980cabe0d78a9732856e916a6a6964
Ephemeral agents mint a fresh unpersisted machine ID, take a unique
name, drain the queue under max-builds/idle/lifetime limits, then exit
instead of reconnecting. Advertise it via `AgentInfo.ephemeral`; the
runner persists ephemeral/auth_kind and prunes stale CI sessions.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ic011da0de951d6d0c5310c1b29b8f6836a6a6964
Agents may now present an OIDC JWT (e.g. a GitHub Actions ID token) in the
register authToken field instead of a bearer token. Yay! The runner verifies the
RS256 signature against the issuer's JWKS, checks iss/aud/exp/sub, and gates
on an owner/repo allowlist;

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I3f07e93b3ff526c91a7bde0f21294fb56a6a6964
@NotAShelf NotAShelf force-pushed the notashelf/push-wkzslqwowkku branch from 59e8fe2 to 18f07fa Compare June 11, 2026 20:55
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Id733430e7c3a3c2f4fac8f0addd18c476a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I7ff05f5c55dbae038e15af6aef230d336a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I554ed9e8934a12f836d966178d14c22a6a6a6964
…ow dispatch

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I425fd270e79cd3a5131c83d4e64c03f56a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I38f83da9b762f0f2e413e86759f498d66a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Icab25eb763e12f94d47a0078ac8483326a6a6964
@NotAShelf NotAShelf marked this pull request as ready for review June 12, 2026 07:37
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I143ffdfe0481679a4781f26f0f8c07fe6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I72e4592cc33ca4b073a9c3bd306be0796a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Iffbc49849945a209f81655a8c94027f86a6a6964
@NotAShelf NotAShelf changed the title ephemeral agents various: ephemeral distributed build agents Jun 12, 2026
NotAShelf added 11 commits June 12, 2026 12:09
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ic607145b08061b71b30f076b38c94c606a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I66684e6558c9103bd483f1718f2b01d66a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I233c61041a8b48d4d379cc0171172ad56a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I99354146c80574403668afee357f5d3c6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ic41cb7f76f29b27a8e7eec61427b91596a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I6cdee25234a1d4c9949df5e1cf32efad6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ia8019cd9dc2c1a6756fc9420713e0ac06a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ifb94186d354f264fbf4afbfae01b8b876a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I3ae6d4c08a9d058eca3227b55acdc34d6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I5123344b54d6324890b6c56448de163d6a6a6964
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: If6d7dd7bab338750c08fe4427f6de3986a6a6964
@NotAShelf NotAShelf force-pushed the notashelf/push-wkzslqwowkku branch 2 times, most recently from 7d84c5a to 2f53407 Compare June 15, 2026 06:11
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I4f20cd9cbe4bfdd9b90f65da57642ff86a6a6964
NotAShelf and others added 11 commits June 15, 2026 09:11
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Iccd209c1b8e77a9de672cf91dc8207166a6a6964
Only the build drv outputs may be presigned, so an agent cannot sign a
narinfo for an arbitrary input-addressed path. Fix the fingerprint to
the canonical 1-prefixed sha256 base32 form with sorted refs and bound
the verify stream.
Cap one streamed import at `MAX_IMPORT_TOTAL_BYTES` so a trusted-ref
agent cannot push unbounded bytes into the runner store.
Require `rpc.tls` when `auth_tokens` or oidc are set, and force the gha
`runner_url` onto `circus+tls`, so credentials are never sent in
cleartext.
Floor JWKS refreshes against unknown-kid floods, pin the discovered
`jwks_uri` to the issuer origin, and derive agent ephemerality from the
OIDC identity rather than the agent flag.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Id97cbb2916646a1d03da1cbcc7756e546a6a6964
Require the host-key type at index 0 or 1 and reject a leading comment
token.
Apply the `ssh_require_host_key` gate in `reserve_venue` so it matches
`try_remote_build`. Reject whitespace in `NIX_SSHOPTS` paths and stop
double-bracketing IPv6 `known_hosts` entries.
Subtract already-registered agents from the inflight count so a
launched runner is not double-counted. Add a total order to slot
matching, cooldown on dispatch failure, client timeouts, and a
`machine_id` tie-break.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ic35307d8e625475ba8e2d67cca255bad6a6a6964
Also prune connected rows whose `last_seen` is stale, since a force-
killed runner never flips connected to false.
Reserve against `max_builds` at assign time so an agent cannot accept
one past the cap.
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I4d96dda2d3c75a1fc78102a1d9eeeb746a6a6964
@NotAShelf NotAShelf force-pushed the notashelf/push-wkzslqwowkku branch from 2f53407 to 78e3612 Compare June 15, 2026 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants