feat: scaffold Foreman as an opt-in add-on (M0 + M1)#501
Merged
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Defilan
added a commit
to Defilan/LLMKube
that referenced
this pull request
May 20, 2026
PR defilantech#501 shipped scaffolding (CRDs, foreman-operator, foreman-agent Registrar, capability providers) without test coverage, on the plan that 'unit + envtest coverage lands with M2'. This commit fronts that work onto defilantech#501 so M0+M1 ship with proper tests instead of a deferred promise. Refs defilantech#500. Coverage delta on the new packages: pkg/foreman/agent 0.0% -> 87.0% internal/foreman/controller 0.0% -> 85.7% What's covered: cmd/foreman-agent/main_test.go (stdlib testing.T): - clampInt32: negative/zero/MaxInt32-bound/overflow paths. - sanitizeName: DNS-1123 cleanup (lowercase, invalid-char collapse, leading/trailing hyphen trim, empty-and-all-invalid fallback, 63-char truncation, macOS '<name>.local' hostname case). - splitCSV: empty / single / multi / whitespace / empty-entries / separator-only cases. Found a real inconsistency along the way: empty input returned nil but separator-only returned []string{}; splitCSV now collapses both to nil so the FleetNodeSpec.Roles and CapabilityOptions.InstalledModels fields see one 'absent' representation. No external callers depended on the distinction. pkg/foreman/agent/capability_darwin_test.go (//go:build darwin): - bytesToGB: zero, sub-1GB rounding, 36GB/128GB sanity, MaxInt32 edge, and uint64-max saturation. - NewCapability: default-metal accelerator, explicit override honored, flag-supplied InstalledModels/MaxContextTokens/ TokensPerSecond propagation. - Live memory probe sanity: TotalRAMGB > 0 and AvailableRAMGB <= TotalRAMGB on a real Darwin host; skip if sysctl unavailable (CI sandbox). pkg/foreman/agent/capability_other_test.go (//go:build !darwin): - Stub provider propagates all flag-supplied fields. - AvailableRAMGB == StaticTotalRAMGB in v0.1 until M4 wires up live Linux probing. - Empty Accelerator is preserved (no silent default on non-darwin). pkg/foreman/agent/fleetnode_test.go (stdlib + fake client): - specEqual: 7 table-driven cases including role-ordering sensitivity. - Registrar.Upsert: creates if missing; updates if spec changed; no-ops (no resourceVersion bump) if spec identical. - Registrar.PatchHeartbeat: writes phase, fresh LastHeartbeatTime, full Capability snapshot. - Registrar.Run: heartbeats while running; drains (phase=Draining) on ctx cancel; exits cleanly within 2s. internal/foreman/controller/suite_test.go (Ginkgo + envtest): - Mirrors internal/controller/suite_test.go: BeforeSuite starts envtest, loads config/crd/bases/, registers foremanv1alpha1 into scheme. AfterSuite tears down. - Same getFirstFoundEnvTestBinaryDir helper for IDE-run support. internal/foreman/controller/{agentictask,workload,fleetnode}_controller_test.go: - Stub-smoke contracts: each M0/M1 reconciler is exercised against a real apiserver and must (1) return no error for missing resources, (2) reconcile an existing resource without erroring, (3) leave .status unmutated. M2 deliberately breaks the agentictask contract with a corresponding test update. CI: no .github/workflows/*.yml changes needed. The existing test.yml (.github/workflows/test.yml) runs make test, which globs the foreman packages automatically. Signed-off-by: Christopher Maher <chris@mahercode.io>
1 task
Foreman is an opt-in add-on layered on LLMKube that schedules agentic
workloads (Workload, AgenticTask) across a fleet of nodes (FleetNode).
Installing LLMKube alone does not install or require it.
M0 is the scaffolding milestone: types, controller stubs, operator
binary, Helm chart skeleton. The reconcilers log and return for now;
real scheduling lands in M2, the planner in M6.
New API group foreman.llmkube.dev/v1alpha1:
- Workload: the v0.1 entrypoint, a natural-language intent the planner
decomposes into AgenticTasks.
- AgenticTask: a dispatchable unit of work (issue-fix, verify, freeform),
with RequiredCapability for capability-aware scheduling.
- FleetNode: cluster-scoped registry entry the FleetAgent owns; carries
the heartbeat and the capability the scheduler matches against.
New paths:
- api/foreman/v1alpha1/ the three CRD types + groupversion_info
- internal/foreman/controller/ empty reconciler stubs (one per kind)
- cmd/foreman-operator/ the new operator binary, separate from
cmd/main.go; only registers the foreman
group, leader-election ID is its own
- charts/foreman/ new Helm chart, dependsOn llmkube
Core touches are surgical and inference-flow-byte-identical:
- scripts/sync-crds.sh now scopes its glob to inference.llmkube.dev_*
so foreman CRDs are not pulled into the llmkube chart.
- Makefile gains foreman-chart-crds (mirrors chart-crds for the foreman
chart). manifests / generate / chart-crds are untouched in behavior;
they still produce exactly the same inference outputs.
- config/rbac/role.yaml grows by the kubebuilder:rbac markers on the
three foreman reconcilers (auto-regenerated by make manifests).
Verification:
- make generate produces api/foreman/v1alpha1/zz_generated.deepcopy.go.
- make manifests produces the three foreman CRD YAMLs.
- make foreman-chart-crds copies them into charts/foreman/templates/crds.
- make chart-crds remains inference-only (verified: charts/llmkube/templates/crds
has only the three inference CRDs).
- make test passes the full envtest suite; no core regressions.
- make lint passes (0 issues).
- go build ./cmd/foreman-operator produces a working binary.
- kubectl apply of each foreman CRD against kind-llmkube-local accepts
a real object; the operator's three reconcilers log the reconcile and
return ctrl.Result{} as the stub design intends.
Part of the Foreman v0.1 MVP plan: M0 done; M1 (FleetNode heartbeat) next.
Signed-off-by: Christopher Maher <chris@mahercode.io>
…t (v0.1 M1)
The Foreman node-side daemon. One foreman-agent runs per fleet host. In
M1 it owns a single responsibility: keep this host's FleetNode CR
present and current so the scheduler (lands in M2) can target it.
Lifecycle:
- on startup: upsert the FleetNode (create if missing, update spec if
flag-supplied identity changed since last run);
- every --heartbeat-interval (default 30s): patch FleetNode.status with
phase=Ready, fresh lastHeartbeatTime, current capability snapshot;
- on SIGTERM/SIGINT: best-effort drain patch (phase=Draining) so the
scheduler stops dispatching to this node before the process exits.
Cross-platform:
- capability_darwin.go uses the metal-agent's existing
DarwinMemoryProvider (sysctl hw.memsize + vm_stat) so available RAM
is live, not flag-supplied. Defaults accelerator=metal.
- capability_other.go is a stub for linux/amd64 so the binary builds
cross-arch from day one. Live probing on Linux + NVIDIA lands at M4
when ShadowStack joins the fleet.
Reuse, not modification: pkg/foreman/agent imports
pkg/agent.DarwinMemoryProvider but does not touch it. The LLMKube
metal-agent's behavior is unchanged.
Flags:
--fleet-node-name, --tailscale-addr, --roles, --accelerator,
--installed-models, --max-context-tokens, --tokens-per-second,
--total-ram-gb, --heartbeat-interval, --kube-context,
--workspace-dir, --opencode-bin (last two are placeholders the M3
executor will require).
--kubeconfig is auto-registered by controller-runtime's config init.
New paths:
- pkg/foreman/agent/fleetnode.go Registrar (Upsert/Run/PatchHeartbeat)
- pkg/foreman/agent/capability.go CapabilityOptions
- pkg/foreman/agent/capability_darwin.go DarwinMemoryProvider backed
- pkg/foreman/agent/capability_other.go !darwin stub
- cmd/foreman-agent/main.go the binary
Verification on kind-llmkube-local, --heartbeat-interval=3s, 10s run:
- kubectl get fleetnodes
NAME PHASE ACCELERATOR RAM CURRENT TASK HEARTBEAT AGE
m5-max Ready metal 22 1s 10s
- status.capability.totalRAMGB=128 (live sysctl), availableRAMGB=22
(live vm_stat), installedModels=[minimax-m2-7],
maxContextTokens=131072, tokensPerSecond=47.
- 3 heartbeat patches over 10s, all successful.
- SIGTERM produced phase=Draining; agent exited cleanly.
- make test (full envtest), make lint (0 issues), go vet all clean.
Signed-off-by: Christopher Maher <chris@mahercode.io>
PR defilantech#501 shipped scaffolding (CRDs, foreman-operator, foreman-agent Registrar, capability providers) without test coverage, on the plan that 'unit + envtest coverage lands with M2'. This commit fronts that work onto defilantech#501 so M0+M1 ship with proper tests instead of a deferred promise. Refs defilantech#500. Coverage delta on the new packages: pkg/foreman/agent 0.0% -> 87.0% internal/foreman/controller 0.0% -> 85.7% What's covered: cmd/foreman-agent/main_test.go (stdlib testing.T): - clampInt32: negative/zero/MaxInt32-bound/overflow paths. - sanitizeName: DNS-1123 cleanup (lowercase, invalid-char collapse, leading/trailing hyphen trim, empty-and-all-invalid fallback, 63-char truncation, macOS '<name>.local' hostname case). - splitCSV: empty / single / multi / whitespace / empty-entries / separator-only cases. Found a real inconsistency along the way: empty input returned nil but separator-only returned []string{}; splitCSV now collapses both to nil so the FleetNodeSpec.Roles and CapabilityOptions.InstalledModels fields see one 'absent' representation. No external callers depended on the distinction. pkg/foreman/agent/capability_darwin_test.go (//go:build darwin): - bytesToGB: zero, sub-1GB rounding, 36GB/128GB sanity, MaxInt32 edge, and uint64-max saturation. - NewCapability: default-metal accelerator, explicit override honored, flag-supplied InstalledModels/MaxContextTokens/ TokensPerSecond propagation. - Live memory probe sanity: TotalRAMGB > 0 and AvailableRAMGB <= TotalRAMGB on a real Darwin host; skip if sysctl unavailable (CI sandbox). pkg/foreman/agent/capability_other_test.go (//go:build !darwin): - Stub provider propagates all flag-supplied fields. - AvailableRAMGB == StaticTotalRAMGB in v0.1 until M4 wires up live Linux probing. - Empty Accelerator is preserved (no silent default on non-darwin). pkg/foreman/agent/fleetnode_test.go (stdlib + fake client): - specEqual: 7 table-driven cases including role-ordering sensitivity. - Registrar.Upsert: creates if missing; updates if spec changed; no-ops (no resourceVersion bump) if spec identical. - Registrar.PatchHeartbeat: writes phase, fresh LastHeartbeatTime, full Capability snapshot. - Registrar.Run: heartbeats while running; drains (phase=Draining) on ctx cancel; exits cleanly within 2s. internal/foreman/controller/suite_test.go (Ginkgo + envtest): - Mirrors internal/controller/suite_test.go: BeforeSuite starts envtest, loads config/crd/bases/, registers foremanv1alpha1 into scheme. AfterSuite tears down. - Same getFirstFoundEnvTestBinaryDir helper for IDE-run support. internal/foreman/controller/{agentictask,workload,fleetnode}_controller_test.go: - Stub-smoke contracts: each M0/M1 reconciler is exercised against a real apiserver and must (1) return no error for missing resources, (2) reconcile an existing resource without erroring, (3) leave .status unmutated. M2 deliberately breaks the agentictask contract with a corresponding test update. CI: no .github/workflows/*.yml changes needed. The existing test.yml (.github/workflows/test.yml) runs make test, which globs the foreman packages automatically. Signed-off-by: Christopher Maher <chris@mahercode.io>
…y lll CI's golangci-lint v2.4.0 on linux caught a 123-character line in the //go:build !darwin variant of the capability test that the M5 Max local lint missed (the file does not compile on darwin, so the darwin-side lint never sees it). Wrapped the t.Errorf to keep all lines under the 120-char limit. Signed-off-by: Christopher Maher <chris@mahercode.io>
…ueAfter controller-runtime deprecated Result.Requeue (bool) in favor of expressing 'no requeue' as RequeueAfter == 0. The neighboring Expect(res.RequeueAfter).To(BeZero()) already covers the assertion, so dropping the Result.Requeue check resolves SA1019 staticcheck without changing test semantics. Caught locally via GOOS=linux golangci-lint after the previous darwin-only run missed it; same cross-arch gotcha covered in feedback_cross_arch_lint.md. Signed-off-by: Christopher Maher <chris@mahercode.io>
09db359 to
662dadc
Compare
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Scaffolds Foreman, an opt-in add-on layered on LLMKube that schedules
agentic workloads (Workload, AgenticTask) across a fleet of nodes
(FleetNode). This PR covers v0.1 milestones M0 and M1:
foreman.llmkube.dev/v1alpha1(Workload,AgenticTask, FleetNode), empty reconciler stubs,
foreman-operatorbinary,
charts/foremanskeleton.foreman-agentnode-side daemon with FleetNodeself-registration, 30s heartbeat, drain on SIGTERM. Cross-platform
via build tags (darwin uses live sysctl + vm_stat memory probing;
linux/amd64 builds via a stub for M4).
Why
Refs #500
Foreman is the fleet-scale evolution of the single-node autofix pipeline.
It is the fleet-aware control plane LLMKube's North Star always pointed at
("treat intelligence as a workload"): one layer up, from serving models to
running agentic workloads on them.
The LLMKube core stays untouched: a user who only wants Kubernetes-managed
local LLM serving installs LLMKube exactly as today, sees nothing new in
their cluster, RBAC,
kubectl api-resources, or values file. Foreman is aseparate API group, a separate operator binary, a separate node-agent, a
separate Helm chart with
dependsOn: llmkube. Same pattern as cert-manager/ trust-manager, Istio base / istiod, kube-prometheus-stack /
kube-state-metrics.
How
Packaging is the design choice: same repo for iteration velocity, fully
separate everything that ships.
inference.llmkube.dev/v1alpha1foreman.llmkube.dev/v1alpha1cmd/main.go→llmkube-operatorcmd/foreman-operator/main.gocmd/metal-agent(untouched)cmd/foreman-agent(separate)api/v1alpha1/,internal/controller/,pkg/agent/api/foreman/v1alpha1/,internal/foreman/controller/,pkg/foreman/charts/llmkube(no new fields)charts/foreman(dependsOn: llmkube)Surgical core changes (the only non-foreman files this PR touches,
all additive):
Makefile: newforeman-chart-crdstarget. Existingmanifests/generate/chart-crds/test/linttargets are unchanged inbehavior.
scripts/sync-crds.sh: narrow the glob from*.yamltoinference.llmkube.dev_*.yamlso the foreman group is not pulled intocharts/llmkube. Inference CRDs continue to copy identically.config/rbac/role.yaml: auto-regenerated; gains the foreman RBACmarker output. The LLMKube chart's
ClusterRoleis hand-authored incharts/llmkube/templates/clusterrole.yamland lists onlyinference.llmkube.dev, so the LLMKube operator pod gains zero newprivileges from this PR.
LLMKube core inference flow is byte-identical: no changes in
api/v1alpha1/,internal/controller/,cmd/main.go,cmd/metal-agent/,pkg/agent/,charts/llmkube/,go.mod, orgo.sum.pkg/foreman/agentimportspkg/agent.DarwinMemoryProviderwithout modifying it.
M1 verification (kind-llmkube-local,
--heartbeat-interval=3s,10-second run):
status.capability.totalRAMGB=128(live sysctlhw.memsize),availableRAMGB=22(livevm_stat),installedModels=[minimax-m2-7],maxContextTokens=131072,tokensPerSecond=47. Three heartbeat patchesover 10 s, all successful. SIGTERM produces
phase=Draining; agent exitscleanly.
M0 verification:
kubectl applyof smoke-test Workload, AgenticTask,and FleetNode against kind-llmkube-local all accepted; printer columns
render. The
foreman-operatorbinary starts cleanly against kind; allthree reconcilers log reconciles on the smoke-test objects and return
ctrl.Result{}as the stub design intends.What comes next (not in this PR; tracked by the epic): M2 lands the
capability-aware scheduler and a function-calling smoke test against
MiniMax M2.7 on the M5 Max. M3 builds the native agent loop (OAI-style
function-calling, in-process tool execution) and the Agent CRD. Foreman's
agentic loop is owned in Go natively, not by wrapping opencode.
Checklist
make testpasses locallymake lintpasses locallygit commit -s) per DCOTests note: M0+M1 are scaffolding; reconcilers are empty stubs and
the Registrar runs against a live apiserver. Unit + envtest coverage for
the Foreman packages lands with M2 (scheduler), when there is concrete
reconcile logic to test. The M1 demo above is the integration check.
Docs note: README + chart README updates land alongside M6 (the v0.1
ship gate), when Foreman is user-installable end-to-end. M0+M1 are not
yet user-installable: there is no operator Deployment or RBAC in
charts/foreman, only CRDs.