feat: add --llama-server-port for a fixed llama-server runtime port by Defilan · Pull Request #499 · defilantech/LLMKube

Defilan · 2026-05-19T22:55:53Z

What

Add a --llama-server-port flag to the metal-agent so the spawned llama-server can bind a fixed port instead of an ephemeral one.

Why

Refs #406.

Per #406: the metal-agent allocates a dynamic port per spawned child, so host-side clients (an agentic coding tool, a quick curl, anything using an OpenAI SDK against localhost) have no stable target. They have to be re-pointed every time the agent respawns the process. The mlx-server runtime already has --mlx-server-port for exactly this reason; this brings the llama-server runtime to parity.

Scope note: this PR addresses the llama-server portion of #406. The vllm-swift runtime and absorbing vllm-swift-proxy.py into the metal-agent are still TBD; I'd suggest tracking those as follow-ups so #406 isn't closed by this single change.

How

New --llama-server-port int CLI flag on the metal-agent. Default 0 keeps the historical ephemeral-port behavior; a non-zero value pins the spawned llama-server to that port.
MetalAgentConfig.LlamaServerPort is wired through to MetalExecutor.SetPort(int). SetPort clamps negative values back to 0.
MetalExecutor.StartProcess uses the fixed port when non-zero; otherwise falls back to allocatePort() exactly as before.
waitForHealthy polls whichever port was resolved, so a fixed port works end to end with no further change.
Test TestMetalExecutorSetPort covers the default-zero, set-to-8080, and negative-clamped-to-zero paths.

Non-breaking: every change is additive. With the flag unset, behavior is byte-identical to pre-patch. No public-API signature changes.

Checklist

Tests added/updated (TestMetalExecutorSetPort in pkg/agent/executor_test.go)
`make test` passes locally (go test ./pkg/agent/... -> ok)
`make lint` passes locally
Commit messages follow conventional commits (feat:)
All commits are signed off (git commit -s) per DCO
Documentation updated — n/a, internal operator flag

The llama-server runtime allocated an ephemeral port for every spawned process, so a native OpenAI-compatible client (e.g. an agentic coding tool pointed at localhost) had no stable endpoint and had to be re-pointed whenever the agent respawned the process. Add a --llama-server-port flag, mirroring the existing --mlx-server-port for the mlx-server runtime. Zero (the default) keeps the historical ephemeral-port behavior; a non-zero value fixes the port the spawned llama-server binds, which the health check then polls consistently. Signed-off-by: Christopher Maher <chris@mahercode.io>

codecov · 2026-05-19T23:00:17Z

Codecov Report

❌ Patch coverage is 25.00000% with 12 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pkg/agent/executor.go	40.00%	6 Missing ⚠️
cmd/metal-agent/main.go	0.00%	5 Missing ⚠️
pkg/agent/agent.go	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Defilan mentioned this pull request May 19, 2026

feat(metal-agent): expose stable host-side endpoint listener (absorb vllm-swift-proxy) #406

Open

8 tasks

Defilan merged commit cc30b0d into defilantech:main May 20, 2026
21 checks passed

github-actions Bot mentioned this pull request May 20, 2026

chore: release 0.7.10 #497

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add --llama-server-port for a fixed llama-server runtime port#499

feat: add --llama-server-port for a fixed llama-server runtime port#499
Defilan merged 1 commit into
defilantech:mainfrom
Defilan:feat/metal-agent-llama-server-port

Defilan commented May 19, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Defilan commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Checklist

Uh oh!

codecov Bot commented May 19, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Defilan commented May 19, 2026 •

edited

Loading