Skip to content

feat(ops-controller): model-config control-plane API (/model-config)#55

Merged
AlienWalker1995 merged 3 commits into
mainfrom
feat/model-config-control-plane
Jun 23, 2026
Merged

feat(ops-controller): model-config control-plane API (/model-config)#55
AlienWalker1995 merged 3 commits into
mainfrom
feat/model-config-control-plane

Conversation

@AlienWalker1995

Copy link
Copy Markdown
Owner

What

Backend for dashboard-driven llama.cpp model control. Every launch flag (model, ctx, rope/YaRN, override-kv, KV quant, MTP, mmproj, gen-caps, …) becomes a first-class, validated, editable knob via one API.

Paradigm (remote-config / feature-flags): the registry is the per-model override store (control plane); .env is the rendered baseline the container reads. Exactly one write path (POST /model-config) → the registry↔.env drift that previously shipped the wrong model is now structurally impossible.

Pieces

  • llamacpp_flags.py — declarative flag schema (types/ranges/enums), per-flag validation, baseline defaults, baseline ⊕ override merge with reset-to-default, MTP↔EXTRA_ARGS folding, JSON-safe descriptors for the UI.
  • GET /model-config — flags + defaults + active model + overrides + effective + on-disk model/mmproj lists (one call drives the whole UI).
  • POST /model-config — validate → persist to the active registry record → render into .env (per-line upsert; commented presets survive) → recreate llamacpp (+ model-gateway when ctx changes).

Tests

27 schema/render + 8 endpoint tests; full ops-controller suite = 69 passing, ruff clean.

Follow-ups

  • Dashboard flag-card (Firebase/LaunchDarkly-style) UI consuming this API.
  • Hermes reads the registry for model identity (supersedes the interim .env grep).

🤖 Generated with Claude Code

Backend for dashboard-driven llama.cpp model control: every launch flag becomes
a first-class, validated, editable knob through one API, with the registry as the
per-model override store and .env as the rendered baseline — ONE write path, so
the registry/.env drift that bit us before is structurally impossible.

- llamacpp_flags.py: declarative flag schema (types/ranges/enums), per-flag
  validation, baseline defaults, baseline+override merge with reset-to-default,
  MTP<->extra_args folding, and JSON-safe descriptors for the UI.
- GET /model-config: flags + defaults + active model + overrides + effective +
  on-disk model/mmproj lists (drives the whole UI in one call).
- POST /model-config: validate -> persist to the active registry record ->
  render into .env (per-line upsert; commented presets survive) -> recreate
  llamacpp (+ model-gateway when ctx changes).

Tests: 27 (schema/render) + 8 (endpoints); full ops-controller suite = 69 passing.
The dashboard flag-card UI that consumes this lands in a follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hermes Bot and others added 2 commits June 23, 2026 11:38
…ployed .env

Found during a local rebuild + validation:
- Dockerfile COPY missed llamacpp_flags.py -> ops-controller crash-looped at
  startup (FileNotFoundError). Added it to the COPY list.
- GET /model-config computed `effective` from the (possibly stale) registry
  record, so it could disagree with what's actually deployed (e.g. reported
  MTP off while the running .env had it on). It now reads the deployed .env as
  the source of truth for current state; an override = an effective value that
  differs from the baseline default.

Validated live: /model-config reports model=Qwen3.6-27B, ctx=262144, MTP on,
vision on; full dashboard proxy chain works.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a HELP map + descriptors().help so the dashboard can show a one-line
explanation per llama.cpp flag. Test asserts every flag has help.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AlienWalker1995 AlienWalker1995 merged commit bb4a83c into main Jun 23, 2026
5 checks passed
@AlienWalker1995 AlienWalker1995 deleted the feat/model-config-control-plane branch June 23, 2026 20:47
AlienWalker1995 added a commit that referenced this pull request Jun 23, 2026
…56)

* feat(dashboard): Model Control flag-card UI (consumes /model-config)

LaunchDarkly/Firebase-style control plane for llama.cpp launch params: a "Model"
tab where every flag (model, ctx, rope/YaRN, override-kv, KV quant, MTP, mmproj,
gen-caps, ...) is a typed, validated field with an inherited/override pill + reset,
a model dropdown, and one "Apply & restart" that POSTs the diff.

- routes_model_config.py: GET/POST /api/model-config proxy to ops-controller
  (token injected server-side, like routes_registry).
- index.html: Model tab + loadModelControl()/renderModelControl()/apply (grouped
  flag cards from GET /model-config; Apply -> POST overrides -> recreate).

Depends on the control-plane API in #55. JS validated via node --check.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(dashboard): flag tooltips (title + info icon) from descriptor help

Each flag label shows a native tooltip + a hover (i) marker sourced from the
schema's help text (GET /model-config descriptors).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hermes Bot <hermes@ordo-ai-stack.local>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant