feat(ops-controller): model-config control-plane API (/model-config)#55
Merged
Conversation
Backend for dashboard-driven llama.cpp model control: every launch flag becomes a first-class, validated, editable knob through one API, with the registry as the per-model override store and .env as the rendered baseline — ONE write path, so the registry/.env drift that bit us before is structurally impossible. - llamacpp_flags.py: declarative flag schema (types/ranges/enums), per-flag validation, baseline defaults, baseline+override merge with reset-to-default, MTP<->extra_args folding, and JSON-safe descriptors for the UI. - GET /model-config: flags + defaults + active model + overrides + effective + on-disk model/mmproj lists (drives the whole UI in one call). - POST /model-config: validate -> persist to the active registry record -> render into .env (per-line upsert; commented presets survive) -> recreate llamacpp (+ model-gateway when ctx changes). Tests: 27 (schema/render) + 8 (endpoints); full ops-controller suite = 69 passing. The dashboard flag-card UI that consumes this lands in a follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3 tasks
…ployed .env Found during a local rebuild + validation: - Dockerfile COPY missed llamacpp_flags.py -> ops-controller crash-looped at startup (FileNotFoundError). Added it to the COPY list. - GET /model-config computed `effective` from the (possibly stale) registry record, so it could disagree with what's actually deployed (e.g. reported MTP off while the running .env had it on). It now reads the deployed .env as the source of truth for current state; an override = an effective value that differs from the baseline default. Validated live: /model-config reports model=Qwen3.6-27B, ctx=262144, MTP on, vision on; full dashboard proxy chain works. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a HELP map + descriptors().help so the dashboard can show a one-line explanation per llama.cpp flag. Test asserts every flag has help. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AlienWalker1995
added a commit
that referenced
this pull request
Jun 23, 2026
…56) * feat(dashboard): Model Control flag-card UI (consumes /model-config) LaunchDarkly/Firebase-style control plane for llama.cpp launch params: a "Model" tab where every flag (model, ctx, rope/YaRN, override-kv, KV quant, MTP, mmproj, gen-caps, ...) is a typed, validated field with an inherited/override pill + reset, a model dropdown, and one "Apply & restart" that POSTs the diff. - routes_model_config.py: GET/POST /api/model-config proxy to ops-controller (token injected server-side, like routes_registry). - index.html: Model tab + loadModelControl()/renderModelControl()/apply (grouped flag cards from GET /model-config; Apply -> POST overrides -> recreate). Depends on the control-plane API in #55. JS validated via node --check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(dashboard): flag tooltips (title + info icon) from descriptor help Each flag label shows a native tooltip + a hover (i) marker sourced from the schema's help text (GET /model-config descriptors). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Hermes Bot <hermes@ordo-ai-stack.local> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Backend for dashboard-driven llama.cpp model control. Every launch flag (model, ctx, rope/YaRN, override-kv, KV quant, MTP, mmproj, gen-caps, …) becomes a first-class, validated, editable knob via one API.
Paradigm (remote-config / feature-flags): the registry is the per-model override store (control plane);
.envis the rendered baseline the container reads. Exactly one write path (POST /model-config) → the registry↔.envdrift that previously shipped the wrong model is now structurally impossible.Pieces
llamacpp_flags.py— declarative flag schema (types/ranges/enums), per-flag validation, baseline defaults,baseline ⊕ overridemerge with reset-to-default, MTP↔EXTRA_ARGSfolding, JSON-safe descriptors for the UI.GET /model-config— flags + defaults + active model + overrides + effective + on-disk model/mmproj lists (one call drives the whole UI).POST /model-config— validate → persist to the active registry record → render into.env(per-line upsert; commented presets survive) → recreate llamacpp (+ model-gateway when ctx changes).Tests
27 schema/render + 8 endpoint tests; full ops-controller suite = 69 passing, ruff clean.
Follow-ups
.envgrep).🤖 Generated with Claude Code