Thinking-level support across all inference kinds (LLM tab) by mattmezza · Pull Request #21 · mattmezza/mpa

mattmezza · 2026-06-25T08:09:51Z

Closes #9.

Adds a thinking / reasoning-effort control to the LLM tab — for the main inference model and every background inference kind.

Inference kinds covered

Main inference, memory extraction, memory consolidation, goal decomposition, task reflection, and history compaction — each with its own independent thinking-level setting.

UI

A thinking-level field (type-or-pick datalist: low / medium / high) next to every model field. It is always visible — defaults to Off, so it's harmless when unused, and lets you set effort for any provider/model.
Anthropic autodiscovery: a Fetch levels button hits POST /setup/thinking-levels, which reads the model's capabilities via the Anthropic Models API (capabilities.effort.*) and populates the real supported values (low/medium/high/xhigh/max).
Other providers: an effort docs ↗ link per provider (OpenAI/Google/Grok/DeepSeek) so you can enter the correct value by hand.
Shared-config badge: a background kind that targets the same provider+model as Main inference shows "· same config as Main inference".
Non-blocking hint: if you set a level on a model id that isn't a recognized reasoning model, an amber note warns the call may error. (modelSupportsThinking() is only this hint now — it does not hide the control.)
Levels are saved per-kind; the value you type is stored as-is.

Backend

AgentConfig.thinking_level + per-kind fields on MemoryConfig / GoalDecompositionConfig / TaskReflectionConfig / CompactionConfig.
LLMClient._reasoning_kwargs() maps the level per provider: Anthropic → thinking={"type":"adaptive"} + output_config={"effort": level}; OpenAI-compatible → reasoning_effort=level. Applied to both generate() (main loop) and generate_text() (background tasks). Nothing is sent when off (""), so untouched configs are byte-identical to before.
_background_llm() clones the main client (sharing the SDK connection) to override only the level, so each background task gets its own effort independent of main inference.
POST /setup/thinking-levels for Anthropic capability autodiscovery.

Tests

tests/test_llm.py covers per-provider kwargs and effort emission on both generate() and generate_text() (off by default). Stubs in test_compaction.py / test_scheduler.py updated for the new signature.

316 passed

Note on issue scope

#9 asked to show the control "only for compatible models". A substring heuristic for that hid every control for real non-Anthropic ids (e.g. deepseek-v4-flash) and conflicted with the follow-up requirement to enter effort manually for other providers — so the final design shows the field always and guides correctness via autodiscovery (Anthropic), docs links (others), and the amber hint, rather than hiding it.

Add a 'thinking level' (low/medium/high) selector to the LLM tab, shown only when the selected model supports reasoning. The value maps to output_config.effort + adaptive thinking for Anthropic, and reasoning_effort for OpenAI-compatible providers. Off by default; never sent for models that don't support thinking. Closes #9

Add independent thinking-level config for memory extraction/consolidation, goal decomposition, task reflection, and compaction. generate_text() now honors the client's level via a shared _reasoning_kwargs() helper, and _background_llm() clones the main client (sharing the SDK connection) to override only the level — so background tasks get their own effort setting.

Add POST /setup/thinking-levels (Anthropic capability lookup via Models API) and pass per-kind thinking levels to the LLM tab template.

Add levelOptions/fetchThinkingLevels (Anthropic autodiscovery), providerDocsUrl (info links for other providers), and sameAsMain (shared-config badge).

Reusable Jinja macro renders a thinking-level field next to every inference kind's model (main, extraction, consolidation, goal decomposition, task reflection, compaction). Each is a type-or-pick datalist; Anthropic gets a 'Fetch levels' button (capability autodiscovery), other providers get an 'effort docs' link to enter the value manually. Background kinds show a 'same config as Main inference' badge when they target the main provider+model. Each level is saved to its own config key, cleared when the model can't think.

The modelSupportsThinking() substring heuristic hid every control for real model ids (deepseek-v4-flash, claude-haiku-4-5, …). Drop the visibility gate so the field shows for all providers/models — matching the 'enter effort manually for other providers' requirement — and save the typed value as-is. The heuristic is demoted to a non-blocking amber hint shown only when a level is set on an unrecognized model.

mattmezza added 6 commits June 25, 2026 10:09

feat(admin): thinking-level autodiscovery endpoint + per-kind context

9567e8b

Add POST /setup/thinking-levels (Anthropic capability lookup via Models API) and pass per-kind thinking levels to the LLM tab template.

feat(ui): Alpine state + helpers for per-kind thinking levels

910091d

Add levelOptions/fetchThinkingLevels (Anthropic autodiscovery), providerDocsUrl (info links for other providers), and sameAsMain (shared-config badge).

test(llm): cover generate_text reasoning + per-provider kwargs

f36aaea

mattmezza changed the title ~~Add thinking-level support in LLM tab for compatible models~~ Thinking-level support across all inference kinds (LLM tab) Jun 25, 2026

mattmezza merged commit a5c7ceb into main Jun 25, 2026
1 check passed

mattmezza deleted the feat/thinking-level-llm-tab branch June 25, 2026 11:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thinking-level support across all inference kinds (LLM tab)#21

Thinking-level support across all inference kinds (LLM tab)#21
mattmezza merged 7 commits into
mainfrom
feat/thinking-level-llm-tab

mattmezza commented Jun 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattmezza commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Inference kinds covered

UI

Backend

Tests

Note on issue scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattmezza commented Jun 25, 2026 •

edited

Loading