Skip to content

Thinking-level support across all inference kinds (LLM tab)#21

Merged
mattmezza merged 7 commits into
mainfrom
feat/thinking-level-llm-tab
Jun 25, 2026
Merged

Thinking-level support across all inference kinds (LLM tab)#21
mattmezza merged 7 commits into
mainfrom
feat/thinking-level-llm-tab

Conversation

@mattmezza

@mattmezza mattmezza commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Closes #9.

Adds a thinking / reasoning-effort control to the LLM tab — for the main inference model and every background inference kind.

Inference kinds covered

Main inference, memory extraction, memory consolidation, goal decomposition, task reflection, and history compaction — each with its own independent thinking-level setting.

UI

  • A thinking-level field (type-or-pick datalist: low / medium / high) next to every model field. It is always visible — defaults to Off, so it's harmless when unused, and lets you set effort for any provider/model.
  • Anthropic autodiscovery: a Fetch levels button hits POST /setup/thinking-levels, which reads the model's capabilities via the Anthropic Models API (capabilities.effort.*) and populates the real supported values (low/medium/high/xhigh/max).
  • Other providers: an effort docs ↗ link per provider (OpenAI/Google/Grok/DeepSeek) so you can enter the correct value by hand.
  • Shared-config badge: a background kind that targets the same provider+model as Main inference shows "· same config as Main inference".
  • Non-blocking hint: if you set a level on a model id that isn't a recognized reasoning model, an amber note warns the call may error. (modelSupportsThinking() is only this hint now — it does not hide the control.)
  • Levels are saved per-kind; the value you type is stored as-is.

Backend

  • AgentConfig.thinking_level + per-kind fields on MemoryConfig / GoalDecompositionConfig / TaskReflectionConfig / CompactionConfig.
  • LLMClient._reasoning_kwargs() maps the level per provider: Anthropic → thinking={"type":"adaptive"} + output_config={"effort": level}; OpenAI-compatible → reasoning_effort=level. Applied to both generate() (main loop) and generate_text() (background tasks). Nothing is sent when off (""), so untouched configs are byte-identical to before.
  • _background_llm() clones the main client (sharing the SDK connection) to override only the level, so each background task gets its own effort independent of main inference.
  • POST /setup/thinking-levels for Anthropic capability autodiscovery.

Tests

tests/test_llm.py covers per-provider kwargs and effort emission on both generate() and generate_text() (off by default). Stubs in test_compaction.py / test_scheduler.py updated for the new signature.

316 passed

Note on issue scope

#9 asked to show the control "only for compatible models". A substring heuristic for that hid every control for real non-Anthropic ids (e.g. deepseek-v4-flash) and conflicted with the follow-up requirement to enter effort manually for other providers — so the final design shows the field always and guides correctness via autodiscovery (Anthropic), docs links (others), and the amber hint, rather than hiding it.

Add a 'thinking level' (low/medium/high) selector to the LLM tab, shown
only when the selected model supports reasoning. The value maps to
output_config.effort + adaptive thinking for Anthropic, and reasoning_effort
for OpenAI-compatible providers. Off by default; never sent for models that
don't support thinking.

Closes #9
Add independent thinking-level config for memory extraction/consolidation,
goal decomposition, task reflection, and compaction. generate_text() now
honors the client's level via a shared _reasoning_kwargs() helper, and
_background_llm() clones the main client (sharing the SDK connection) to
override only the level — so background tasks get their own effort setting.
Add POST /setup/thinking-levels (Anthropic capability lookup via Models API)
and pass per-kind thinking levels to the LLM tab template.
Add levelOptions/fetchThinkingLevels (Anthropic autodiscovery), providerDocsUrl
(info links for other providers), and sameAsMain (shared-config badge).
Reusable Jinja macro renders a thinking-level field next to every inference
kind's model (main, extraction, consolidation, goal decomposition, task
reflection, compaction). Each is a type-or-pick datalist; Anthropic gets a
'Fetch levels' button (capability autodiscovery), other providers get an
'effort docs' link to enter the value manually. Background kinds show a
'same config as Main inference' badge when they target the main provider+model.
Each level is saved to its own config key, cleared when the model can't think.
@mattmezza mattmezza changed the title Add thinking-level support in LLM tab for compatible models Thinking-level support across all inference kinds (LLM tab) Jun 25, 2026
The modelSupportsThinking() substring heuristic hid every control for real
model ids (deepseek-v4-flash, claude-haiku-4-5, …). Drop the visibility gate
so the field shows for all providers/models — matching the 'enter effort
manually for other providers' requirement — and save the typed value as-is.
The heuristic is demoted to a non-blocking amber hint shown only when a level
is set on an unrecognized model.
@mattmezza mattmezza merged commit a5c7ceb into main Jun 25, 2026
1 check passed
@mattmezza mattmezza deleted the feat/thinking-level-llm-tab branch June 25, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add thinking level support in LLM tab for compatible models

1 participant