Tools tab (gh) + prompt caching + conversation compaction#4
Merged
Conversation
- Inject live date/time into each user-message turn instead of the static system prompt, so the prompt prefix is stable and cacheable. - Snapshot the static system prompt once per session (session mode) and reuse it until /new, instead of rebuilding and re-sending it every turn. - Move the per-request execution plan out of the static prompt into the per-turn preamble. - Add Anthropic prompt caching (cache_control) on the system/tools prefix. - Add a Tools tab + tools config registry (core/tools.py); first tool is the GitHub CLI (gh): GH_TOKEN auth, prompt advertisement when active, token test endpoint, and gh installed in the container image. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add /clear as an alias for /new (clears conversation context). - Add client-side conversation compaction for session mode: when the real context size (from provider usage) crosses a threshold, summarize the oldest turns with a small model and keep recent turns verbatim. Works across all providers (no server-side compaction dependency); never splits a tool_use from its tool_result. - Capture token usage (input/output/cache → context_tokens) in LLMResponse for both Anthropic and OpenAI-compatible providers. - Threshold configurable in the History tab (% of context window or absolute tokens, plus keep-recent-turns); compaction model configurable in the LLM tab. - Notify the user with a system message when compaction triggers (delivered as a separate follow-up on Telegram/WhatsApp via AgentResponse.system_notice). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Branch bundles several related changes to prompt assembly, history, and tooling.
1. Per-turn datetime + cached static prompt
/new; per-request execution plan moved to the per-turn preamble.cache_control) on the system/tools prefix.2. Tools tab + GitHub CLI (
gh)core/tools.pyregistry: enabled tools are authenticated (env) and advertised to the LLM; disabled tools stay hidden.gh: token stored astools.gh.token(secret), injected asGH_TOKEN; Tools tab with enable/token/Test;ghinstalled in the container image.3.
/clearalias/clearnow works as an alias for/new(clears the conversation).4. Conversation compaction (session mode)
usage—input + cache_read + cache_creation, model-agnostic, no tokenizer) crosses a threshold, the oldest turns are summarized by a small model and recent turns kept verbatim. Client-side so it works across all providers; never splits atool_usefrom itstool_result.AgentResponse.system_noticeon Telegram/WhatsApp).Tests
tests/test_tools.py(13) +tests/test_compaction.py(14) added; full suite 252 passed. Lint clean.🤖 Generated with Claude Code