Tools tab (gh) + prompt caching + conversation compaction by mattmezza · Pull Request #4 · mattmezza/mpa

mattmezza · 2026-06-07T18:46:06Z

Branch bundles several related changes to prompt assembly, history, and tooling.

1. Per-turn datetime + cached static prompt

Live date/time injected into each user-message turn (not the static system prompt), so the prefix is stable and cacheable.
Session mode: static system prompt snapshotted once per conversation, reused until /new; per-request execution plan moved to the per-turn preamble.
Anthropic prompt caching (cache_control) on the system/tools prefix.

2. Tools tab + GitHub CLI (`gh`)

core/tools.py registry: enabled tools are authenticated (env) and advertised to the LLM; disabled tools stay hidden.
gh: token stored as tools.gh.token (secret), injected as GH_TOKEN; Tools tab with enable/token/Test; gh installed in the container image.

3. `/clear` alias

/clear now works as an alias for /new (clears the conversation).

4. Conversation compaction (session mode)

When the real context size (from provider usage — input + cache_read + cache_creation, model-agnostic, no tokenizer) crosses a threshold, the oldest turns are summarized by a small model and recent turns kept verbatim. Client-side so it works across all providers; never splits a tool_use from its tool_result.
Threshold configurable in the History tab (% of context window or absolute tokens, + keep-recent-turns); compaction model in the LLM tab.
User is notified with a system message when compaction triggers (delivered as a separate follow-up via AgentResponse.system_notice on Telegram/WhatsApp).

Tests

tests/test_tools.py (13) + tests/test_compaction.py (14) added; full suite 252 passed. Lint clean.

🤖 Generated with Claude Code

- Inject live date/time into each user-message turn instead of the static system prompt, so the prompt prefix is stable and cacheable. - Snapshot the static system prompt once per session (session mode) and reuse it until /new, instead of rebuilding and re-sending it every turn. - Move the per-request execution plan out of the static prompt into the per-turn preamble. - Add Anthropic prompt caching (cache_control) on the system/tools prefix. - Add a Tools tab + tools config registry (core/tools.py); first tool is the GitHub CLI (gh): GH_TOKEN auth, prompt advertisement when active, token test endpoint, and gh installed in the container image. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Add /clear as an alias for /new (clears conversation context). - Add client-side conversation compaction for session mode: when the real context size (from provider usage) crosses a threshold, summarize the oldest turns with a small model and keep recent turns verbatim. Works across all providers (no server-side compaction dependency); never splits a tool_use from its tool_result. - Capture token usage (input/output/cache → context_tokens) in LLMResponse for both Anthropic and OpenAI-compatible providers. - Threshold configurable in the History tab (% of context window or absolute tokens, plus keep-recent-turns); compaction model configurable in the LLM tab. - Notify the user with a system message when compaction triggers (delivered as a separate follow-up on Telegram/WhatsApp via AgentResponse.system_notice). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mattmezza and others added 5 commits June 7, 2026 20:42

test: cover tools registry, datetime preamble, session prompt snapshot

d9c6594

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

docs: document Tools tab (gh CLI) and per-turn datetime / cached prompt

9fc5d1b

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

docs: document /clear alias and conversation compaction

e19e6a1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mattmezza changed the title ~~Per-turn datetime, cached static prompt, Tools tab (gh CLI)~~ Tools tab (gh) + prompt caching + conversation compaction Jun 7, 2026

mattmezza mentioned this pull request Jun 7, 2026

Improve memory consolidation: unified ADD/UPDATE/DELETE/NOOP pipeline + semantic dedup #5

Closed

mattmezza merged commit 0870df7 into main Jun 7, 2026
1 check passed

mattmezza deleted the feat/tools-tab-and-prompt-caching branch June 7, 2026 19:37

mattmezza mentioned this pull request Jun 7, 2026

feat(memory): unified update pipeline + embeddings + forgetting + hygiene (closes #5) #7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tools tab (gh) + prompt caching + conversation compaction#4

Tools tab (gh) + prompt caching + conversation compaction#4
mattmezza merged 5 commits into
mainfrom
feat/tools-tab-and-prompt-caching

mattmezza commented Jun 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattmezza commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Per-turn datetime + cached static prompt

2. Tools tab + GitHub CLI (gh)

3. /clear alias

4. Conversation compaction (session mode)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattmezza commented Jun 7, 2026 •

edited

Loading

2. Tools tab + GitHub CLI (`gh`)

3. `/clear` alias