Skip to content

Commit d071816

Browse files
author
Mateusz
committed
feat: auto-continue/proceed removal - strip mechanical re-enablement messages from remote LLM submissions
When sessions are interrupted (connectivity issues), users typically type 'continue' or 'proceed' to re-enter the agent loop. These messages pollute context without semantic value. This feature detects the last user message being exactly 'continue' or 'proceed' (trimmed, case-insensitive) and tags it as NEVER_FORWARD so the proxy omits it from backend transmissions while preserving it in the local agent context window. Default enabled; disable via --disable-auto-continue-removal, AUTO_CONTINUE_REMOVAL_ENABLED=false env var, or session.auto_continue_removal_enabled: false in config. Uses existing non-forwardable tagging + enforcement infrastructure. Fail-open: errors during tagging are logged and the request proceeds.
1 parent 88a0928 commit d071816

14 files changed

Lines changed: 685 additions & 68 deletions

File tree

config/config.example.yaml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,9 @@ session:
112112
# Fix improperly formatted <think> tags in model responses
113113
fix_think_tags_enabled: false # Set to true to enable think tags correction
114114

115+
# Remove trailing [AUTO-CONTINUE] marker from assistant responses.
116+
auto_continue_removal_enabled: true
117+
115118
# Planning phase: Route initial requests to a strong model for better planning
116119
planning_phase:
117120
enabled: false # Set to true to enable planning phase
@@ -471,6 +474,26 @@ resilience:
471474
# Force shared scoping for selected backends (optional override).
472475
shared_backend_types: []
473476

477+
# Scheduled provider warm-up for sliding usage windows.
478+
# Sends lightweight prompts at fixed local server times to intentionally start
479+
# request windows at more favorable times of day.
480+
# Only explicit backend:model routes are allowed. Aliases, model-only selectors,
481+
# and composite selectors using ^ or | are rejected.
482+
usage_window_warmup:
483+
enabled: false
484+
entries: []
485+
# Example:
486+
# entries:
487+
# - model: "openai-codex:gpt-5.4-mini"
488+
# time: "08:00"
489+
# execute_on_weekend: false
490+
# - model: "gemini.2:google/gemini-2.5-flash"
491+
# time: "13:30"
492+
# execute_on_weekend: false
493+
# - model: "gemini.2:google/gemini-2.5-flash"
494+
# time: "18:45"
495+
# execute_on_weekend: true
496+
474497
# Model name rewrite rules (optional)
475498
# These rules allow you to dynamically rewrite model names before they are processed
476499
# Rules are processed in order, and the first matching rule is applied

config/schemas/app_config.schema.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,7 @@ properties:
230230
fix_think_tags_streaming_buffer_size: { type: integer, minimum: 1024 }
231231
droid_path_fix_enabled: { type: boolean }
232232
double_ampersand_fixes_for_windows_enabled: { type: boolean }
233+
auto_continue_removal_enabled: { type: boolean }
233234
max_per_session_backends: { type: integer, minimum: 1 }
234235
session_continuity:
235236
type: object
@@ -554,6 +555,23 @@ properties:
554555
method: { type: string, enum: [GET, HEAD] }
555556
path: { type: string }
556557
accept_any_response: { type: boolean }
558+
usage_window_warmup:
559+
type: object
560+
additionalProperties: false
561+
properties:
562+
enabled: { type: boolean }
563+
entries:
564+
type: array
565+
items:
566+
type: object
567+
additionalProperties: false
568+
required: [model, time]
569+
properties:
570+
model: { type: string }
571+
time:
572+
type: string
573+
pattern: "^(?:[01]\\d|2[0-3]):[0-5]\\d$"
574+
execute_on_weekend: { type: boolean }
557575
failure_handling:
558576
type: object
559577
additionalProperties: false

docs/user_guide/cli-parameters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -441,6 +441,7 @@ Prevent duplicate requests from exhausting rate limits. See [Request Deduplicati
441441
| CLI Argument | Environment Variable | Description |
442442
| :--- | :--- | :--- |
443443
| `--fix-think-tags` | `FIX_THINK_TAGS_ENABLED=true` | Enable correction of `<think>` tags. |
444+
| `--disable-auto-continue-removal` | `AUTO_CONTINUE_REMOVAL_ENABLED=false` | Disable automatic removal of trailing "continue"/"proceed" user messages. |
444445
| `--disable-binary-file-edit-steering` | N/A | Disable binary file edit steering (overrides config). |
445446
| `--disable-dangerous-git-commands-protection` | `DANGEROUS_COMMAND_PREVENTION_ENABLED=false` | Disable dangerous command protection. |
446447
| N/A | `DANGEROUS_COMMAND_STEERING_MESSAGE` | Custom message for dangerous commands. |

docs/user_guide/configuration.md

Lines changed: 50 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,11 @@ session:
248248
# Fixes
249249
fix_think_tags_enabled: false
250250
fix_think_tags_streaming_buffer_size: 4096
251+
252+
# Auto continue/proceed removal
253+
# When the last user message is exactly "continue" or "proceed",
254+
# tag it as non-forwardable so it is excluded from remote LLM submissions.
255+
auto_continue_removal_enabled: true
251256
252257
# Quality Verifier
253258
quality_verifier_model: null # "backend:model"
@@ -521,12 +526,51 @@ health_check:
521526
| `ping.interval_seconds` | float | `30.0` | Seconds between ping checks |
522527
| `ping.timeout_seconds` | float | `5.0` | Ping timeout |
523528
| `ping.failure_threshold` | int | `3` | Failures before unhealthy |
524-
| `http.enabled` | bool | `true` | Enable HTTP probe checks |
525-
| `http.interval_seconds` | float | `60.0` | Seconds between HTTP checks |
526-
| `http.timeout_seconds` | float | `10.0` | HTTP request timeout |
527-
| `http.failure_threshold` | int | `2` | Failures before unhealthy |
528-
529-
### ProxyMem (Cross-Session Memory)
529+
| `http.enabled` | bool | `true` | Enable HTTP probe checks |
530+
| `http.interval_seconds` | float | `60.0` | Seconds between HTTP checks |
531+
| `http.timeout_seconds` | float | `10.0` | HTTP request timeout |
532+
| `http.failure_threshold` | int | `2` | Failures before unhealthy |
533+
534+
### Usage Window Warm-up (`usage_window_warmup`)
535+
536+
Schedules lightweight background prompts at fixed local server times to intentionally start
537+
sliding provider request windows at more favorable times of day.
538+
539+
- Runs automatically while the server is up.
540+
- Accepts explicit `backend:model` routes, including numbered backends such as
541+
`gemini.2:google/gemini-2.5-flash`.
542+
- Rejects aliases (`alias:` / `auto:`), model-only selectors, and composite routing
543+
expressions using `^` or `|`.
544+
- Adds random jitter between 5 and 35 seconds before each scheduled request.
545+
- Sends prompts like `Hi, how much is it 1234 times 567 plus 8901` and retries once
546+
when a temporary error prevents a valid response.
547+
- For `openai-codex:<model>` entries, warm-up fans out across all currently eligible
548+
managed OAuth accounts so each account window is warmed independently.
549+
550+
```yaml
551+
usage_window_warmup:
552+
enabled: true
553+
entries:
554+
- model: "openai-codex:gpt-5.4-mini"
555+
time: "08:00"
556+
execute_on_weekend: false
557+
- model: "gemini.2:google/gemini-2.5-flash"
558+
time: "13:30"
559+
execute_on_weekend: false
560+
- model: "gemini.2:google/gemini-2.5-flash"
561+
time: "18:45"
562+
execute_on_weekend: true
563+
```
564+
565+
| Option | Type | Default | Description |
566+
|--------|------|---------|-------------|
567+
| `enabled` | bool | `false` | Enable the background warm-up scheduler |
568+
| `entries` | list | `[]` | Scheduled warm-up entries |
569+
| `entries[].model` | str | required | Explicit `backend:model` route; numbered backends allowed |
570+
| `entries[].time` | str | required | Local server time in `HH:MM` 24-hour format |
571+
| `entries[].execute_on_weekend` | bool | `false` | Allow this entry to run on Saturday and Sunday |
572+
573+
### ProxyMem (Cross-Session Memory)
530574

531575
ProxyMem provides persistent context across sessions by capturing interactions, generating LLM summaries, and injecting relevant history into new sessions.
532576

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Auto Continue/Proceed Removal
2+
3+
Automatically detect and exclude mechanical "continue" / "proceed" user messages from backend submissions after connectivity interruptions, keeping context windows clean.
4+
5+
## Overview
6+
7+
When a coding agent session is interrupted (network drop, timeout, etc.), users commonly type `continue` or `proceed` to resume. These messages serve a purely mechanical purpose of re-enabling the agent loop and provide no semantic value to the remote LLM. Without this feature, they pollute the context window and are sent to every backend on every subsequent turn.
8+
9+
The Auto Continue/Proceed Removal feature detects when the very last user message is exactly `continue` or `proceed` (trimmed, case-insensitive) and tags it as non-forwardable. The existing non-forwardable enforcement layer then silently excludes it from outbound payloads to remote LLMs. The message remains in the agent's local context history so the coding agent continues building a complete window; only the transmission to the remote model is affected.
10+
11+
## Key Features
12+
13+
- **Exact match only**: Only pure `continue` or `proceed` strings are matched (case-insensitive, trimmed). Phrases like `please continue` or `continue working` are **not** affected.
14+
- **Last-message scope**: Only the final user message in the request is checked. Earlier occurrences are left untouched.
15+
- **Non-forwardable tagging**: Uses the existing `NEVER_FORWARD` mechanism so the proxy keeps the message in local history but excludes it from all backend transmissions.
16+
- **Default enabled**: Active by default; disable explicitly when needed.
17+
- **Fail-open**: If the non-forwardable registry or identity service is unavailable, the feature degrades gracefully without breaking requests.
18+
19+
## How It Works
20+
21+
1. During the request transform pipeline, the proxy inspects the last message.
22+
2. If the message role is `user` and its content (trimmed, lowercased) is exactly `continue` or `proceed`, the proxy computes a deterministic identity and tags it with `NEVER_FORWARD` and reason `auto_continue_removal`.
23+
3. Later, just before the backend call, the non-forwardable message enforcer filters out tagged messages from the outbound payload.
24+
4. On subsequent turns the coding agent resubmits the same context window; the tag persists for the session lifetime, so the message continues to be excluded.
25+
26+
## Configuration
27+
28+
The feature is **enabled by default**. Configuration follows precedence: CLI > Environment > Config File.
29+
30+
### CLI Flag
31+
32+
```bash
33+
# Disable the feature
34+
python -m src.core.cli --disable-auto-continue-removal
35+
```
36+
37+
### Environment Variable
38+
39+
```bash
40+
# Disable the feature
41+
export AUTO_CONTINUE_REMOVAL_ENABLED=false
42+
```
43+
44+
### Config File
45+
46+
```yaml
47+
# config.yaml
48+
session:
49+
auto_continue_removal_enabled: false
50+
```
51+
52+
## When to Disable
53+
54+
- You want the remote LLM to see literal `continue` / `proceed` prompts (e.g. for debugging agent behavior).
55+
- Your workflow uses custom continue-like keywords that should reach the model.
56+
- You are testing context window behavior and need every message forwarded verbatim.
57+
58+
## Logging
59+
60+
When a message is tagged, the proxy logs at INFO level:
61+
62+
```
63+
Auto continue removal: tagged last user message for session abc-123, reason=auto_continue_removal
64+
```
65+
66+
Debug-level logging shows when messages are checked but not matched.
67+
68+
## Related Features
69+
70+
- [Non-Forwardable Message Tagging](../features/non-forwardable-message-tagging.md) - Underlying tagging and enforcement mechanism
71+
- [Quality Verifier System](quality-verifier.md) - Verifies individual responses for quality
72+
- [Context Window Enforcement](context-window-enforcement.md) - Enforces per-model context limits

docs/user_guide/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Advanced features that enhance the proxy's capabilities:
4646
### Response Processing
4747

4848
- **[Think Tags Fix](features/think-tags-fix.md)** - Correct improperly formatted thinking tags in model responses
49+
- **[Auto Continue/Proceed Removal](features/auto-continue-removal.md)** - Strip mechanical "continue"/"proceed" messages from remote LLM submissions after interruptions
4950
- **[Edit Precision Tuning](features/edit-precision.md)** - Automatically adjust temperature and top_p for code editing tasks
5051

5152
### Session Memory

src/core/cli_support/applicators/session_applicator.py

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -621,20 +621,32 @@ def _apply_session_flags(
621621
origin="--disable-dangerous-git-commands-protection",
622622
)
623623

624-
if (
625-
getattr(args, "disable_double_ampersand_fixes_for_windows", None)
626-
is not None
627-
):
628-
session = overrides.setdefault("session", {})
624+
if (
625+
getattr(args, "disable_double_ampersand_fixes_for_windows", None)
626+
is not None
627+
):
628+
session = overrides.setdefault("session", {})
629629
session["double_ampersand_fixes_for_windows_enabled"] = (
630630
not args.disable_double_ampersand_fixes_for_windows
631631
)
632632
resolution.record(
633633
"session.double_ampersand_fixes_for_windows_enabled",
634634
not args.disable_double_ampersand_fixes_for_windows,
635-
ParameterSource.CLI,
636-
origin="--disable-double-ampersand-fixes-for-windows",
637-
)
635+
ParameterSource.CLI,
636+
origin="--disable-double-ampersand-fixes-for-windows",
637+
)
638+
639+
if getattr(args, "disable_auto_continue_removal", None) is not None:
640+
session = overrides.setdefault("session", {})
641+
session["auto_continue_removal_enabled"] = (
642+
not args.disable_auto_continue_removal
643+
)
644+
resolution.record(
645+
"session.auto_continue_removal_enabled",
646+
not args.disable_auto_continue_removal,
647+
ParameterSource.CLI,
648+
origin="--disable-auto-continue-removal",
649+
)
638650

639651
def _apply_strict_command_detection(
640652
self,

src/core/config/env/from_env_part1b.py

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -195,14 +195,21 @@ def _optional_int(value: str) -> int | None:
195195
path="session.fix_think_tags_streaming_buffer_size",
196196
resolution=resolution,
197197
),
198-
"double_ampersand_fixes_for_windows_enabled": _env_to_bool(
199-
"DOUBLE_AMPERSAND_FIXES_FOR_WINDOWS_ENABLED",
200-
True,
201-
env,
202-
path="session.double_ampersand_fixes_for_windows_enabled",
203-
resolution=resolution,
204-
),
205-
"planning_phase": {
198+
"double_ampersand_fixes_for_windows_enabled": _env_to_bool(
199+
"DOUBLE_AMPERSAND_FIXES_FOR_WINDOWS_ENABLED",
200+
True,
201+
env,
202+
path="session.double_ampersand_fixes_for_windows_enabled",
203+
resolution=resolution,
204+
),
205+
"auto_continue_removal_enabled": _env_to_bool(
206+
"AUTO_CONTINUE_REMOVAL_ENABLED",
207+
True,
208+
env,
209+
path="session.auto_continue_removal_enabled",
210+
resolution=resolution,
211+
),
212+
"planning_phase": {
206213
"enabled": _env_to_bool(
207214
"PLANNING_PHASE_ENABLED",
208215
False,

src/core/config/models/session.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -273,11 +273,12 @@ class SessionConfig(DomainModel):
273273
test_execution_reminder_enabled: bool | None = None
274274
test_execution_reminder_message: str | None = None
275275
droid_path_fix_enabled: bool = False
276-
fix_think_tags_enabled: bool = False
277-
fix_think_tags_streaming_buffer_size: int = 4096
278-
double_ampersand_fixes_for_windows_enabled: bool = True
279-
"""Whether automatic && to ; replacement is enabled for Windows clients."""
280-
planning_phase: PlanningPhaseConfig = Field(default_factory=PlanningPhaseConfig)
276+
fix_think_tags_enabled: bool = False
277+
fix_think_tags_streaming_buffer_size: int = 4096
278+
double_ampersand_fixes_for_windows_enabled: bool = True
279+
"""Whether automatic && to ; replacement is enabled for Windows clients."""
280+
auto_continue_removal_enabled: bool = True
281+
planning_phase: PlanningPhaseConfig = Field(default_factory=PlanningPhaseConfig)
281282
max_per_session_backends: int = 32
282283
session_continuity: SessionContinuityConfig = Field(
283284
default_factory=SessionContinuityConfig

0 commit comments

Comments
 (0)