Skip to content

Commit 4c58c73

Browse files
author
Mateusz
committed
feat: composite model routing with ordered failover and weighted random selection
Add composite routing subsystem supporting ordered failover (|) and weighted random (^) selector syntax with deprecation bridge for legacy random replacement. Includes routing coordinator, weighted branch selector, diagnostics publisher, failure recovery bridge, CLI parsing, schema validation, and comprehensive test coverage.
1 parent 857067a commit 4c58c73

58 files changed

Lines changed: 17199 additions & 11161 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.kiro/specs/archive_allowlist.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@
2020
{
2121
"spec": "compression-layer-rtk-inspired",
2222
"reason": "Spec completed and retained in active folder during final validation pass"
23+
},
24+
{
25+
"spec": "composite-model-routing-failover-weighted-random",
26+
"reason": "Wave C2 finalized and retained in active folder for immediate follow-up validation"
2327
}
2428
]
2529
}

.kiro/specs/composite-model-routing-failover-weighted-random/design.md

Lines changed: 710 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Gap Analysis: composite-model-routing-failover-weighted-random
2+
3+
## Analysis Summary
4+
5+
- Requirements are not yet approved in `spec.json`, but gap analysis can still use them to shape the design phase.
6+
- The codebase already has a strong shared routing spine for main requests and auxiliary reroutes through `BackendModelResolver` and `BackendRequestPreparer`, plus quality-verifier calls already flow through `IBackendService`.
7+
- The largest gaps are not basic routing primitives; they are a missing composite-selector grammar/parser, a missing shared composite-routing decision object, bounded nested failover accounting across retry layers, and a migration bridge from random model replacement.
8+
- Existing failover and retry mechanisms are spread across legacy config-driven failover, failure strategy retries, and quality-verifier/replacement flags, so the main design challenge is unifying behavior without duplicating or multiplying attempts.
9+
- Most viable direction for design is a hybrid approach: add a dedicated composite-routing layer and context/diagnostics model, while reusing current resolver, routing, availability, and execution collaborators underneath.
10+
11+
## Document Status
12+
13+
- Analysis approach: loaded spec + requirements + all steering files + `gap-analysis.md`, then inspected current routing, failover, auxiliary, quality-verifier, and replacement code paths.
14+
- Status warning: requirements are generated but not approved yet in `spec.json`.
15+
16+
## Current State Investigation
17+
18+
### Key assets already in place
19+
20+
- Shared request target resolution already exists in `src/core/services/backend_model_resolver.py`; it preserves alias resolution, `backend:model`, model-only routing, URI params, and static-route handling.
21+
- Auxiliary routing already re-enters the shared resolver path in `src/core/services/backend_completion_flow/backend_request_preparer.py`, especially the auxiliary reroute flow.
22+
- Backend instance/model-only routing and availability-aware candidate filtering already exist in `src/core/services/backend_routing_service.py`, with model-only discovery and ranking built in.
23+
- Execution-time availability checks already classify unsupported / unavailable / rate-limited states in `src/core/services/backend_completion_flow/availability_checker.py`.
24+
- Retry/failover bookkeeping already exists via `retry_attempt` context metadata in `src/core/services/backend_completion_flow/service.py` and `src/core/services/backend_completion_flow/failure_recovery_executor.py`.
25+
- Quality Verifier is already routed as an internal backend call through `IBackendService` in `src/core/services/quality_verifier_orchestrator.py`.
26+
- Random model replacement already mutates request routing state and marks context flags in `src/core/services/request_processor_service.py` and is implemented in `src/core/services/model_replacement_service.py`.
27+
28+
### Existing conventions and constraints
29+
30+
- Current selector parsing is intentionally conservative: explicit backend is only `:` before `/`, and URI params are parsed after `?`, in `src/core/domain/model_utils.py`.
31+
- Backend routing today expects a single resolved backend/model pair, not a composite parse tree, in `src/core/services/backend_model_resolver.py`.
32+
- Legacy failover config is model-keyed config data, not inline selector grammar, in `src/core/services/failover_service.py`.
33+
- Failure handling already has its own retry/failover loop, which means composite failover must avoid stacking independent attempt budgets, in `src/core/services/backend_completion_flow/failure_recovery_executor.py`.
34+
35+
## Requirements Feasibility Analysis
36+
37+
### Requirement-to-Asset Map
38+
39+
| Requirement Area | Existing Assets | Status | Gap Notes |
40+
|---|---|---:|---|
41+
| R1 Unified composite routing entry point | `src/core/services/backend_model_resolver.py`, `src/core/services/backend_completion_flow/backend_request_preparer.py`, `src/core/services/quality_verifier_orchestrator.py` | Constraint | Main and auxiliary paths are close to unified already; quality verifier uses shared backend service but not the resolver directly as a first-class composite-routing entry point. |
42+
| R2 Ordered failover `|` selectors | `src/core/services/failover_service.py`, `src/core/services/backend_completion_flow/failure_recovery_executor.py` | Missing | Existing failover is config-driven or error-driven, not selector-driven; no inline ordered composite selector support. |
43+
| R3 Weighted random `^` selectors with `[weight=N]` | `src/core/services/backend_routing_service.py`, `src/core/services/model_replacement_service.py` | Missing | Round-robin and random replacement exist, but no weighted random selector grammar or shared weighted chooser. |
44+
| R4 Deterministic parsing and validation | `src/core/domain/model_utils.py` | Constraint | Deterministic single-selector parsing exists, but there is no composite grammar, nesting policy, or validation error taxonomy for composite selectors. |
45+
| R5 Nested failover safety / bounded retries | `src/core/services/backend_completion_flow/service.py`, `src/core/services/backend_completion_flow/failure_recovery_executor.py`, `config/schemas/app_config.schema.yaml` | Constraint | Retry metadata and max failover hops exist, but they are not clearly shared across nested composite layers plus legacy failure strategy plus quality-verifier calls. |
46+
| R6 Backward compatibility for existing selectors | `src/core/domain/model_utils.py`, `src/core/services/backend_model_resolver.py` | Present | Existing non-composite semantics are explicit and stable; compatibility risk is mainly parser precedence and migration behavior. |
47+
| R7 Deprecate random model replacement | `src/core/services/model_replacement_service.py`, `src/core/services/request_processor_service.py`, `config/config.example.yaml` | Missing | Feature exists, but no deprecation signaling, no compatibility bridge into composite selectors, and no N+1 removal messaging. |
48+
| R8 Observability and diagnosability | `src/core/services/backend_completion_flow/service.py`, `src/core/services/quality_verifier_orchestrator.py`, usage and capture surfaces | Constraint | Context surfaces exist, but there is no structured composite-routing trace explaining parsed branches, selected branch, skipped targets, or exhaustion cause. |
49+
50+
### Missing capabilities
51+
52+
- No parser/AST for composite selectors with operators, precedence, nesting, whitespace, or weight annotations.
53+
- No typed "composite routing plan/decision" object flowing through resolver, execution, and observability layers.
54+
- No single place that decides selection failure vs availability failure vs execution failure before meaningful output for composite target progression.
55+
- No unified failover-hop budget spanning selector-level failover, legacy failover planning, failure-strategy retry/failover, and internal verifier calls when they themselves use composite selectors.
56+
- No deprecation contract for replacement configuration to map or reject legacy replacement rules.
57+
58+
### Constraints from current architecture
59+
60+
- `BackendModelResolver` currently returns one `BackendTarget`, so adding composites there directly may overload its responsibility unless a pre-resolution composite layer is introduced.
61+
- `BackendRoutingService` is backend-instance oriented; turning it into parser + executor + diagnostics would likely bloat a hot-path class.
62+
- `FailureRecoveryExecutor` already increments retry metadata and decides retries; if composite failover also increments independently, attempts can explode.
63+
- `RequestProcessorService` already treats replacement and quality-verifier scheduling specially; migration must preserve these interactions.
64+
65+
## Implementation Approach Options
66+
67+
### Option A: Extend existing components in place
68+
69+
**Description**
70+
71+
Add composite parsing and execution behavior directly into `BackendModelResolver`, `BackendRoutingService`, and `FailureRecoveryExecutor`.
72+
73+
**Likely touch points**
74+
75+
- `src/core/domain/model_utils.py`
76+
- `src/core/services/backend_model_resolver.py`
77+
- `src/core/services/backend_routing_service.py`
78+
- `src/core/services/backend_completion_flow/failure_recovery_executor.py`
79+
- `src/core/services/request_processor_service.py`
80+
- `src/core/services/quality_verifier_orchestrator.py`
81+
82+
**Compatibility assessment**
83+
84+
- Preserves most existing call sites.
85+
- Minimizes DI and stage wiring churn.
86+
- High risk of mixing parsing, policy, execution, and observability responsibilities across already-important hot-path services.
87+
88+
**Trade-offs**
89+
90+
- Pros: smallest surface-area change to calling code; fastest initial implementation.
91+
- Cons: highest risk of resolver/routing bloat, harder to test composite grammar separately, more fragile nested retry accounting.
92+
93+
### Option B: Create new composite-routing components
94+
95+
**Description**
96+
97+
Introduce dedicated components such as:
98+
99+
- composite selector parser / validator,
100+
- composite routing plan model,
101+
- composite routing executor / coordinator,
102+
- composite diagnostics payload builder.
103+
104+
Existing resolver/routing services remain underneath as leaf primitives for single-target resolution.
105+
106+
**Likely integration points**
107+
108+
- New services under `src/core/services/`
109+
- New interfaces under `src/core/interfaces/`
110+
- Resolver integration at `src/core/services/backend_model_resolver.py`
111+
- Backend execution integration at `src/core/services/backend_completion_flow/service.py`
112+
- Replacement bridge integration at `src/core/services/request_processor_service.py`
113+
114+
**Responsibility boundaries**
115+
116+
- Parser owns syntax, weights, nesting, and validation.
117+
- Coordinator owns branch progression and bounded hop accounting.
118+
- Existing resolver/routing service still owns single-target backend/model resolution and candidate eligibility.
119+
- Observability layer consumes structured routing decision objects.
120+
121+
**Trade-offs**
122+
123+
- Pros: cleaner separation, best long-term maintainability, easier unit/property testing.
124+
- Cons: more files, more DI wiring, higher design overhead.
125+
126+
### Option C: Hybrid incremental migration
127+
128+
**Description**
129+
130+
Introduce a dedicated composite parser + decision context first, but reuse current single-target resolver/routing and failure executor underneath. Then layer deprecation/migration and observability on top.
131+
132+
**Combination strategy**
133+
134+
- New:
135+
- parser/validator,
136+
- composite decision model,
137+
- shared hop-budget context,
138+
- deprecation bridge adapter for replacement.
139+
- Extend:
140+
- `BackendModelResolver` to delegate composite selectors,
141+
- `FailureRecoveryExecutor` to respect shared composite hop accounting,
142+
- request/quality-verifier flows to emit composite diagnostics.
143+
144+
**Risk mitigation**
145+
146+
- Preserve non-composite path unchanged.
147+
- Gate composite selector handling on operator presence (`|`, `^`, weight syntax).
148+
- Make migration bridge explicit and reversible in config behavior.
149+
- Add regression tests for main, auxiliary, and quality-verifier surfaces before broad rollout.
150+
151+
**Trade-offs**
152+
153+
- Pros: best balance for a brownfield codebase; reuses proven primitives while isolating new grammar/policy.
154+
- Cons: temporary mixed architecture until all routing surfaces fully converge.
155+
156+
## Complexity and Risk
157+
158+
- Effort: `L` — touches hot-path routing, failure handling, config/deprecation behavior, and three routing surfaces (main, auxiliary, verifier).
159+
- Risk: `High` — parser precedence, retry explosion, and backward-compatibility regressions could affect core proxy behavior and operator trust.
160+
161+
## Design-Phase Recommendations
162+
163+
### Preferred direction to evaluate further
164+
165+
- Option C looks strongest for design: keep existing single-target resolution intact, but add a new composite-routing layer above it.
166+
- Option B is the cleanest end-state and may become the target architecture if Option C is used as a staged migration.
167+
168+
### Research Needed
169+
170+
1. How should operator precedence and nesting work between `|`, `^`, URI params, and `[weight=N]` without breaking current `backend:model?x=y` semantics?
171+
2. What exact event should consume a composite failover hop: parse rejection, candidate ineligibility, backend unavailability, retry attempt, or only branch transitions?
172+
3. How should composite routing interact with existing failure strategy in `src/core/services/backend_completion_flow/failure_recovery_executor.py` so there is one bounded attempt budget?
173+
4. What is the safest compatibility mapping from replacement rules in `src/core/services/model_replacement_service.py` into composite weighted-random selectors, and when must mapping fail explicitly?
174+
5. Which observability surfaces should carry composite metadata first: wire capture, usage records, logs, diagnostics endpoints, or all of them?
175+
6. Should quality verifier and auxiliary requests allow full composite syntax, or should some surfaces restrict explicit backend requirements for safety and clarity?
176+
177+
## Next Steps
178+
179+
1. Approve or refine the requirements in `.kiro/specs/composite-model-routing-failover-weighted-random/requirements.md`.
180+
2. Move to `/kiro:spec-design composite-model-routing-failover-weighted-random`, carrying forward the hybrid approach and research items above.
181+
3. In design, define a single composite-routing contract that all three surfaces use, plus a single shared attempt-budget model.

0 commit comments

Comments
 (0)