|
| 1 | +# Gap Analysis: composite-model-routing-failover-weighted-random |
| 2 | + |
| 3 | +## Analysis Summary |
| 4 | + |
| 5 | +- Requirements are not yet approved in `spec.json`, but gap analysis can still use them to shape the design phase. |
| 6 | +- The codebase already has a strong shared routing spine for main requests and auxiliary reroutes through `BackendModelResolver` and `BackendRequestPreparer`, plus quality-verifier calls already flow through `IBackendService`. |
| 7 | +- The largest gaps are not basic routing primitives; they are a missing composite-selector grammar/parser, a missing shared composite-routing decision object, bounded nested failover accounting across retry layers, and a migration bridge from random model replacement. |
| 8 | +- Existing failover and retry mechanisms are spread across legacy config-driven failover, failure strategy retries, and quality-verifier/replacement flags, so the main design challenge is unifying behavior without duplicating or multiplying attempts. |
| 9 | +- Most viable direction for design is a hybrid approach: add a dedicated composite-routing layer and context/diagnostics model, while reusing current resolver, routing, availability, and execution collaborators underneath. |
| 10 | + |
| 11 | +## Document Status |
| 12 | + |
| 13 | +- Analysis approach: loaded spec + requirements + all steering files + `gap-analysis.md`, then inspected current routing, failover, auxiliary, quality-verifier, and replacement code paths. |
| 14 | +- Status warning: requirements are generated but not approved yet in `spec.json`. |
| 15 | + |
| 16 | +## Current State Investigation |
| 17 | + |
| 18 | +### Key assets already in place |
| 19 | + |
| 20 | +- Shared request target resolution already exists in `src/core/services/backend_model_resolver.py`; it preserves alias resolution, `backend:model`, model-only routing, URI params, and static-route handling. |
| 21 | +- Auxiliary routing already re-enters the shared resolver path in `src/core/services/backend_completion_flow/backend_request_preparer.py`, especially the auxiliary reroute flow. |
| 22 | +- Backend instance/model-only routing and availability-aware candidate filtering already exist in `src/core/services/backend_routing_service.py`, with model-only discovery and ranking built in. |
| 23 | +- Execution-time availability checks already classify unsupported / unavailable / rate-limited states in `src/core/services/backend_completion_flow/availability_checker.py`. |
| 24 | +- Retry/failover bookkeeping already exists via `retry_attempt` context metadata in `src/core/services/backend_completion_flow/service.py` and `src/core/services/backend_completion_flow/failure_recovery_executor.py`. |
| 25 | +- Quality Verifier is already routed as an internal backend call through `IBackendService` in `src/core/services/quality_verifier_orchestrator.py`. |
| 26 | +- Random model replacement already mutates request routing state and marks context flags in `src/core/services/request_processor_service.py` and is implemented in `src/core/services/model_replacement_service.py`. |
| 27 | + |
| 28 | +### Existing conventions and constraints |
| 29 | + |
| 30 | +- Current selector parsing is intentionally conservative: explicit backend is only `:` before `/`, and URI params are parsed after `?`, in `src/core/domain/model_utils.py`. |
| 31 | +- Backend routing today expects a single resolved backend/model pair, not a composite parse tree, in `src/core/services/backend_model_resolver.py`. |
| 32 | +- Legacy failover config is model-keyed config data, not inline selector grammar, in `src/core/services/failover_service.py`. |
| 33 | +- Failure handling already has its own retry/failover loop, which means composite failover must avoid stacking independent attempt budgets, in `src/core/services/backend_completion_flow/failure_recovery_executor.py`. |
| 34 | + |
| 35 | +## Requirements Feasibility Analysis |
| 36 | + |
| 37 | +### Requirement-to-Asset Map |
| 38 | + |
| 39 | +| Requirement Area | Existing Assets | Status | Gap Notes | |
| 40 | +|---|---|---:|---| |
| 41 | +| R1 Unified composite routing entry point | `src/core/services/backend_model_resolver.py`, `src/core/services/backend_completion_flow/backend_request_preparer.py`, `src/core/services/quality_verifier_orchestrator.py` | Constraint | Main and auxiliary paths are close to unified already; quality verifier uses shared backend service but not the resolver directly as a first-class composite-routing entry point. | |
| 42 | +| R2 Ordered failover `|` selectors | `src/core/services/failover_service.py`, `src/core/services/backend_completion_flow/failure_recovery_executor.py` | Missing | Existing failover is config-driven or error-driven, not selector-driven; no inline ordered composite selector support. | |
| 43 | +| R3 Weighted random `^` selectors with `[weight=N]` | `src/core/services/backend_routing_service.py`, `src/core/services/model_replacement_service.py` | Missing | Round-robin and random replacement exist, but no weighted random selector grammar or shared weighted chooser. | |
| 44 | +| R4 Deterministic parsing and validation | `src/core/domain/model_utils.py` | Constraint | Deterministic single-selector parsing exists, but there is no composite grammar, nesting policy, or validation error taxonomy for composite selectors. | |
| 45 | +| R5 Nested failover safety / bounded retries | `src/core/services/backend_completion_flow/service.py`, `src/core/services/backend_completion_flow/failure_recovery_executor.py`, `config/schemas/app_config.schema.yaml` | Constraint | Retry metadata and max failover hops exist, but they are not clearly shared across nested composite layers plus legacy failure strategy plus quality-verifier calls. | |
| 46 | +| R6 Backward compatibility for existing selectors | `src/core/domain/model_utils.py`, `src/core/services/backend_model_resolver.py` | Present | Existing non-composite semantics are explicit and stable; compatibility risk is mainly parser precedence and migration behavior. | |
| 47 | +| R7 Deprecate random model replacement | `src/core/services/model_replacement_service.py`, `src/core/services/request_processor_service.py`, `config/config.example.yaml` | Missing | Feature exists, but no deprecation signaling, no compatibility bridge into composite selectors, and no N+1 removal messaging. | |
| 48 | +| R8 Observability and diagnosability | `src/core/services/backend_completion_flow/service.py`, `src/core/services/quality_verifier_orchestrator.py`, usage and capture surfaces | Constraint | Context surfaces exist, but there is no structured composite-routing trace explaining parsed branches, selected branch, skipped targets, or exhaustion cause. | |
| 49 | + |
| 50 | +### Missing capabilities |
| 51 | + |
| 52 | +- No parser/AST for composite selectors with operators, precedence, nesting, whitespace, or weight annotations. |
| 53 | +- No typed "composite routing plan/decision" object flowing through resolver, execution, and observability layers. |
| 54 | +- No single place that decides selection failure vs availability failure vs execution failure before meaningful output for composite target progression. |
| 55 | +- No unified failover-hop budget spanning selector-level failover, legacy failover planning, failure-strategy retry/failover, and internal verifier calls when they themselves use composite selectors. |
| 56 | +- No deprecation contract for replacement configuration to map or reject legacy replacement rules. |
| 57 | + |
| 58 | +### Constraints from current architecture |
| 59 | + |
| 60 | +- `BackendModelResolver` currently returns one `BackendTarget`, so adding composites there directly may overload its responsibility unless a pre-resolution composite layer is introduced. |
| 61 | +- `BackendRoutingService` is backend-instance oriented; turning it into parser + executor + diagnostics would likely bloat a hot-path class. |
| 62 | +- `FailureRecoveryExecutor` already increments retry metadata and decides retries; if composite failover also increments independently, attempts can explode. |
| 63 | +- `RequestProcessorService` already treats replacement and quality-verifier scheduling specially; migration must preserve these interactions. |
| 64 | + |
| 65 | +## Implementation Approach Options |
| 66 | + |
| 67 | +### Option A: Extend existing components in place |
| 68 | + |
| 69 | +**Description** |
| 70 | + |
| 71 | +Add composite parsing and execution behavior directly into `BackendModelResolver`, `BackendRoutingService`, and `FailureRecoveryExecutor`. |
| 72 | + |
| 73 | +**Likely touch points** |
| 74 | + |
| 75 | +- `src/core/domain/model_utils.py` |
| 76 | +- `src/core/services/backend_model_resolver.py` |
| 77 | +- `src/core/services/backend_routing_service.py` |
| 78 | +- `src/core/services/backend_completion_flow/failure_recovery_executor.py` |
| 79 | +- `src/core/services/request_processor_service.py` |
| 80 | +- `src/core/services/quality_verifier_orchestrator.py` |
| 81 | + |
| 82 | +**Compatibility assessment** |
| 83 | + |
| 84 | +- Preserves most existing call sites. |
| 85 | +- Minimizes DI and stage wiring churn. |
| 86 | +- High risk of mixing parsing, policy, execution, and observability responsibilities across already-important hot-path services. |
| 87 | + |
| 88 | +**Trade-offs** |
| 89 | + |
| 90 | +- Pros: smallest surface-area change to calling code; fastest initial implementation. |
| 91 | +- Cons: highest risk of resolver/routing bloat, harder to test composite grammar separately, more fragile nested retry accounting. |
| 92 | + |
| 93 | +### Option B: Create new composite-routing components |
| 94 | + |
| 95 | +**Description** |
| 96 | + |
| 97 | +Introduce dedicated components such as: |
| 98 | + |
| 99 | +- composite selector parser / validator, |
| 100 | +- composite routing plan model, |
| 101 | +- composite routing executor / coordinator, |
| 102 | +- composite diagnostics payload builder. |
| 103 | + |
| 104 | +Existing resolver/routing services remain underneath as leaf primitives for single-target resolution. |
| 105 | + |
| 106 | +**Likely integration points** |
| 107 | + |
| 108 | +- New services under `src/core/services/` |
| 109 | +- New interfaces under `src/core/interfaces/` |
| 110 | +- Resolver integration at `src/core/services/backend_model_resolver.py` |
| 111 | +- Backend execution integration at `src/core/services/backend_completion_flow/service.py` |
| 112 | +- Replacement bridge integration at `src/core/services/request_processor_service.py` |
| 113 | + |
| 114 | +**Responsibility boundaries** |
| 115 | + |
| 116 | +- Parser owns syntax, weights, nesting, and validation. |
| 117 | +- Coordinator owns branch progression and bounded hop accounting. |
| 118 | +- Existing resolver/routing service still owns single-target backend/model resolution and candidate eligibility. |
| 119 | +- Observability layer consumes structured routing decision objects. |
| 120 | + |
| 121 | +**Trade-offs** |
| 122 | + |
| 123 | +- Pros: cleaner separation, best long-term maintainability, easier unit/property testing. |
| 124 | +- Cons: more files, more DI wiring, higher design overhead. |
| 125 | + |
| 126 | +### Option C: Hybrid incremental migration |
| 127 | + |
| 128 | +**Description** |
| 129 | + |
| 130 | +Introduce a dedicated composite parser + decision context first, but reuse current single-target resolver/routing and failure executor underneath. Then layer deprecation/migration and observability on top. |
| 131 | + |
| 132 | +**Combination strategy** |
| 133 | + |
| 134 | +- New: |
| 135 | + - parser/validator, |
| 136 | + - composite decision model, |
| 137 | + - shared hop-budget context, |
| 138 | + - deprecation bridge adapter for replacement. |
| 139 | +- Extend: |
| 140 | + - `BackendModelResolver` to delegate composite selectors, |
| 141 | + - `FailureRecoveryExecutor` to respect shared composite hop accounting, |
| 142 | + - request/quality-verifier flows to emit composite diagnostics. |
| 143 | + |
| 144 | +**Risk mitigation** |
| 145 | + |
| 146 | +- Preserve non-composite path unchanged. |
| 147 | +- Gate composite selector handling on operator presence (`|`, `^`, weight syntax). |
| 148 | +- Make migration bridge explicit and reversible in config behavior. |
| 149 | +- Add regression tests for main, auxiliary, and quality-verifier surfaces before broad rollout. |
| 150 | + |
| 151 | +**Trade-offs** |
| 152 | + |
| 153 | +- Pros: best balance for a brownfield codebase; reuses proven primitives while isolating new grammar/policy. |
| 154 | +- Cons: temporary mixed architecture until all routing surfaces fully converge. |
| 155 | + |
| 156 | +## Complexity and Risk |
| 157 | + |
| 158 | +- Effort: `L` — touches hot-path routing, failure handling, config/deprecation behavior, and three routing surfaces (main, auxiliary, verifier). |
| 159 | +- Risk: `High` — parser precedence, retry explosion, and backward-compatibility regressions could affect core proxy behavior and operator trust. |
| 160 | + |
| 161 | +## Design-Phase Recommendations |
| 162 | + |
| 163 | +### Preferred direction to evaluate further |
| 164 | + |
| 165 | +- Option C looks strongest for design: keep existing single-target resolution intact, but add a new composite-routing layer above it. |
| 166 | +- Option B is the cleanest end-state and may become the target architecture if Option C is used as a staged migration. |
| 167 | + |
| 168 | +### Research Needed |
| 169 | + |
| 170 | +1. How should operator precedence and nesting work between `|`, `^`, URI params, and `[weight=N]` without breaking current `backend:model?x=y` semantics? |
| 171 | +2. What exact event should consume a composite failover hop: parse rejection, candidate ineligibility, backend unavailability, retry attempt, or only branch transitions? |
| 172 | +3. How should composite routing interact with existing failure strategy in `src/core/services/backend_completion_flow/failure_recovery_executor.py` so there is one bounded attempt budget? |
| 173 | +4. What is the safest compatibility mapping from replacement rules in `src/core/services/model_replacement_service.py` into composite weighted-random selectors, and when must mapping fail explicitly? |
| 174 | +5. Which observability surfaces should carry composite metadata first: wire capture, usage records, logs, diagnostics endpoints, or all of them? |
| 175 | +6. Should quality verifier and auxiliary requests allow full composite syntax, or should some surfaces restrict explicit backend requirements for safety and clarity? |
| 176 | + |
| 177 | +## Next Steps |
| 178 | + |
| 179 | +1. Approve or refine the requirements in `.kiro/specs/composite-model-routing-failover-weighted-random/requirements.md`. |
| 180 | +2. Move to `/kiro:spec-design composite-model-routing-failover-weighted-random`, carrying forward the hybrid approach and research items above. |
| 181 | +3. In design, define a single composite-routing contract that all three surfaces use, plus a single shared attempt-budget model. |
0 commit comments