Skip to content

Commit 87ea3fd

Browse files
committed
docs: capture architecture redesign target
Document the approved ports-and-use-cases end state so the codebase can refactor toward a stable, testable architecture.
1 parent a215311 commit 87ea3fd

1 file changed

Lines changed: 214 additions & 0 deletions

File tree

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
# Codebase Architecture Redesign
2+
3+
Date: 2026-03-09
4+
Status: Approved
5+
6+
## Goal
7+
8+
Define an ideal end-state architecture for the Video RSS Aggregator that improves
9+
maintainability, reliability, delivery velocity, and testability.
10+
11+
## Context
12+
13+
The current codebase works, but several design pressures now slow it down:
14+
15+
- `Pipeline` mixes composition, orchestration, feed ingestion, processing,
16+
persistence, runtime reporting, and RSS generation.
17+
- Data contracts leak across layers, especially between summarization,
18+
persistence, and RSS rendering.
19+
- CLI and API do not consistently share one application-level workflow model.
20+
- Setup/runtime data is shaped in multiple places across Python, HTML, and JS.
21+
- Tests rely heavily on monkeypatching concrete modules instead of stable seams.
22+
23+
## Chosen Approach
24+
25+
Adopt a ports-and-use-cases architecture with four layers:
26+
27+
1. Adapters
28+
2. Application
29+
3. Domain
30+
4. Infrastructure
31+
32+
This was selected over a lighter workflow-slice refactor or a stabilized version
33+
of the current layering because the goal is an ideal long-term architecture,
34+
not only a lower-risk cleanup.
35+
36+
## Architectural Principles
37+
38+
- Dependencies point inward only.
39+
- Business workflows live in application use cases, not transport adapters.
40+
- Domain types are stable and independent of storage, HTTP, CLI, and model APIs.
41+
- Infrastructure implements ports instead of owning business rules.
42+
- Composition happens once in a single composition root.
43+
44+
## Target Architecture
45+
46+
### Adapters
47+
48+
Adapters translate external interactions into application requests and map
49+
application responses back out.
50+
51+
- FastAPI routes
52+
- CLI commands
53+
- GUI/setup endpoints and page models
54+
- RSS HTTP delivery surface
55+
56+
Adapters should not contain orchestration logic beyond request mapping,
57+
validation, and response formatting.
58+
59+
### Application
60+
61+
Application use cases coordinate business workflows through explicit ports.
62+
63+
Core use cases:
64+
65+
- `BootstrapRuntime`
66+
- `GetRuntimeStatus`
67+
- `IngestFeed`
68+
- `ProcessSource`
69+
- `RenderRssFeed`
70+
71+
The current `Pipeline` class should be replaced by these focused use cases.
72+
73+
### Domain
74+
75+
Domain types define the stable language of the system.
76+
77+
Illustrative types:
78+
79+
- `SourceItem`
80+
- `VideoRecord`
81+
- `Transcript`
82+
- `PreparedMedia`
83+
- `SummaryDraft`
84+
- `SummaryResult`
85+
- `ProcessOutcome`
86+
- `DiagnosticReport`
87+
88+
These types must not depend on SQLite rows, Ollama payloads, FastAPI models,
89+
Click commands, or subprocess output.
90+
91+
### Infrastructure
92+
93+
Infrastructure adapters implement application ports.
94+
95+
- SQLite repositories
96+
- Ollama client adapter
97+
- Feed fetching adapter
98+
- Media preparation adapter around yt-dlp, ffmpeg, and filesystem artifacts
99+
- RSS rendering adapter
100+
- Runtime inspection adapter
101+
102+
Infrastructure owns transport and tool integration details, but not workflow
103+
decisions.
104+
105+
## Ports
106+
107+
The application layer should depend on explicit interfaces such as:
108+
109+
- `FeedSource`
110+
- `VideoRepository`
111+
- `SummaryRepository`
112+
- `MediaPreparationService`
113+
- `Summarizer`
114+
- `RuntimeInspector`
115+
- `PublicationRenderer`
116+
- `ArtifactStore`
117+
118+
This creates narrow seams for tests and keeps adapters replaceable.
119+
120+
## Data Flow
121+
122+
### Startup
123+
124+
The composition root loads `Config`, builds infrastructure adapters, wires use
125+
cases, and exposes them to FastAPI and CLI entry points. Web startup and
126+
shutdown should be owned by FastAPI lifespan rather than by a prebuilt runtime
127+
object created externally.
128+
129+
### Ingest
130+
131+
`IngestFeed` fetches and parses a feed, normalizes entries into domain types,
132+
stores feed/video metadata, and optionally delegates processing to
133+
`ProcessSource`. It should not own processing internals.
134+
135+
### Processing
136+
137+
`ProcessSource` asks `MediaPreparationService` for `PreparedMedia`, passes the
138+
result to `Summarizer`, then persists a typed `ProcessOutcome`.
139+
140+
This use case owns the decision about whether a result is successful, degraded,
141+
or failed.
142+
143+
### Publication
144+
145+
`RenderRssFeed` reads published summaries through repositories and passes stable
146+
publication models to a renderer. RSS generation should not depend on storage
147+
row types.
148+
149+
### Setup and Runtime
150+
151+
`BootstrapRuntime` and `GetRuntimeStatus` should return one application-level
152+
view model shared by API and GUI, replacing duplicated config/setup shaping.
153+
154+
## Error Handling Model
155+
156+
Replace implicit fallback-heavy success semantics with explicit outcome types:
157+
158+
- `Success`
159+
- `PartialSuccess`
160+
- `Failure`
161+
162+
Rules:
163+
164+
- Adapter-specific exceptions are translated at the use-case boundary.
165+
- `PartialSuccess` is used when the system produced a degraded but valid result.
166+
- `Failure` means the business goal was not achieved and must not be presented
167+
as a normal success.
168+
- Persistence records outcome status explicitly rather than relying on summary
169+
text to reveal degradation.
170+
- API and CLI surfaces report status directly.
171+
172+
Diagnostics remain separate from processing outcomes.
173+
174+
## Testing Strategy
175+
176+
The main regression net should move to application-boundary contract tests for:
177+
178+
- `BootstrapRuntime`
179+
- `GetRuntimeStatus`
180+
- `IngestFeed`
181+
- `ProcessSource`
182+
- `RenderRssFeed`
183+
184+
Supporting tests should be split into:
185+
186+
- repository integration tests
187+
- Ollama adapter tests
188+
- media adapter tests
189+
- FastAPI adapter tests
190+
- CLI adapter tests
191+
- policy tests for model selection, degradation classification, normalization,
192+
and retention/publication rules
193+
194+
The desired end state is that most business behavior can be tested without
195+
FastAPI, Click, SQLite, subprocesses, or a live Ollama runtime.
196+
197+
## Non-Goals
198+
199+
- Defining the full migration sequence in this document
200+
- Implementing the redesign directly from adapters inward
201+
- Preserving the current `Pipeline` shape as a compatibility constraint
202+
203+
## Expected Benefits
204+
205+
- clearer ownership of business workflows
206+
- lower coupling between persistence, summarization, and presentation
207+
- consistent behavior across API and CLI
208+
- easier unit and contract testing
209+
- safer future changes to model policy, media tooling, and publishing
210+
211+
## Next Step
212+
213+
Create a dedicated implementation plan that stages the migration from the
214+
current codebase into this target architecture.

0 commit comments

Comments
 (0)