diff --git a/ONBOARDING.md b/ONBOARDING.md index 2f46b86..57f3478 100644 --- a/ONBOARDING.md +++ b/ONBOARDING.md @@ -127,13 +127,13 @@ ______________________________________________________________________ These are the load-bearing oddities — read these before changing code in their area. - **Riverpod v3 internal import.** A few places (`languages.dart`, `updates.dart`) import `package:riverpod/src/framework.dart` to access `$RefArg` for `family` providers when overriding in tests. This is intentional, suppressed via `// ignore` comments. -- **`MustOverrideProvider`** in `globals.dart` is a custom helper that throws unless overridden in a `ProviderScope`. It is used for `sharedPrefsProvider` and `packageInfoProvider` so that test environments **must** provide explicit fakes. +- **`MustOverrideProvider`** in `globals.dart` is a custom helper that throws unless overridden in a `ProviderScope`. It is used for `sharedPrefsProvider`, `packageInfoProvider`, and `languageDownloaderProvider` so that test environments **must** provide explicit fakes. +- **`LanguageDownloader`** (`lib/data/language_downloader.dart`) is the in-house module that downloads + unzips a language's HTML + PDF into a staging directory and swaps it into place atomically. Failed downloads never destroy prior offline content, concurrent `download()` calls are serialized (peak memory bounded by two zips), and a `.staging` leftover from a crashed run is cleaned up on the next attempt. It replaced the third-party `download_assets` package and is wired up in `main.dart` and the background isolate via `languageDownloaderProvider.overrideWithValue(...)`. - **`flutter_html` error filter.** `main.dart` installs a `FlutterError.onError` shim that swallows four specific assertions thrown by `flutter_html_table` 3.0.0. Read the long docstring there before touching it — the four assertion variants are documented in detail. - **HTML pre-processing.** `widgets/html_view.dart` calls `sanitize()` to fix bugs in `flutter_html` (e.g. percent table widths, fuzzy translations, stylized subtitles). Some workarounds are also in the HTML generator (`pywikitools`) upstream. - **Background task is half-disabled.** Big chunks of `BackgroundScheduler.schedule()` and `Workmanager().initialize` in `main.dart` are commented out, gated on a "version 0.9" milestone. Don't delete them — they are the working scaffolding for the next release. - **No package on pub.dev.** `pubspec.yaml` has `publish_to: 'none'`. - **License is AGPL** with an Apple App Store exception. See `LICENSE` and `COPYING.iOS`. -- **`download_assets` is pinned to 4.0.0** because 4.1.0 breaks GitHub archive URLs (Content-Length check). - **`test` is pinned to ^1.29.0** because 1.30+ needs a newer `test_api` than `flutter_test` allows. ______________________________________________________________________ diff --git a/docs/architecture.md b/docs/architecture.md index 7c76785..124ad2c 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -51,10 +51,11 @@ There is no app-owned backend; all content comes from public GitHub repos and th ▼ ▼ ▼ ┌───────────────┐ ┌──────────────┐ ┌──────────────────────┐ │ File system │ │ SharedPrefs │ │ Network │ -│ (download_ │ │ user prefs + │ │ - download_assets │ -│ assets) │ │ last-checked │ │ (zip download) │ -│ HTML + PDF │ │ timestamps │ │ - http (GitHub API) │ -│ per language │ │ │ │ commits since… │ +│ LanguageDown- │ │ user prefs + │ │ - dio (zip download │ +│ loader: stag- │ │ last-checked │ │ via Language- │ +│ ing + atomic │ │ timestamps │ │ Downloader) │ +│ swap of HTML │ │ │ │ - http (GitHub API, │ +│ + PDF per lang│ │ │ │ commits since…) │ └───────────────┘ └──────────────┘ └──────────────────────┘ ``` @@ -116,10 +117,19 @@ ViewPage(page, langCode) ``` DownloadLanguageButton(langCode).onPressed - → LanguageController(langCode).download(force?) - • _initController() → DownloadAssetsController.init(assetDir: 'assets-') - • startDownload([htmlZipUrl]) ← github.com/4training/html-/archive/main.zip - • startDownload([pdfZipUrl]) ← github.com/4training/pdf-/archive/main.zip + → LanguageController(langCode).download() + • ref.read(languageDownloaderProvider).download(langCode) + – serialize against any in-flight download (max one at a time) + – rm -rf assets-.staging ← crash recovery + – Future.wait([ + dio.get(htmlZipUrl, responseType: bytes), + dio.get(pdfZipUrl, responseType: bytes), + ]) ← github.com/4training/{html,pdf}-/archive/main.zip + – ZipDecoder.decodeBytes(...) → write each entry into .staging via FileSystem + – on any failure: rm -rf .staging and rethrow (prior data untouched) + – rename assets- → assets-.old (if it existed) + – rename .staging → assets- (atomic swap) + – best-effort rm -rf .old • _load(): – read structure/contents.json – build pages: Map, pageIndex: List, @@ -130,7 +140,7 @@ DownloadLanguageButton(langCode).onPressed → snackbar, button stops spinning ``` -The two zip downloads are issued sequentially because `download_assets` errors when both URLs share a filename (`main.zip`). See `LanguageController._download`. +The HTML and PDF zips are fetched **concurrently**; the in-house `LanguageDownloader` owns staging directory naming, so there is no longer a same-filename collision to work around. See `lib/data/language_downloader.dart` and `LanguageController._download`. ## Data flow: checking for updates diff --git a/docs/background-tasks.md b/docs/background-tasks.md index 8b5a9a5..1a2a826 100644 --- a/docs/background-tasks.md +++ b/docs/background-tasks.md @@ -24,9 +24,10 @@ void backgroundTask() { ### `backgroundMain()` 1. Gets a fresh `SharedPreferences` instance (the background isolate has its own memory). -2. Builds a new `ProviderContainer` with `sharedPrefsProvider` overridden. -3. Calls `backgroundCheck(ref)`. -4. Currently, the task only checks for updates, never auto-downloads. +2. Resolves `getApplicationDocumentsDirectory()` and builds a real `LanguageDownloaderImpl` with a fresh `Dio` and the local `FileSystem`. +3. Builds a new `ProviderContainer` with `sharedPrefsProvider` **and** `languageDownloaderProvider` overridden — the background isolate has the same atomicity and concurrency guarantees as the foreground path. +4. Calls `backgroundCheck(ref)`. +5. Currently, the task only checks for updates, never auto-downloads. ### `backgroundCheck(ProviderContainer ref)` For each language code in `availableLanguagesProvider`: @@ -97,8 +98,8 @@ Two tests, both running on a real Android emulator (CI uses `reactivecircus/andr 2. **"Test synchronization with main isolate"** — full happy-path: mount `App4Training` with `appLanguage='de'`, open settings to warm `languageStatusProvider`, fire the background task, then open a worksheet and verify the `foundBgActivity` snackbar appears. -The fixtures use `MemoryFileSystem` and `MockDownloadAssetsController` from `lib/background/background_test.dart`. +The fixtures use `MemoryFileSystem` and `FakeLanguageDownloader` from `lib/background/background_test.dart`. ## Why the test fixtures live in `lib/` -`background_test.dart` lives in `lib/background/` rather than `test/` because the integration test imports it through the production `background_task.dart` path (`backgroundTestMain` calls `createTestFileSystem()` and `createMockDownloadAssetsController()` from inside the isolate). Test code under `test/` can't be imported by code under `lib/`, so the helpers have to be co-located with production. There's a comment to that effect at the top of the file. +`background_test.dart` lives in `lib/background/` rather than `test/` because the integration test imports it through the production `background_task.dart` path (`backgroundTestMain` calls `createTestFileSystem()` and constructs a `FakeLanguageDownloader` from inside the isolate). Test code under `test/` can't be imported by code under `lib/`, so the helpers have to be co-located with production. There's a comment to that effect at the top of the file. diff --git a/docs/conventions.md b/docs/conventions.md index d50f10e..3e4b737 100644 --- a/docs/conventions.md +++ b/docs/conventions.md @@ -55,7 +55,7 @@ The unwritten-but-consistent patterns you'll see across the codebase. Follow the The repo's existing code is sparing with comments — but where comments exist, they're substantive (e.g. `_installHtmlTableSemanticsFilter` and `sanitize()` have multi-paragraph docstrings explaining the "why"). Match that bar: - Don't write comments that restate the code. -- **Do** write comments when the *why* is non-obvious — e.g. why two `download_assets` calls instead of one, why an internal Riverpod import is needed, why a button is intentionally clickable while looking disabled. +- **Do** write comments when the *why* is non-obvious — e.g. why an internal Riverpod import is needed, why a button is intentionally clickable while looking disabled. - `// FIXME` is used to mark workarounds that should ideally move upstream (`pywikitools`, `flutter_html`). - `// TODO for version 0.9` marks the v0.9 milestone code. @@ -63,7 +63,8 @@ The repo's existing code is sparing with comments — but where comments exist, - **Don't** add a service/repository layer. Riverpod notifiers ARE the repository. - **Don't** reintroduce `custom_lint` (removed for analyzer-version conflict). -- **Don't** unpin `download_assets` or `test` without first reading the comments in `pubspec.yaml`. +- **Don't** unpin `test` without first reading the comments in `pubspec.yaml`. +- **Don't** reintroduce `download_assets` (replaced by the in-house `LanguageDownloader` in `lib/data/language_downloader.dart`, which gives atomic staging-and-swap downloads, a concurrency cap, and no same-filename workaround). - **Don't** delete commented-out v0.9 code without project agreement — it's the working scaffold for the next release. - **Don't** introduce a navigation library (`go_router`, etc.) — Navigator 1.0 is intentional and it interplays cleanly with the URL-mirroring `/view//` scheme. - **Don't** delete the `flutter_html_table` error filter — the four assertion variants are real and well-documented. diff --git a/docs/data-layer.md b/docs/data-layer.md index 7eeb1a1..9fac1a9 100644 --- a/docs/data-layer.md +++ b/docs/data-layer.md @@ -4,7 +4,7 @@ Everything in `lib/data/` and the file-system / GitHub interactions it owns. ## On-device file system layout -When a language is downloaded, `download_assets` creates a per-language assets directory (whose path is platform-defined; the Riverpod `LanguageController` doesn't care about the absolute path). Inside it: +When a language is downloaded, `LanguageDownloaderImpl` creates a per-language assets directory at `/assets-` (the root is resolved once at startup via `getApplicationDocumentsDirectory()` and passed into the downloader; the `LanguageController` itself reads the absolute path from `languageDownloaderProvider.pathFor(langCode)`). Inside it: ``` assets-/ @@ -100,14 +100,12 @@ The branch name (`main`) is hardcoded; switching branches would require a code c ## Download flow -`LanguageController.download({force=false})`: +`LanguageController.download()`: -1. (force) `deleteResources()` → `_controller.clearAssets()` and reset state to empty. -2. `_download()`: - - `DownloadAssetsController.startDownload([htmlZipUrl])` — fetches and unzips into `assets-/`. - - Then `startDownload([pdfZipUrl])` — same. **Two separate calls** because `download_assets` errors when both URLs share the filename `main.zip`. - - On any throw, calls `_controller.clearAssets()` so the next attempt starts clean. -3. `_load()`: +1. `_download()`: + - `await ref.read(languageDownloaderProvider).download(languageCode)`. The downloader handles the full atomic flow internally (see below); on success the on-disk directory at `pathFor(langCode)` is the new content. On failure it throws and the prior on-disk directory (if any) is left untouched. + - The caller wraps in a `try/catch` to preserve the existing `Future` shape. +2. `_load()`: - Recompute `assetsDirAlreadyExists()`. - Sum file sizes recursively → `sizeInKB`. - Read `contents.json` mtime → `downloadTimestamp` (UTC). @@ -115,9 +113,33 @@ The branch name (`main`) is hardcoded; switching branches would require a code c - Scan `pdf--main/` for `.pdf` files; match each `worksheet.pdf` → `Page.pdfPath`. - Scan `html--main/files/` for images; build `Map`. - `_checkConsistency()` warns about HTML files referenced but missing, or HTML files present but unreferenced. - - On any throw: log, call `_controller.clearAssets()`, reset to empty `Language`, return `false`. + - On any throw: log, call `deleteResources()` (which delegates to `languageDownloader.delete(languageCode)`), reset to empty `Language`, return `false`. -`init()` calls `_load()` only — no network. `lazyInit()` only checks for `contents.json` existence and returns a sparse `Language(languageCode, {}, [], {}, path, 0, timestamp)` without parsing — used by the background isolate which doesn't need page details. +`init()` calls `_load()` only — no network. `lazyInit()` only checks for `contents.json` existence and returns a sparse `Language(languageCode, {}, [], {}, path, 0, timestamp)` without parsing — used by the background isolate which doesn't need page details. Both `lazyInit()` and `_load()` read the path from `ref.read(languageDownloaderProvider).pathFor(languageCode)`. + +### Inside `LanguageDownloaderImpl` (`lib/data/language_downloader.dart`) + +The downloader owns the atomicity, concurrency, and crash-recovery guarantees so callers don't need to reason about partial state. One `download(langCode)` call performs: + +1. **Serialize** against any in-flight download — at most one zip pair is held in memory at a time (concurrency cap; protects low-end devices from rapid taps on the per-language download buttons). +2. **Crash recovery** — `rm -rf .staging` so a leftover from a prior crashed run never accumulates. +3. **Fetch concurrently** — `Future.wait` over two `dio.get(..., responseType: bytes)` calls for the HTML and PDF zips. +4. **Extract into staging** — `ZipDecoder().decodeBytes(...)` over each response, writing every `ArchiveFile` via the injected `FileSystem`. Not `extractArchiveToDisk` (it is tied to `dart:io` and not testable against `MemoryFileSystem`). +5. **Atomic swap** — rename existing `assets-` → `assets-.old` (if any), rename `.staging` → `assets-` (rename is atomic on a single filesystem), then best-effort `rm -rf .old`. +6. **On any throw mid-flight** — `rm -rf .staging` and `rethrow`. The prior on-disk directory (if any) is never touched until step 5, so a failed update never destroys offline content. + +The interface is intentionally small: + +```dart +abstract interface class LanguageDownloader { + String pathFor(String langCode); // synchronous; /assets- + Future isDownloaded(String langCode); // cheap directory-exists check + Future download(String langCode); // throws on failure; cleans up partial state + Future delete(String langCode); // idempotent +} +``` + +`dio` and `archive` exceptions bubble; the downloader does not introduce a custom exception type. There are no progress callbacks — the codebase has no callers for them. ## `pageContentProvider` diff --git a/docs/dependencies.md b/docs/dependencies.md index 00a8a6e..96addf4 100644 --- a/docs/dependencies.md +++ b/docs/dependencies.md @@ -18,14 +18,14 @@ Snapshot of the dependency tree as of `pubspec.yaml`. Group-by-purpose, with not | `riverpod: ^3.2.1` | Core Riverpod. **Note**: a few places import `package:riverpod/src/framework.dart` for `$RefArg` to access the family argument inside a `Notifier.build()` — this is a known v3 internal escape hatch | ### Content download / file system -| Package | Why | Notes | -| --- | --- | --- | -| `download_assets: 4.0.0` | Downloads + unzips the GitHub `archive/main.zip` per language | **Pinned to 4.0.0** — comment in `pubspec.yaml`: "the Content-Length check breaks GitHub archive URLs in 4.1.0" | -| `path_provider: ^2.0.11` | Used in `background_task.dart` (`getApplicationDocumentsDirectory()`) for the debug log file | +| Package | Why | +| --- | --- | +| `dio: ^5.0.0` | HTTP client used by `LanguageDownloaderImpl` to fetch each language's HTML + PDF zip from GitHub. Mocked in `language_downloader_test.dart` | +| `archive: ^4.0.0` | `ZipDecoder` used by `LanguageDownloaderImpl` to unpack the zip bytes into the staging directory via the injected `FileSystem` | +| `path_provider: ^2.0.11` | `getApplicationDocumentsDirectory()` resolved at startup (`main.dart`) and passed as the `root` for the real `LanguageDownloader`; also used by `background_task.dart` for the debug log file | | `path: ^1.8.2` | `join()` for cross-platform paths | -| `file: ^7.0.0` | Abstract `FileSystem` API. Backs `fileSystemProvider`, lets tests inject `MemoryFileSystem` | +| `file: ^7.0.0` | Abstract `FileSystem` API. Backs `fileSystemProvider`, lets tests (including `LanguageDownloader` tests) inject `MemoryFileSystem` | | `http: ^1.0.0` | The GitHub Commits API call (`Globals.getCommitsSince`). Backs `httpClientProvider` | -| `dio: ^5.0.0` | Transitive dependency surfaced in `pubspec.yaml`; used inside `download_assets` and referenced from tests via the `Dio` type | ### Persistence | Package | Why | @@ -60,7 +60,7 @@ These are listed under `dependencies:` rather than `dev_dependencies:` because s | Package | Why | | --- | --- | | `test: ^1.29.0` | **Pinned**: 1.30+ needs `test_api > 0.7.9` but `flutter_test` pins `test_api 0.7.9` | -| `mocktail: ^1.0.3` | Used by `MockDownloadAssetsController` etc. | +| `mocktail: ^1.0.3` | Used for `Dio` mocks in `language_downloader_test.dart` and ad-hoc mocks elsewhere | ## Dev dependencies diff --git a/docs/testing.md b/docs/testing.md index 6cce865..195e800 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -24,7 +24,8 @@ test/ ├── html_view_test.dart ├── language_selection_button_test.dart ├── languages_table_test.dart -├── languages_test.dart ← shared helpers (TestLanguageController, FakeDownloadAssetsController, ...) +├── language_downloader_test.dart ← LanguageDownloaderImpl contract: staging/swap, concurrency, crash recovery +├── languages_test.dart ← shared helpers (TestLanguageController, MemoryFileSystem fixtures, ...) ├── main_drawer_test.dart ├── routes_test.dart ├── set_update_prefs_page_test.dart @@ -68,13 +69,22 @@ generates `test/full_coverage_test.dart` that imports every file under `lib/`. T ## Shared test helpers +### `lib/background/background_test.dart` +- **`FakeLanguageDownloader`** — the single fake for the new `LanguageDownloader` interface. Implements `pathFor`, `isDownloaded`, `download`, `delete` against a configurable backing store, and exposes counters (e.g. `downloadCalls`) so tests can assert on call shape. Lives under `lib/` rather than `test/` because the integration test imports it via the production `background_task.dart` path — see [background-tasks.md](background-tasks.md). All download-related tests (`languages_test.dart`, `update_language_button_test.dart`, `download_language_button_test.dart`, `background_task_test.dart`, and the integration test) inject this fake via `languageDownloaderProvider.overrideWithValue(...)`. + ### `test/languages_test.dart` -- **`MockDownloadAssetsController`** (mocktail) — straight `Mock` impl. -- **`FakeDownloadAssetsController`** — minimal hand-written fake exposing `initCalled`, `clearAssetsCalled`, `startDownloadCalls` so tests can assert on call counts. -- **`ThrowingDownloadAssetsController`** — fake that throws from `startDownload` to simulate network failures. - **`TestLanguageController`** — overrides `LanguageController` to short-circuit `init`/`download` to a configured boolean. Used pervasively by widget tests that don't care about disk state. - **Helper functions** to build `MemoryFileSystem` instances pre-populated with the German/English fixtures from `test/assets-de/`, `test/assets-en/`. +### `test/language_downloader_test.dart` +Exercises `LanguageDownloaderImpl` directly against `MemoryFileSystem` + a mocked `Dio`. Covers the contract: +- **Happy path** — both zips download and extract into `pathFor(langCode)`. +- **Network failure** — one `dio.get` throws; nothing left at `pathFor(langCode)`, no `.staging` leftover. +- **Corrupted-zip failure** — bytes fail to decode; same cleanup invariant. (Note: `archive`'s `ZipDecoder` silently returns an empty archive for pure garbage — to force a real decode error the test feeds bytes with a `PK` header followed by garbage.) +- **Atomic update preserves prior data** — `pathFor(langCode)` is seeded with files; a failing download leaves the seed intact and readable. +- **Concurrent calls serialized** — two `download()` calls fired without awaiting; the second only starts after the first completes (observable via dio mock ordering). +- **Crash recovery** — a pre-seeded `.staging` directory is wiped by the next successful `download()`. + ### `test/updates_test.dart` - **`TestLanguageStatus`** — overrides `LanguageStatusNotifier` for tests that don't want to mock HTTP. - **`mockCheckResponse({'de': 2, 'en': 0, ...})`** — builds an `http.Client` (via `mocktail`) that returns the right number of fake commits per language code.