@@ -33,6 +33,7 @@ High-level project roadmap. For detailed phase documentation, see [docs/roadmap.
3333| 15.13 | Provider comparison table + LinkedIn | ✅ |
3434| 15.14 | Code quality refactoring | 📋 |
3535| 15.16 | Ancestry photo upload | ✅ |
36+ | 15.17 | Data integrity + bulk discovery | ✅ |
3637| 16 | Multi-platform sync architecture | 📋 |
3738| 17 | Real-time event system (Socket.IO) | 📋 |
3839
@@ -255,6 +256,52 @@ See [docs/architecture.md](./docs/architecture.md) for full details.
255256
256257## Next Steps
257258
259+ ### Phase 15.17: Data Integrity Page + Bulk Discovery
260+
261+ Database maintenance dashboard with automated parent ID discovery:
262+
263+ - ** Integrity Service** : ` integrity.service.ts ` - SQL-based checks for data quality
264+ - ` getIntegritySummary() ` - Counts for all check types
265+ - ` getProviderCoverageGaps() ` - Persons with some but not all provider links
266+ - ` getParentLinkageGaps() ` - Parent edges where parent lacks provider link the child has
267+ - ` getOrphanedEdges() ` - Parent edges referencing non-existent person records
268+ - ` getStaleProviderData() ` - Provider cache files older than N days
269+ - ** Bulk Discovery Service** : ` bulk-discovery.service.ts ` - Database-wide parent ID discovery
270+ - Async generator yielding ` BulkDiscoveryProgress ` events for SSE streaming
271+ - Deduplicates by child_id (one scrape discovers both parents)
272+ - Reuses existing ` parentDiscoveryService.discoverParentIds() ` per child
273+ - Rate limited via ` PROVIDER_DEFAULTS[provider].rateLimitDefaults `
274+ - In-memory cancellation via ` Set<operationId> ` checked between iterations
275+ - ** API Endpoints** (` /api/integrity/:dbId ` ):
276+ - ` GET / ` - Full integrity summary
277+ - ` GET /coverage ` - Provider coverage gaps (?providers=fs,ancestry)
278+ - ` GET /parents ` - Parent linkage gaps (?provider=familysearch)
279+ - ` GET /orphans ` - Orphaned edges
280+ - ` GET /stale ` - Stale records (?days=30)
281+ - ` POST /discover-all ` - Start bulk discovery
282+ - ` GET /discover-all/events ` - SSE stream for progress
283+ - ` POST /discover-all/cancel ` - Cancel running operation
284+ - ** UI** : ` IntegrityPage.tsx ` at ` /db/:dbId/integrity `
285+ - Summary cards (4 check types with counts, clickable)
286+ - Tabbed interface: Parents | Coverage | Orphans | Stale
287+ - Parents tab: provider selector, "Discover All" button with SSE progress bar + cancel
288+ - Coverage tab: table of persons with linked/missing provider badges
289+ - Orphans tab: table of broken parent edges
290+ - Stale tab: configurable days threshold, table with age coloring
291+ - Sidebar nav item with ShieldCheck icon
292+ - ** Shared Types** : ` IntegritySummary ` , ` ProviderCoverageGap ` , ` ParentLinkageGap ` , ` OrphanedEdge ` , ` StaleRecord ` , ` BulkDiscoveryProgress `
293+ - ** Files Created** :
294+ - ` server/src/services/integrity.service.ts `
295+ - ` server/src/services/bulk-discovery.service.ts `
296+ - ` server/src/routes/integrity.routes.ts `
297+ - ` client/src/components/integrity/IntegrityPage.tsx `
298+ - ** Files Modified** :
299+ - ` shared/src/index.ts ` - New types
300+ - ` server/src/index.ts ` - Route mount
301+ - ` client/src/services/api.ts ` - API methods + type re-exports
302+ - ` client/src/App.tsx ` - Route
303+ - ` client/src/components/layout/Sidebar.tsx ` - Nav item
304+
258305### Phase 15.14: Code Quality Refactoring (Pre-Phase 16 Cleanup)
259306
260307Code audit identified DRY/YAGNI/performance issues to address before Phase 16:
0 commit comments