Skip to content

v0.3.21

Choose a tag to compare

@github-actions github-actions released this 23 Jan 06:04
· 324 commits to main since this release
0604ab0

Release v0.3.21 - SQLite Storage Layer & Migration Framework

Released: YYYY-MM-DD

Overview

Major architectural upgrade introducing SQLite as a high-performance index layer while maintaining JSON files as the source of truth. This release adds canonical ULID-based identities, full-text search via FTS5, recursive CTEs for path finding, and a comprehensive data migration framework.

🎉 New Features

SQLite Storage Layer

  • Added SQLite database (data/sparsetree.db) as fast query index
  • FTS5 full-text search for person names, bios, and occupations
  • Recursive CTEs for efficient ancestor/descendant path finding
  • WAL mode for better read concurrency
  • Auto-enables when database exists with data, falls back to JSON otherwise

Canonical Identity System

  • ULID-based canonical IDs (26-char, sortable, collision-resistant)
  • External identity mappings for multiple providers (FamilySearch, Ancestry, WikiTree, 23andMe)
  • Bidirectional ID lookup with in-memory LRU cache
  • Confidence scoring for identity assertions

Content-Addressed Blob Storage

  • SHA-256 hash-based media storage (data/blobs/{hash[:2]}/{hash}.{ext})
  • Automatic deduplication of duplicate photos
  • Media records linked to persons with primary photo support

Data Migration Framework

  • Automatic schema and data migrations on update
  • Migration tracking in data/.data-version and SQLite migration table
  • Dry-run mode for previewing changes
  • Rollback support for reversible migrations
  • Commands: npm run migrate, npm run migrate:status, npm run migrate:dry-run

Update Script

  • New ./update.sh for one-command updates
  • Pulls latest code, installs deps, builds, migrates, restarts
  • Supports --dry-run, --no-restart, --branch=NAME options
  • Safe: checks for uncommitted changes before updating

Simplified Genealogy Provider UI (v0.3.10)

  • Consolidated genealogy provider management to a single /providers/genealogy page
  • Removed over-engineered /providers/scraper and /providers/genealogy/:id/edit routes
  • Simplified BrowserSettingsPage to focus on CDP connection settings only
  • Added "Login with Google" SSO option for FamilySearch
  • Browser connection controls now available directly on the providers page
  • Provider credentials and auto-login moved to the consolidated page

Fixed Database Refresh Hang (v0.3.11)

  • Fixed refresh button hanging the server on large trees (138k+ persons)
  • Changed recursive CTE query to use database_membership table when available
  • Added hard limit (50 generations, 500k persons) to prevent runaway queries
  • Converted refresh to SSE-based background task to avoid HTTP timeouts
  • Client now uses EventSource for non-blocking refresh with progress updates

Documentation Restructure (v0.3.12)

  • Created docs/ folder with modular documentation:
    • architecture.md - Data model, storage, identity system
    • api.md - API endpoint reference
    • cli.md - CLI command reference
    • development.md - Development setup guide
    • providers.md - Genealogy provider configuration
    • roadmap.md - Detailed phase documentation
  • Simplified PLAN.md to high-level summary with links
  • Streamlined CLAUDE.md with quick reference format
  • Simplified README.md with user-focused content
  • Added Phase 17 (Socket.IO real-time event system) to roadmap

Project Structure Consolidation (v0.3.13)

  • Moved root lib/ directory into server/src/lib/:
    • lib/config.ts - Application configuration with TypeScript types
    • lib/graph/ - Path finding algorithms (shortest, longest, random)
    • lib/familysearch/ - FamilySearch API client, fetcher, and transformer
    • lib/sqlite-writer.ts - SQLite dual-write logic during indexing
  • Moved root CLI scripts to scripts/:
    • index.ts - Main ancestry indexer
    • find.ts - Path finder
    • print.ts - Chronological print
    • purge.ts - Purge cached records
    • prune.ts - Prune orphan files
    • rebuild.ts - Rebuild databases
    • migrate-favorites.ts - Favorites migration
  • Added server/src/utils/sleep.ts and randInt.ts utilities
  • Updated indexer.service.ts to spawn npx tsx scripts/index.ts
  • Removed deprecated tsv.js (redundant with /api/export/:dbId/tsv)
  • All scripts now run via npx tsx for TypeScript support

Socket.IO Real-Time Events (v0.3.14)

  • Implemented Socket.IO for bidirectional real-time communication
  • Added server/src/services/socket.service.ts with room-based event broadcasting
  • Added client/src/services/socket.ts singleton socket client
  • Added client/src/hooks/useSocket.ts with React hooks for socket events:
    • useSocketConnection() - Manage socket connection lifecycle
    • useSocketEvent() - Subscribe to specific socket events
    • useDatabaseEvents() - Database refresh notifications
    • useBrowserEvents() - Browser status updates
    • useIndexerEvents() - Indexing progress
  • Converted database refresh from SSE to Socket.IO events
  • Browser status broadcasts via Socket.IO in addition to SSE (backwards compatible)
  • Added in-memory LRU cache for SQL queries (server/src/services/cache.service.ts):
    • Configurable TTL and max size
    • Separate caches for queries, persons, and lists
    • Cache invalidation per database or person
    • Cache statistics for monitoring
  • Optimized /api/favorites endpoint:
    • Replaced N+1 queries with single JOIN query
    • Removed loading of entire database contents
    • Added index on favorite.added_at for faster sorting
    • Added person_count column to database_info table

🔧 Improvements

Service Layer Updates

  • database.service.ts - Queries from SQLite with JSON fallback
  • search.service.ts - Uses FTS5 for text search, supports cross-database global search
  • path.service.ts - Recursive CTEs for shortest/longest/random paths
  • favorites.service.ts - SQLite storage with JSON backup
  • augmentation.service.ts - External identity registration

Schema Design

  • Normalized tables: person, external_identity, parent_edge, spouse_edge
  • Extensible claims system for facts with provenance
  • Vital events with date parsing (supports BC notation)
  • Database membership for multi-tree support
  • Full schema in server/src/db/schema.sql

Developer Experience

  • Updated CLAUDE.md with comprehensive migration documentation
  • Added npm scripts for migration management
  • Better error messages for ID resolution failures

📦 Installation

git clone https://github.com/atomantic/SparseTree.git
cd SparseTree
npm run install:all
npm run migrate
pm2 start ecosystem.config.cjs

🔄 Upgrading from v0.2.x

Run the update script to automatically migrate:

./update.sh

Or manually:

git pull origin main
npm install
npm run build
npm run migrate
pm2 restart ecosystem.config.cjs

🔗 Full Changelog

Full Diff: v0.2.11...v0.3.21