All notable changes to RecallForge will be documented in this file.
Nothing yet.
RecallForge 0.2.0 makes the project much more usable as a local multimodal memory MCP. This release improves cross-modal retrieval, adds a stronger MCP surface for agents, and lays the foundation for memory-level results instead of isolated file fragments.
Note: Python 3.12 and 3.13 supported. Python 3.14 is not yet supported due to the pinned pyarrow wheel range for this release.
- First-class
memory_idsupport in storage and search - Parent/child memory linkage for derived assets such as frames, transcripts, and document sections
- Memory-centric MCP capabilities including memory lookup and listing
memory://resources for stable memory access- Memory-level result rollup so related assets can surface as a single memory hit
- Hybrid search support for image and video retrieval paths
- Parallel
search_batchMCP tool for multi-query workflows explain_resultsMCP tool for search transparency and score inspection- Ingest-time image/video captioning to improve BM25 and keyword recall
- Optional vision-language query expansion and stronger cross-modal benchmark coverage
- PDF vision fallback so scanned or image-heavy documents remain searchable
- HTTP transport for the MCP server with persistent model loading
- Configurable model selection via environment variables and MCP config
- Qwen3.5-0.8B captioning for lighter-weight media understanding
- Version-aware server reporting and improved health/config surfaces
- Collection management API for better multi-namespace organization
- Fast media retrieval defaults for better real-world latency
- Query-side media handling fixes so reranking is no longer blind to media queries
- Thread-safe model swaps plus explicit memory cleanup
- Video frame sampling and cleanup fixes to reduce OOMs and tensor retention
- Expanded UAT, observability, score audits, and release validation coverage
- Trusted publishing workflow for tag-based PyPI releases
First public release. The local multimodal memory engine for AI agents.
Note: Python 3.12 and 3.13 supported. Python 3.14 is not yet supported due to pyarrow wheel availability.
- Cross-modal search across text, images, documents (PDF/DOCX/PPTX), and video in one unified query
- Shared embedding space via Qwen3-VL (2048-dim vectors for all modalities)
- 3-stage retrieval pipeline: embedding → reranking → query expansion (all multimodal)
- Tiered search modes: embed (~1.7GB), hybrid (~3.4GB), full (~4.4GB) on MLX 4-bit
- MLX backend for Apple Silicon with 4-bit and bf16 quantization
- PyTorch backend for CUDA, MPS, and CPU
- Auto-detection of best available backend
- First-run model download UX with progress reporting
- 17 MCP tools:
search,search_fts,search_vec,ingest,index_document,index_image,index_folder,memory_add,memory_update,memory_delete,status,rebuild_fts,list_collections,list_namespaces,batch,get_config,set_config - Structured error responses with 4 standard codes
- Batch operation support (up to 20 ops per call)
- Runtime config inspection and adjustment
recallforge index— index files and foldersrecallforge search— text, image, and video searchrecallforge serve— MCP server moderecallforge status— health checkrecallforge watch— folder monitoring with start/stop/list/status
- LanceDB + Tantivy FTS hybrid backend
- Schema migration for namespace columns on existing stores
- Content-addressable deduplication
- Collection and namespace support
- Warm search p50: 53ms (MLX 4-bit, Mac mini M4)
- Warm search p95: 55ms
- Cold start: 7.6s
- Text indexing: 5.0 docs/sec
- FTS miss fallback optimization (no expensive BM25 fallback on empty results)
- Bulk mode context manager for deferred FTS rebuilds during batch ingest
- COCO retrieval benchmark (50 images, MLX 4-bit embed mode):
- Text → Image: R@1 23.6%, R@5 36.8%, R@10 46.4%
- Image → Text: R@1 30.0%, R@5 42.0%, R@10 54.0%
- 256+ unit tests, 35 SQL validation tests, comprehensive UAT suite
- GitHub Actions CI (Python 3.12/3.13 matrix)
- LRU embedding cache for repeat queries
- Watch folder with path-keyed dedup and flush-on-shutdown
- Configurable max file size on ingest
- MIT License