High-performance multi-model database with native AI/LLM integration
📚 Documentation · 🚀 Quick Start · 🛠️ Setup ·
The code in this repository was partially or fully generated by an AI tool (e.g., AI-assisted coding, vibe coding, or automated code generation). While efforts were made to ensure correctness and functionality, AI-generated code may: Lack modular consistency: Cross-module dependencies, naming conventions, or architectural patterns may be inconsistent or suboptimal. Contain subtle bugs: Logic errors, edge-case oversights, or inefficient patterns may persist. Require manual review: Human oversight is essential for refactoring, testing, and alignment with project standards. Be outdated: AI models may not reflect the latest best practices, libraries, or security standards. Use this code at your own risk. Contributions, feedback, and improvements are welcome to address these limitations.
This is an active development project. Current synchronized status snapshot (source-based):
- ✅ 15 modules are
PRODUCTION_CANDIDATE - 🟡 45 modules are
HARDENING - 🔴 2 modules are
EXPERIMENTAL(llama_cpp,stable_diffusion) - ⚪ 4 modules are
THIN/PLACEHOLDER(ai_working,distributed_tensor,evaluation,retrieval)
See ROADMAP.md for the full 66-module table.
Evidence artifacts:
- logs/module_status_66_refined.csv
- logs/module_test_include_refs_66.csv
- logs/module_status_66_classified_v2.csv
- Root-level markdown documentation was reviewed and synchronized.
- Current wire/themis verification baseline:
cmake --build --preset windows-release --target themis_tests --parallel 16themis_tests --gtest_filter=WireProtocolServer.SingleThreadedIoContextPrunesSessionsAfterDisconnectctest --preset windows-release -R ThemisWireProtocolV1Tests --output-on-failure
- Recent technical hardening reflected in docs/changelog:
- fail-closed wire bootstrap behaviour retained for deprecated bridge-only setup
- single-threaded wire server session pruning regression covered by dedicated test
multi_lora_manageropaque adapter handle consistency fix (void*)
- Aktueller Gap-Scan-Stand wird ueber die Worklist gepflegt:
ai_working/gap_scan_report_ollama_gemma4.mdai_working/gap_scan_report_ollama_gemma4.smoke.md
- Scope-Regel:
themis_coreactionable,third_partynur informativ. - Aktives Tracking-Issue fuer den aktuellen Baseline-Scope:
#5475([P0-HIGH] INCLUDE Module - Current Gap Worklist Tracking (2026-06-11))
- Konsolidierungsstatus GitHub-Issues:
- Historische v3-P0- und Cross-Module-Tracker wurden geschlossen (superseded by
#5475). - Duplikat-Tracker
#5474wurde geschlossen. - Bewusst offen bleiben die Legacy-Umstellungs-Issues
#5363bis#5366.
- Historische v3-P0- und Cross-Module-Tracker wurden geschlossen (superseded by
ThemisDB is a high-performance multi-model database engine in active development that aims to combine relational, graph, vector, document, geospatial, and time-series storage in a single system with native AI/LLM integration.
Current Status (2026-06-14, source-evidence based): 66 modules are tracked in src; 15 are PRODUCTION_CANDIDATE, 45 are HARDENING, 2 are EXPERIMENTAL, and 4 are THIN/PLACEHOLDER. See ROADMAP.md for detailed per-module status.
Key capabilities at a glance:
| Capability | Details |
|---|---|
| Multi-model storage | Relational · Graph · Vector (HNSW/FAISS) · Document · Geospatial · Time-series |
| ACID transactions | MVCC, SSI, 2PC, SAGA orchestration, HLC-based global ordering |
| Distributed | Raft consensus, mTLS replication, consistent-hash sharding, auto-failover |
| AI/LLM native | Embedded LLM inference (llama.cpp, ONNX), RAG pipeline, prompt engineering, LoRA fine-tuning |
| Full-text search | BM25 + vector hybrid search (RRF), faceted, conversational, multi-modal |
| Observability | Prometheus metrics, OpenTelemetry tracing, PagerDuty/Slack alerting |
| Security | AES-256-GCM field encryption, RLS, Zero-Trust policy, eIDAS timestamping, HSM/Vault |
| Editions | MINIMAL · COMMUNITY · ENTERPRISE · MILITARY · HYPERSCALER |
For a consistent onboarding flow, use these pages in order:
- QUICKSTART.md — install + first successful run
- SETUP.md — complete local development environment
- SUPPORT.md — support and escalation paths
- RELEASE_STRATEGY.md — release lanes and version lifecycle
- INDEX.md — full root navigation map
docker pull ghcr.io/makr-code/themisdb:latest
docker run -d --name themisdb -p 8765:8765 -p 8766:8766 ghcr.io/makr-code/themisdb:latestConnect via the wire protocol on port 8766 or the REST/HTTP API on port 8765.
git clone https://github.com/makr-code/ThemisDB.git
cd ThemisDB
# Install dependencies and configure build environment
./scripts/setup-pre-commit.sh # Linux/macOS
# pwsh ./scripts/setup-third-party.ps1 # all platforms (vcpkg, llama.cpp, ffmpeg)
cmake --preset linux-release # Linux x64
cmake --build --preset linux-release
# Windows (run from VS Developer Command Prompt):
# cmake --preset windows-release && cmake --build --preset windows-releaseSee QUICKSTART.md for a step-by-step guide, and SETUP.md for a full development-environment walkthrough.
ThemisDB is available in five editions, selected at CMake build time:
| Edition | Use case | Branch | Build flag |
|---|---|---|---|
| MINIMAL | Embedded / resource-constrained | minimal |
-DTHEMIS_EDITION=MINIMAL |
| COMMUNITY | Open-source, self-hosted | community |
-DTHEMIS_EDITION=COMMUNITY |
| ENTERPRISE | Commercial, SLA-backed | enterprise |
-DTHEMIS_EDITION=ENTERPRISE |
| MILITARY | Hardened / air-gapped | military |
-DTHEMIS_EDITION=MILITARY |
| HYPERSCALER | Cloud/OEM, Kubernetes operator | hyperscaler |
-DTHEMIS_EDITION=HYPERSCALER |
Feature sets are nested: MINIMAL ⊂ COMMUNITY ⊂ ENTERPRISE ⊂ HYPERSCALER.
See RELEASE_STRATEGY.md for the full feature comparison and edition matrix.
GitHub workflow and branch governance references:
- BRANCHING_STRATEGY.md - canonical branch model (
develop,minimal,community,enterprise,hyperscaler,military) - .github/GOVERNANCE.md - labels, milestones, and issue/PR metadata standards
- .github/pull_request_template.md - required PR evidence sections
After startup, verify health and run a first query:
curl http://localhost:8765/health
curl -X POST http://localhost:8765/v2/query \
-H 'Content-Type: application/json' \
-d '{"query":"SELECT 1 AS hello"}'ThemisDB is organised into tracked source modules under src/, grouped into four logical layers:
┌─────────────────────────────────────────────────────┐
│ API Layer REST · GraphQL · gRPC · Wire V2 │
├─────────────────────────────────────────────────────┤
│ Query Layer AQL · Optimizer · Planner · Cache │
├─────────────────────────────────────────────────────┤
│ Storage Layer RocksDB · MVCC · WAL · Sharding │
├─────────────────────────────────────────────────────┤
│ Distributed Raft · Replication · Failover · CDC │
└─────────────────────────────────────────────────────┘
→ Full architecture reference: ARCHITECTURE.md
→ Module list and status: ROADMAP.md
For security and hardening reviews, use the tier model (T0 Trusted Core -> T5 Plugin Boundary) as the default classification.
| What | Where |
|---|---|
| Tier model and trust boundaries | ARCHITECTURE.md |
| Normative security rules per tier | SECURITY.md |
| Contributor checklist for tier/boundary evidence | CONTRIBUTING.md |
| PR template section (required for runtime changes) | .github/pull_request_template.md |
| Tier-to-test verification mapping | CTEST.md |
Rule of thumb: architecture is layered, but security acceptance is tier-based.
| Document | Description |
|---|---|
| QUICKSTART.md | Get running in minutes |
| SETUP.md | Full development environment setup |
| ARCHITECTURE.md | System design and module overview |
| VERSIONING.md | Versioning policy and release cadence |
| RELEASE_STRATEGY.md | Branch model, edition matrix, CI/CD pipeline |
| CHANGELOG.md | Release notes (Keep a Changelog format) |
| PERFORMANCE_EXPECTATIONS.md | Benchmarks and performance targets |
| SOP.md | Standard operating procedures (release, hotfix, incident) |
| GOVERNANCE.md | Project governance: roles, decision-making, contribution policy |
| MAINTAINERS.md | Maintainer roster and module ownership |
| SECURITY.md | Security policy and vulnerability reporting |
| CONTRIBUTING.md | How to contribute |
| CODE_OF_CONDUCT.md | Community guidelines |
| SUPPORT.md | Where to get help |
| INDEX.md | Full project structure index |
| docs/ | Extended documentation (API reference, guides, research) |
| compendium/ | In-depth technical compendium |
ThemisDB follows Semantic Versioning 2.0.0. The current version is stored in the VERSION file and in CHANGELOG.md. Pre-release identifiers use the form -rcN (release candidate) or -alphaN / -betaN.
See VERSIONING.md for the full versioning policy.
Contributions are welcome! Please read CONTRIBUTING.md before submitting a pull request. All participants are expected to follow our Code of Conduct.
Good first issues are tagged good first issue in the issue tracker.
To report a security vulnerability, do not open a public issue. Follow the responsible disclosure process in SECURITY.md or use GitHub Security Advisories.
ThemisDB is released under the MIT License.
ThemisDB includes Gap Scanner V3 (GS3), a comprehensive multi-phase gap detection system with 46 specialized scanners organized across 4 phases.
# List all 46 scanners
python tools/gs3.py list-scanners
# Run fast scan on source code
python tools/gs3.py scan src include tests --scan-mode fast --output results.json
# Generate Markdown report
python tools/gs3.py report results.json --format md --output report.md
# Generate JSON report
python tools/gs3.py report results.json --format json| Phase | Category | Count | Focus |
|---|---|---|---|
| Phase 1 | AI, Core C++, Checks | 18 | Baseline detection (AI-Vibe, memory, concurrency) |
| Phase 2 | Safety | 5 | Exception safety, input validation, type safety |
| Phase 3 | Security | 7 | Cryptography, data leaks, hardening |
| Phase 4 | Design & Quality | 16 | Architecture rules, documentation standards |
- tools/GS3_CLI_GUIDE.md — Complete CLI reference and usage guide
- tools/GS3_COMPLETE_GUIDE.md — System architecture and scanner design
- tools/GS3_PROJECT_COMPLETION_REPORT.md — Project deliverables and metrics
- tools/legacy/LEGACY_SCANNER_MAPPING.md — Legacy code archival and migration info
- 46 specialized scanners for AI-Vibe, C++, security, and design gaps
- Dual-axis classification: Severity (CRITICAL/HIGH/MEDIUM/LOW) × Impact (CRITICAL/HIGH/MEDIUM/LOW/THIRD_PARTY)
- Auto-discovery: Scanners automatically discovered from
tools/scanners/ - Multiple output formats: JSON (machine-readable) and Markdown (human-readable)
- Scan modes: Fast (quick pass) and Thorough (detailed analysis)
- Phase-based execution: Sequential scanning through phases 1-4
# Fast scan for PR validation (< 3 minutes)
python tools/gs3.py scan src --scan-mode fast --output pr_scan.json
# Fail on critical blockers
if grep -q '"severity":"CRITICAL".*"impact_level":"CRITICAL"' pr_scan.json; then
echo "FAILED: Critical blockers detected"
exit 1
fiSee tools/GS3_CLI_GUIDE.md for more CI/CD examples.
Per-module documentation lives in
src/<module>/README.mdandinclude/<module>/. This section is a navigation index.
Zuletzt geprueft (Root-Sync): 2026-06-21