ThemisDB

High-performance multi-model database with native AI/LLM integration

📚 Documentation · 🚀 Quick Start · 🛠️ Setup · ⚠️ Status · 🆘 Support · Release Notes

Note to AI vibe coding

The code in this repository was partially or fully generated by an AI tool (e.g., AI-assisted coding, vibe coding, or automated code generation). While efforts were made to ensure correctness and functionality, AI-generated code may: Lack modular consistency: Cross-module dependencies, naming conventions, or architectural patterns may be inconsistent or suboptimal. Contain subtle bugs: Logic errors, edge-case oversights, or inefficient patterns may persist. Require manual review: Human oversight is essential for refactoring, testing, and alignment with project standards. Be outdated: AI models may not reflect the latest best practices, libraries, or security standards. Use this code at your own risk. Contributions, feedback, and improvements are welcome to address these limitations.

⚠️ IMPORTANT: Module Status Snapshot (66/66)

This is an active development project. Current synchronized status snapshot (source-based):

✅ 15 modules are PRODUCTION_CANDIDATE
🟡 45 modules are HARDENING
🔴 2 modules are EXPERIMENTAL (llama_cpp, stable_diffusion)
⚪ 4 modules are THIN/PLACEHOLDER (ai_working, distributed_tensor, evaluation, retrieval)

See ROADMAP.md for the full 66-module table.

Evidence artifacts:

Documentation Sync (2026-05-26)

Root-level markdown documentation was reviewed and synchronized.
Current wire/themis verification baseline:
- cmake --build --preset windows-release --target themis_tests --parallel 16
- themis_tests --gtest_filter=WireProtocolServer.SingleThreadedIoContextPrunesSessionsAfterDisconnect
- ctest --preset windows-release -R ThemisWireProtocolV1Tests --output-on-failure
Recent technical hardening reflected in docs/changelog:
- fail-closed wire bootstrap behaviour retained for deprecated bridge-only setup
- single-threaded wire server session pruning regression covered by dedicated test
- multi_lora_manager opaque adapter handle consistency fix (void*)

Scanner Baseline Update (2026-06-11)

Aktueller Gap-Scan-Stand wird ueber die Worklist gepflegt:
- ai_working/gap_scan_report_ollama_gemma4.md
- ai_working/gap_scan_report_ollama_gemma4.smoke.md
Scope-Regel: themis_core actionable, third_party nur informativ.
Aktives Tracking-Issue fuer den aktuellen Baseline-Scope:
- #5475 ([P0-HIGH] INCLUDE Module - Current Gap Worklist Tracking (2026-06-11))
Konsolidierungsstatus GitHub-Issues:
- Historische v3-P0- und Cross-Module-Tracker wurden geschlossen (superseded by #5475).
- Duplikat-Tracker #5474 wurde geschlossen.
- Bewusst offen bleiben die Legacy-Umstellungs-Issues #5363 bis #5366.

What is ThemisDB?

ThemisDB is a high-performance multi-model database engine in active development that aims to combine relational, graph, vector, document, geospatial, and time-series storage in a single system with native AI/LLM integration.

Current Status (2026-06-14, source-evidence based): 66 modules are tracked in src; 15 are PRODUCTION_CANDIDATE, 45 are HARDENING, 2 are EXPERIMENTAL, and 4 are THIN/PLACEHOLDER. See ROADMAP.md for detailed per-module status.

Key capabilities at a glance:

Capability	Details
Multi-model storage	Relational · Graph · Vector (HNSW/FAISS) · Document · Geospatial · Time-series
ACID transactions	MVCC, SSI, 2PC, SAGA orchestration, HLC-based global ordering
Distributed	Raft consensus, mTLS replication, consistent-hash sharding, auto-failover
AI/LLM native	Embedded LLM inference (llama.cpp, ONNX), RAG pipeline, prompt engineering, LoRA fine-tuning
Full-text search	BM25 + vector hybrid search (RRF), faceted, conversational, multi-modal
Observability	Prometheus metrics, OpenTelemetry tracing, PagerDuty/Slack alerting
Security	AES-256-GCM field encryption, RLS, Zero-Trust policy, eIDAS timestamping, HSM/Vault
Editions	MINIMAL · COMMUNITY · ENTERPRISE · MILITARY · HYPERSCALER

Canonical Onboarding Path

For a consistent onboarding flow, use these pages in order:

QUICKSTART.md — install + first successful run
SETUP.md — complete local development environment
SUPPORT.md — support and escalation paths
RELEASE_STRATEGY.md — release lanes and version lifecycle
INDEX.md — full root navigation map

Installation

Docker (fastest)

docker pull ghcr.io/makr-code/themisdb:latest
docker run -d --name themisdb -p 8765:8765 -p 8766:8766 ghcr.io/makr-code/themisdb:latest

Connect via the wire protocol on port 8766 or the REST/HTTP API on port 8765.

Build from source

git clone https://github.com/makr-code/ThemisDB.git
cd ThemisDB
# Install dependencies and configure build environment
./scripts/setup-pre-commit.sh          # Linux/macOS
# pwsh ./scripts/setup-third-party.ps1  # all platforms (vcpkg, llama.cpp, ffmpeg)

cmake --preset linux-release        # Linux x64
cmake --build --preset linux-release

# Windows (run from VS Developer Command Prompt):
# cmake --preset windows-release && cmake --build --preset windows-release

See QUICKSTART.md for a step-by-step guide, and SETUP.md for a full development-environment walkthrough.

Editions

ThemisDB is available in five editions, selected at CMake build time:

Edition	Use case	Branch	Build flag
MINIMAL	Embedded / resource-constrained	`minimal`	`-DTHEMIS_EDITION=MINIMAL`
COMMUNITY	Open-source, self-hosted	`community`	`-DTHEMIS_EDITION=COMMUNITY`
ENTERPRISE	Commercial, SLA-backed	`enterprise`	`-DTHEMIS_EDITION=ENTERPRISE`
MILITARY	Hardened / air-gapped	`military`	`-DTHEMIS_EDITION=MILITARY`
HYPERSCALER	Cloud/OEM, Kubernetes operator	`hyperscaler`	`-DTHEMIS_EDITION=HYPERSCALER`

Feature sets are nested: MINIMAL ⊂ COMMUNITY ⊂ ENTERPRISE ⊂ HYPERSCALER.

See RELEASE_STRATEGY.md for the full feature comparison and edition matrix.

GitHub workflow and branch governance references:

BRANCHING_STRATEGY.md - canonical branch model (develop, minimal, community, enterprise, hyperscaler, military)
.github/GOVERNANCE.md - labels, milestones, and issue/PR metadata standards
.github/pull_request_template.md - required PR evidence sections

Usage

After startup, verify health and run a first query:

curl http://localhost:8765/health
curl -X POST http://localhost:8765/v2/query \
  -H 'Content-Type: application/json' \
  -d '{"query":"SELECT 1 AS hello"}'

Architecture

ThemisDB is organised into tracked source modules under src/, grouped into four logical layers:

┌─────────────────────────────────────────────────────┐
│  API Layer       REST · GraphQL · gRPC · Wire V2     │
├─────────────────────────────────────────────────────┤
│  Query Layer     AQL · Optimizer · Planner · Cache   │
├─────────────────────────────────────────────────────┤
│  Storage Layer   RocksDB · MVCC · WAL · Sharding     │
├─────────────────────────────────────────────────────┤
│  Distributed     Raft · Replication · Failover · CDC │
└─────────────────────────────────────────────────────┘

→ Full architecture reference: ARCHITECTURE.md
→ Module list and status: ROADMAP.md

Security Tiering Quick Reference

For security and hardening reviews, use the tier model (T0 Trusted Core -> T5 Plugin Boundary) as the default classification.

What	Where
Tier model and trust boundaries	ARCHITECTURE.md
Normative security rules per tier	SECURITY.md
Contributor checklist for tier/boundary evidence	CONTRIBUTING.md
PR template section (required for runtime changes)	.github/pull_request_template.md
Tier-to-test verification mapping	CTEST.md

Rule of thumb: architecture is layered, but security acceptance is tier-based.

Documentation

Document	Description
QUICKSTART.md	Get running in minutes
SETUP.md	Full development environment setup
ARCHITECTURE.md	System design and module overview
VERSIONING.md	Versioning policy and release cadence
RELEASE_STRATEGY.md	Branch model, edition matrix, CI/CD pipeline
CHANGELOG.md	Release notes (Keep a Changelog format)
PERFORMANCE_EXPECTATIONS.md	Benchmarks and performance targets
SOP.md	Standard operating procedures (release, hotfix, incident)
GOVERNANCE.md	Project governance: roles, decision-making, contribution policy
MAINTAINERS.md	Maintainer roster and module ownership
SECURITY.md	Security policy and vulnerability reporting
CONTRIBUTING.md	How to contribute
CODE_OF_CONDUCT.md	Community guidelines
SUPPORT.md	Where to get help
INDEX.md	Full project structure index
docs/	Extended documentation (API reference, guides, research)
compendium/	In-depth technical compendium

Versioning

ThemisDB follows Semantic Versioning 2.0.0. The current version is stored in the VERSION file and in CHANGELOG.md. Pre-release identifiers use the form -rcN (release candidate) or -alphaN / -betaN.

See VERSIONING.md for the full versioning policy.

Contributing

Contributions are welcome! Please read CONTRIBUTING.md before submitting a pull request. All participants are expected to follow our Code of Conduct.

Good first issues are tagged good first issue in the issue tracker.

Security

To report a security vulnerability, do not open a public issue. Follow the responsible disclosure process in SECURITY.md or use GitHub Security Advisories.

License

ThemisDB is released under the MIT License.

Quality Assurance & Gap Scanning

ThemisDB includes Gap Scanner V3 (GS3), a comprehensive multi-phase gap detection system with 46 specialized scanners organized across 4 phases.

Quick Start with GS3

# List all 46 scanners
python tools/gs3.py list-scanners

# Run fast scan on source code
python tools/gs3.py scan src include tests --scan-mode fast --output results.json

# Generate Markdown report
python tools/gs3.py report results.json --format md --output report.md

# Generate JSON report
python tools/gs3.py report results.json --format json

Scanner Organization

Phase	Category	Count	Focus
Phase 1	AI, Core C++, Checks	18	Baseline detection (AI-Vibe, memory, concurrency)
Phase 2	Safety	5	Exception safety, input validation, type safety
Phase 3	Security	7	Cryptography, data leaks, hardening
Phase 4	Design & Quality	16	Architecture rules, documentation standards

Documentation

tools/GS3_CLI_GUIDE.md — Complete CLI reference and usage guide
tools/GS3_COMPLETE_GUIDE.md — System architecture and scanner design
tools/GS3_PROJECT_COMPLETION_REPORT.md — Project deliverables and metrics
tools/legacy/LEGACY_SCANNER_MAPPING.md — Legacy code archival and migration info

Key Features

46 specialized scanners for AI-Vibe, C++, security, and design gaps
Dual-axis classification: Severity (CRITICAL/HIGH/MEDIUM/LOW) × Impact (CRITICAL/HIGH/MEDIUM/LOW/THIRD_PARTY)
Auto-discovery: Scanners automatically discovered from tools/scanners/
Multiple output formats: JSON (machine-readable) and Markdown (human-readable)
Scan modes: Fast (quick pass) and Thorough (detailed analysis)
Phase-based execution: Sequential scanning through phases 1-4

CI/CD Integration

# Fast scan for PR validation (< 3 minutes)
python tools/gs3.py scan src --scan-mode fast --output pr_scan.json

# Fail on critical blockers
if grep -q '"severity":"CRITICAL".*"impact_level":"CRITICAL"' pr_scan.json; then
  echo "FAILED: Critical blockers detected"
  exit 1
fi

See tools/GS3_CLI_GUIDE.md for more CI/CD examples.

Module Documentation

Per-module documentation lives in src/<module>/README.md and include/<module>/. This section is a navigation index.

Zuletzt geprueft (Root-Sync): 2026-06-21

Name		Name	Last commit message	Last commit date
Latest commit History 13,069 Commits
.continue		.continue
.devcontainer		.devcontainer
.github		.github
.tools		.tools
.vscode		.vscode
DeveloperGuide		DeveloperGuide
adapters		adapters
ai_context		ai_context
ai_working		ai_working
api		api
aql		aql
archive		archive
artifacts		artifacts
assets/ethics_ai		assets/ethics_ai
audit		audit
benchmarks		benchmarks
certs		certs
clients		clients
cmake		cmake
compendium		compendium
config		config
data		data
debian		debian
demo		demo
deploy		deploy
deployment		deployment
docker		docker
docs		docs
doxyfile_output		doxyfile_output
examples		examples
external		external
ffmpeg @ 7e3781e		ffmpeg @ 7e3781e
fuzz		fuzz
gh		gh
grafana		grafana
helm		helm
include		include
llama.cpp @ 1e8924f		llama.cpp @ 1e8924f
logs		logs
loras		loras
openapi		openapi
operator		operator
packaging		packaging
plugins		plugins
ports-overlays		ports-overlays
ports		ports
projects		projects
prometheus		prometheus
proto		proto
releases		releases
research		research
schemas		schemas
schulung		schulung
scripts		scripts
sdks		sdks
security		security
src		src
stable-diffusion.cpp @ 8afbeb6		stable-diffusion.cpp @ 8afbeb6
symbols		symbols
tests		tests
themis_admin_cache		themis_admin_cache
tools		tools
whisper.cpp @ 95ea8f9		whisper.cpp @ 95ea8f9
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.copilot-cross-compile-prompt.md		.copilot-cross-compile-prompt.md
.copilot-cross-compile-rules.json		.copilot-cross-compile-rules.json
.cppcheck		.cppcheck
.cppcheck-suppressions		.cppcheck-suppressions
.dockerignore		.dockerignore
.docs-validation.example.yml		.docs-validation.example.yml
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.gitmodules		.gitmodules
.license-policy.json		.license-policy.json
.markdownlint.json		.markdownlint.json
.pre-commit-config.yaml		.pre-commit-config.yaml
.secret-scan-allowlist.txt		.secret-scan-allowlist.txt
.secrets.baseline		.secrets.baseline
.secrets.example		.secrets.example
AI_VIBE_SCAN_REPORT.md		AI_VIBE_SCAN_REPORT.md
AQL_CONSOLIDATION_AUDIT_2026_06_18.md		AQL_CONSOLIDATION_AUDIT_2026_06_18.md
ARCHITECTURE.md		ARCHITECTURE.md
AUDIT.md		AUDIT.md
BRANCHING_STRATEGY.md		BRANCHING_STRATEGY.md
BRANCH_MIGRATION_INVENTORY.md		BRANCH_MIGRATION_INVENTORY.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CMAKE_HARDENING_PLAN.md		CMAKE_HARDENING_PLAN.md
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CMakeUserPresets.json.example		CMakeUserPresets.json.example
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CTEST.md		CTEST.md
DISTRIBUTED_TENSOR_SHARDING.md		DISTRIBUTED_TENSOR_SHARDING.md
DOCUMENTATION_AUDIT_2026_06_18.md		DOCUMENTATION_AUDIT_2026_06_18.md
Dockerfile		Dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ThemisDB

Note to AI vibe coding

⚠️ IMPORTANT: Module Status Snapshot (66/66)

Documentation Sync (2026-05-26)

Scanner Baseline Update (2026-06-11)

What is ThemisDB?

Canonical Onboarding Path

Installation

Docker (fastest)

Build from source

Editions

Usage

Architecture

Security Tiering Quick Reference

Documentation

Versioning

Contributing

Security

License

Quality Assurance & Gap Scanning

Quick Start with GS3

Scanner Organization

Documentation

Key Features

CI/CD Integration

Module Documentation

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ThemisDB

Note to AI vibe coding

⚠️ IMPORTANT: Module Status Snapshot (66/66)

Documentation Sync (2026-05-26)

Scanner Baseline Update (2026-06-11)

What is ThemisDB?

Canonical Onboarding Path

Installation

Docker (fastest)

Build from source

Editions

Usage

Architecture

Security Tiering Quick Reference

Documentation

Versioning

Contributing

Security

License

Quality Assurance & Gap Scanning

Quick Start with GS3

Scanner Organization

Documentation

Key Features

CI/CD Integration

Module Documentation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages