One API. Multiple engines. Zero friction.
UniOCR is a unified, multilingual OCR abstraction layer that wraps best-in-class OCR engines behind a single, clean interface. Throw any image or PDF at it — get back structured text, Markdown, and layout blocks — regardless of which engine runs under the hood.
Built for developers, AI agents, and automation pipelines (n8n, Dify, Telegram bots, etc.).
- 🖥️ Stunning Enterprise Dashboard — A fully re-engineered Glassmorphism Web UI featuring an interactive OCR playground, API generator, and live system monitoring.
- 📊 Geek-Level Hardware Radar — Real-time backend polling of physical sensor data: CPU/GPU frequencies, RAM/Swap allocation, Apple Neural Engine status, and active AI model library versions.
- 🔐 Military-Grade Security — Built-in local SQLite persistence. Full support for 2FA (TOTP), Admin master passwords, and one-click toggles between public/private API access.
- 🔑 Seamless API Key Management — Issue and revoke API Tokens directly from the UI, with auto-generated ready-to-use
curlsnippets for instant integration testing. - 🔌 Pluggable Engines — PaddleOCR-VL (deep document AI) and Apple Vision (native macOS) with automatic priority fallback.
- ⚡ Zero-Config Acceleration — Auto-detects Apple Silicon → launches MLX-VLM → offloads to Neural Engine (NPU).
- 🚀 Zero-Delay Smart Cache (LRU) — Instantaneous format switching (TXT, JSON, MD, PDF download/preview) for recent files without re-running the neural network.
- 🐳 Docker Ready — Single-command deployment via Docker Compose for production-grade frontend & backend.
# Core only (lightweight)
pip install uniocr
# With PaddleOCR-VL (powerful document AI, ~1.8 GB model download on first run)
pip install "uniocr[paddle]"
# With Apple Vision (macOS only, uses built-in system OCR)
pip install "uniocr[apple]"
# Everything (Recommended for Dashboard & API)
pip install "uniocr[all]"# Use Docker Compose (pulls and runs all components instantly)
curl -O https://raw.githubusercontent.com/yuanweize/uni-ocr/master/docker-compose.yml
docker compose up -d
# Check it's running
curl http://localhost:8000/healthfrom uniocr import UniOCR
ocr = UniOCR(engine="auto") # Auto-selects best available engine
doc = ocr.extract("invoice.pdf")
print(doc.text) # Plain text
print(doc.markdown) # Structured Markdown
print(doc.to_dict()) # JSON-serialisable dict# Start the full Web UI Console & API server
uniocr serve --port 8000
# Extract text (outputs Markdown by default)
uniocr extract document.pdf -o result.md
# Generate a Searchable PDF
uniocr extract input_image.jpg -o output_searchable.pdfStart the server:
uniocr serve --port 8000- Web UI Console:
http://localhost:8000/ - System Settings & Radar:
http://localhost:8000/settings - Interactive API Docs:
http://localhost:8000/docs
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check & engine list |
POST |
/extract |
Extract text from uploaded file (JSON/Markdown) |
POST |
/extract/pdf |
Extract text and return a Searchable PDF file |
POST |
/extract/url |
Extract text from URL |
(If Public API Access is disabled, these endpoints require an Authorization: Bearer <API_KEY> header).
git clone https://github.com/yuanweize/uni-ocr.git
cd uni-ocr
docker compose up -d --build| Priority | Engine | Best for | Speed |
|---|---|---|---|
| 1 | PaddleOCR-VL + MLX-VLM | Complex layouts, tables, formulas, 109 languages | ⚡⚡ |
| 2 | PaddleOCR-VL (CPU) | Same capabilities, without MLX acceleration | ⚡ |
| 3 | Apple Vision | Simple text, macOS only, instant | ⚡⚡⚡ |
Apple Silicon users: when
mlx-vlmis installed, UniOCR automatically starts an MLX-VLM server for Neural Engine acceleration. No configuration needed.
MIT © 2026 Weize Yuan