GitHub - yuanweize/uni-ocr: UniOCR - Unified multilingual OCR service. Supports PaddleOCR / Apple Vision engines with deep MLX-VLM neural network hardware acceleration. A blazing-fast, local-first API for enterprise AI workflows (n8n/Bob).

One API. Multiple engines. Zero friction.

UniOCR is a unified, multilingual OCR abstraction layer that wraps best-in-class OCR engines behind a single, clean interface. Throw any image or PDF at it — get back structured text, Markdown, and layout blocks — regardless of which engine runs under the hood.

Built for developers, AI agents, and automation pipelines (n8n, Dify, Telegram bots, etc.).

✨ V3.0 Pro Max Highlights

🖥️ Stunning Enterprise Dashboard — A fully re-engineered Glassmorphism Web UI featuring an interactive OCR playground, API generator, and live system monitoring.
📊 Geek-Level Hardware Radar — Real-time backend polling of physical sensor data: CPU/GPU frequencies, RAM/Swap allocation, Apple Neural Engine status, and active AI model library versions.
🔐 Military-Grade Security — Built-in local SQLite persistence. Full support for 2FA (TOTP), Admin master passwords, and one-click toggles between public/private API access.
🔑 Seamless API Key Management — Issue and revoke API Tokens directly from the UI, with auto-generated ready-to-use curl snippets for instant integration testing.
🔌 Pluggable Engines — PaddleOCR-VL (deep document AI) and Apple Vision (native macOS) with automatic priority fallback.
⚡ Zero-Config Acceleration — Auto-detects Apple Silicon → launches MLX-VLM → offloads to Neural Engine (NPU).
🚀 Zero-Delay Smart Cache (LRU) — Instantaneous format switching (TXT, JSON, MD, PDF download/preview) for recent files without re-running the neural network.
🐳 Docker Ready — Single-command deployment via Docker Compose for production-grade frontend & backend.

🚀 Quick Start

Option 1: pip install

# Core only (lightweight)
pip install uniocr

# With PaddleOCR-VL (powerful document AI, ~1.8 GB model download on first run)
pip install "uniocr[paddle]"

# With Apple Vision (macOS only, uses built-in system OCR)
pip install "uniocr[apple]"

# Everything (Recommended for Dashboard & API)
pip install "uniocr[all]"

Option 2: Docker (recommended for servers)

# Use Docker Compose (pulls and runs all components instantly)
curl -O https://raw.githubusercontent.com/yuanweize/uni-ocr/master/docker-compose.yml
docker compose up -d

# Check it's running
curl http://localhost:8000/health

📖 Usage

Python SDK

from uniocr import UniOCR

ocr = UniOCR(engine="auto")          # Auto-selects best available engine
doc = ocr.extract("invoice.pdf")

print(doc.text)                       # Plain text
print(doc.markdown)                   # Structured Markdown
print(doc.to_dict())                  # JSON-serialisable dict

CLI

# Start the full Web UI Console & API server
uniocr serve --port 8000

# Extract text (outputs Markdown by default)
uniocr extract document.pdf -o result.md

# Generate a Searchable PDF
uniocr extract input_image.jpg -o output_searchable.pdf

REST API & Dashboard Console

Start the server:

uniocr serve --port 8000

Web UI Console: http://localhost:8000/
System Settings & Radar: http://localhost:8000/settings
Interactive API Docs: http://localhost:8000/docs

Endpoints

Method	Endpoint	Description
`GET`	`/health`	Health check & engine list
`POST`	`/extract`	Extract text from uploaded file (JSON/Markdown)
`POST`	`/extract/pdf`	Extract text and return a Searchable PDF file
`POST`	`/extract/url`	Extract text from URL

(If Public API Access is disabled, these endpoints require an Authorization: Bearer <API_KEY> header).

🐳 Docker Build

git clone https://github.com/yuanweize/uni-ocr.git
cd uni-ocr
docker compose up -d --build

🔧 Engine Priority

Priority	Engine	Best for	Speed
1	PaddleOCR-VL + MLX-VLM	Complex layouts, tables, formulas, 109 languages	⚡⚡
2	PaddleOCR-VL (CPU)	Same capabilities, without MLX acceleration	⚡
3	Apple Vision	Simple text, macOS only, instant	⚡⚡⚡

Apple Silicon users: when mlx-vlm is installed, UniOCR automatically starts an MLX-VLM server for Neural Engine acceleration. No configuration needed.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
assets		assets
src/uniocr		src/uniocr
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ V3.0 Pro Max Highlights

🚀 Quick Start

Option 1: pip install

Option 2: Docker (recommended for servers)

📖 Usage

Python SDK

CLI

REST API & Dashboard Console

Endpoints

🐳 Docker Build

🔧 Engine Priority

📄 License

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✨ V3.0 Pro Max Highlights

🚀 Quick Start

Option 1: pip install

Option 2: Docker (recommended for servers)

📖 Usage

Python SDK

CLI

REST API & Dashboard Console

Endpoints

🐳 Docker Build

🔧 Engine Priority

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages