GitHub - AlienWalker1995/Ordo-AI-Stack: All in one AI stack that detects hardware, pulls models, and manages all services + MCP. Entirely local -- Entirely Free.

  ___          _       
 / _ \ _ __ __| | ___  
| | | | '__/ _` |/ _ \ 
| |_| | | | (_| | (_) |
 \___/|_|  \__,_|\___/

──────────────────────────────────────────────────
Docker Compose stack for local LLMs, chat UI, image/video (ComfyUI), and automation (n8n) — with a unified dashboard.

Overview

Ordo AI Stack packages a local-first, operator-deployed stack: llama.cpp-backed models behind an OpenAI-compatible LiteLLM model gateway, Open WebUI for chat, ComfyUI for diffusion workflows, n8n for workflows, and an MCP gateway for shared tools. A dashboard provides a single place to inspect dependencies, pull models, and control the stack.

Deployment model: Single homelab operator. All user-facing UIs sit behind a Caddy + oauth2-proxy + Tailscale + Google SSO front door — no UI service publishes a host port directly. The operator brings their own Tailscale tailnet and Google OAuth client; the stack stitches them together so every UI is reachable at https://${CADDY_TAILNET_HOSTNAME}/<service>/ after a single Google sign-in, with an email allowlist gating access. See docs/runbooks/auth.md for the one-time setup.

Who it is for: A homelab operator running the stack on their own hardware, exposed over their tailnet to a small allowlist of personal Google accounts. Local AI models, strong operator-deployment principles.

Docs: Getting started · Auth front door · Secrets · Configuration · Data · Hermes Agent · PRD index

Features

All UI ports below are internal (container-network). Operators reach them via the Caddy front door under https://${CADDY_TAILNET_HOSTNAME}/<path>/; the only host-published ports are Caddy :443 (tailnet-bound), and 127.0.0.1-bound publishes of model-gateway:11435, mcp-gateway:8811, and qdrant:6333 for host-side tools (Cursor, Cline, scripts).

Unified dashboard (internal 8080, front-door /dash/) — model lists, service links, dependency health, model pulls.
Model gateway (host 127.0.0.1:11435, also internal) — LiteLLM OpenAI-compatible API in front of llama.cpp backends.
Open WebUI (internal 8080, front-door /) — chat UI at the root of the tailnet hostname.
ComfyUI (internal 8188, front-door /comfy/) — workflows; large optional model downloads on demand.
n8n (internal 5678, front-door /n8n/) — automation.
MCP gateway (host 127.0.0.1:8811, also internal) — shared MCP tools for host clients and in-stack services.
Ops controller (internal 9000; no host port) — compose lifecycle from the dashboard with OPS_CONTROLLER_TOKEN.
Hermes dashboard (internal 9119, front-door /hermes/) — assistant-agent UI.
GPU profiles — scripts/detect_hardware.py generates overrides/compute.yml (gitignored) for NVIDIA / AMD / Intel / CPU paths.

Quickstart

Prerequisites:

Docker with Compose, and enough disk for models.
Tailscale installed on the host machine, with a Tailscale-issued TLS cert for the chosen tailnet hostname (tailscale cert ordo.<tailnet>.ts.net).
A Google Cloud OAuth 2.0 Web client for the SSO front door (Client ID + secret).
SOPS + age for secrets at rest.
For tests / lint, Python 3.12+ (see pyproject.toml).

Clone this repository and open a shell at the repo root.
Environment: If .env is missing, init scripts can create it from .env.example. Otherwise copy manually:
```
cp .env.example .env
```
Set at least BASE_PATH, CADDY_BIND (your tailnet IPv4 from tailscale ip -4), and CADDY_TAILNET_HOSTNAME (e.g. ordo.<tailnet>.ts.net). See comments in .env.example.
Auth front door (one-time): Follow docs/runbooks/auth.md to configure the Tailscale cert, Google OAuth client, cookie secret, and email allowlist.
Secrets (one-time): Follow docs/runbooks/secrets.md — generate an age keypair, register your public key in secrets/.sops.yaml, and run make decrypt-secrets to materialize runtime tokens at ~/.ai-toolkit/runtime/secrets/.
Full bring-up — the compose wrapper runs hardware detection, then builds and starts the stack:

Windows (PowerShell):
```
.\compose.ps1 up -d --build --force-recreate
```
Linux / macOS:
```
./compose up -d --build --force-recreate
```
From any device on your tailnet, browse to https://${CADDY_TAILNET_HOSTNAME}/ — Google sign-in gates the front door, then Open WebUI loads at /, the dashboard at /dash/, n8n at /n8n/, ComfyUI at /comfy/, and the Hermes UI at /hermes/.

Lighter bring-up (no forced rebuild/recreate; still runs hardware detection):

.\compose.ps1 up -d

./compose up -d

CPU-only / minimal services: bring up a subset after init, e.g. ./compose up -d ollama dashboard open-webui.

Installation

Runtime: Everything runs in containers; install Docker and use the repo from a fixed path (set BASE_PATH accordingly).
Development: Python 3.12+. Install test dependencies:
```
pip install -r tests/requirements.txt
```
On Linux/macOS you can use make test, make lint, and make smoke-test (see Makefile).

Configuration

Primary reference: .env.example (copy to .env).

Area	Variables (examples)
Paths	`BASE_PATH`, `DATA_PATH`
Models	`MODELS`, `DEFAULT_MODEL`
Security / APIs	`DASHBOARD_AUTH_TOKEN`, `OPS_CONTROLLER_TOKEN`, `WEBUI_AUTH`, `HF_TOKEN`, `GITHUB_PERSONAL_ACCESS_TOKEN`
MCP	`MCP_GATEWAY_SERVERS`
Compute	`COMPUTE_MODE`, `COMPOSE_FILE` (see comments for `overrides/*.yml`)
RAG profile	`EMBED_MODEL`, `QDRANT_PORT`, `RAG_COLLECTION`, …

Auto-generated: overrides/compute.yml (from hardware detection). Do not commit secrets; .env is gitignored.

Usage

Daily restart / full rebuild: same as Quickstart step 3.

On-demand one-off containers:

./compose run --rm model-puller
./compose run --rm comfyui-model-puller

RAG: docker compose --profile rag up -d and ingest paths per Getting started — RAG.
MCP clients: connect to http://localhost:8811/mcp (see mcp/README.md).

Dashboard

Reach the dashboard at https://${CADDY_TAILNET_HOSTNAME}/dash/ (Google SSO front door; allowlist via auth/oauth2-proxy/emails.txt). It lists models (Ollama and ComfyUI), links to other services, dependency health, and searchable model pulls. OPS_CONTROLLER_TOKEN lets it restart services and run POST /api/comfyui/install-node-requirements. DASHBOARD_AUTH_TOKEN is an optional bearer layer for non-browser API access; the browser path is gated by SSO at the proxy level.

After code changes affecting the dashboard image: .\compose.ps1 build dashboard then .\compose.ps1 up -d (or ./compose equivalents).

Ollama models

Pull lists and defaults come from .env (MODELS, DEFAULT_MODEL). Pull via the dashboard or:

./compose run --rm model-puller

ComfyUI (LTX-2)

Large optional downloads on demand; first run can take a long time. Pull via the dashboard or ./compose run --rm comfyui-model-puller.

Security

Front door: Caddy + oauth2-proxy + Google SSO gates all browser-reachable UIs at the network edge. Email allowlist in auth/oauth2-proxy/emails.txt (replace YOUR_ALLOWLIST_EMAIL locally — never commit your real email). See docs/runbooks/auth.md.
Open WebUI: runs with native auth disabled by default because Google SSO already gates it at the proxy; flip WEBUI_AUTH=True if you want a second auth layer for multi-user workspaces.
Dashboard: DASHBOARD_AUTH_TOKEN provides a bearer-token fallback for non-browser API access (e.g. host scripts). Browser traffic is SSO-gated.
Ops controller: requires OPS_CONTROLLER_TOKEN for dashboard-driven lifecycle and installs; no host port at all.
Secrets at rest: SOPS + age, with high-value tokens mounted as Docker secrets at /run/secrets/<name>. See docs/runbooks/secrets.md.
Never commit .env or any plaintext secret. Full notes: SECURITY.md.

GPU / compute

Hardware detection writes overrides/compute.yml. The compose wrapper runs detection before commands. No GPU: use a minimal service set (./compose up -d ollama dashboard open-webui); ComfyUI will be slower.

Architecture

Tailnet device → Caddy :443 (TLS) → oauth2-proxy (Google SSO + email allowlist)
                                          │
                                          ├── /          → Open WebUI
                                          ├── /dash/     → Dashboard
                                          ├── /n8n/      → n8n
                                          ├── /comfy/    → ComfyUI
                                          └── /hermes/   → Hermes dashboard
                                                  │
                                                  ├── Model Gateway → LiteLLM → llama.cpp / Ollama / (vLLM)
                                                  ├── MCP Gateway → shared tools (SearXNG, n8n, ComfyUI, …)
                                                  └── Ops Controller → Docker Compose lifecycle (token-auth, no host port)

Local-first AI; operator-deployed front door. Dashboard does not mount docker.sock. Details: PRD index.

Data

Bind mounts only. Set BASE_PATH (and optionally DATA_PATH). Ollama blobs under models/ollama. See docs/data.md.

MCP (Model Context Protocol)

MCP Gateway — configure servers with MCP_GATEWAY_SERVERS in .env. Endpoint: http://localhost:8811/mcp. See mcp/README.md.

Hermes Agent

Hermes Agent runs as two compose services (hermes-gateway + hermes-dashboard) with persistent state under data/hermes/. Setup and upgrade notes: docs/hermes-agent.md.

Development

Python layout: dashboard/, model-gateway/, ops-controller/, rag-ingestion/, scripts/; Ruff config in pyproject.toml.
Do not commit: .env, data/, models/, overrides/compute.yml, mcp/.env — see CONTRIBUTING.md.

Testing

pip install -r tests/requirements.txt
python -m pytest tests/ -v
python -m ruff check dashboard tests model-gateway ops-controller rag-ingestion scripts comfyui-mcp orchestration-mcp worker

Health / diagnostics:

.\scripts\doctor.ps1

./scripts/doctor.sh

Optional: DOCTOR_DEPS_TIMEOUT_SEC; DASHBOARD_AUTH_TOKEN from .env when probing the dashboard.

Smoke (Docker required):

.\scripts\smoke_test.ps1

./scripts/smoke_test.sh
# or: make smoke-test

CI (.github/workflows/ci.yml): TruffleHog secret scan; pytest + ruff; docker compose config; optional compose smoke via workflow dispatch.

Troubleshooting

Services won’t start or images are stale — Rebuild affected images and recreate, e.g. docker compose build dashboard model-gateway (or the compose wrapper), then up -d. Doctor WARN on missing /api/dependencies or /ready often indicates an old image.
Doctor warns on Ollama (11434) or MCP (8811) — Expected if those ports are not published; use overrides/ollama-expose.yml / overrides/mcp-expose.yml or set DOCTOR_STRICT=1 only when you intend strict probes (see doctor script comments in repo).
No GPU — Use a minimal service set or CPU-oriented overrides; ComfyUI will be slower.
Exposing to a network — Enable Open WebUI auth (WEBUI_AUTH=True), set DASHBOARD_AUTH_TOKEN, and harden n8n — see SECURITY.md.

Roadmap

Rolling changes: CHANGELOG.md.

Contributing

See CONTRIBUTING.md. Report security issues per SECURITY.md (do not use public issues for vulnerabilities).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Features

Quickstart

Installation

Configuration

Usage

Dashboard

Ollama models

ComfyUI (LTX-2)

Security

GPU / compute

Architecture

Data

MCP (Model Context Protocol)

Hermes Agent

Development

Testing

Troubleshooting

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 408 Commits
.cline		.cline
.github/workflows		.github/workflows
.vscode		.vscode
auth		auth
codebase-memory-mcp		codebase-memory-mcp
codebase-memory-ui		codebase-memory-ui
comfyui-mcp		comfyui-mcp
config		config
dashboard		dashboard
docs		docs
hermes		hermes
mcp		mcp
model-gateway		model-gateway
ops-controller		ops-controller
orchestration-mcp		orchestration-mcp
overrides		overrides
qdrant-rag-mcp		qdrant-rag-mcp
rag-ingestion		rag-ingestion
scripts		scripts
secrets		secrets
tests		tests
worker		worker
.cbmignore		.cbmignore
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
compose		compose
compose.ps1		compose.ps1
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Overview

Features

Quickstart

Installation

Configuration

Usage

Dashboard

Ollama models

ComfyUI (LTX-2)

Security

GPU / compute

Architecture

Data

MCP (Model Context Protocol)

Hermes Agent

Development

Testing

Troubleshooting

Roadmap

Contributing

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages