GitHub - chernistry/bernstein: Deterministic orchestrator for 18 CLI AI coding agents. Git worktree isolation, HMAC audit trail, MCP server mode.

"To achieve great things, two things are needed: a plan and not quite enough time." — Leonard Bernstein

Orchestrate any AI coding agent. Any model. One command.

Bernstein in action: parallel AI agents orchestrated in real time

Documentation · Getting Started · Glossary · Limitations

Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and merges the results. When agents succeed, the janitor merges verified work into main. Failed tasks retry or route to a different model.

Why deterministic coordination

LLMs write code well. They schedule work across other LLMs badly. Most agent orchestrators use an LLM as the coordinator and hit the same failure modes: non-reproducible plans, silent coordination drift, token burn on meta-decisions a 200-line event loop does reliably. Bernstein inverts that. One LLM call upfront decomposes the goal; after that, scheduling, worktree isolation, quality gates, and HMAC-chained audit replay are all deterministic Python. Every run is bit-identically replayable.

No framework to learn. No vendor lock-in. Agents are interchangeable workers. Swap any agent, any model, any provider.

pipx install bernstein
cd your-project && bernstein init
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"

$ bernstein -g "Add JWT auth"
[manager] decomposed into 4 tasks
[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)
[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)
[verify]  all gates pass. merging to main.

Also available via pip, uv tool install, brew, dnf copr, and npx bernstein-orchestrator. See install options.

Supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

18 CLI agent adapters: 17 third-party wrappers plus a generic wrapper for anything with --prompt.

Agent	Models	Install
Claude Code	Opus 4, Sonnet 4.6, Haiku 4.5	`npm install -g @anthropic-ai/claude-code`
Codex CLI	GPT-5, GPT-5 mini	`npm install -g @openai/codex`
OpenAI Agents SDK v2	GPT-5, GPT-5 mini, o4	`pip install 'bernstein[openai]'`
Gemini CLI	Gemini 2.5 Pro, Gemini Flash	`npm install -g @google/gemini-cli`
Cursor	Sonnet 4.6, Opus 4, GPT-5	Cursor app
Aider	Any OpenAI/Anthropic-compatible	`pip install aider-chat`
Amp	Amp-managed	`npm install -g @sourcegraph/amp`
Cody	Sourcegraph-hosted	`npm install -g @sourcegraph/cody`
Continue	Any OpenAI/Anthropic-compatible	`npm install -g @continuedev/cli` (binary: `cn`)
Goose	Any provider Goose supports	See Goose docs
IaC (Terraform/Pulumi)	Any provider the base agent uses	Built-in
Kilo	Kilo-hosted	See Kilo docs
Kiro	Kiro-hosted	See Kiro docs
Ollama + Aider	Local models (offline)	`brew install ollama`
OpenCode	Any provider OpenCode supports	See OpenCode docs
Qwen	Qwen Code models	`npm install -g @qwen-code/qwen-code`
Cloudflare Agents	Workers AI models	`bernstein cloud login`
Generic	Any CLI with `--prompt`	Built-in

Any adapter also works as the internal scheduler LLM. Run the entire stack without any specific provider:

internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...
internal_llm_model: gemini-2.5-pro

Tip

Run bernstein --headless for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

Quick start

cd your-project
bernstein init                    # creates .sdd/ workspace + bernstein.yaml
bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit
bernstein live                    # watch progress in the TUI dashboard
bernstein stop                    # graceful shutdown with drain

For multi-stage projects, define a YAML plan:

bernstein run plan.yaml           # skips LLM planning, goes straight to execution
bernstein run --dry-run plan.yaml # preview tasks and estimated cost

How it works

Decompose. The manager breaks your goal into tasks with roles, owned files, and completion signals.
Spawn. Agents start in isolated git worktrees, one per task. Main branch stays clean.
Verify. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.
Merge. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible.

Cloud execution (Cloudflare)

Bernstein can run agents on Cloudflare Workers instead of locally. The bernstein cloud CLI handles deployment and lifecycle.

Workers. Agent execution on Cloudflare's edge, with Durable Workflows for multi-step tasks and automatic retry.
V8 sandbox isolation. Each agent runs in its own isolate, no container overhead.
R2 workspace sync. Local worktree state syncs to R2 object storage so cloud agents see the same files.
Workers AI (experimental). Use Cloudflare-hosted models as the LLM provider, no external API keys required.
D1 analytics. Task metrics and cost data stored in D1 for querying.
Vectorize. Semantic cache backed by Cloudflare's vector database.
Browser rendering. Headless Chrome on Workers for agents that need to inspect web output.
MCP remote transport. Expose or consume MCP servers over Cloudflare's network.

bernstein cloud login      # authenticate with Bernstein Cloud
bernstein cloud deploy     # push agent workers
bernstein cloud run plan.yaml  # execute a plan on Cloudflare

A bernstein cloud init scaffold for wrangler.toml and bindings is planned.

Capabilities

Core orchestration. Parallel execution, git worktree isolation, janitor verification, quality gates (lint, types, PII scan), cross-model code review, circuit breaker for misbehaving agents, token growth monitoring with auto-intervention.

Intelligence. Contextual bandit router for model/effort selection. Knowledge graph for codebase impact analysis. Semantic caching saves tokens on repeated patterns. Cost anomaly detection (burn-rate alerts). Behavior anomaly detection with Z-score flagging.

Sandboxing. Pluggable SandboxBackend protocol — run agents in local git worktrees (default), Docker containers, E2B Firecracker microVMs, or Modal serverless containers (with optional GPU). Plugin authors can register custom backends through the bernstein.sandbox_backends entry-point group. Inspect installed backends with bernstein agents sandbox-backends.

Artifact storage. .sdd/ state can stream to pluggable ArtifactSink backends: local filesystem (default), S3, Google Cloud Storage, Azure Blob, or Cloudflare R2. BufferedSink keeps the WAL crash-safety contract by writing locally with fsync first and mirroring to the remote asynchronously.

Skill packs. Progressive-disclosure skills (OpenAI Agents SDK pattern): only a compact skill index ships in every spawn's system prompt, agents pull full bodies via the load_skill MCP tool on demand. 17 built-in role packs plus third-party bernstein.skill_sources entry-points.

Controls. HMAC-chained audit logs, policy engine, PII output gating, WAL-backed crash recovery (experimental multi-worker safety), OAuth 2.0 PKCE. SSO/SAML/OIDC support is in progress.

Observability. Prometheus /metrics, OTel exporter presets, Grafana dashboards. Per-model cost tracking (bernstein cost). Terminal TUI and web dashboard. Agent process visibility in ps.

Ecosystem. MCP server mode, A2A protocol support, GitHub App integration, pluggy-based plugin system, multi-repo workspaces, cluster mode for distributed execution, self-evolution via --evolve (experimental).

Full feature matrix: FEATURE_MATRIX.md · Recent features: What's New

How it compares

Feature	Bernstein	CrewAI	AutoGen ¹	LangGraph
Orchestrator	Deterministic code	LLM-driven (+ code Flows)	LLM-driven	Graph + LLM
Works with	Any CLI agent (18 adapters)	Python SDK classes	Python agents	LangChain nodes
Git isolation	Worktrees per agent	No	No	No
Pluggable sandboxes	Worktree, Docker, E2B, Modal	No	No	No
Verification	Janitor + quality gates	Guardrails + Pydantic output	Termination conditions	Conditional edges
Cost tracking	Built-in	`usage_metrics`	`RequestUsage`	Via LangSmith
State model	File-based (.sdd/)	In-memory + SQLite checkpoint	In-memory	Checkpointer
Remote artifact sinks	S3, GCS, Azure Blob, R2	No	No	No
Self-evolution	Built-in (experimental)	No	No	No
Declarative plans (YAML)	Yes	Yes (`agents.yaml`, `tasks.yaml`)	No	Partial (`langgraph.json`)
Model routing per task	Yes	Per-agent LLM	Per-agent `model_client`	Per-node (manual)
MCP support	Yes (client + server)	Yes	Yes (client + workbench)	Yes (client + server)
Agent-to-agent chat	Bulletin board	Yes (Crew process)	Yes (group chat)	Yes (supervisor, swarm)
Web UI	TUI + web dashboard	CrewAI AMP	AutoGen Studio	LangGraph Studio + LangSmith
Cloud hosted option	Yes (Cloudflare)	Yes (CrewAI AMP)	No	Yes (LangGraph Cloud)
Built-in RAG/retrieval	Yes (codebase FTS5 + BM25)	`crewai_tools`	`autogen_ext` retrievers	Via LangChain

Last verified: 2026-04-19. See full comparison pages for detailed feature matrices.

The table above compares Bernstein against LLM-orchestration frameworks (they orchestrate LLM calls). The table below covers the closer category — other tools that orchestrate CLI coding agents:

Feature	Bernstein	ComposioHQ/agent-orchestrator	emdash
Shape	Python CLI + library + MCP server	TypeScript CLI + local dashboard	Electron desktop app
Primary language	Python	TypeScript	TypeScript
Install	`pipx install bernstein`	`npm install -g @aoagents/ao`	`.dmg` / `.msi` / `.AppImage`
Agent adapters	18	3 (Claude Code, Codex, Aider)	23
Git worktree per agent	Yes	Yes	Yes
MCP server mode (exposes self as MCP)	Yes (stdio + HTTP/SSE)	No	No
Coordinator	Deterministic Python scheduler	LLM-driven	Not documented
HMAC-chained audit replay	Yes	No	No
Autonomous CI-fix / PR flow	No	Yes	No
Visual dashboard	TUI + web	Web	Desktop app
Backing	Solo OSS	Funded (Composio.dev)	YC W26
License	Apache 2.0	MIT	Apache 2.0

Bernstein's wedge in this category: Python-native, MCP-server-first, widest adapter coverage. If your stack is TypeScript and you want a product with a dashboard, Composio's @aoagents/ao is a better fit; if you want a polished desktop ADE, emdash is. If you want a primitive that imports into Python, exposes itself over MCP to any client, and covers the full agent breadth (including Qwen, Goose, Ollama, OpenAI Agents SDK, Cloudflare Agents, and more) — Bernstein.

Monitoring

bernstein live       # TUI dashboard
bernstein dashboard  # web dashboard
bernstein status     # task summary
bernstein ps         # running agents
bernstein cost       # spend by model/task
bernstein doctor     # pre-flight checks
bernstein recap      # post-run summary
bernstein trace <ID> # agent decision trace
bernstein run-changelog --hours 48  # changelog from agent-produced diffs
bernstein explain <cmd>  # detailed help with examples
bernstein dry-run    # preview tasks without executing
bernstein dep-impact # API breakage + downstream caller impact
bernstein aliases    # show command shortcuts
bernstein config-path    # show config file locations
bernstein init-wizard    # interactive project setup
bernstein debug-bundle   # collect logs, config, and state for bug reports
bernstein skills list    # discoverable skill packs (progressive disclosure)
bernstein skills show <name>  # print a skill body with its references

bernstein fingerprint build --corpus-dir ~/oss-corpus  # build local similarity index
bernstein fingerprint check src/foo.py                 # check generated code against the index

Install

Method	Command
pip	`pip install bernstein`
pipx	`pipx install bernstein`
uv	`uv tool install bernstein`
Homebrew	`brew tap chernistry/bernstein && brew install bernstein`
Fedora / RHEL	`sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein`
npm (wrapper)	`npx bernstein-orchestrator`

Optional extras

Provider SDKs are optional so the base install stays lean. Pick what you need:

Extra	Enables
`bernstein[openai]`	OpenAI Agents SDK v2 adapter (`openai_agents`)
`bernstein[docker]`	Docker sandbox backend
`bernstein[e2b]`	E2B microVM sandbox backend (needs `E2B_API_KEY`)
`bernstein[modal]`	Modal sandbox backend, optional GPU (needs `MODAL_TOKEN_ID` / `MODAL_TOKEN_SECRET`)
`bernstein[s3]`	S3 artifact sink (via `boto3`)
`bernstein[gcs]`	Google Cloud Storage artifact sink
`bernstein[azure]`	Azure Blob artifact sink
`bernstein[r2]`	Cloudflare R2 artifact sink (S3-compatible `boto3`)
`bernstein[grpc]`	gRPC bridge
`bernstein[k8s]`	Kubernetes integrations

Combine extras with brackets, e.g. pip install 'bernstein[openai,docker,s3]'.

Editor extensions: VS Marketplace · Open VSX

Contributing

PRs welcome. See CONTRIBUTING.md for setup and code style.

Support

If Bernstein saves you time: GitHub Sponsors

Contact: forte@bernstein.run

Star History

License

Apache License 2.0

AutoGen is in maintenance mode; successor is Microsoft Agent Framework 1.0. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 2,173 Commits
.bernstein		.bernstein
.github		.github
.plugin		.plugin
.well-known		.well-known
Formula		Formula
action		action
agents		agents
benchmarks		benchmarks
commands		commands
community		community
deploy		deploy
docker		docker
docs		docs
examples		examples
hooks		hooks
integrations		integrations
packages		packages
packaging		packaging
plans		plans
proto/bernstein/v1		proto/bernstein/v1
rules		rules
scripts		scripts
sdk		sdk
src/bernstein		src/bernstein
templates		templates
tests		tests
.gitignore		.gitignore
.importlinter		.importlinter
.mcp.json		.mcp.json
.python-version		.python-version
.readthedocs.yaml		.readthedocs.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml
bernstein.yaml		bernstein.yaml
codecov.yml		codecov.yml
docker-compose.yaml		docker-compose.yaml
glama.json		glama.json
mkdocs.yml		mkdocs.yml
mutmut_config.py		mutmut_config.py
pyproject.toml		pyproject.toml
server.json		server.json
sonar-project.properties		sonar-project.properties
typos.toml		typos.toml
uv.lock		uv.lock
vulture_whitelist.py		vulture_whitelist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Orchestrate any AI coding agent. Any model. One command.

Why deterministic coordination

Supported agents

Quick start

How it works

Cloud execution (Cloudflare)

Capabilities

How it compares

Monitoring

Install

Optional extras

Contributing

Support

Star History

License

About

Uh oh!

Releases 76

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Orchestrate any AI coding agent. Any model. One command.

Why deterministic coordination

Supported agents

Quick start

How it works

Cloud execution (Cloudflare)

Capabilities

How it compares

Monitoring

Install

Optional extras

Contributing

Support

Star History

License

Footnotes

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 76

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages