How to run buddy server in production — systemd, Docker, Kubernetes,
reverse proxy, monitoring, and upgrades.
Scope. This covers the HTTP/WebSocket API server (
buddy server). For the multi-AI fleet specifics (peers,peer.*methods, autonomy daemon), seefleet-guide.md. For the API surface itself, seeinfrastructure.md.
One process, one port: an Express HTTP API plus the Gateway WebSocket
attached to the same port at path /ws.
buddy server --port 3000 --host 0.0.0.0 # production: JWT enforced
buddy server --port 3000 --no-auth # ONLY behind a trusted network| Flag | Default | Notes |
|---|---|---|
--port <port> |
3000 |
Also settable via PORT |
--host <host> |
0.0.0.0 |
Bind to 127.0.0.1 when fronted by a local reverse proxy |
--no-auth |
auth on | Disables JWT. Never expose a --no-auth server to an untrusted network — reserve it for loopback or a private overlay (Tailscale/WireGuard) |
Fleet convention: the chat/API server runs on 3000 and a second
instance acting as the fleet gateway runs on 3001. They are separate
processes of the same binary, not two listeners in one process.
JWT_SECRETis mandatory. WithNODE_ENV=productionand noJWT_SECRET, the server refuses to start (fail-closed). Generate one:openssl rand -hex 32.- CORS is localhost-only by default. Set
CORS_ORIGINSto the exact origins of your clients. Avoid*in production. - Behind a proxy, configure trusted proxies explicitly — the
gateway is origin-hardened (GHSA-5wcw-8jjv-m286); client IPs are only
read from
X-Forwarded-Forwhen the proxy is trusted. - Rate limit defaults to 100 req/min (
RATE_LIMIT_MAX,RATE_LIMIT_WINDOWto tune). - Fleet tool exposure is fail-closed.
peer.tool.invokestays disabled untilCODEBUDDY_PEER_TOOL_WORKSPACE_ROOTis set. - Optional device pairing.
CODEBUDDY_GATEWAY_REQUIRE_PAIRING=truegates unknown WebSocket devices behind an approval queue (buddy gateway-pairing pending|approve|reject|revoke). - Back up before upgrading:
buddy backup create(verify withbuddy backup verify).
| Variable | Purpose |
|---|---|
NODE_ENV |
Set to production to enforce JWT + production behavior |
PORT |
Server port (overridden by --port) |
JWT_SECRET |
Required in production. HMAC secret for API/WS auth |
CORS_ORIGINS |
Comma-separated allowed origins (default: localhost only) |
RATE_LIMIT_MAX / RATE_LIMIT_WINDOW |
Rate limiting (default 100 req / 60 s) |
GROK_API_KEY / OPENAI_API_KEY / ANTHROPIC_API_KEY / GEMINI_API_KEY |
Provider credentials (set only what you use) |
OLLAMA_HOST / VLLM_BASE_URL |
Local/bundled provider endpoints |
CODEBUDDY_PEER_TOOL_WORKSPACE_ROOT |
Enables peer.tool.invoke; unset = disabled (fail-closed) |
CODEBUDDY_PEER_TOOL_ALLOWLIST |
CSV override of the default view_file,list_directory,search |
CODEBUDDY_PEER_SESSION_IDLE_MS / CODEBUDDY_PEER_MAX_DEPTH / CODEBUDDY_PEER_ROLE |
Fleet limits / anti-loop guards |
CODEBUDDY_FLEET_MAX_CONCURRENCY |
Peer capacity → utilization in heartbeats + backpressure |
CODEBUDDY_GATEWAY_REQUIRE_PAIRING |
true to require device pairing on the gateway WS |
SENTRY_DSN / OTEL_EXPORTER_OTLP_ENDPOINT |
Error reporting / OpenTelemetry export |
MAX_COST |
Session cost budget (USD) |
Keep secrets out of unit files — use an EnvironmentFile (systemd) or
secrets (Docker/K8s). A common layout is ~/.codebuddy/fleet.env with
0600 permissions.
Tested in production on the Ministar fleet hub. Two variants below — prefer the built one (faster boot, no toolchain at runtime).
cd /opt/code-buddy && npm ci && npm run build# /etc/systemd/system/codebuddy.service
[Unit]
Description=Code Buddy HTTP/WebSocket API server
Documentation=https://github.com/phuetz/code-buddy
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=codebuddy
WorkingDirectory=/opt/code-buddy
EnvironmentFile=-/etc/codebuddy/server.env
Environment=NODE_ENV=production
ExecStart=/usr/bin/node dist/index.js server --port 3000 --host 127.0.0.1
Restart=on-failure
RestartSec=10
# Daily restart helps catch slow leaks before they accumulate.
RuntimeMaxSec=86400
# Hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=full
[Install]
WantedBy=multi-user.targetsudo systemctl daemon-reload
sudo systemctl enable --now codebuddy.service
journalctl -u codebuddy -fUseful on dev/staging boxes that track main. Note that systemd has no
inherited PATH — point it at your Node install explicitly:
[Service]
WorkingDirectory=/home/youruser/code-buddy
EnvironmentFile=-/home/youruser/.codebuddy/fleet.env
Environment=PATH=/home/youruser/.nvm/versions/node/v24.14.1/bin:/usr/local/bin:/usr/bin:/bin
ExecStart=/home/youruser/.nvm/versions/node/v24.14.1/bin/npx tsx src/index.ts server --port 3001 --host 0.0.0.0The
--no-authflag is acceptable only when the listener is bound to a private overlay network (e.g. a Tailscale tailnet) whose ACL you control. Switch to JWT before adding any out-of-ACL client.
The repo ships a multi-stage Dockerfile (non-root
user codebuddy, healthcheck on /api/health built in, EXPOSE 3000).
# Build
docker build -t codebuddy:latest --target production .
# Run the server
docker run -d --name codebuddy \
-p 3000:3000 \
-e NODE_ENV=production \
-e JWT_SECRET="$(openssl rand -hex 32)" \
-e GROK_API_KEY="$GROK_API_KEY" \
-e CORS_ORIGINS="https://app.example.com" \
-v codebuddy-data:/home/codebuddy/.codebuddy \
codebuddy:latest server --port 3000 --host 0.0.0.0Persist /home/codebuddy/.codebuddy — it holds the SQLite database,
sessions, memory, and peer-session state. Schema migrations run
automatically at startup (see Upgrades below).
docker-compose.ymlat the repo root targets interactive CLI usage, not server mode. For a composed server deployment, add a service that overrides the command withserver --port 3000 --host 0.0.0.0and mounts a named volume on/home/codebuddy/.codebuddy.
Ready-to-adapt manifests live in
deploy/kubernetes/:
| File | Purpose |
|---|---|
deployment.yaml |
Single-replica Deployment, ports 3000/3001, env from ConfigMap + Secret |
service.yaml |
ClusterIP service |
ingress.yaml |
Ingress (adapt host + TLS) |
configmap.yaml |
Non-secret configuration |
secret.yaml |
JWT_SECRET, provider API keys |
rbac.yaml |
ServiceAccount |
kind-setup.sh |
Local kind cluster bootstrap for testing |
kubectl apply -f deploy/kubernetes/
kubectl rollout status deployment/codebuddyNotes:
- Fill real values into
secret.yaml(or replace with ExternalSecrets/SealedSecrets) before applying. - Keep
replicas: 1unless you externalize state: the SQLite database and peer-session files are per-instance. Use a PVC mounted on/home/codebuddy/.codebuddy. - Wire liveness/readiness probes to
GET /api/health.
WebSocket upgrade must be forwarded for /ws:
server {
listen 443 ssl;
server_name buddy.example.com;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /ws {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_read_timeout 3600s; # long-lived gateway connections
}
}Remember to declare the proxy as trusted (see checklist §3) so client IPs and origins are evaluated correctly.
| Endpoint | Purpose |
|---|---|
GET /api/health |
Liveness + apiHeartbeat (30 s provider probe loop) |
GET /api/metrics |
Prometheus metrics |
GET /api/heartbeat/status |
Heartbeat detail (?format=report for Cowork-ready JSON) |
GET /api/daemon/status |
Autonomy daemon status (?format=report) |
Observability backends: set SENTRY_DSN (errors) and/or
OTEL_EXPORTER_OTLP_ENDPOINT (traces). buddy doctor diagnoses a
misbehaving install (--fix applies auto-migrations).
buddy backup create(andbuddy backup verify).- Deploy the new version (rebuild image /
git pull && npm ci && npm run build/buddy update). - Restart the service. SQLite schema migrations run automatically at
startup — the manager walks
schema_versionup to the currentSCHEMA_VERSION(covered end-to-end bytests/database/migration-e2e.test.ts). - Legacy JSON installs (pre-SQLite
~/.codebuddy/*.json) are imported by the JSON→SQLite migration;buddy doctor --fixtriggers it if needed. - If anything goes wrong: stop the service,
buddy backup restore, redeploy the previous version.