Skip to content

feat(mcp): add codebase-memory MCP — code knowledge graph for Hermes#57

Merged
AlienWalker1995 merged 8 commits into
mainfrom
feat/codebase-memory-mcp
Jun 23, 2026
Merged

feat(mcp): add codebase-memory MCP — code knowledge graph for Hermes#57
AlienWalker1995 merged 8 commits into
mainfrom
feat/codebase-memory-mcp

Conversation

@AlienWalker1995

Copy link
Copy Markdown
Owner

What

Adds codebase-memory, a gateway-spawned stdio MCP server that gives Hermes a structural code knowledge graph of the repos under your code root — call graphs, trace paths, architecture views, symbol/snippet lookup — so it navigates code instead of grepping blindly. Wraps the upstream DeusData/codebase-memory-mcp release binary (MIT; bundled offline nomic-embed-code embeddings; no API keys).

Architecture note (design corrected during build)

The original design assumed a long-lived HTTP MCP service. Verification killed that: the binary is stdio-only (no HTTP/SSE MCP transport) and ships no Docker image. But stdio is exactly how docker/mcp-gateway already runs servers — spawned as sibling containers over stdio — so this mirrors qdrant-rag-mcp (build-only image + registry-custom.yaml catalog entry), which is simpler and consistent with every other MCP in the stack.

Changes

  • codebase-memory-mcp/Dockerfile — downloads + sha256-verifies the pinned portable/static binary (v0.8.1); CMD = MCP stdio server. (+ README.md)
  • docker-compose.ymlcodebase-memory-mcp-image build service behind profile: codebase-memory (opt-in; ~266 MB binary); mcp-gateway gains CODE_ROOT + MCP_GATEWAY_DOCKER_BIND_ALLOWED_PATHS.
  • mcp/gateway/registry-custom.yamlcodebase-memory catalog entry: longLived: true, disableNetwork: true (100% local → anti-exfil), /c/dev:ro host bind + codebase-memory-cache named volume for the persistent SQLite index.
  • mcp/gateway/gateway-wrapper.sh — substitutes PLACEHOLDER_CODE_ROOT from $CODE_ROOT.
  • .cbmignore (repo root) — defense-in-depth secret/non-source excludes.
  • .env.example, CHANGELOG.md.

Why these gateway specifics

Verified against the docker/mcp-gateway source: the 2026-06 bind-mount hardening requires host-path binds to be read-only + allow-listed (hence /c/dev:ro + MCP_GATEWAY_DOCKER_BIND_ALLOWED_PATHS=CODE_ROOT), while named volumes are exempt (hence the cache uses one). Robust whether or not the running gateway has the hardening.

Validation (evidence)

  • Build: image builds; checksums.txt SHA verified in-layer (/tmp/cbm.tar.gz: OK).
  • MCP stdio: initialize + tools/list return clean JSON-RPC; stdout has zero banner pollution (the n8n-style gateway-crash risk); runs under --network none (offline confirmed).
  • Secret exclusion (live): indexed a canary repo with secrets in indexable .py files — .cbmignore (secrets/) and .gitignore (ignored_secret.py) both yield 0 matches for the secret content, while a normal symbol is retrievable (positive control).
  • ✅ compose + registry YAML parse.

Not yet validated (operator step)

Full live gateway-spawn + Hermes round-trip is operator-space (touches data/ + recreates the gateway container) and not in this PR. To enable on a deployment:

  1. Set CODE_ROOT in .env (host path = what Hermes sees at /c/dev).
  2. docker compose --profile codebase-memory build codebase-memory-mcp-image
  3. ./scripts/mcp_add.sh codebase-memory, then recreate the gateway so it picks up CODE_ROOT.

Follow-ups (surfaced, not silently accepted)

  • docker/mcp-gateway:v2 is a rolling tag (violates pin-don't-float). Pre-existing; recommend pinning the gateway by digest separately.
  • Spawned container runs as root inside the network-isolated, RO-code sandbox (named-volume cache is root-owned). Acceptable for v1; non-root + entrypoint chown is a hardening follow-up.

🤖 Generated with Claude Code

Hermes Bot and others added 2 commits June 23, 2026 13:26
Gateway-spawned stdio MCP server wrapping the upstream
DeusData/codebase-memory-mcp static binary (MIT; bundled offline
embeddings, no API keys). Gives Hermes structural code navigation
(search_graph, trace_path, get_architecture, get_code_snippet, ...)
over the repos under CODE_ROOT, mounted read-only at /c/dev.

Architecture: the binary is stdio-only, which is exactly how the
docker/mcp-gateway already runs MCP servers (spawned as sibling
containers over stdio), so this mirrors qdrant-rag-mcp rather than a
long-lived HTTP service (the originally-assumed shape, which the binary
does not support).

- codebase-memory-mcp/: Dockerfile (pinned, sha256-verified portable
  static binary v0.8.1) + README.
- docker-compose.yml: codebase-memory-mcp-image build service behind
  profile "codebase-memory" (opt-in; ~266MB binary); mcp-gateway gains
  CODE_ROOT + MCP_GATEWAY_DOCKER_BIND_ALLOWED_PATHS.
- registry-custom.yaml: codebase-memory catalog entry — longLived,
  disableNetwork (100% local, anti-exfil), /c/dev:ro host bind +
  codebase-memory-cache named volume for the persistent SQLite index.
- gateway-wrapper.sh: substitute PLACEHOLDER_CODE_ROOT from $CODE_ROOT.
- .cbmignore: defense-in-depth secret/non-source excludes.
- .env.example, CHANGELOG.md.

Validated: image builds with checksum verification; binary speaks clean
MCP stdio (no stdout pollution) under --network none; secret exclusion
proven live (.cbmignore + .gitignore both keep secret content out of the
index, positive control retrievable).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Folds the upstream visualization into the codebase-memory integration: a
long-lived `codebase-memory-ui` service that serves the interactive 3D
knowledge-graph from the SAME index the headless MCP builds (shared
`codebase-memory-cache` named volume).

- codebase-memory-ui/: Dockerfile (pinned, sha256-verified portable/static UI
  binary v0.8.1) + entrypoint. The UI binds 127.0.0.1 only, so the entrypoint
  bridges it to 0.0.0.0:9750 via socat; it also keeps stdin open so the
  MCP-stdio process doesn't EOF-exit and the UI stays up as a service.
- docker-compose.yml: codebase-memory-ui service (profile codebase-memory,
  shared cache volume) + a pinned `codebase-memory-cache` named volume + an
  extra oauth2-proxy --whitelist-domain for the :8443 origin.
- auth/caddy/Caddyfile: dedicated SSO-gated :8443 listener. The UI is an
  absolute-asset SPA (/assets, /api, /rpc at root) whose /api collides with
  Open WebUI's root catch-all, so it owns its origin rather than a subpath.
- overrides/codebase-memory-ui.yml: publishes Caddy :8443 (opt-in).
- README + CHANGELOG + .env.example.

Validated: UI image builds (checksum verified) and serves HTTP 200 on both the
localhost UI port and the socat-bridged port, staying alive as a service;
merged compose config valid; Caddyfile validates ("Valid configuration",
3 servers incl :8443). Live OAuth-across-:8443 + in-browser render need the
operator gateway/Caddy recreate (make up).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AlienWalker1995

Copy link
Copy Markdown
Owner Author

Folded the 3D graph visualization UI into this PR (commit 0c14cf9).

New: codebase-memory-ui — a long-lived service serving the upstream interactive 3D knowledge-graph from the same index the headless MCP builds (shared codebase-memory-cache named volume).

Why it's shaped this way:

  • The UI binary binds 127.0.0.1 only → the image bridges it to 0.0.0.0:9750 with socat, and keeps stdin open so the MCP-stdio process doesn't EOF-exit (it stays up as a service).
  • The UI is an absolute-asset SPA (/assets, /api, /rpc at root) and its /api/* collides with Open WebUI's root catch-all → it can't be a subpath; it gets its own SSO-gated origin on Caddy :8443 (opt-in via overrides/codebase-memory-ui.yml; extra --whitelist-domain=…:8443 on oauth2-proxy for the cross-port login return).

Enable: COMPOSE_FILE=…;overrides/codebase-memory-ui.yml + docker compose --profile codebase-memory up -d --build, then browse https://<host>:8443/ (Google SSO).

Validated: UI image builds (checksum verified); serves HTTP 200 on both the localhost UI port and the socat-bridged port, staying alive as a service; merged compose config valid; Caddyfile validates (Valid configuration, 3 servers incl :8443).

Needs operator recreate to live-test: the OAuth-across-:8443 flow and the in-browser graph render require make up (SOPS secrets), so they're config-complete + syntax-validated but not yet exercised on the running stack. Note: gateway-spawned MCPs (this + qdrant-rag) never appear in the services dashboard; the UI is what gives you something visual.

Hermes Bot and others added 6 commits June 23, 2026 15:09
…shared-index docs

The upstream binary doesn't reliably flush its graph index to CBM_CACHE_DIR
across container exits, so the cache volume can't serve as a shared index between
the gateway-spawned MCP and the UI. Mount the code root read-only on the UI so its
own long-lived process indexes + visualizes the graph (the index is then in-memory;
re-index after a restart). Validated live: the UI indexed ordo-ai-stack (2062 nodes /
8411 edges, secrets/ + data/ excluded) and serves it; gateway loads codebase-memory
(14 tools); :8443 SSO listener returns 302→oauth2 with the cross-port rd intact;
:443 front door unaffected.

Corrects the now-inaccurate "shared index" comments in docker-compose.yml + README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a `codebase-memory-ui` entry to dashboard SERVICES (health-checked at
http://codebase-memory-ui:9750/ over proxy-net) and to ops-controller
ALLOWED_SERVICES (so restart/logs/stats work for it). `url` is omitted so the
frontend builds https://<host>:8443 from the port (the SSO Caddy listener);
the localhost-only :9749 UI is bridged to :9750, which is what the dashboard
health-checks.

Validated live: /api/health lists codebase-memory-ui ok=true (9 services);
ruff + dashboard service tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The "Open" link was generated as http://<host>:8443 (serviceOpenHref's
no-url fallback defaults to http), which hit the TLS :8443 listener and
failed with "Client sent an HTTP request to an HTTPS server". Set the
catalog `url` to https://localhost:8443 so serviceOpenHref keeps the https
scheme + :8443 port and only swaps in the dashboard host -> https://<host>:8443/.

Validated live: /api/services returns url=https://localhost:8443, ok=true.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…443)

Give the visualization a proper, port-less URL — https://<host>/codebase-memory/
on the shared :443 Google-SSO origin — instead of the dedicated :8443 listener.

The UI is an absolute-asset SPA with no base-path option, so the image now runs
nginx that proxies the localhost-only UI and sub_filter-rewrites its baked
/assets,/api,/rpc,/font-files to the /codebase-memory/ prefix — letting it live
under a subpath without colliding with Open WebUI's root. Caddy routes
/codebase-memory/* in the existing :443 SSO block; the dashboard links via
SSO_ROUTES. Removes the :8443 listener, its oauth2-proxy :8443 whitelist, and the
overrides/codebase-memory-ui.yml port override (socat replaced by nginx).

Validated live: /codebase-memory/ -> 302 SSO (same-origin rd); assets+api proxy
through nginx (re-index succeeded via /codebase-memory/rpc); :8443 removed;
:443 front door intact (/healthz 200, /dash/ 302); dashboard /api/health ok=true.
Re-verify the nginx sub_filter path list when CBM_VERSION is bumped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…443)

Give the visualization a proper, port-less URL — https://<host>/codebase-memory/
on the shared :443 Google-SSO origin — instead of the dedicated :8443 listener.

The UI is an absolute-asset SPA with no base-path option, so the image now runs
nginx that proxies the localhost-only UI and sub_filter-rewrites its baked
/assets,/api,/rpc,/font-files to the /codebase-memory/ prefix — letting it live
under a subpath without colliding with Open WebUI's root. Caddy routes
/codebase-memory/* in the existing :443 SSO block; the dashboard links via
SSO_ROUTES. Removes the :8443 listener, its oauth2-proxy :8443 whitelist, and the
overrides/codebase-memory-ui.yml override (socat replaced by nginx).

Validated live: /codebase-memory/ -> 302 SSO (same-origin rd); assets+api proxy
through nginx (re-index succeeded via /codebase-memory/rpc); :8443 removed;
:443 front door intact (/healthz 200, /dash/ 302); dashboard /api/health ok=true.
Re-verify the nginx sub_filter path list when CBM_VERSION is bumped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AlienWalker1995 AlienWalker1995 merged commit 29b24a6 into main Jun 23, 2026
5 checks passed
@AlienWalker1995 AlienWalker1995 deleted the feat/codebase-memory-mcp branch June 23, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant