Turn any SQL Server or Oracle database into a navigable knowledge graph.
DBGraph introspects your database schema — tables, views, stored procedures, functions, triggers, foreign keys, and cross-object dependencies — and builds a persistent knowledge graph with community detection, god-node ranking, and an interactive HTML visualization. Ask natural-language questions about your schema, trace dependency chains, and find surprising cross-schema couplings you didn't know existed.
Inspired by graphify — the same graph pipeline, applied to databases instead of code.
| Output | Description |
|---|---|
dbgraph-out/graph.html |
Interactive D3.js force graph — search, filter by type, hover for details |
dbgraph-out/GRAPH_REPORT.md |
God nodes, community map, surprising connections, suggested questions |
dbgraph-out/graph.json |
Raw graph (GraphRAG-ready, importable into Neo4j / FalkorDB) |
pip install dbgraph[sqlserver] # SQL Server
pip install dbgraph[oracle] # Oracle
pip install dbgraph[all] # both + best clustering (graspologic)python -m dbgraph --sqlserver "Server=myhost;Database=mydb;UID=sa;PWD=secret"python -m dbgraph --oracle "user/pass@host:1521/ORCL"Open dbgraph-out/graph.html in your browser.
DBGraph ships as a Claude Code skill. Once installed, type /dbgraph in any Claude Code chat to run the full pipeline interactively — with community labeling, natural-language queries, and guided exploration.
Install the skill:
# Windows
New-Item -ItemType Directory -Force "$env:USERPROFILE\.claude\skills\dbgraph"
Copy-Item "SKILL.md" "$env:USERPROFILE\.claude\skills\dbgraph\SKILL.md"
# macOS / Linux
mkdir -p ~/.claude/skills/dbgraph
cp SKILL.md ~/.claude/skills/dbgraph/SKILL.mdThen in Claude Code:
/dbgraph --sqlserver "Server=myhost;Database=mydb;UID=sa;PWD=secret"
The skill runs the full 7-step pipeline — introspect → extract → cluster → label → visualize — and guides you through exploring the result.
python -m dbgraph --sqlserver "..." # SQL Server connection string
python -m dbgraph --oracle "..." # Oracle DSN (user/pass@host:port/service)
python -m dbgraph --dsn ENV_VAR # read connection from environment variable
python -m dbgraph --schema dbo # limit to one schema
python -m dbgraph --schema "A,B,C" # limit to multiple schemas
python -m dbgraph --include-columns # add columns as graph nodes
python -m dbgraph --include-indexes # add indexes as graph nodes
python -m dbgraph --no-procs # skip stored procedures and functions
python -m dbgraph --mode deep # LLM semantic enrichment of proc bodies
python -m dbgraph --cluster-only # rerun community detection only
python -m dbgraph --no-viz # skip HTML output
python -m dbgraph --output ./my-out # custom output directory
python -m dbgraph query "What tables does sp_ProcessOrder touch?"
python -m dbgraph path "Orders" "InvoiceLines"
python -m dbgraph explain "sp_CalculateTax"
| Type | Description |
|---|---|
table |
Base table |
view |
View |
procedure |
Stored procedure |
function |
Scalar, table-valued, or aggregate function |
trigger |
DML/DDL trigger |
schema |
Schema/owner container |
column |
Column (--include-columns only) |
index |
Index (--include-indexes only) |
| Relation | Confidence | Description |
|---|---|---|
fk_references |
EXTRACTED | Foreign key constraint |
view_reads |
EXTRACTED | View selects from table/view |
proc_reads |
EXTRACTED | Proc/function reads from table/view |
proc_writes |
EXTRACTED | Proc/function inserts/updates/deletes table |
proc_calls |
EXTRACTED | Proc/function calls another proc/function |
trigger_on |
EXTRACTED | Trigger fires on table |
contains_column |
EXTRACTED | Table/view owns a column |
in_schema |
EXTRACTED | Object belongs to schema |
semantic_related |
INFERRED | LLM-inferred business relationship (--mode deep) |
EXTRACTED edges come from the database catalog (sys.sql_expression_dependencies / ALL_DEPENDENCIES) — definitive.
INFERRED edges are derived from regex parsing of procedure bodies — flagged as uncertain.
DBGraph uses Leiden clustering (via graspologic) with Louvain fallback. Oversized communities are split automatically. The result is a set of named business-domain clusters like:
- Order Processing — Orders, Bookings, Spots, related procs
- Invoice & Financials — Invoices, InvoiceDetails, Contracts
- Channel Configuration — Channels, ChannelGroups, SalesHouses
- Creative Management — Copies, SpotUpload, ReplayTypes
The most-connected objects in your schema — the tables or functions that everything else depends on. Renaming or dropping one of these without checking its 300+ dependents first is how production incidents happen.
Cross-community or cross-schema edges the graph surfaces automatically. Real examples from a production media planning database:
aos.Creative_Createwrites directly intodbo.Copieswith no validation layerdbo.Productsis read byaos.GetProductTreeacross the schema boundary- A copy batch allocation proc references
aos.MessageID(traced and confirmed as a regex false positive in this case — the audit trail tells you which edges to trust)
git clone https://github.com/RobertvanDijk-Operative/DBGraph.git
cd DBGraph
pip install -e ".[sqlserver]" # SQL Server support
pip install -e ".[oracle]" # Oracle support
pip install -e ".[all]" # everything| Package | Required for |
|---|---|
networkx |
Graph construction and traversal (always required) |
pyodbc |
SQL Server ([sqlserver]) |
oracledb |
Oracle ([oracle]) |
graspologic |
Best-quality Leiden clustering ([cluster]) |
If graspologic is not installed, DBGraph falls back to NetworkX's built-in Louvain algorithm — results are good, just slightly less stable across runs.
Oracle uses ALL_TABLES, ALL_VIEWS, ALL_PROCEDURES, ALL_SOURCE, ALL_CONSTRAINTS, ALL_DEPENDENCIES, and ALL_TRIGGERS. Procedure bodies are reconstructed by joining ALL_SOURCE rows in line order.
oracledb runs in thin mode by default — no Oracle Instant Client required. For thick mode (wallets, advanced auth), call oracledb.init_oracle_client() before connecting.
graphify parses files (code, docs, PDFs, images) into a knowledge graph. DBGraph does the same but the input is a live database connection:
| graphify | DBGraph | |
|---|---|---|
| Input | File folder | Database connection string |
| Extraction | tree-sitter AST + LLM | SQL catalog queries |
| Nodes | Functions, classes, docs | Tables, views, procs, triggers |
| Edges | Imports, calls, references | FK, proc_reads, proc_writes, view_reads |
| No-LLM mode | No (LLM required for semantic) | Yes — full graph with no LLM cost |
| LLM mode | Semantic enrichment of code | --mode deep: semantic enrichment of proc bodies |
- FK edges are EXTRACTED from the catalog — definitive.
- Dependency edges from
sys.sql_expression_dependencies/ALL_DEPENDENCIESare EXTRACTED. - Edges inferred from regex parsing of proc bodies are INFERRED — treat as hints, not facts.
- Encrypted or inaccessible proc bodies: any body-derived edges are omitted or marked INFERRED.
- The report always shows edge confidence breakdown so you know what to trust.
MIT — see LICENSE.
Built on top of the graphify architecture by @safishamsi.
Community detection via graspologic (Leiden) and NetworkX (Louvain).
Visualization via D3.js.