Skip to content

Commit b55748a

Browse files
committed
Initial commit: SysML indexer, Kuzu graph, MCP server, CLI, viewer
- Discovery (find-sysml, load-order), parser (sysml-v2-lsp), symbol-to-graph mapping - Kuzu graph store (.kuzu path), registry, list/clean - CLI: analyze, list, clean; path normalization and .kuzu DB path - MCP: indexDbGraph, list_indexed, clean_index, cypher, query, context, impact, rename - Scripts: export-graph, query-one, debug-index; viewer/view.html for graph viz - Docs: grammar-and-mapping, detailed README usage - Tests: discovery, graph-store, integration (parser mocked) Made-with: Cursor
0 parents  commit b55748a

50 files changed

Lines changed: 6061 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
node_modules/
2+
dist/
3+
*.log
4+
.env
5+
.sysmledgraph/
6+
graph-export.json

.nvmrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
20

README.md

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# sysmledgraph
2+
3+
Path-only SysML indexer: builds a knowledge graph from `.sysml` files and exposes it via **MCP** (query, context, impact, rename, cypher, list, clean) and **CLI** (analyze, list, clean). Follows the [GitNexus](https://github.com/abhigyanpatwari/GitNexus) pattern (Kuzu graph, MCP tools) but indexes SysML only; grouping is SysML-native.
4+
5+
**Design and plan:** SysML v2 model and development plan live in a separate repo (`sysml-v2-models/projects/sysmledgraph`). This repo is the implementation.
6+
7+
## Requirements
8+
9+
- **Node.js 20+** (see `.nvmrc`)
10+
11+
## Libraries
12+
13+
| Library | Purpose |
14+
|--------|--------|
15+
| **[Kuzu](https://kuzudb.com/)** (`kuzu`) | Embedded property graph database; Cypher queries; stores SysML nodes and edges. |
16+
| **[@modelcontextprotocol/sdk](https://github.com/modelcontextprotocol/typescript-sdk)** | MCP server (stdio), tool and resource registration; used for the sysmledgraph MCP server. |
17+
| **[Commander](https://github.com/tj/commander.js)** (`commander`) | CLI subcommands: `analyze`, `list`, `clean`. |
18+
| **[fast-glob](https://github.com/mrmlnc/fast-glob)** | File discovery: find all `.sysml` / `.kerml` under path(s). |
19+
| **[Zod](https://github.com/colinhacks/zod)** | Schema validation for MCP tool parameters. |
20+
| **[sysml-v2-lsp](https://github.com/daltskin/sysml-v2-lsp)** | SysML v2 parser via LSP stdio; **required** for indexing. Provides document symbols (Package, PartDef, PartUsage, etc.) and IN_DOCUMENT / IN_PACKAGE edges. |
21+
22+
**sysml-v2-lsp (required)**
23+
After `npm install`, build the LSP once:
24+
`cd node_modules/sysml-v2-lsp && npm run build`
25+
Or set **SYSMLLSP_SERVER_PATH** to your built `dist/server/server.js`. Indexing will fail with a clear error if the LSP is not found.
26+
27+
*Dev:* TypeScript, Vitest, @types/node.
28+
29+
## Install
30+
31+
```bash
32+
npm install
33+
npm run build
34+
```
35+
36+
## Usage
37+
38+
### CLI
39+
40+
Run from the project root (after `npm run build`) or via `npx sysmledgraph` if linked/published.
41+
42+
| Command | Description |
43+
|--------|-------------|
44+
| **analyze** `<paths...>` | Index one or more directory trees: discover `.sysml` and `.kerml`, parse via LSP, build the graph. Paths are resolved to absolute and stored in the registry. |
45+
| **list** | Print all indexed root paths (from the registry). |
46+
| **clean** `[path]` | Remove the index for a given path, or for **all** indexed paths if `path` is omitted. Deletes the DB file and registry entry. |
47+
48+
**Examples:**
49+
50+
```bash
51+
# Index a single model directory
52+
npx sysmledgraph analyze ./path/to/sysml-models
53+
54+
# Index multiple roots (each gets its own DB)
55+
npx sysmledgraph analyze ./repo1/models ./repo2/models
56+
57+
# See what is indexed
58+
npx sysmledgraph list
59+
60+
# Remove index for one path
61+
npx sysmledgraph clean ./path/to/sysml-models
62+
63+
# Remove all indexed paths
64+
npx sysmledgraph clean
65+
```
66+
67+
**Options and environment:**
68+
69+
- **`--storage <path>`** — Override the storage root (default: `~/.sysmledgraph`). Same as env **`SYSMEDGRAPH_STORAGE_ROOT`**.
70+
- **`SYSMLLSP_SERVER_PATH`** — Optional. Path to the sysml-v2-lsp server JS (e.g. `dist/server/server.js`). If unset, the CLI uses `node_modules/sysml-v2-lsp/dist/server/server.js` (must be built).
71+
72+
**Storage layout:** Under the storage root: `registry.json` (list of indexed paths), and `db/<sanitized-path>.kuzu` (one Kuzu database per indexed path). On failure, the CLI writes errors to stderr and exits non-zero.
73+
74+
---
75+
76+
### MCP
77+
78+
Server name: **sysmledgraph**. The MCP server uses the same storage root as the CLI (default `~/.sysmledgraph`), so tools operate on whatever paths you indexed via the CLI or via the **indexDbGraph** tool.
79+
80+
**Setup (Cursor):** Add to `.cursor/mcp.json` or Cursor MCP settings.
81+
82+
**Option A — local build:**
83+
84+
```json
85+
{
86+
"mcpServers": {
87+
"sysmledgraph": {
88+
"command": "node",
89+
"args": ["C:/path/to/codebase-sysmledgraph/dist/mcp/index.js"],
90+
"env": {
91+
"SYSMEDGRAPH_STORAGE_ROOT": "C:/Users/you/.sysmledgraph",
92+
"SYSMLLSP_SERVER_PATH": "C:/path/to/sysml-v2-lsp/dist/server/server.js"
93+
}
94+
}
95+
}
96+
}
97+
```
98+
99+
**Option B — npx (when published):**
100+
101+
```json
102+
{
103+
"mcpServers": {
104+
"sysmledgraph": {
105+
"command": "npx",
106+
"args": ["-y", "sysmledgraph-mcp"]
107+
}
108+
}
109+
}
110+
```
111+
112+
**Tools:**
113+
114+
| Tool | Parameters | Description |
115+
|------|------------|-------------|
116+
| **indexDbGraph** | `path` (string) or `paths` (string[]) | Build the graph for the given path(s). Same logic as CLI `analyze`. Uses first indexed path if none given (no-op). |
117+
| **list_indexed** || Return the list of indexed root paths (same as CLI `list`). |
118+
| **clean_index** | `path` (string, optional) | Remove index for one path or all (same as CLI `clean`). |
119+
| **cypher** | `query` (string) | Run a Cypher query on the graph for the **first** indexed path. Example: `MATCH (n:Node) RETURN n.id, n.label LIMIT 10`. |
120+
| **query** | `query` (string), `kind` (string, optional) | Concept search over node names/labels. Filters by node label if `kind` is set. |
121+
| **context** | `name` (string) | Get one node by id or name and its adjacent edges (types and targets). |
122+
| **impact** | `target` (string), `direction` (`"upstream"` \| `"downstream"`, optional) | List nodes that depend on `target` (upstream) or that `target` depends on (downstream). |
123+
| **rename** | `symbol` (string), `newName` (string), `dry_run` (boolean, optional) | Preview or perform a rename of a symbol across the graph. |
124+
125+
**Resources:**
126+
127+
- **sysmledgraph://context** — Index stats and list of indexed paths (Markdown).
128+
- **sysmledgraph://schema** — Graph node and edge schema (Markdown).
129+
130+
Tools that need a graph (cypher, query, context, impact) use the **first** entry in the registry as the target DB. Ensure at least one path is indexed (CLI or indexDbGraph) before calling them.
131+
132+
---
133+
134+
### Querying the graph (scripts)
135+
136+
For ad-hoc Cypher or exporting the graph without MCP:
137+
138+
- **Run one Cypher query** (uses first indexed path, outputs JSON):
139+
140+
```bash
141+
node scripts/query-one.mjs "MATCH (n:Node) RETURN count(n) AS total"
142+
node scripts/query-one.mjs "MATCH (n:Node) RETURN n.label, count(*) AS c ORDER BY c DESC LIMIT 5"
143+
```
144+
145+
The graph uses a single node table **Node**; always use the label in Cypher: `MATCH (n:Node) ...`.
146+
147+
- **Export graph for viewing:** Run `npm run export-graph` (writes `graph-export.json` in the current directory). Open `viewer/view.html` in a browser, click “Load graph.json”, and select that file for a force-directed view; click a node for details. Custom output path: `node scripts/export-graph.mjs path/to/out.json`.
148+
149+
## Project layout
150+
151+
- `src/` — Core: indexer, graph, mcp, cli, discovery, parser, symbol-to-graph, storage.
152+
- `bin/cli.ts` — CLI entrypoint.
153+
- `mcp/index.ts` — MCP server entrypoint (stdio).
154+
- `test/` — Unit and integration tests.
155+
156+
## Development
157+
158+
- `npm run build` — Compile TypeScript.
159+
- `npm run test` — Run tests.
160+
- `npm run test:watch` — Watch mode.
161+
162+
## View the graph
163+
164+
See **Usage → Querying the graph (scripts)** for the full flow. Short version: `npm run export-graph` writes `graph-export.json`; open `viewer/view.html` in a browser and load that file for a force-directed graph (click nodes for id/label/path).
165+
166+
## Schema (graph)
167+
168+
Implemented with **Kuzu**: one `Node` table (id, name, path, label) and one rel table per edge type (Node→Node). Label values: Document, Package, PartDef, PartUsage, etc. Edge types: IN_DOCUMENT, IN_PACKAGE, PARENT, TYPES, REFERENCES, IMPORTS, SATISFY, DERIVE, VERIFY, BINDING, CONNECTION_END. See design doc (GITNEXUS_FEATURES.md) and deploy model.
169+
170+
## License
171+
172+
MIT

bin/cli.ts

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
#!/usr/bin/env node
2+
/**
3+
* CLI entrypoint: analyze, list, clean.
4+
*/
5+
6+
import { program } from 'commander';
7+
import {
8+
configureStorageRoot,
9+
cmdAnalyze,
10+
cmdList,
11+
cmdClean,
12+
} from '../src/cli/commands.js';
13+
14+
program
15+
.name('sysmledgraph')
16+
.description('Path-only SysML indexer: knowledge graph, MCP server, CLI')
17+
.option('--storage <path>', 'Storage root (default: ~/.sysmledgraph)', process.env.SYSMEDGRAPH_STORAGE_ROOT);
18+
19+
program
20+
.command('analyze <paths...>', { isDefault: true })
21+
.description('Index path(s): discover .sysml, build graph')
22+
.action(async (paths: string[]) => {
23+
const opts = program.opts();
24+
configureStorageRoot(opts.storage);
25+
const result = await cmdAnalyze(paths);
26+
if (!result.ok) {
27+
process.stderr.write((result.error ?? 'Unknown error') + '\n');
28+
process.exit(1);
29+
}
30+
});
31+
32+
program
33+
.command('list')
34+
.description('List indexed path(s)')
35+
.action(async () => {
36+
configureStorageRoot(program.opts().storage);
37+
const result = await cmdList();
38+
if (!result.ok) {
39+
process.stderr.write((result.error ?? 'Unknown error') + '\n');
40+
process.exit(1);
41+
}
42+
for (const p of result.paths) {
43+
console.log(p);
44+
}
45+
});
46+
47+
program
48+
.command('clean [path]')
49+
.description('Remove index for path or all')
50+
.action(async (path?: string) => {
51+
configureStorageRoot(program.opts().storage);
52+
const result = await cmdClean(path);
53+
if (!result.ok) {
54+
process.stderr.write((result.error ?? 'Unknown error') + '\n');
55+
process.exit(1);
56+
}
57+
});
58+
59+
program.parse();

docs/.gitkeep

Whitespace-only changes.

docs/grammar-and-mapping.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Grammar and symbol mapping
2+
3+
## Two different things
4+
5+
1. **SysML v2 language grammar** – The syntax of `.sysml` / `.kerml` (what is valid SysML). **Not in this repo.** It is defined and updated in the **sysml-v2-lsp** project (the LSP that does the actual parsing).
6+
2. **Symbol → graph mapping** – How LSP *output* (metaclass names, relation names) is turned into graph node labels and edge types in sysmledgraph. **This is in this repo** and is what you update when you want to index new metaclasses or relations.
7+
8+
---
9+
10+
## 1. Updating the SysML v2 language grammar
11+
12+
Parsing is done by **sysml-v2-lsp** (dependency: `github:daltskin/sysml-v2-lsp`). To change what is *parsed* as valid SysML (new syntax, grammar fixes):
13+
14+
- Work in the **sysml-v2-lsp** repository.
15+
- That project typically has a grammar (e.g. ANTLR `.g4` or similar) and a parser generated from it. Update the grammar there, regenerate the parser, and release or point `package.json` at your fork.
16+
- After updating the LSP, rebuild it (`npm run build` in the LSP repo or in `node_modules/sysml-v2-lsp`) and ensure **SYSMLLSP_SERVER_PATH** (if you use it) points to the new server binary.
17+
18+
sysmledgraph does not contain or generate the SysML grammar; it only consumes the LSP’s document symbols.
19+
20+
---
21+
22+
## 2. Updating the symbol → graph mapping (in this repo)
23+
24+
When the LSP returns **new metaclass names** or **new relation semantics**, you map them to the graph here.
25+
26+
### Node labels (LSP metaclass → graph label)
27+
28+
**File:** `src/symbol-to-graph/mapping.ts`
29+
30+
- **`symbolKindToNodeLabel(kind)`**`kind` is the LSP `DocumentSymbol.detail` (metaclass name from sysml-v2-lsp, e.g. `PartDefinition`, `RequirementUsage`). Add or change entries in the `map` object to support new metaclasses or rename labels.
31+
32+
If you introduce a **new** graph label:
33+
34+
1. Add it to **`NODE_LABELS`** in `src/types.ts`.
35+
2. Add the corresponding Kuzu schema in `src/graph/schema.ts` if you ever split into multiple node tables (currently there is a single `Node` table with a `label` property, so usually no schema change).
36+
3. Add the metaclass → label mapping in **`symbolKindToNodeLabel`** in `src/symbol-to-graph/mapping.ts`.
37+
38+
### Edge types (LSP relation → graph edge type)
39+
40+
**File:** `src/symbol-to-graph/mapping.ts`
41+
42+
- **`relationToEdgeType(relation)`**`relation` is the relation name from the LSP (e.g. `inDocument`, `inPackage`). Add or change entries to support new relations.
43+
44+
If you introduce a **new** edge type:
45+
46+
1. Add it to **`EDGE_TYPES`** in `src/types.ts`.
47+
2. The Kuzu schema in `src/graph/schema.ts` creates one rel table per `EDGE_TYPES` entry, so a new type will get a new table when the DB is (re)created.
48+
3. Add the relation name → edge type mapping in **`relationToEdgeType`** in `src/symbol-to-graph/mapping.ts`.
49+
50+
### Where LSP output is used
51+
52+
- **`src/parser/symbols.ts`** – Builds normalized symbols from LSP response; uses `symbolKindToNodeLabel(item.sym.detail ?? '')` and emits relations that are later mapped with `relationToEdgeType`.
53+
54+
After editing mapping or types, run `npm run build` and re-index to see new nodes/edges in the graph.

mcp/index.ts

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/usr/bin/env node
2+
/**
3+
* MCP server entrypoint (stdio). Server name: sysmledgraph.
4+
* Tools: indexDbGraph, query, context, impact, rename, cypher, list_indexed, clean_index.
5+
* Resources: sysmledgraph://context, sysmledgraph://schema.
6+
*/
7+
8+
import { runStdio } from '../src/mcp/server.js';
9+
10+
runStdio().catch((err) => {
11+
process.stderr.write(String(err) + '\n');
12+
process.exit(1);
13+
});

0 commit comments

Comments
 (0)