Plan version / iteration: Iteration 2 (GenerateMap).
Change note: Start iteration 2; promote GenerateMap from §6. Scope: implement GenerateMap (MCP tool to generate .md from graph, e.g. interconnection view). §6 backlog: DetectChanges, per-path resource, multi-path doc remain.
Source of truth: SysML v2 model in this project (deploy-sysmledgraph.sysml, requirements-sysmledgraph.sysml, behaviour-sysmledgraph.sysml). Requirements R1–R8 and part/action/state definitions in the model are authoritative; this plan is the implementation guide.
Design: docs/mcp/SYSMLEDGRAPH_MCP.md (parent SystemDesign repo).
Contents: 0 Workflow · 1 Tech stack · 2 Scope · 3 Phases · 4 Steps · 5 Verification · 6 After v1 · 7 References.
Related: sysmledgraph-CODEBASE_STRUCTURE.md · sysmledgraph-INTERCONNECTION_VIEW.md · sysmledgraph-MODEL_EXAMINATION_ERROR_CONDITIONS.md · sysmledgraph-GITNEXUS_ANALYSIS.md. Gaps: maintained in the codebase (implement side) at docs/GAPS.md.
Workflow (step-by-step, iteration): modelbase-development-WORKFLOW.md.
This repo is the modelbase for sysmledgraph: the SysML model, this plan, and related outputs live here (sysml-v2-models/projects/sysmledgraph/). The implement side (codebase) is the implementation repo: chouswei/codebase-sysmledgraph.
- Modelbase (this repo) generates the plan (to do): this development plan, requirements R1–R8, deploy/behaviour models, CODEBASE_STRUCTURE, and checkable steps. The plan is the implementation guide for the current iteration.
- Implement side (codebase) implements against the plan and model, then produces an implement report and delivers it to the modelbase. The report includes:
- Alignment — whether the implementation matches the plan and model (e.g.
docs/MODELBASE_ALIGNMENT.mdin the codebase). - Gaps — what the model has but the code does not; what the code had but the model did not; pipeline alignment; limitations (e.g.
docs/GAPS.mdin the codebase). - Traceability — plan step → implementation (e.g.
docs/IMPLEMENTATION_PLAN.mdin the codebase).
- Alignment — whether the implementation matches the plan and model (e.g.
- Modelbase receives the implement report, reviews it, and decides whether the iteration is done. If done, the modelbase updates this plan for the next iteration (scope, §6 After v1, or close) and updates Plan version and change note (see PLAN_TEMPLATE). Gaps are maintained in the codebase.
sysmledgraph is a path-only indexer that builds a knowledge graph from .sysml files and exposes it via MCP (query, context, impact, rename, cypher, list, clean) and CLI (analyze, list, clean). It uses a Kuzu graph and MCP/CLI patterns similar to common code indexers but indexes SysML only; grouping is SysML-native (packages, containment).
Model alignment: The SysML model defines four parts (Indexer, GraphStore, McpServer, Cli), ports and connection items, eight actions (index/query/context/impact/rename/cypher/list/clean), a lifecycle state machine (idle, indexing, ready, cleaning; error states), and a pipeline state machine IndexPipelineStates (discovering → loadOrdering → parsing → mapping → writing). Implementation should align with deploy-sysmledgraph.sysml and behaviour-sysmledgraph.sysml.
Terms: Indexer — discovers .sysml under path(s), parses them, writes Document + Symbol nodes and edges to the graph. GraphStore — abstraction over Kuzu (open DB, add nodes/edges, get connection for Cypher). Symbol→graph mapping — LSP symbol kinds → graph node labels and edge types (PartDef, IN_PACKAGE, TYPES, etc.).
| Layer | Choice | What it does |
|---|---|---|
| Runtime | Node.js 20+ | Same as sysml-v2-lsp; MCP/CLI patterns. |
| Language | TypeScript | Type safety; same ecosystem as sysml-v2-lsp. |
| Graph store | Kuzu (kuzudb.com) | Embedded graph DB; Cypher. Node bindings. Persistence: global registry (~/.sysmledgraph/registry.json + DB per indexed root). |
| SysML parsing | sysml-v2-lsp (stdio in v1) | ANTLR4 parsing, document symbols, go-to-def, references, rename. v1: stdio client (spawn LSP, documentSymbol/definition/references)—accepted; no stable programmatic API in sysml-v2-lsp. Library mode if API becomes available later. |
| File discovery | fast-glob or Node fs + glob |
Find .sysml (and optionally .kerml) under path(s). Respect config.yaml (e.g. model_files) for load order. |
| MCP server | @modelcontextprotocol/sdk (Node) | Tools and resources; server name sysmledgraph. |
| CLI | commander (Node) | Subcommands: analyze, list, clean; env or config for storage. |
| Load order | Config-driven when present | config.yaml model_files at path; else deterministic (e.g. breadth-first by path). |
Not in v1: No Tree-sitter (SysML only); no clustering, embeddings, or web UI. Same Kuzu + MCP pattern and list/clean lifecycle.
sysml-v2-lsp: daltskin/sysml-v2-lsp. v1: stdio client (spawn LSP, documentSymbol/definition/references)—accepted. Library mode when/if stable API is exported. Grammar: daltskin/sysml-v2-grammar; update in LSP with make update-grammar.
| Item | Description |
|---|---|
| Objective | Path-only indexer of .sysml into a knowledge graph; MCP server (query, context, impact, rename, cypher); CLI (index, list, clean). |
| In scope | Index paths → Document + Symbol nodes and SysML edges (R1–R7); Kuzu storage; load order when config present (R7); list/clean; MCP tools and resources; error reporting and graph consistency on failure (R8). |
| Out of scope | Code-oriented features not needed for SysML (execution flows, language-specific parsers). |
| Success | One command indexes path(s); MCP answers query/context/impact/rename/cypher; list/clean work; schema exposed; errors reported; graph not left inconsistent on index/clean failure. |
Phases are sequential: 2 depends on 1; 3 and 4 depend on 2. Phase 3 (MCP) and 4 (CLI) can be parallelised once the core index + graph API exists.
| Phase | Focus | Done when |
|---|---|---|
| 1. Repo and pipeline | Repo, build, SysML parser, file discovery, load order. | Repo builds; parsing a folder of .sysml yields symbol list and relations (e.g. IN_PACKAGE, TYPES). |
| 2. Graph and index | Kuzu schema, GraphStore, Indexer, list/clean; R8 for index/clean. | indexDbGraph(paths) populates graph; list/clean work; on failure report and leave graph unchanged; Cypher returns expected nodes/edges. |
| 3. MCP server | MCP server sysmledgraph; tools + resources; R8 tool errors. | From Cursor: index, query, context, impact, rename (dry_run), cypher, list_indexed, clean_index; schema resource; errors in tool result. |
| 4. CLI and docs | CLI (analyze/list/clean), README, schema doc; R8 CLI exit/stderr. | User can index from terminal; failures exit non-zero and stderr; docs match behaviour; MCP setup documented. |
| 5. GenerateMap (iteration 2) | MCP tool generate_map: read from graph (first indexed path), produce Markdown (e.g. interconnection view: documents, nodes by label, edges). Output to caller (tool result) or optional file. R8: empty graph / no index → error in result. | From Cursor: call generate_map; get markdown in result; optionally save to file. Errors returned in tool result. |
Test strategy: Phase 1: unit tests discovery, parser shape. Phase 2: GraphStore unit + Indexer integration; list/clean; error cases (invalid path, non-indexed clean). Phase 3–4: smoke tests MCP and CLI; error cases (empty graph, invalid path). No full E2E in v1.
Decisions (made)
- Storage: Global storage root (default
~/.sysmledgraph, or envSYSMEDGRAPH_STORAGE_ROOT);registry.jsonwithpaths: string[]; one DB per indexed path atdb/{sanitized}.kuzu. See sysmledgraph-GITNEXUS_ANALYSIS.md. - Parser: v1 uses stdio client (accepted); library mode if stable API available later.
- Symbol→graph mapping: Document LSP kinds → node labels and edge types in code and README/schema doc.
- R8: MCP = structured error in tool result; CLI = non-zero exit and stderr.
Risks and mitigations
| Risk | Mitigation |
|---|---|
| sysml-v2-lsp no programmatic API | v1: stdio client (spawn LSP, documentSymbol/definition/references)—accepted. |
| Load order wrong → broken refs | Use config.yaml model_files when present; else deterministic; document rule. |
| Schema drift | Define schema in one place (code); generate or copy for resource and README. |
| Index/clean fails mid-way | Report and do not commit partial state; leave previous graph unchanged (R8). |
| Kuzu DB lock | One process per DB. Per-process cache; document that CLI and MCP should not use same storage concurrently. Report lock errors (R8). See sysmledgraph-MODEL_EXAMINATION_ERROR_CONDITIONS.md §6. |
| Worker threads | v1: single-threaded (event loop); indexer sequential; one GraphStore connection per process. If worker threads added later, writes must stay serialized; see §7 (OOP and threads) in prior plan if needed. |
OOP: Indexer, GraphStore, McpServer, Cli map to modules/classes; ports to interfaces. No worker threads in v1; keeps Kuzu single-threaded.
Phase 1 – Repo and pipeline
- Create implementation location (codebase repo). Layout per sysmledgraph-CODEBASE_STRUCTURE.md. Node/TypeScript:
src/,bin/,mcp/,test/,docs/. Schema in code (e.g.src/graph/schema.ts) is an accepted layout variant (no top-levelschema/required). Dependencies: Kuzu, @modelcontextprotocol/sdk, commander, fast-glob. Pin Node 20+; document sysml-v2-lsp version. - Build and test (e.g.
tsc+ vitest). Scripts:build,test, optionalanalyzestub. - SysML parsing: sysml-v2-lsp via stdio client (v1 accepted). Parse files → symbol tables → map to graph node/edge types.
- File discovery: Given path(s), list
.sysml(and optionally.kerml). Ifconfig.yamlat path, readmodel_filesandmodel_dirfor load order. - Verify: Parse sample
.sysml; output symbol list and relations. Document symbol→graph mapping (LSP kinds → node labels and edge types).
Phase 2 – Graph and index
- Graph schema in Kuzu: node labels (Document, Package, PartDef, PartUsage, …); edge types (IN_DOCUMENT, IN_PACKAGE, PARENT, TYPES, …) per design in SYSMLEDGRAPH_MCP.md and codebase docs. Create in code or migration.
- GraphStore: open/create Kuzu DB. API:
addDocument(path, indexedAt?),addSymbol(label, props),addEdge(from, to, type),getConnection(). Storage: global root + registry + DB per path. - Indexer: discovery → load order → parse → map → write. Multiple roots; define re-index behaviour (replace or merge; document). R8: On failure report and leave graph unchanged.
- list (registry); clean (delete DB, update registry). R8: On failure report; graph unchanged.
- Verify: Index a project path; list; run Cypher; confirm nodes/edges.
Phase 3 – MCP server
- MCP server (stdio); server name sysmledgraph.
- Tools: indexDbGraph (path/paths), query, context, impact, rename (dry_run), cypher, list_indexed, clean_index; generate_map (iteration 2). R8: structured error in tool result.
- Resources:
sysmledgraph://context(stats, staleness),sysmledgraph://schema. Optional: per-path resource. - Verify: From Cursor: index, query, context, impact, rename dry_run, schema resource; errors returned.
Phase 4 – CLI and docs
- CLI: analyze (index path(s)), list, clean (path optional). R8: non-zero exit, stderr.
- README: install, usage, env/config, storage, schema summary, MCP setup.
- Verify: CLI analyze/list/clean; docs match behaviour.
Phase 5 – GenerateMap (iteration 2)
- generate_map MCP tool: no required params; optional
output_path(if server can write). Reads graph from first indexed path; produces Markdown: sections for Documents (path, id), Nodes by label (name, path, id), Edges (from, type, to). Returns{ ok: true, markdown }or{ ok: false, error }. R8: no indexed path or empty graph → error in result. - Verify: Index a path; call generate_map; confirm markdown in result; optionally save to .md file.
- Index a real project path (e.g.
sysml-v2-models/projects/sysmledgraph). - MCP: query, context, impact; rename dry_run; schema resource; results match expected symbols/relations.
- CLI: analyze, list, clean; docs match behaviour.
- R8: Invalid path index, clean non-indexed path → error to caller (MCP result or CLI stderr/exit); graph unchanged. Read tools return clear error when graph empty or symbol missing.
- GenerateMap (iteration 2): Call generate_map after indexing; markdown in result; no indexed path → error in result.
- DetectChanges (model action): Git diff → affected symbols. Not in codebase yet.
- GenerateMap: In scope for iteration 2 (Phase 5, steps 18–19).
- Multi-path semantics (documented): One DB per indexed path. Tools that need a graph use the first indexed path; no merged view. Document and test when multiple roots are indexed.
- Per-path MCP resource: Optional; deferred.
- Model: deploy-sysmledgraph.sysml, requirements-sysmledgraph.sysml, behaviour-sysmledgraph.sysml. Outputs: sysmledgraph-CODEBASE_STRUCTURE.md, sysmledgraph-INTERCONNECTION_VIEW.md, sysmledgraph-MODEL_EXAMINATION_ERROR_CONDITIONS.md, sysmledgraph-GITNEXUS_ANALYSIS.md.
- Design: SYSMLEDGRAPH_MCP.md (parent SystemDesign repo). Workflow: modelbase-development-WORKFLOW.md, PLAN_TEMPLATE.md.
- Implement report: Codebase delivers combined report at
docs/IMPLEMENT_REPORT.md(iteration block, alignment, gaps, traceability). Separate docs:docs/MODELBASE_ALIGNMENT.md,docs/GAPS.md,docs/IMPLEMENTATION_PLAN.md. Template: REPORT_TEMPLATE.md. v1 report received: Phases 1–4 done; plan updated accordingly. - Parser: daltskin/sysml-v2-lsp; grammar daltskin/sysml-v2-grammar.
Context7 (MCP): /modelcontextprotocol/typescript-sdk (MCP server, tools, resources); /tj/commander.js (CLI). Kuzu: kuzudb.com/docs.