Skip to content

Commit f167da2

Browse files
committed
feat: Add comprehensive documentation for CodeGraph crates including architecture, features, and usage
1 parent af1310c commit f167da2

16 files changed

Lines changed: 248 additions & 0 deletions

File tree

docs/README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# CodeGraph Documentation
2+
3+
Welcome to the comprehensive documentation for the CodeGraph project.
4+
5+
## Architecture Visualization
6+
Explore the interactive [Architecture Diagram](architecture-visualization.html) to understand the system components and data flow.
7+
8+
## Crate Documentation
9+
10+
### Core & Data
11+
- [codegraph-core](crates/codegraph-core/README.md): Fundamental types and models (`CodeNode`, `ExtractionResult`).
12+
- [codegraph-graph](crates/codegraph-graph/README.md): Database Access Layer (SurrealDB).
13+
- [codegraph-zerocopy](crates/codegraph-zerocopy/README.md): Zero-copy serialization utilities.
14+
15+
### Processing & AI
16+
- [codegraph-parser](crates/codegraph-parser/README.md): Tree-sitter parsing and unified extraction.
17+
- [codegraph-vector](crates/codegraph-vector/README.md): Embeddings and chunking.
18+
- [codegraph-ai](crates/codegraph-ai/README.md): LLM provider abstractions.
19+
- [codegraph-cache](crates/codegraph-cache/README.md): Caching and read-ahead mechanisms.
20+
- [codegraph-concurrent](crates/codegraph-concurrent/README.md): Concurrency primitives.
21+
22+
### MCP Ecosystem
23+
- [codegraph-mcp](crates/codegraph-mcp/README.md): Main Orchestrator and Indexer.
24+
- [codegraph-mcp-server](crates/codegraph-mcp-server/README.md): Server binary and setup.
25+
- [codegraph-mcp-daemon](crates/codegraph-mcp-daemon/README.md): Background service and file watching.
26+
- [codegraph-mcp-tools](crates/codegraph-mcp-tools/README.md): Specific tool implementations.
27+
- [codegraph-mcp-core](crates/codegraph-mcp-core/README.md): Shared MCP traits.
28+
- [codegraph-mcp-autoagents](crates/codegraph-mcp-autoagents/README.md): Experimental agents.
29+
- [codegraph-mcp-rig](crates/codegraph-mcp-rig/README.md): Testing rig.
30+
31+
## Specifications
32+
Detailed specs can be found in the [specifications](specifications/) directory.

docs/crates/codegraph-ai/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# CodeGraph AI (`codegraph-ai`)
2+
3+
## Overview
4+
`codegraph-ai` provides high-level LLM integration capabilities, distinct from simple vector embeddings.
5+
6+
## Features
7+
- **LLM Providers**: Abstractions for Anthropic, OpenAI, and local LLMs.
8+
- **Completion & Chat**: Unified traits for text generation.
9+
- **Usage**: Used for complex tasks like "Semantic Edge Resolution" where a simple vector similarity isn't enough (e.g., determining if a generic `handle_request` call maps to a specific controller).
10+
11+
## Configuration
12+
Supports switching providers via `codegraph.toml` or environment variables.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# CodeGraph Cache (`codegraph-cache`)
2+
3+
## Overview
4+
Implements caching strategies to speed up incremental indexing and queries.
5+
6+
## Features
7+
- **Query Cache**: Caches results of expensive graph traversals or vector searches.
8+
- **Invalidation**: Logic to invalidate cache entries when files change (hooked into file watchers in the daemon).
9+
- **Read-ahead**: Experimental features for pre-fetching related graph nodes.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# CodeGraph Concurrent (`codegraph-concurrent`)
2+
3+
## Overview
4+
Provides concurrency primitives and structures to ensure safe, high-performance parallel processing.
5+
6+
## Features
7+
- **Graph Primitives**: Thread-safe graph structures for in-memory operations.
8+
- **Queues**: MPMC (Multi-Producer Multi-Consumer) and SPSC (Single-Producer Single-Consumer) channel implementations optimized for the indexing workload.
9+
- **Usage**: heavily used by `codegraph-mcp` during the parallel indexing phase.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# CodeGraph Core (`codegraph-core`)
2+
3+
## Overview
4+
`codegraph-core` is the foundational crate for the CodeGraph system. It defines the core data models, types, and traits that are shared across the entire ecosystem. It allows for a decoupled architecture where parsing, storage, and indexing can evolve independently.
5+
6+
## Key Components
7+
8+
### `CodeNode`
9+
The `CodeNode` struct is the atomic unit of the graph. It represents a file, a class, a function, or any other significant code entity.
10+
- **Deterministic IDs**: `CodeNode` uses a content-addressable or path-based deterministic ID system (`with_deterministic_id`) to ensure that re-indexing the same content yields the same ID, facilitating incremental updates.
11+
12+
### `ExtractionResult`
13+
This struct is the bridge between parsing and indexing.
14+
- **Decoupled Design**: It holds a list of `CodeNode`s and a list of `EdgeRelationship`s.
15+
- **Atomic Unit**: The parser produces a single `ExtractionResult` for a file, which contains everything needed to index that file.
16+
17+
### `EdgeRelationship`
18+
Represents a directed connection between nodes.
19+
- **Unresolved Edges**: Initially, edges might point to a string target (e.g., a function name) rather than a concrete Node ID. The `indexer` resolves these later.
20+
- **Types**: Defines relationship types like `Defines`, `Calls`, `Imports`.
21+
22+
## Connascence & Architecture
23+
This crate has **high afferent coupling** (many crates depend on it) but **low efferent coupling** (it depends on few things). This is by design. Changes here ripple through the system, so strict backward compatibility and careful design of `CodeNode` and `ExtractionResult` are required.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# CodeGraph Graph (`codegraph-graph`)
2+
3+
## Overview
4+
`codegraph-graph` serves as the Data Access Layer (DAL) for the system, specifically targeting SurrealDB as the persistent store.
5+
6+
## Key Components
7+
8+
### `SurrealDbStorage`
9+
The main client struct for database interactions.
10+
- **Async Operations**: Optimized for high-throughput, concurrent writes.
11+
- **Graph Operations**: Methods to insert nodes, create edges, and query the graph.
12+
13+
### Schema & Models
14+
Defines the storage-optimized versions of core types.
15+
- **`CodeEdge`**: Represents a fully resolved edge in the database.
16+
- **`NodeEmbeddingRecord`**: storage for vector embeddings.
17+
18+
## Migration
19+
Includes utilities for applying SurrealDB schema files (`.surql`) to ensure the database structure matches the code expectations.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# CodeGraph MCP AutoAgents
2+
3+
Experimental crate for autonomous agents.
4+
Implements the "Observe-Thought-Action" loop using graph tools.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# CodeGraph MCP Core
2+
3+
Shared types, traits, and protocol definitions for the CodeGraph MCP ecosystem.
4+
See [codegraph-mcp](../codegraph-mcp/README.md) for the main server orchestration.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# CodeGraph MCP Daemon
2+
3+
Background service for the CodeGraph system.
4+
Handles:
5+
- File watching (using `notify`)
6+
- Incremental indexing triggers
7+
- Long-running state management
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# CodeGraph MCP Rig Agent (`codegraph-mcp-rig`)
2+
3+
## Overview
4+
`codegraph-mcp-rig` provides an alternative agent backend for the CodeGraph MCP server, built on top of the [Rig](https://github.com/0xPlaygrounds/rig) framework. It is **not** a testing rig, but a fully functional agent implementation that orchestrates LLMs to solve complex tasks using the code graph.
5+
6+
## Purpose
7+
This crate serves as a robust, production-ready alternative to the experimental `codegraph-mcp-autoagents`. It leverages the `rig` library's abstractions for:
8+
- **Provider Abstraction**: Unified interface for OpenAI, Anthropic, Ollama, xAI, and generic OpenAI-compatible providers (like LM Studio).
9+
- **Agent Construction**: Builder pattern (`RigAgentBuilder`) to configure agents with specific system prompts, context tiers, and tool sets.
10+
- **Tool Integration**: Automatically exposes CodeGraph tools (dependency analysis, semantic search, etc.) to the LLM agent.
11+
12+
## Key Components
13+
14+
### `RigAgentBuilder`
15+
The core entry point for creating agents.
16+
- **Context Awareness**: Automatically configures token limits and "turn" counts based on the detected `ContextTier` (e.g., Small, Medium, Large, XLarge).
17+
- **Provider Support**: Feature-gated support for different LLM providers (`openai`, `anthropic`, `ollama`, `xai`).
18+
- **Tool Factory**: Uses `GraphToolFactory` to inject graph-aware tools into the agent's context.
19+
20+
### `RigExecutor`
21+
Handles the execution of agent queries.
22+
- Manages the conversation history.
23+
- Executes the tool use loop (Agent -> Tool -> Agent).
24+
- Returns the final answer to the MCP client.
25+
26+
## Usage
27+
This crate is typically used by `codegraph-mcp-server` when the `rig-experimental` feature is enabled. It allows the server to delegate complex "agentic" requests (like "Refactor this module" or "Explain the data flow") to a sophisticated Rig-based agent.

0 commit comments

Comments
 (0)