Skip to content

Commit 8957ef2

Browse files
committed
docs: update CHANGELOG and README with LM Studio embedding improvements
CHANGELOG updates: - Added new section documenting LM Studio provider fixes - Documented with_auto_from_env() support for LM Studio - Documented embeddings-lmstudio feature flag addition - Documented architectural consolidation to single config path - Explained impact: LM Studio now works in all code paths README updates: - Added LM Studio as explicit embedding provider option - Added side-by-side comparison of Ollama vs LM Studio providers - Updated LM Studio setup with new build commands (Makefile + feature flags) - Added environment variable configuration option - Fixed LM Studio URL to include /v1 endpoint - Improved clarity on supported embedding models for LM Studio Both files now accurately reflect the current state of LM Studio support.
1 parent 14fbee6 commit 8957ef2

2 files changed

Lines changed: 42 additions & 6 deletions

File tree

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
#### **LM Studio Embedding Provider - Full Environment Variable Support**
13+
- **Fixed `with_auto_from_env()`** to support LM Studio provider (previously only supported Jina and Ollama)
14+
- **Added `embeddings-lmstudio` feature flag** to MCP crate for explicit LM Studio support
15+
- **Exposed in build scripts**: Added to `build-mcp-autoagents` and `build-mcp-http` Makefile targets
16+
- **Environment variable detection**: `CODEGRAPH_EMBEDDING_PROVIDER=lmstudio` now properly initializes provider
17+
- **Architectural improvement**: Consolidates embedding initialization to single code path
18+
- Symbol resolution now uses `with_config()` instead of `with_auto_from_env()`
19+
- Eliminates duplicate initialization logic and configuration inconsistencies
20+
- Single source of truth for all embedding configuration
21+
- **Impact**: LM Studio embeddings now work in all code paths (main indexing, symbol resolution, API)
22+
1223
#### **Fast ML Code Enhancement (Always-On)**
1324
- **Aho-Corasick pattern matching** for sub-microsecond multi-pattern code analysis (50-500ns per file)
1425
- Detects common patterns: `use`, `impl`, `class`, `extends`, `async fn`, `trait`, `import`, etc.

README.md

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,18 +23,30 @@ CodeGraph indexes your source code to a graph database, creates semantic embeddi
2323

2424
### Local Embeddings & Reranking (SurrealDB)
2525

26-
CodeGraph now writes Ollama/LM Studio embeddings directly into SurrealDBs dedicated HNSW columns. Pick the model you want and set the matching env vars before running `codegraph index`:
26+
CodeGraph supports multiple local embedding providers (Ollama, LM Studio, ONNX) and writes embeddings directly into SurrealDB's dedicated HNSW columns. Pick the provider you want and set the matching env vars before running `codegraph index`:
2727

28+
**Option 1: Ollama**
2829
```bash
2930
export CODEGRAPH_EMBEDDING_PROVIDER=ollama
3031
export CODEGRAPH_EMBEDDING_MODEL=qwen3-embedding:0.6b # or all-mini-llm, qwen3-embedding:4b, embeddinggemma etc.
3132
export CODEGRAPH_EMBEDDING_DIMENSION=1024 # 384, 768, 1024, 1536, 2048, 2560, 3072 or 4096 dimensions supported
33+
```
34+
35+
**Option 2: LM Studio (OpenAI-compatible)**
36+
```bash
37+
export CODEGRAPH_EMBEDDING_PROVIDER=lmstudio
38+
export CODEGRAPH_LMSTUDIO_MODEL=jina-embeddings-v3 # or jina-embeddings-v4, qwen3-embedding-0.6b, nomic-embed-text-v1.5, etc.
39+
export CODEGRAPH_LMSTUDIO_URL=http://localhost:1234/v1 # Default LM Studio endpoint
40+
export CODEGRAPH_EMBEDDING_DIMENSION=1024 # Auto-detected for 20+ models, or set manually
41+
```
3242

33-
# Optional local reranking (LM Studio exposes an OpenAI-compatible reranker endpoint)
43+
**Optional local reranking:**
44+
```bash
45+
# LM Studio exposes an OpenAI-compatible reranker endpoint
3446
export CODEGRAPH_RERANKING_PROVIDER=lmstudio
3547
```
3648

37-
We automatically route embeddings to `embedding_384`, `embedding_768`, `embedding_1024`, `embedding_2048`, `embedding_2056`, or `embedding_4096` and keep reranking disabled unless a provider is configured.
49+
We automatically route embeddings to `embedding_384`, `embedding_768`, `embedding_1024`, `embedding_2048`, `embedding_2560`, or `embedding_4096` columns and keep reranking disabled unless a provider is configured.
3850

3951
---
4052

@@ -282,18 +294,23 @@ ollama_url = "http://localhost:11434"
282294
```bash
283295
cd codegraph-rust
284296

285-
# Build with OpenAI-compatible support (for LM Studio)
286-
cargo build --release --features "openai-compatible"
297+
# Build MCP server with LM Studio support (recommended)
298+
make build-mcp-autoagents
299+
300+
# Or build manually with feature flags
301+
cargo build --release -p codegraph-mcp --features "ai-enhanced,autoagents-experimental,embeddings-lmstudio,codegraph-ai/openai-compatible"
287302
```
288303

289304
**Step 5: Configure**
290305

306+
**Option A: Config file (recommended)**
307+
291308
Create `~/.codegraph/config.toml`:
292309
```toml
293310
[embedding]
294311
provider = "lmstudio"
295312
model = "jinaai/jina-embeddings-v4"
296-
lmstudio_url = "http://localhost:1234"
313+
lmstudio_url = "http://localhost:1234/v1"
297314
dimension = 2048
298315

299316
[llm]
@@ -303,6 +320,14 @@ model = "lmstudio-community/DeepSeek-Coder-V2-Lite-Instruct-GGUF"
303320
lmstudio_url = "http://localhost:1234"
304321
```
305322

323+
**Option B: Environment variables**
324+
```bash
325+
export CODEGRAPH_EMBEDDING_PROVIDER=lmstudio
326+
export CODEGRAPH_LMSTUDIO_MODEL=jinaai/jina-embeddings-v4
327+
export CODEGRAPH_LMSTUDIO_URL=http://localhost:1234/v1
328+
export CODEGRAPH_EMBEDDING_DIMENSION=2048
329+
```
330+
306331
**Step 6: Index and run**
307332
```bash
308333
# Index your project

0 commit comments

Comments
 (0)