From fde27dd350061197bc5dc4e8281fb46d6fc4bf8d Mon Sep 17 00:00:00 2001 From: Jordan Partridge Date: Thu, 2 Jul 2026 21:49:58 -0700 Subject: [PATCH 1/2] docs: add README --- README.md | 339 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 213 insertions(+), 126 deletions(-) diff --git a/README.md b/README.md index e0ba52a..b240d6c 100644 --- a/README.md +++ b/README.md @@ -1,200 +1,287 @@ -# Knowledge CLI +# Knowledge [![Sentinel Gate](https://github.com/conduit-ui/knowledge/actions/workflows/gate.yml/badge.svg)](https://github.com/conduit-ui/knowledge/actions/workflows/gate.yml) -AI-powered knowledge base with semantic search, Qdrant vector storage, and Ollama intelligence. +A semantic knowledge base that lives in your terminal — and talks directly to your AI tools. -## What It Does +Knowledge captures the technical decisions, gotchas, and context you accumulate while working, then hands the right pieces back to you (or to Claude Code) via semantic vector search. It's a [Laravel Zero](https://laravel-zero.com) CLI backed entirely by [Qdrant](https://qdrant.tech) — no relational database, no schema migrations, just vectors and JSON payloads. Every entry is namespaced to a project automatically, detected from your git repository. -Captures technical decisions, learnings, and context from your work. Retrieves exactly what you need via semantic search — especially for AI pair programming with Claude Code. +There are two ways in: the `know` command-line tool, and a local **MCP server** that exposes the same knowledge base as tools your AI agent can call directly. ```bash -# Add knowledge -./know add "Database Connection Fix" --content="Check .env before debugging migrations" --tags=debugging,database +# Capture something worth remembering +./know add "Database Connection Fix" \ + --content="Check .env before debugging migrations" \ + --tags=debugging,database -# Semantic search -./know search "how to fix database issues" +# Find it again, by meaning rather than keyword +./know search "how do I fix database issues" -# Show entry details -./know show +# Load project context at the start of an AI session +./know context ``` -## Architecture +## Requirements +- **PHP 8.2+** +- **Composer** +- **Docker** — for the Qdrant vector database (`make up` starts it for you) +- **Redis** *(optional)* — query/embedding cache for sub-200ms reads +- **Ollama** *(optional)* — background auto-tagging and categorization +- An **embedding server** reachable at `QDRANT_EMBEDDING_SERVER` (default `http://localhost:8001`) for producing vectors + +## Installation + +Knowledge is a standalone CLI app rather than a library you pull into another project. + +```bash +git clone https://github.com/conduit-ui/knowledge.git +cd knowledge +composer install +cp .env.example .env +``` + +The entry point is the `know` binary in the project root: + +```bash +./know list # show every available command +``` + +Optionally, build a single self-contained PHAR with [Box](https://github.com/box-project/box) (see `box.json`): + +```bash +box compile # bundle app + vendor into a single PHAR +``` + +## Quick Start + +### 1. Start the vector database + +```bash +make up # docker compose up -d — starts Qdrant +``` + +This brings up Qdrant on `http://localhost:6333` (HTTP) and `6334` (gRPC). Embeddings are generated through the [`the-shit/vector`](https://packagist.org/packages/the-shit/vector) client, which talks to the embedding server configured via `QDRANT_EMBEDDING_SERVER` — run that separately or point it at an existing one. + +`make status` health-checks the services; `make down` stops them; `make clean` also removes the data volume. + +### 2. Initialize the collection + +```bash +./know install ``` -CLI (Laravel Zero) - ↓ -Tiered Search (narrow-to-wide retrieval across 4 tiers) - ↓ -Qdrant (Vector DB - all storage) - ├── Per-project collections (auto-detected from git) - └── Payload-based metadata (JSON) - ↓ -Redis (Cache layer - sub-200ms queries) - ↓ -Embedding Server (sentence-transformers) - ↓ -Ollama (optional - async auto-tagging via background queue) - ↓ -Remote Sync (optional - background sync to centralized server) + +### 3. Capture and retrieve + +```bash +# Git context (repo, branch, commit, author) is detected automatically +./know add "Fix Auth Timeout" \ + --content="Increase token TTL in config/auth.php" \ + --category=debugging --tags=auth,timeout + +# Skip git detection +./know add "API Key Policy" --content="Store in vault, never in .env" --no-git + +# Semantic search, with optional metadata filters +./know search "authentication timeout issues" +./know search "flaky tests" --category=testing --priority=high --limit=5 +``` + +## Core Concepts + +- **Pure vector storage.** There is no SQLite and there are no Eloquent models. Every entry is a point in a Qdrant collection: the vector drives search, and everything else (title, content, category, tags, confidence, git context) rides along in the JSON payload. +- **Per-project namespaces.** `ProjectDetectorService` reads your git repo and routes entries into a project-specific collection. Most commands accept `--project=` to target another namespace and `--global` to search across all of them. +- **Tiered retrieval.** `TieredSearchService` searches narrow-to-wide across four tiers (working context → recent → structured → archive) and returns early on confident matches, keeping latency low. +- **Write gate.** `WriteGateService` filters low-quality and duplicate entries before they land, so the base stays signal-heavy. Use `--force` on `add` to bypass it. +- **Confidence & staleness.** `EntryMetadataService` degrades confidence over time; search results flag entries as `[STALE]` when they haven't been verified recently. Run `validate ` to reaffirm one. +- **Corrections, not overwrites.** `correct` supersedes an entry with a corrected version and propagates the fix to related, conflicting entries rather than destroying history. +- **Background enhancement.** `add` stays fast; Ollama auto-tagging is queued to a file-based queue and processed later by `enhance:worker`. + +## AI Integration (MCP) + +Knowledge registers a local [MCP](https://modelcontextprotocol.io) server named `knowledge` in `routes/ai.php`: + +```php +Mcp::local('knowledge', KnowledgeServer::class); +``` + +Start it as a local server for your MCP client (e.g. Claude Code): + +```bash +./know mcp:start knowledge ``` -No SQLite. No schema migrations. Pure vector storage. Per-project isolation via auto-detected git namespaces. +The server (`app/Mcp/Servers/KnowledgeServer.php`) exposes eight tools, all of which auto-detect the current project from git: + +| Tool | What it does | +|------|--------------| +| `recall` | Semantic vector search with tiered retrieval, ranked by relevance, confidence, and freshness | +| `remember` | Capture a discovery — auto-detects git context, runs the write gate, checks for duplicates | +| `correct` | Supersede wrong knowledge with a corrected version and propagate the fix | +| `context` | Load project-relevant entries grouped by category; ideal at session start | +| `stats` | Entry counts, project namespaces, and system health | +| `search-code` | Semantic code search across indexed repositories | +| `file-outline` | Symbol outline of a file — classes, methods, functions with hierarchy | +| `symbol-lookup` | Look up a symbol by ID, optionally with its source code | + +Point your MCP client at the local server with a command such as `./know mcp:start knowledge`. Use `./know mcp:inspector knowledge` to test the connection interactively. ## Commands -All commands support `--project=` to target a specific project namespace and `--global` to search across all projects. Project is auto-detected from the current git repository. +Most knowledge commands accept `--project=` (target a namespace) and `--global` (span all namespaces); the project is auto-detected from git when omitted. -### Core Knowledge +### Knowledge | Command | Description | |---------|-------------| -| `add` | Add a knowledge entry (auto-detects git context, async Ollama tagging) | -| `search` | Semantic vector search with tiered narrow-to-wide retrieval | -| `show ` | Display entry details | -| `entries` | List entries with filters | -| `update ` | Update an existing entry | -| `validate ` | Mark entry as validated (boosts confidence) | -| `archive ` | Soft-delete an entry | -| `export ` | Export a single entry | -| `export:all` | Bulk export all entries | -| `correct` | Correct/update knowledge with multi-tier propagation | +| `add {title}` | Add an entry (git-aware, queues async Ollama enhancement) | +| `search {query?}` | Semantic vector search with metadata filters | +| `show {id}` | Display an entry's full details | +| `entries` | List entries with `--category` / `--priority` / `--status` / `--module` filters | +| `update {id}` | Update an existing entry (`--add-tags` appends, `--tags` replaces) | +| `validate {id}` | Reaffirm an entry, boosting effective confidence | +| `archive {id}` | Soft-delete an entry (`--restore` to bring it back) | +| `correct {id}` | Correct an entry, superseding the original and propagating the fix | +| `export {id}` | Export one entry (`--format=markdown\|json`) | +| `export:all` | Bulk-export all entries to a directory | ### Intelligence | Command | Description | |---------|-------------| -| `context` | Load semantic session context for AI tools | -| `insights` | AI-generated insights from your knowledge base | -| `synthesize` | Generate daily synthesis of knowledge themes | -| `stage` | Stage entries in daily log before permanent storage | -| `promote` | Promote staged entries to permanent knowledge | -| `enhance:worker` | Process the background Ollama enhancement queue | +| `context` | Load semantic session context for AI tools, capped by `--max-tokens` | +| `insights` | AI-generated insights — `--themes`, `--patterns`, or classify a single entry | +| `synthesize` | Deduplicate, digest, and archive stale entries (all `--dry-run`-able) | +| `stage` | Stage an entry in the daily log before permanent storage | +| `promote` | Promote staged entries to permanent knowledge (`--auto`, `--date`, `--all`) | +| `enhance:worker` | Process the background Ollama enhancement queue (`--once`, `--status`) | + +### Code Intelligence + +| Command | Description | +|---------|-------------| +| `index-code {path?}` | Index a codebase for semantic code search (`--incremental`, `--list`) | +| `search-code {query}` | Semantic search over indexed symbols (`--show-source`, `--kind`, `--file`) | +| `vectorize-code {repo}` | Vectorize tree-sitter symbols into Qdrant | +| `reindex:all` | Incrementally re-index and vectorize all git repos under a base path | +| `git:context` | Display the current git context | ### Infrastructure | Command | Description | |---------|-------------| -| `install` | Initialize Qdrant collection | -| `config` | Manage configuration | -| `stats` | Analytics dashboard | +| `install` | Initialize the Qdrant collection | +| `config` | Manage configuration (`config get\|set\|list`) | +| `stats` | Analytics dashboard for the knowledge base | | `search:status` | Search infrastructure health check | -| `agent:status` | Dependency health checks (Qdrant, Redis, Ollama, Embeddings) | -| `maintain` | Run maintenance tasks | +| `agent:status` | Dependency health checks (Qdrant, Redis, Ollama, embeddings) | +| `maintain` | Run maintenance passes over entries | | `projects` | List all project knowledge bases | +| `daemon:install` | Install/manage systemd timers for background daemons | ### Services (Docker) | Command | Description | |---------|-------------| -| `service:up` | Start Qdrant, Redis, embedding server | -| `service:down` | Stop services | -| `service:status` | Health check all services | -| `service:logs` | View service logs | +| `service:up` | Start the backing services | +| `service:down` | Stop them | +| `service:status` | Health-check all services | +| `service:logs` | Tail service logs | ### Sync | Command | Description | |---------|-------------| -| `sync` | Bidirectional sync (--push / --pull) | -| `sync:remote` | Background sync to centralized remote server | -| `sync:purge` | Purge sync queue | - -### Code Intelligence +| `sync` | Bidirectional cloud sync (`--push` / `--pull` / `--full-sync`) | +| `sync:remote` | Background sync to a centralized remote server | +| `sync:purge` | Purge the local deletion/sync queue | -| Command | Description | -|---------|-------------| -| `index-code` | Index codebase for semantic code search | -| `search-code` | Semantic search across indexed code | -| `git:context` | Display current git context | - -## Quick Start +## Configuration -### 1. Start Services +Configuration comes from `.env` (see `.env.example`), the config files in `config/`, and an optional per-user override at `~/.knowledge/config.json` (merged in by `AppServiceProvider`). -```bash -# Docker compose (Qdrant + embedding server) -make up +```env +# Vector database (Qdrant) +QDRANT_ENABLED=true +QDRANT_HOST=localhost +QDRANT_PORT=6333 +EMBEDDING_PROVIDER=qdrant +QDRANT_EMBEDDING_SERVER=http://localhost:8001 -# Or manually -docker compose up -d -``` +# Redis cache (optional — improves query speed) +REDIS_HOST=localhost +REDIS_PORT=6379 -This starts: -- **Qdrant** on `http://localhost:6333` — Vector database -- **Embedding Server** on `http://localhost:8001` — sentence-transformers (all-MiniLM-L6-v2) +# Ollama LLM (optional — auto-tagging and query expansion) +OLLAMA_ENABLED=false +OLLAMA_HOST=localhost +OLLAMA_PORT=11434 +OLLAMA_MODEL=llama3.2:3b -### 2. Initialize +# Centralized sync (optional — multi-machine sharing) +REMOTE_SYNC_ENABLED=false +# REMOTE_SYNC_URL=http://your-server:8080 +# REMOTE_SYNC_TOKEN=your-token -```bash -./know install +# Cloud API sync (optional) +# PREFRONTAL_API_URL=http://your-api:8080 +# PREFRONTAL_API_TOKEN=your-token ``` -### 3. Add Knowledge +Other notable keys (with defaults, from `config/search.php`): -```bash -# With automatic git context detection -./know add "Fix Auth Timeout" --content="Increase token TTL in config/auth.php" --tags=auth,debugging +| Key | Default | Purpose | +|-----|---------|---------| +| `EMBEDDING_DIMENSION` | `1024` | Vector size (`1024` for bge-large, `384` for all-MiniLM-L6-v2) | +| `SEARCH_MIN_SIMILARITY` | `0.3` | Minimum similarity score for a match | +| `SEARCH_MAX_RESULTS` | `20` | Default result cap | +| `QDRANT_COLLECTION` | `knowledge` | Base collection name | +| `QDRANT_CACHE_TTL` | `604800` | Embedding cache lifetime (7 days) | +| `HYBRID_SEARCH_ENABLED` | `false` | Combine dense + sparse (BM25) vectors via Reciprocal Rank Fusion | -# Skip git detection -./know add "API Keys" --content="Store in vault, never in .env" --no-git -``` +### Remote server (production) -### 4. Search +`docker-compose.remote.yml` binds services to a specific network interface (Tailscale, VPN, or LAN) so several machines can sync against one centralized knowledge base. Last-write-wins conflict resolution is based on `updated_at`. -```bash -# Semantic search -./know search "authentication timeout issues" +## Testing -# With filters -./know search --category=debugging --tag=auth --limit=5 +```bash +composer test # Pest, run in parallel +composer test-coverage # with a coverage report ``` -## Configuration +Run a single file or filter: -`.env` file: - -```env -QDRANT_HOST=localhost -QDRANT_PORT=6333 -EMBEDDING_SERVER_URL=http://localhost:8001 -REDIS_HOST=localhost -REDIS_PORT=6379 -OLLAMA_HOST=http://localhost:11434 +```bash +vendor/bin/pest tests/Feature/Commands/KnowledgeSearchCommandTest.php ``` -### Remote Server (Production) - -Uses `docker-compose.remote.yml` to bind services to a specific network interface (e.g. Tailscale, VPN, LAN) for centralized knowledge sync across multiple machines. - ## Development ```bash -composer install # Install dependencies -composer test # Run tests (Pest, parallel) -composer test-coverage # Run with coverage report -composer format # Format code (Laravel Pint) -composer analyse # Static analysis (PHPStan level 8) +composer install # install dependencies +composer format # Laravel Pint (Laravel preset) +composer analyse # PHPStan level 8, strict rules +composer test # Pest, parallel ``` -## Quality Standards - -- **Test Coverage**: 95% minimum (enforced by Sentinel Gate CI) -- **Static Analysis**: PHPStan Level 8 with strict rules -- **Code Style**: Laravel Pint (Laravel preset) -- **CI/CD**: Sentinel Gate auto-merges PRs after certification +Quality gates are enforced in CI by the **Sentinel Gate** workflow: **95% coverage minimum**, **PHPStan level 8**, and Pint formatting. Keep the suite green — the gate certifies PRs before merge. ## Stack -- **Runtime**: PHP 8.2+, Laravel Zero -- **Vector DB**: Qdrant (Rust) -- **Cache**: Redis -- **Embeddings**: sentence-transformers (Python/FastAPI) -- **LLM**: Ollama (optional, for auto-tagging and query expansion) -- **HTTP Client**: Saloon -- **Testing**: Pest -- **CI**: GitHub Actions (Sentinel Gate) +- **Runtime** — PHP 8.2+, Laravel Zero 12 +- **Vector DB** — Qdrant, via the [`the-shit/vector`](https://packagist.org/packages/the-shit/vector) connector and embedding client +- **Cache** — Redis +- **LLM** — Ollama (optional), for auto-tagging and query expansion +- **AI protocol** — [`laravel/mcp`](https://github.com/laravel/mcp) local server +- **HTTP** — Saloon (used internally by the Qdrant and code-indexing services) +- **Testing** — Pest 4 +- **CI** — GitHub Actions (Sentinel Gate) + +## Contributing + +The workflow is test-first: write a failing test, make it pass, then run `composer format` and `composer analyse` before pushing. PRs must clear the Sentinel Gate (95% coverage, PHPStan level 8, Pint) to merge. ## License -MIT License. See [LICENSE](LICENSE) for details. +Released under the MIT License. From 624159bbcf69013e13002463adb253eb813d3fa3 Mon Sep 17 00:00:00 2001 From: Jordan Partridge Date: Thu, 2 Jul 2026 21:54:42 -0700 Subject: [PATCH 2/2] docs: remove Sentinel Gate references from README --- README.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index b240d6c..6c93f3c 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,5 @@ # Knowledge -[![Sentinel Gate](https://github.com/conduit-ui/knowledge/actions/workflows/gate.yml/badge.svg)](https://github.com/conduit-ui/knowledge/actions/workflows/gate.yml) - A semantic knowledge base that lives in your terminal — and talks directly to your AI tools. Knowledge captures the technical decisions, gotchas, and context you accumulate while working, then hands the right pieces back to you (or to Claude Code) via semantic vector search. It's a [Laravel Zero](https://laravel-zero.com) CLI backed entirely by [Qdrant](https://qdrant.tech) — no relational database, no schema migrations, just vectors and JSON payloads. Every entry is namespaced to a project automatically, detected from your git repository. @@ -265,7 +263,7 @@ composer analyse # PHPStan level 8, strict rules composer test # Pest, parallel ``` -Quality gates are enforced in CI by the **Sentinel Gate** workflow: **95% coverage minimum**, **PHPStan level 8**, and Pint formatting. Keep the suite green — the gate certifies PRs before merge. +Quality gates are enforced in CI: **95% coverage minimum**, **PHPStan level 8**, and Pint formatting. Keep the suite green — automated PR review runs via [Epic Werkflow](https://github.com/the-shit/epic-werkflow). ## Stack @@ -276,11 +274,11 @@ Quality gates are enforced in CI by the **Sentinel Gate** workflow: **95% covera - **AI protocol** — [`laravel/mcp`](https://github.com/laravel/mcp) local server - **HTTP** — Saloon (used internally by the Qdrant and code-indexing services) - **Testing** — Pest 4 -- **CI** — GitHub Actions (Sentinel Gate) +- **CI** — GitHub Actions ## Contributing -The workflow is test-first: write a failing test, make it pass, then run `composer format` and `composer analyse` before pushing. PRs must clear the Sentinel Gate (95% coverage, PHPStan level 8, Pint) to merge. +The workflow is test-first: write a failing test, make it pass, then run `composer format` and `composer analyse` before pushing. PRs must maintain 95% coverage, PHPStan level 8, and Pint formatting to merge. ## License