Source-grounded knowledge map for humans and AI agents.
English | ็ฎไฝไธญๆ
GroundMap is a local-first knowledge map built on Markdown, Git, stable block anchors, and full-page agent reading. It is designed for teams and solo builders who want auditable, source-grounded knowledge without vector databases, document chunking, or hidden LLM runtime inside the core repository.
Most RAG systems optimize for recall first: split documents into chunks, embed them, retrieve fragments, and ask an LLM to reconstruct context.
GroundMap starts from a different premise:
- Knowledge should stay human-readable.
- Markdown + Git should remain the source of truth.
- Agents should read complete pages or complete sections, not arbitrary chunks.
- Every important claim should point back to a stable source anchor.
- The knowledge base itself should not call an LLM.
That gives you an AI-ready wiki that is easier to audit, diff, review, and maintain over time.
- No embeddings by default: search uses BM25-style text search, metadata, backlinks, outlinks, and full-page reading.
- Stable anchors: converted raw documents get block anchors such as
^h-*,^p-*, and^t-*so claims can cite exact source blocks. - Markdown is truth: SQLite/cache layers are optional derived indexes and can be rebuilt.
- Agent outside, KB inside: the repository exposes scripts, templates, and Web UI. LLM reasoning happens in external agents.
- Git-native governance: all meaningful changes can be reviewed, reverted, audited, and discussed as normal commits.
- Human-only zones:
raw/**,my_thoughts/**,#human-only, andlocked: truefiles are protected by policy and hooks. - Typed relation graph: wikilinks can carry semantic relation types (
SUPPORTS,REFUTES,EXTENDS, โฆ), linted against a whitelist and rendered as an interactive graph at/graphin the Web console.
GroundMap is designed to be managed with coding agents such as Claude Code, Codex, Cursor-style agents, or any tool that can read files, edit Markdown, and run shell commands. The agent handles LLM reasoning outside the knowledge base: it reads full wiki pages, calls scripts/k.py for search/outline/backlinks/health checks, updates wiki/** Markdown, and commits changes through Git.
For best results, ask your agent to read CLAUDE.md or AGENTS.md before working in the repository. Those files define the operating rules for ingesting sources, answering queries, resolving conflicts, protecting human-only areas, and keeping the Markdown knowledge base auditable.
Browse complete wiki pages with frontmatter, source citations, and block previews.
Explore typed wiki relations as an interactive knowledge graph.
Use the optional debug console to inspect agent reasoning, tool calls, and grounded answers.
Requirements:
- Python 3.10+
- Node.js 22+
- npm
git clone https://github.com/Qinbf/groundmap.git
cd groundmap
make setup
make test
make webThen open http://localhost:3006.
๐ฆ Example
raw/sources are not distributed with this repository (copyright reasons;workspaces/*/raw/is excluded by.gitignore). The example workspaces ship their fullwiki/pages, which remain completely browsable. After a fresh clone,k.py healthreports nonzero broken references (across the example workspaces โ "raw ๆไปถไธๅญๅจ" / raw file missing) and source issues (broken-source-link:source_summarypages cite[[raw/...]]blocks that aren't present) โ both are expected and do not mean your installation failed; they are the same raw-absent artifact, only the deep links into missing raw blocks are unresolved. To exercise the full convert โ cite loop, ingest your own documents into a workspace'sraw/.
The recommended real-world workflow is: clone GroundMap as the engine, prepare your source documents yourself, create a new workspace, then ask Claude Code, Codex, or another coding agent to ingest those files into the knowledge base.
For private or copyrighted documents, keep your data outside the public engine repo and point GroundMap at it with KB_ROOT:
mkdir -p ~/work/my-kb-data/workspaces
KB_ROOT=~/work/my-kb-data python scripts/k.py new-workspace my-research
mkdir -p ~/work/my-kb-data/workspaces/my-research/raw/papers
# Put your PDFs, HTML files, Word docs, or Markdown files into raw/papers/ or raw/articles/ yourself.Then start your agent in this repository and give it a concrete instruction, for example:
Read AGENTS.md first. Use KB_ROOT=~/work/my-kb-data and workspace my-research.
I put source documents under raw/papers/. Please ingest them into the knowledge base,
update the relevant wiki pages and indexes, run the health/lint checks, and summarize what changed.
To browse the result:
cd web
KB_ROOT=~/work/my-kb-data KB_WORKSPACE=my-research npm run devManual setup:
python -m pip install -r requirements-dev.txt
cd web && npm install && cd ..
bash scripts/install_hooks.sh
python -m pytest scripts/tests
python scripts/k.py health --json
cd web && npm run lint && npm run build
โ ๏ธ Stop your dev server before runningnpm run build. The Web console (npm run dev) andnext buildshare the sameweb/.next/directory. Running a production build while a dev server is live can leave the dev server serving 404s. To validate types only without building, runcd web && npx tsc --noEmitinstead. (CI runs the full build in a clean environment, which is fine.)
Local servers listen on localhost (Web console :3006, debug console :3100). With a system/terminal proxy active:
- One command (recommended):
make devstarts the Web console (:3006) and the debug console (:3100) together (Ctrl-C stops both);make webstarts just the Web console. Both setno_proxy=localhost,127.0.0.1,::1, so the local servers and their child processes are never routed through the proxy โ they start whether or not a proxy is on. - Using
npmdirectly: if you bypass the Makefile withcd web && npm run devand have no globalno_proxy, command-line tools may send loopback requests to the proxy. Usemakeinstead, orexport no_proxy=localhost,127.0.0.1,::1first (persist it in~/.zshenv). - Browser: Chrome / Safari / recent Firefox bypass
localhostby default. If a proxy extension (e.g. SwitchyOmega) breaks access, addlocalhost, 127.0.0.1to its bypass list. - Blank page: usually a corrupted
.nextcache (after switching branches / large edits), unrelated to the proxy โ runmake cleanand restart.
Engine code (scripts/, web/) is shared; data is isolated per topic under workspaces/<name>/. Each workspace has the same internal layout: wiki/, raw/, exports/, my_thoughts/, .cache/, and log.md. When no workspace is specified, the CLI auto-selects one (and prints a hint when several exist); pass --workspace to choose.
# No --workspace: auto-selects a workspace (prints a hint when several exist)
python scripts/k.py health --json
# Target a specific workspace
python scripts/k.py --workspace ai-ml-demo search "transformer"
cd web && KB_WORKSPACE=rag-evolution npm run devThis repository ships three example workspaces: smb-ecommerce, rag-evolution, and ai-ml-demo. The first two are living demos; ai-ml-demo is an archived v0 library kept on purpose โ most of its pages carry status: deprecated, demonstrating the "mark, never delete" archival workflow.
The web top bar includes a workspace switcher (writes a kb_workspace cookie and reloads), so you can switch libraries live in the UI without restarting; KB_WORKSPACE sets the initial default. The cookie value is validated against the real workspace list (resolveWorkspace()), so a tampered cookie can't escape the workspaces directory.
The engine (scripts/, web/) is a shared tool. Each independent project keeps its own knowledge base in that project's own folder and points the engine at it via the KB_ROOT environment variable:
# Engine installed once; data lives in each project's own directory
KB_ROOT=~/work/project-a/kb-data python ~/tools/groundmap/scripts/k.py --workspace main health
cd ~/tools/groundmap/web && KB_ROOT=~/work/project-a/kb-data KB_WORKSPACE=main npm run devKB_ROOT must point to the data root that contains workspaces/ (e.g. <project>/kb-data), not a specific workspace; --workspace / KB_WORKSPACE then picks the library inside it. When KB_ROOT is unset it defaults to the engine repo itself (data-in-repo, the multi-topic mode above). Keeping each project's data in its own folder lets the engine stay pure code โ shared, upgraded, and open-sourced without leaking any project's data. See GroundMap-่ฎพ่ฎกๆๆกฃ.md ยง2.4 for the full deployment model.
All of the following work on a fresh clone (they only read the bundled wiki/ pages):
python scripts/k.py health --json
python scripts/k.py --workspace rag-evolution search "retrieval"
python scripts/k.py --workspace rag-evolution outline wiki/sources/bge.md
python scripts/k.py list-conflicts
python scripts/k.py list-to-updateWeb console (defaults to http://127.0.0.1:3006, local single-user; it does not bind to 0.0.0.0 unless you pass -H explicitly):
cd web
npm run dev.
โโโ CLAUDE.md # Schema / behavior spec (single source of truth)
โโโ AGENTS.md # Codex mirror of CLAUDE.md (kept byte-aligned)
โโโ GroundMap-่ฎพ่ฎกๆๆกฃ.md # System design document
โโโ scripts/ # CLI (k.py), conversion (convert.py), parsing, tests, hooks
โโโ web/ # Next.js reading/editing console (+ REST/server actions)
โโโ .claude/skills/ # Claude Code workflow skills (kb-ingest / query / lint / export / conflict-resolve)
โโโ .agents/skills/ # Codex mirror of the skills above
โโโ wiki/_templates/ # Shared page templates (used by all workspaces)
โโโ workspaces/ # Per-topic data, switchable; ships smb-ecommerce / rag-evolution / ai-ml-demo examples
โ โโโ <name>/
โ โโโ wiki/ # Markdown wiki pages (root_index, indexes, concepts, entities, sources, analyses)
โ โโโ raw/ # Source documents and converted markdown (articles, papers, assets)
โ โโโ exports/ # Generated outputs
โ โโโ my_thoughts/ # Human-only zone (agent read-only)
โ โโโ .cache/ # Derived SQLite index (gitignored, rebuildable)
โ โโโ log.md # Operation log
โโโ docs/ # Public documentation
โโโ tools/debug-console/ # Optional standalone debug console (external KB client; see its README)
โโโ .github/ # CI, issue templates, PR template
โโโ requirements*.txt # Python dependencies
There is no backend/ directory: the original MCP + REST backend was deprecated (see GroundMap-่ฎพ่ฎกๆๆกฃ.md ยง10.5). REST and write actions are served by web/ instead.
GroundMap intentionally does not include:
- embedded LLM SDK calls in the core knowledge base,
- embedding models or vector stores for default retrieval,
- hidden chunking pipelines,
- a hosted multi-tenant SaaS layer,
- private industry playbooks.
Those boundaries are deliberate. The open-source core focuses on the durable knowledge substrate; product-specific agents, workflows, and enterprise integrations can live outside it.
- ๐ Step-by-step beginner tutorial (ไธญๆ, with screenshots) โ zero-to-running walkthrough with a full worked example; the best place to start (also available as a standalone HTML page with a sidebar TOC for offline reading)
- Quickstart
- Why No Embeddings
- Demo Plan
- Web Console
For Chinese-language notes on frontier AI technology, practical AI applications, and research-oriented AI agent workflows, follow the author's WeChat Official Account: ่ฆ็งไธฐAI. It also shares GroundMap tutorials, case studies, and related learning resources.
- Public demo workspace with redistributable sources.
- Packaged CLI command, for example
groundmap health. - Better onboarding walkthrough in the Web console.
- Optional derived SQLite/FTS index for large repositories.
See AGENTS.md for the design contracts and future evolution notes.
Contributions are welcome, especially around documentation, tests, onboarding, CLI ergonomics, and Web UI polish. Please read CONTRIBUTING.md first.
This project is licensed under the Apache License 2.0. See LICENSE.
The Apache-2.0 license applies to the open-source core. Private industry playbooks, customer-specific workflows, and hosted product layers can be maintained separately.



