VectifyAI · KylinMountain · May 30, 2026 · May 30, 2026 · May 30, 2026 · May 30, 2026
diff --git a/README.md b/README.md
@@ -109,6 +109,7 @@ wiki/                                  │            ← the foundation
  ├── sources/            Full-text conversions
  ├── summaries/          Per-document summaries
  ├── concepts/           Cross-document synthesis ← the good stuff
+ ├── entities/           Specific named things (people, orgs, places, products)
  ├── explorations/       Saved query results
  └── reports/            Lint reports
                                        │
@@ -136,9 +137,10 @@ Short docs are read in full by the LLM. Long PDFs are indexed by PageIndex into
 When you add a document, the LLM:
 
 1. Generates a **summary** page
-2. Reads existing **concept** pages
+2. Reads existing **concept** and **entity** pages
 3. Creates or updates concepts with cross-document synthesis
-4. Updates the **index** and **log**
+4. Creates or updates **entity** pages (people, orgs, places, products)
+5. Updates the **index** and **log**
 
 A single source might touch 10-15 wiki pages. Knowledge accumulates: each document enriches the existing wiki rather than sitting in isolation.
 

diff --git a/openkb/agent/compiler.py b/openkb/agent/compiler.py
diff --git a/openkb/agent/linter.py b/openkb/agent/linter.py
@@ -24,12 +24,16 @@
 4. **Redundancy** — Are there multiple pages that cover the same content and
    could be merged?
 5. **Concept coverage** — Are important themes in the summaries missing concept pages?
+6. **Entity coverage** — Are important named things (people, organizations, places,
+   products, works, events) in the summaries missing entity pages, or are existing
+   entity pages contradictory, redundant, or orphaned (unlinked from any source)?
 
 ## Process
 1. Start with index.md to understand scope.
 2. Read summary pages to understand document content.
 3. Read concept pages to check for contradictions and gaps.
-4. Produce a structured Markdown report listing issues found with references
+4. Read entity pages to check for contradictions, redundancy, coverage, and orphans.
+5. Produce a structured Markdown report listing issues found with references
    to the specific pages where each issue occurs.
 
 Be thorough but concise. If the wiki is small or sparse, say so.
@@ -99,9 +103,9 @@ async def run_knowledge_lint(kb_dir: Path, model: str) -> str:
 
     prompt = (
         "Please audit this knowledge base wiki for semantic quality issues: "
-        "contradictions, gaps, staleness, redundancy, and missing concept pages. "
-        "Start with index.md, then read summaries and concepts as needed. "
-        "Produce a structured Markdown report."
+        "contradictions, gaps, staleness, redundancy, and missing concept and "
+        "entity pages. Start with index.md, then read summaries, concepts, and "
+        "entities as needed. Produce a structured Markdown report."
     )
 
     result = await Runner.run(agent, prompt, max_turns=MAX_TURNS)

diff --git a/openkb/agent/query.py b/openkb/agent/query.py
@@ -28,15 +28,17 @@
    Summaries may omit details — if you need more, follow the summary's
    `full_text` frontmatter field to the source (see step 4).
 3. Read concept pages (concepts/) for cross-document synthesis.
-4. When you need detailed source document content, each summary page has a
+4. For "who/what is X" questions about a specific named person, organization,
+   place, or product, read the matching page in entities/ first.
+5. When you need detailed source document content, each summary page has a
    `full_text` frontmatter field with the path to the original document content:
    - Short documents (doc_type: short): read_file with that path.
    - PageIndex documents (doc_type: pageindex): use get_page_content(doc_name, pages)
      with tight page ranges. The summary shows document tree structure with page
      ranges to help you target. Never fetch the whole document.
-5. Source content may reference images (e.g. ![image](sources/images/doc/file.png)).
+6. Source content may reference images (e.g. ![image](sources/images/doc/file.png)).
    Use the get_image tool to view them when needed.
-6. Synthesize a clear, concise, well-cited answer grounded in wiki content.
+7. Synthesize a clear, concise, well-cited answer grounded in wiki content.
 
 Answer based only on wiki content. Be concise.
 Before each tool call, output one short sentence explaining the reason.

diff --git a/openkb/cli.py b/openkb/cli.py
@@ -43,7 +43,7 @@ def filter(self, record: logging.LogRecord) -> bool:
 from openkb.config import DEFAULT_CONFIG, load_config, save_config, load_global_config, register_kb
 from openkb.converter import convert_document
 from openkb.log import append_log
-from openkb.schema import AGENTS_MD
+from openkb.schema import AGENTS_MD, INDEX_SEED, PAGE_CONTENT_DIRS
 
 # Suppress warnings after all imports — markitdown overrides filters at import time
 import warnings
@@ -217,7 +217,7 @@ def _preflight_skill_new(kb_dir: Path, name: str) -> str | None:
     Checks (in order):
       * skill name is a valid kebab-case slug
       * ``<kb>/wiki`` exists
-      * ``<kb>/wiki/concepts`` or ``<kb>/wiki/summaries`` has at least
+      * any of ``<kb>/wiki/{summaries,concepts,entities}`` has at least
         one file (i.e. some document has been ingested + compiled)
 
     Returns ``None`` if all gates pass, else a single-line error message
@@ -239,7 +239,7 @@ def _preflight_skill_new(kb_dir: Path, name: str) -> str | None:
 
     has_content = any(
         (wiki / sub).is_dir() and any((wiki / sub).iterdir())
-        for sub in ("concepts", "summaries")
+        for sub in PAGE_CONTENT_DIRS
     )
     if not has_content:
         return (
@@ -538,13 +538,11 @@ def init(model, language):
     Path("wiki/sources/images").mkdir(parents=True, exist_ok=True)
     Path("wiki/summaries").mkdir(parents=True, exist_ok=True)
     Path("wiki/concepts").mkdir(parents=True, exist_ok=True)
+    Path("wiki/entities").mkdir(parents=True, exist_ok=True)
 
     # Write wiki files
     Path("wiki/AGENTS.md").write_text(AGENTS_MD, encoding="utf-8")
-    Path("wiki/index.md").write_text(
-        "# Knowledge Base Index\n\n## Documents\n\n## Concepts\n\n## Explorations\n",
-        encoding="utf-8",
-    )
+    Path("wiki/index.md").write_text(INDEX_SEED, encoding="utf-8")
     Path("wiki/log.md").write_text("# Operations Log\n\n", encoding="utf-8")
 
     # Create .openkb/ state directory
@@ -800,6 +798,7 @@ def remove(ctx, identifier, keep_raw, keep_empty_concepts, dry_run, yes):
     """
     from openkb.agent.compiler import (
         remove_doc_from_concept_pages,
+        remove_doc_from_entity_pages,
         remove_doc_from_index,
     )
     from openkb.lint import fix_broken_links
@@ -895,6 +894,42 @@ def remove(ctx, identifier, keep_raw, keep_empty_concepts, dry_run, yes):
     for slug in concept_edits:
         actions.append(("MODIFY", f"wiki/concepts/{slug}.md  (drop this doc from sources)"))
 
+    # Scan entity pages with the same frontmatter logic as concepts. The
+    # executor calls ``remove_doc_from_entity_pages``; this only makes the
+    # preview/summary truthful about what it will delete vs. edit.
+    affected_entities: list[tuple[str, int]] = []  # (slug, remaining_sources)
+    entities_dir = wiki_dir / "entities"
+    if entities_dir.is_dir():
+        for path in sorted(entities_dir.glob("*.md")):
+            text = path.read_text(encoding="utf-8")
+            if not text.startswith("---"):
+                continue
+            fm_end = text.find("---", 3)
+            if fm_end == -1:
+                continue
+            sources_count = 0
+            source_in_frontmatter = False
+            for line in text[:fm_end].split("\n"):
+                if line.lstrip().startswith("sources:"):
+                    lb = line.find("[")
+                    rb = line.rfind("]")
+                    if lb != -1 and rb != -1 and rb > lb:
+                        items = [s.strip() for s in line[lb + 1:rb].split(",") if s.strip()]
+                        sources_count = len(items)
+                        source_in_frontmatter = source_file_marker in items
+                    break
+            if not source_in_frontmatter:
+                continue
+            remaining = max(sources_count - 1, 0)
+            affected_entities.append((path.stem, remaining))
+
+    entity_deletes = [s for s, r in affected_entities if r == 0 and not keep_empty_concepts]
+    entity_edits = [s for s, r in affected_entities if r > 0 or keep_empty_concepts]
+    for slug in entity_deletes:
+        actions.append(("DELETE", f"wiki/entities/{slug}.md  (only source: this doc)"))
+    for slug in entity_edits:
+        actions.append(("MODIFY", f"wiki/entities/{slug}.md  (drop this doc from sources)"))
+
     if (wiki_dir / "index.md").exists():
         actions.append(("MODIFY", "wiki/index.md  (remove Documents entry)"))
 
@@ -936,6 +971,12 @@ def remove(ctx, identifier, keep_raw, keep_empty_concepts, dry_run, yes):
             f"  {len(concept_deletes)} concept(s) will be DELETED because this is their only source."
         )
         click.echo("  Pass --keep-empty-concepts to retain them instead.")
+    if entity_deletes:
+        click.echo("")
+        click.echo(
+            f"  {len(entity_deletes)} entity(s) will be DELETED because this is their only source."
+        )
+        click.echo("  Pass --keep-empty-concepts to retain them instead.")
     click.echo("")
 
     if dry_run:
@@ -967,22 +1008,31 @@ def remove(ctx, identifier, keep_raw, keep_empty_concepts, dry_run, yes):
         wiki_dir, doc_name, keep_empty=keep_empty_concepts,
     )
 
-    remove_doc_from_index(wiki_dir, doc_name, concept_result["deleted"])
+    entity_result = remove_doc_from_entity_pages(
+        wiki_dir, doc_name, keep_empty=keep_empty_concepts,
+    )
+
+    remove_doc_from_index(wiki_dir, doc_name, concept_result["deleted"],
+                          entity_slugs_deleted=entity_result["deleted"])
 
     # Strip dangling wikilinks now so a retry (after a PageIndex
     # failure below) finds a clean wiki — no point in re-running this
     # on every attempt.
     #
     # Scope: only the pages this remove actually touched (modified
-    # concept pages ∪ index.md). Previously this swept the whole wiki
-    # via ``fix_broken_links(wiki_dir)``, which silently stripped
+    # concept + entity pages ∪ index.md). Previously this swept the whole
+    # wiki via ``fix_broken_links(wiki_dir)``, which silently stripped
     # pre-existing dangling links in unrelated pages — see issue #58
     # (Bug 2). Users who want a wiki-wide sweep can still run
     # ``openkb lint --fix`` explicitly.
     lint_scope: list[Path] = [
         wiki_dir / "concepts" / f"{slug}.md"
         for slug in concept_result["modified"]
     ]
+    lint_scope += [
+        wiki_dir / "entities" / f"{slug}.md"
+        for slug in entity_result["modified"]
+    ]
     index_md = wiki_dir / "index.md"
     if index_md.exists():
         lint_scope.append(index_md)
@@ -1277,6 +1327,15 @@ def print_list(kb_dir: Path) -> None:
             for c in concepts:
                 click.echo(f"  - {c}")
 
+    # Display entities
+    entities_dir = kb_dir / "wiki" / "entities"
+    if entities_dir.exists():
+        entities = sorted(p.stem for p in entities_dir.glob("*.md"))
+        if entities:
+            click.echo(f"\nEntities ({len(entities)}):")
+            for e in entities:
+                click.echo(f"  - {e}")
+
     # Display reports
     reports_dir = kb_dir / "wiki" / "reports"
     if reports_dir.exists():
@@ -1301,7 +1360,7 @@ def list_cmd(ctx):
 def print_status(kb_dir: Path) -> None:
     """Print knowledge base status. Usable from CLI and chat REPL."""
     wiki_dir = kb_dir / "wiki"
-    subdirs = ["sources", "summaries", "concepts", "reports"]
+    subdirs = ["sources", "summaries", "concepts", "entities", "reports"]
 
     # Print the active KB path as the first line. Agents and scripts
     # parse this to locate the wiki without assuming cwd == KB root.
@@ -1332,15 +1391,19 @@ def print_status(kb_dir: Path) -> None:
         hashes = json.loads(hashes_file.read_text(encoding="utf-8"))
         click.echo(f"\n  Total indexed: {len(hashes)} document(s)")
 
-    # Last compile time: newest file in wiki/summaries/
-    summaries_dir = wiki_dir / "summaries"
-    if summaries_dir.exists():
-        summaries = list(summaries_dir.glob("*.md"))
-        if summaries:
-            newest_summary = max(summaries, key=lambda p: p.stat().st_mtime)
-            import datetime
-            mtime = datetime.datetime.fromtimestamp(newest_summary.stat().st_mtime)
-            click.echo(f"  Last compile:  {mtime.strftime('%Y-%m-%d %H:%M:%S')}")
+    # Last compile time: newest compiled page across summaries/, concepts/,
+    # and entities/ (an entity-only compile must still bump the shown time).
+    compiled_pages = [
+        p
+        for sub in PAGE_CONTENT_DIRS
+        for p in (wiki_dir / sub).glob("*.md")
+        if (wiki_dir / sub).exists()
+    ]
+    if compiled_pages:
+        newest_page = max(compiled_pages, key=lambda p: p.stat().st_mtime)
+        import datetime
+        mtime = datetime.datetime.fromtimestamp(newest_page.stat().st_mtime)
+        click.echo(f"  Last compile:  {mtime.strftime('%Y-%m-%d %H:%M:%S')}")
 
     # Last lint time: newest file in wiki/reports/
     reports_dir = wiki_dir / "reports"

diff --git a/openkb/lint.py b/openkb/lint.py
@@ -15,6 +15,8 @@
 
 import yaml
 
+from openkb.schema import PAGE_CONTENT_DIRS
+
 # Matches [[wikilink]] or [[subdir/link]]
 _WIKILINK_RE = re.compile(r"\[\[([^\]]+)\]\]")
 
@@ -171,6 +173,9 @@ def list_existing_wiki_targets(wiki_dir: Path) -> set[str]:
         targets.update(f"concepts/{p.stem}" for p in concepts_dir.glob("*.md"))
     if summaries_dir.is_dir():
         targets.update(f"summaries/{p.stem}" for p in summaries_dir.glob("*.md"))
+    entities_dir = wiki_dir / "entities"
+    if entities_dir.is_dir():
+        targets.update(f"entities/{p.stem}" for p in entities_dir.glob("*.md"))
     if (wiki_dir / "index.md").exists():
         targets.add("index")
     return targets
@@ -365,7 +370,7 @@ def check_index_sync(wiki: Path) -> list[str]:
 
     Returns issues for:
     - Links in index.md pointing to non-existent pages
-    - Pages in summaries/ or concepts/ not mentioned in index.md
+    - Pages in summaries/, concepts/, or entities/ not mentioned in index.md
 
     Args:
         wiki: Path to the wiki root directory.
@@ -389,11 +394,11 @@ def check_index_sync(wiki: Path) -> list[str]:
         if lnk_norm not in pages:
             issues.append(f"index.md links to missing page: [[{lnk}]]")
 
-    # Check that summaries and concepts pages are mentioned in index
+    # Check that summaries, concepts, and entities pages are mentioned in index
     index_stems = {Path(lnk.strip()).stem for lnk in index_links}
     index_text_lower = index_text.lower()
 
-    for subdir in ("summaries", "concepts"):
+    for subdir in PAGE_CONTENT_DIRS:
         subdir_path = wiki / subdir
         if not subdir_path.exists():
             continue

diff --git a/openkb/schema.py b/openkb/schema.py
@@ -2,6 +2,14 @@
 
 from pathlib import Path
 
+# The compiled page-type subdirectories under wiki/. Shared source of truth
+# for surfaces that enumerate page content (list, lint, status, skill gate).
+PAGE_CONTENT_DIRS = ("summaries", "concepts", "entities")
+
+# Canonical empty index.md seed. Used by `openkb init` and the compiler's
+# lazy-create path so they never drift.
+INDEX_SEED = "# Knowledge Base Index\n\n## Documents\n\n## Concepts\n\n## Entities\n\n## Explorations\n"
+
 AGENTS_MD = """\
 # Wiki Schema
 
@@ -10,6 +18,7 @@
 - sources/images/ — Extracted images from documents, referenced by sources.
 - summaries/ — One per source document. Summary of key content.
 - concepts/ — Cross-document topic synthesis. Created when a theme spans multiple documents.
+- entities/ — Specific named things: people, organizations, places, products, named works, events. One page per entity, accumulated across documents.
 - explorations/ — Saved query results, analyses, and comparisons worth keeping.
 - reports/ — Lint health check reports. Auto-generated.
 
@@ -20,13 +29,15 @@
 ## Page Types
 - **Summary Page** (summaries/): Key content of a single source document.
 - **Concept Page** (concepts/): Cross-document topic synthesis with [[wikilinks]].
+- **Entity Page** (entities/): A specific named thing (proper noun). Frontmatter `type:` is one of: person, organization, place, product, work, event, other. An entity differs from a concept: a concept is an abstract recurring idea; an entity is a specific named thing. Create an entity page only when the entity is central to a document or recurs across sources — do not page passing mentions.
 - **Exploration Page** (explorations/): Saved query results — analyses, comparisons, syntheses.
 - **Index Page** (index.md): One-liner summary of every page in the wiki. Auto-maintained.
 
 ## Index Page Format
-index.md lists all documents, concepts, and explorations with metadata:
+index.md lists all documents, concepts, entities, and explorations with metadata:
 - Documents: name, one-liner description, type (short|pageindex), detail access path
 - Concepts: name, one-liner description
+- Entities: name, type, one-liner description
 - Explorations: name, one-liner description
 
 ## Log Format

diff --git a/skills/openkb/SKILL.md b/skills/openkb/SKILL.md
@@ -14,12 +14,17 @@ description: |
 
 The user has compiled their documents into a Markdown wiki at `wiki/`.
 
-The wiki holds three kinds of pages:
+The wiki holds these kinds of pages:
 
 - **Concept pages** at `wiki/concepts/*.md` — cross-document synthesis
   on specific topics. This is where OpenKB's value compounds: a
   concept with multiple sources represents knowledge merged across
   documents the user has ingested.
+- **Entity pages** at `wiki/entities/*.md` — one per specific named
+  thing (people, organizations, places, products, named works,
+  events), accumulated across documents. Each has a `type:`
+  frontmatter field. For "who is X" / "what is X" questions about a
+  named thing, read the matching `entities/` page first.
 - **Summary pages** at `wiki/summaries/*.md` — one per ingested
   document, linking to the concepts that document touches.
 - **Source files** at `wiki/sources/*.{md,json}` — full text for short
@@ -76,8 +81,9 @@ After capturing the KB path from `openkb status`, drill in via:
 
 - `openkb list` — table of ingested documents (name, type, page count)
   plus the concept list.
-- Read `<kb>/wiki/index.md` — the compiled table of contents. Every
-  document and concept has a one-line `brief`. Scan this and pick the
+- Read `<kb>/wiki/index.md` — the compiled table of contents. It has
+  `## Documents`, `## Concepts`, `## Entities`, and `## Explorations`
+  sections; every entry has a one-line `brief`. Scan this and pick the
   slugs that semantically match the user's question.
 
 ## Read content
@@ -90,6 +96,7 @@ calls these `Read` / `Grep` / `Bash`; Gemini CLI uses `read_file` /
 | Goal | Action |
 |---|---|
 | Read a concept page | read the file at `<kb>/wiki/concepts/<slug>.md` |
+| Answer "who/what is X" about a named thing | read `<kb>/wiki/entities/<slug>.md` |
 | Read a document's summary | read `<kb>/wiki/summaries/<doc>.md` |
 | Read a short doc's full text | read `<kb>/wiki/sources/<doc>.md` |
 | Read a long doc's specific page | shell: `jq '.[N-1]' <kb>/wiki/sources/<doc>.json` (N = 1-indexed PDF page; `.[0]` is page 1) |