Graph revision#130
Open
wangyu-ustc wants to merge 8 commits into
Open
Conversation
- New: temporal knowledge graph (entity_nodes, entity_edges, episode_nodes, involves_edges) - New: GraphMemoryManager with write path (W1-W5) and read path (R1-R4) - Toggle: MIRIX_ENABLE_GRAPH_MEMORY=true/false (default off) - LoCoMo benchmark: +3.05% LLM Judge (0.5429 → 0.5734) on 1540 questions - Zero changes to original logic when disabled
Graph memory returns {"context": "<pre-formatted str>"} instead of the
{"total_count": N, "items": [...]} shape used by other memory types, so
the existing total_count==0 short-circuit dropped graph context entirely.
Split the empty-data check from the count check and add a graph-specific
branch that reads the context string directly.
Introduces a deterministic conflict-resolution path for semantic memory
inserts, with source provenance (turn_id / chunk_id / serial / occurred_at)
flowing from /memory/add through to stored records. Enabled per meta-agent
via the new `enable_conflict_resolution` flag; legacy free-form inserts
remain the default.
Schema:
- `users.turn_counter`, `users.chunk_counter` — per-user monotonic counters
used by `/memory/add` to fill in fallback provenance when the client does
not provide source_meta.
- `episodic_memory.source_refs`, `semantic_memory.source_refs` —
provenance pointers from stored memories back to their source units.
- `semantic_memory.prior_values` — history of values that have been
superseded under the conflict-resolution path.
Services:
- `UserManager.reserve_source_ids` — atomic counter bump used by the
/memory/add fallback.
- New `semantic_memory_upsert_fact` tool gated by the agent flag.
- `MetaAgent` system prompt augmentation when the flag is on.
Docs: `docs/mab_conflict_resolution_and_provenance.md`,
`docs/mab_raw_chunk_side_channel.md`,
`docs/mab_user_id_isolation_fix.md`.
Replaces v2 single-graph memory with two independent Neo4j graphs — one
per existing MIRIX memory layer:
- G_episodic: (:Episode) + (:EpisodicEntity), with [:NEXT] temporal edges,
[:EP_RELATES] entity edges (with keywords + embedding), and [:MENTIONS]
episode→entity links. Driven by EpisodicMemoryManager.insert_event.
- G_semantic: (:Concept) + (:SemanticEntity), with [:CONCEPT_RELATES]
concept-concept edges (LLM-judged at insert time), [:SEM_RELATES] entity
edges, and [:MENTIONS]. Driven by SemanticMemoryManager.insert_semantic_item.
Retrieval (GraphRetrieverDispatcher):
- 1 LLM call to split the query into ll/hl keywords (cached in Redis)
- 1 batch embed call for both keyword sets
- Parallel asyncio.gather over EpisodicRetriever + SemanticRetriever
- Each retriever runs LightRAG dual-level vector search (ll → entity name
vector index, hl → relation keyword vector index), round-robin merges,
reverses MENTIONS to fetch items, then one-hop expands (NEXT for
episodes, CONCEPT_RELATES for concepts).
- 50/50 token budget split across the two graphs, format as a combined
"## Episodic KG / ## Semantic KG" markdown payload.
Zero-overhead default:
- All hooks gated on settings.enable_graph_memory (default False).
- Neo4j compose service is profile-gated ("graph"); mirix_api's depends_on
is required: false, so plain `docker compose up` skips Neo4j.
- Token tracker (mirix/database/token_tracker.py) is disabled by default;
record() is a no-op until enable() is called by the eval harness via
POST /debug/token_stats/reset.
Schema bootstrap (mirix/database/neo4j_client.py):
- 6 unique constraints, 2 btree indexes, 5 vector indexes (Neo4j 5.13+)
- v3 (:Entity / :Event) cleanup runs first; safe on fresh DBs
- Idempotent: re-running on existing DBs is a no-op
Removed:
- mirix/orm/graph_memory.py (v2 single-graph ORM)
- mirix/services/graph_memory_manager.py (v2 manager)
Docs:
- docs/graph_memory_v4/README.md: design overview + zero-overhead notes
- docs/graph_memory_v4/v4_graph_memory.md: per-file source + diffs
- docs/graph_memory_v4/kg_overview_{episodic,semantic}.png: top-N visualizations
- docs/graph_memory_v4/kg_subgraph_{identity,family_camping,art_creativity}.png:
paired episodic-vs-semantic zoom-ins on shared themes (conv-26)
Configuration:
- MIRIX_ENABLE_GRAPH_MEMORY=true
- MIRIX_NEO4J_URI=bolt://neo4j:7687
- MIRIX_NEO4J_USER, MIRIX_NEO4J_PASSWORD, MIRIX_NEO4J_DATABASE
- MIRIX_NEO4J_VECTOR_DIM (default 1536, match the embedding model in use)
Tested with gpt-4.1-mini + text-embedding-3-small + Neo4j 5.20-community
on LoCoMo conv-26 (154 QA non-adversarial). See docs/graph_memory_v4/.
One-shot script: runs main_eval.py (LoCoMo sample 0) followed by organize_results.py, then prints overall accuracy + per-category breakdown. Pre-flight checks that server is up on :8531 and that locomo10.json exists. Output goes to evals/results/locomo/v4_<timestamp>/.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.