One config line. Works with any MCP agent. Zero setup.
Every coding agent forgets everything when the session ends.
Session 1: You explain your stack, conventions, architecture. Agent writes great code.
Session 2: Agent has forgotten everything. You start over.
Session 50: You've wasted hours re-teaching the same context.
Yaad is the memory engine behind the hawk
coding agent. Run hawk and your agent gets persistent, graph-native memory — across
sessions, across models, across projects — with no separate install or daemon to manage.
Yaad is a Go library, not a standalone binary. It ships no
yaadcommand of its own; its memory features are surfaced through the host agent (hawk). To embed Yaad in your own Go program, import it directly:import ( yaad "github.com/GrayCodeAI/yaad/engine" "github.com/GrayCodeAI/yaad/storage" )
Your agent starts a session → Yaad injects context from previous sessions:
## Project Memory (Yaad)
### Conventions (always follow)
- Use `jose` library, not `jsonwebtoken` (Edge compatibility)
- Named exports only, no default exports
- Run `pnpm test --coverage` before committing
### Active Tasks
- ✓ JWT token issuance endpoint
- → Rate limiting on /auth/token (in progress)
### ⚠ Stale Warnings
- Auth subgraph outdated: src/middleware/auth.ts modified 2h ago
### Previous Session
- Implemented rate limiting skeleton, hit NATS backpressure issueYour agent works → stores decisions, bugs, conventions automatically.
Session ends → Yaad compresses and links everything in a memory graph.
Next session → picks up exactly where you left off. Zero re-explaining.
Yaad is a memory layer — it doesn't call LLMs. Your agent handles the LLM. Yaad handles memory.
Your Agent Yaad
│ │
├─ starts session ──────────────▶ │ returns hot-tier context (~2K tokens)
│ │
├─ needs context ───────────────▶ │ graph-aware search (BM25 + vector + graph + temporal)
│ "auth middleware" │ returns: decisions + conventions + bugs + specs
│ │
├─ learns something ────────────▶ │ stores node, extracts entities, links edges
│ "Use RS256 for JWT" │ auto-detects: file refs, libraries, functions
│ │
├─ ends session ────────────────▶ │ compresses → summary node → links to graph
│ │
└─ next session ────────────────▶ │ picks up from summary. zero re-explaining.
Relaxed DAG — memories are nodes, relationships are edges:
[decision: "Use RS256"] ──led_to──▶ [convention: "Always RS256"]
│ │
│ led_to │ touches
▼ ▼
[spec: "Auth subsystem"] ◀──part_of── [file: src/middleware/auth.ts]
│
│ relates_to
▼
[bug: "Token refresh race"] ──supersedes──▶ [bug: "Token expiry (FIXED)"]
Intent-aware retrieval — "why" queries traverse causal edges, "when" queries traverse temporal edges:
"why did we choose NATS?" → Intent: Why → boost caused_by, led_to edges
"when did we fix auth?" → Intent: When → boost temporal backbone
"what is the auth spec?" → Intent: What → boost spec, part_of edges
4-path search — BM25 + vector + graph (intent-aware) + temporal recency, fused with RRF.
| Type | What it stores | Example |
|---|---|---|
convention |
Coding rules & patterns | "Use jose not jsonwebtoken" |
decision |
Architecture choices + why | "Chose NATS for backpressure" |
bug |
Symptom → Cause → Fix | "Token race → use mutex" |
spec |
How a subsystem works | "Auth: RS256 JWT with jose" |
task |
Done / in-progress / blocked | "✓ auth, → rate limiting" |
skill |
Reusable step sequences | "Deploy: test → build → fly" |
preference |
User coding style | "Functional style, tabs" |
file |
File/module anchor | "src/middleware/auth.ts" |
entity |
Auto-extracted entity | "jose", "PostgreSQL" |
Graph-Native Memory (Relaxed DAG)
Not a flat list of memories. A directed graph with 8 edge types:
- Causal (acyclic):
led_to,supersedes,caused_by,learned_in,part_of - Relational (cycles OK):
relates_to,depends_on,touches
Enables: subgraph extraction, impact analysis, causal chain traversal.
Intent-Aware 4-Path Search
Based on MAGMA (arxiv:2601.03236):
- BM25 (FTS5) — keyword matching
- Vector (optional) — semantic similarity
- Graph (intent-aware BFS) — edge weights boosted by query intent
- Temporal — recency-aware for "when" queries
Fused with Reciprocal Rank Fusion (RRF).
Dual-Stream Ingestion
Based on MAGMA + GAM research:
- Fast path (sync): store node + temporal edge, return in <1ms
- Slow path (async goroutine): infer causal edges, link entities
Agent is never blocked waiting for memory processing.
Git-Aware Staleness
When source files change, Yaad walks the graph backwards to flag stale subgraphs:
"Auth subgraph may be stale: src/auth.ts modified 2h ago. Affected: [decision: RS256], [convention: jose], [bug: token refresh]"
Impact Analysis
"What memories break if I change schema.sql?" → reverse graph traversal → "3 decisions + 2 specs + 1 convention affected"
Auto-Decay & Compaction
- Half-life decay: unused memories fade automatically
- Compaction: low-confidence memories merge into summaries
- Pinned memories never decay (core architecture decisions, deploy process)
- Auto-decay runs on every session start — zero maintenance
Privacy & Security
- API keys, tokens, secrets auto-stripped on ingest (regex + entropy detection)
- Localhost-only binding (127.0.0.1)
- HTTPS with auto self-signed cert generation
- All data stays local (SQLite, your machine)
- No LLM API calls — Yaad never sends your code anywhere
Worktree-Aware Memory Sharing
Memory is scoped to the git repository root (git rev-parse --show-toplevel), so every worktree, subdirectory, or linked checkout of the same repo shares one .yaad/ memory store. Falls back to the working directory when not in a git repo.
LLM Entity Extraction (opt-in)
Optionally extract entities with an OpenAI-compatible LLM, merged and deduplicated with the regex extractor. Default off — enable with YAAD_LLM_EXTRACT=1 + YAAD_LLM_API_KEY. Falls back to regex-only on any failure, so behavior is unchanged when unset.
Temporal Fact-Validity Windows
Zep-style valid_at / invalid_at intervals on edges let recall filter to facts that were active at a given point in time, distinguishing currently-valid from superseded knowledge.
Live Memory Streaming (gRPC / SSE)
WatchMemories (and WatchStale) stream Remember/Forget mutations in real time over gRPC, with identical semantics exposed over HTTP/SSE at /yaad/events for broad client compatibility.
Versions & Rollback
Every edit is versioned. The engine can list a node's version history and restore a prior state — the rollback itself is recorded as a new version.
ADD-Only Single-Pass Mode (opt-in)
Mem0-v3-style single-pass ingestion that appends memories without the conflict-resolution pass. Default off — enable with YAAD_ADD_ONLY=1.
Subagent-Scoped Memory
A subagent can request its own isolated memory store (<base>/agent-memory/<id>/) via an agent-id env var, keeping parallel agents' memories separate.
Sleep-Time Consolidation (opt-in)
A background loop consolidates memories on the long-running serve / mcp paths. Default off — enable by setting YAAD_CONSOLIDATE_INTERVAL to a positive duration (e.g. 30m).
Semantic Boundary Detection
Detects topic boundaries in a session to segment memories into coherent units before consolidation.
When Yaad is embedded in a host agent (such as hawk), it exposes these operations to the agent. Yaad implements the logic; the host surfaces them (e.g. as MCP tools):
| Tool | What it does |
|---|---|
yaad_remember |
Store a memory (convention, decision, bug, spec, task, skill, preference) |
yaad_recall |
Graph-aware search with intent classification |
yaad_hybrid_recall |
4-path search: BM25 + vector + graph + temporal |
yaad_context |
Get hot-tier context for session injection |
yaad_link |
Create typed edge between memories |
yaad_forget |
Archive a memory (sets confidence to 0) |
yaad_feedback |
Approve / edit / discard a memory |
yaad_pin |
Pin/unpin a memory (pinned = always in context) |
yaad_stale |
Find memories invalidated by git changes |
yaad_proactive |
Predict what context the agent needs next |
yaad_compact |
Merge low-confidence memories into summaries |
yaad_mental_model |
Auto-generated project summary |
yaad_skill_store |
Save a reusable step sequence |
yaad_skill_get |
Retrieve and replay a skill |
yaad_session_recap |
Summary of the previous session |
yaad_subgraph |
Extract neighborhood around a memory |
yaad_impact |
What memories are affected by a file change? |
yaad_status |
Graph stats (nodes, edges, sessions) |
yaad_decay |
Manually trigger confidence decay |
yaad_gc |
Garbage collect archived memories |
yaad_embed |
Generate vector embedding for a node |
yaad_export |
Export graph as JSON/Markdown/Obsidian |
yaad_import |
Import graph from JSON |
┌─────────────────────────────────────────────────────────────────┐
│ YOUR CODING AGENT │
│ Hawk · Claude Code · Cursor · Gemini CLI · Any MCP Agent │
└──────┬───────────────┬──────────────────────────────────────────┘
│ MCP (stdio) │ REST/HTTPS (127.0.0.1:3456)
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ YAAD │
│ Memory Engine · Graph Engine · 4-Path Search · Dual-Stream │
├─────────────────────────────────────────────────────────────────┤
│ SQLite (WAL mode) · FTS5 · Embeddings (optional) │
└─────────────────────────────────────────────────────────────────┘
Zero dependencies. Pure Go. No CGO. No Docker. No cloud. Embedded by the host agent; all data stays local (SQLite, WAL mode).
Yaad is consumed as a Go library. The core entry points:
import (
yaad "github.com/GrayCodeAI/yaad/engine"
"github.com/GrayCodeAI/yaad/storage"
)
// Open (or create) a memory store rooted at the repo / working dir.
store, err := storage.Open(".yaad")
eng := yaad.New(store)
// Store, recall, link, and inspect impact — the same operations the
// host agent exposes as tools (see the table above).
eng.Remember(ctx, yaad.Note{Type: "decision", Text: "Chose NATS for backpressure"})
hits, _ := eng.Recall(ctx, "why did we choose NATS?")See ARCHITECTURE.md and api/openapi.yaml for the full surface. Versioning/rollback, export/import, decay, and GC are all available as engine methods (the host agent decides which to expose).
Generated at .yaad/config.toml:
[server]
port = 3456
host = "127.0.0.1"
[memory]
hot_token_budget = 800
warm_token_budget = 800
max_memories = 10000
[search]
bm25_weight = 0.5
vector_weight = 0.5
default_limit = 10
[decay]
enabled = true
half_life_days = 30
min_confidence = 0.1
boost_on_access = 0.2
[git]
watch = true
auto_stale = truegit clone https://github.com/GrayCodeAI/yaad.git
cd yaad
make build # go build ./... (library packages; no standalone binary)
make test # Run all tests
make cover # Coverage report (coverage.out + coverage.html)
make ci # Everything CI runs (tidy, fmt, vet, lint, test-race, security)| Doc | What |
|---|---|
| ARCHITECTURE.md | Technical architecture |
| COMPARISON.md | vs Mem0, Letta, Engram, agentmemory |
| CONTRIBUTING.md | How to contribute |
| CHANGELOG.md | Release notes |
| api/openapi.yaml | OpenAPI spec |
- Discord: GrayCodeAI
- Issues: GitHub Issues
- Contributing: CONTRIBUTING.md
MIT © 2026 GrayCodeAI
yaad (याद) — Hindi/Urdu for memory, remembrance