Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,13 @@ Agents Schema is the shared, queryable metadata surface for consumers that start
from the warehouse and need context about data that already exists there.

It is closest in spirit to `information_schema`, but extensible across many
providers. Compared with MCP servers, Agents Schema is narrower: it publishes
context inside the warehouse, while MCP servers can expose tools, actions, and
providers. In fact `AGENTS.TABLES` and `AGENTS.COLUMNS` are drop-in enriched
versions of `INFORMATION_SCHEMA.TABLES`/`COLUMNS`: the native columns plus
provider-prefixed context (`dbt_description`, `lookml_ai_context`, …). Because
`INFORMATION_SCHEMA` is per-database, these views cover the database that holds
the `AGENTS` schema — point the workflows at the database your data lives in.
Compared with MCP servers, Agents Schema is narrower: it publishes context
inside the warehouse, while MCP servers can expose tools, actions, and
source-specific workflows.

### How it works
Expand Down
65 changes: 65 additions & 0 deletions SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,63 @@ The current package delivers one table family per metadata source:

Each ingestion replaces its own table family with `CREATE OR REPLACE TABLE` and then inserts the rows parsed from the source metadata.

Each ingestion also refreshes provider-normalized views and generic context views over whichever provider tables currently exist. These views are intended to be familiar drop-in starting points for agents that would otherwise reach for `INFORMATION_SCHEMA.TABLES` or `INFORMATION_SCHEMA.COLUMNS`, while preserving source-provider references for deeper inspection.

The generic views are documented in `AGENTS.ROOT` under the `core` provider.

| View | Purpose |
|---|---|
| `AGENTS.TABLES` | `INFORMATION_SCHEMA.TABLES` enriched with matching provider table context. |
| `AGENTS.COLUMNS` | `INFORMATION_SCHEMA.COLUMNS` enriched with matching provider column context. |

---

## Generic Context Views

### Scope

v1 extends the surfaces `INFORMATION_SCHEMA` already has — `TABLES` and `COLUMNS` — rather than inventing new object types. Relationships, metrics, and entities are intentionally out of scope: the information-schema-faithful home for relationships is the `REFERENTIAL_CONSTRAINTS` / `KEY_COLUMN_USAGE` family (a future extension), and metrics/entities are object types that semantic providers such as OSI already model in their own `AGENTS.OSI_*` tables. The generic views enrich; they do not become a competing semantic model.

### Merge model

Each provider publishes a normalized `AGENTS.<PROVIDER>_TABLES` / `AGENTS.<PROVIDER>_COLUMNS` view with a shared shape. The generic views then take the native `INFORMATION_SCHEMA` view as the row spine via `SELECT t.*` (so they inherit whatever native columns the account exposes — nothing is hardcoded) and **left join every provider view that exists** by object identity:

- `AGENTS.TABLES`: `INFORMATION_SCHEMA.TABLES` joined to each `*_TABLES` view on `table_catalog` / `table_schema` / `table_name`.
- `AGENTS.COLUMNS`: `INFORMATION_SCHEMA.COLUMNS` joined to each `*_COLUMNS` view on `table_catalog` / `table_schema` / `table_name` / `column_name`.

The merge is generic and provider-agnostic. Each provider's enrichment columns are appended under a `<provider>_` prefix (`dbt_description`, `lookml_ai_context`, `osi_description`, …), so providers never collide and no native column is overwritten. Within a single provider, rows are aggregated to one row per object identity before the join, so duplicate provider rows cannot multiply native rows. A provider that ships a new `*_TABLES`/`*_COLUMNS` view later — for example a memory provider contributing `memory_*` counts — is picked up automatically with no change to the core views.

`SELECT t.*` resolves against the `INFORMATION_SCHEMA` of the database that holds the `AGENTS` schema, so `AGENTS.TABLES`/`COLUMNS` cover objects in that database. Provider-specific detail not promoted into the shared shape stays in the source tables (for example `AGENTS.LOOKML_DIMENSION`) and is reachable through the `<provider>_source_object_id` columns.

### `AGENTS.TABLES`

`SELECT t.*` from `INFORMATION_SCHEMA.TABLES` plus, for each participating provider, the following prefixed columns:

| Column | Description |
|---|---|
| `<provider>_table_type` | Provider object kind, such as `DBT_MODEL` or `OSI_DATASET`. |
| `<provider>_display_name` | Provider label for the matched table. |
| `<provider>_description` | Provider description for the matched table. |
| `<provider>_ai_context` | Provider AI context for the matched table. |
| `<provider>_source_object_id` | Provider-specific object identifier(s). |
| `<provider>_source_path` | Source file path when available. |
| `<provider>_materialization` | Provider materialization when available. |
| `<provider>_tags` | Provider tags when available. |

### `AGENTS.COLUMNS`

`SELECT t.*` from `INFORMATION_SCHEMA.COLUMNS` plus, for each participating provider, the following prefixed columns:

| Column | Description |
|---|---|
| `<provider>_display_name` | Provider label for the matched column. |
| `<provider>_description` | Provider description. |
| `<provider>_ai_context` | Provider AI context when available. |
| `<provider>_semantic_type` | Provider semantic field kind when available. |
| `<provider>_is_time_dimension` | Whether the field is marked time-like. |
| `<provider>_expression` | Provider expression or SQL when available. |
| `<provider>_source_object_id` | Provider-specific object identifier. |

---

## Source: dbt
Expand Down Expand Up @@ -446,13 +503,21 @@ The current source provider names are:
| `lookml` | `AGENTS.LOOKML_*` |
| `osi` | `AGENTS.OSI_*` |

The current core provider name is:

| Provider | Objects |
|---|---|
| `core` | `AGENTS.ROOT`, `AGENTS.TABLES`, `AGENTS.COLUMNS` |

---

## Summary of Current Tables

| Table | Source | Purpose |
|---|---|---|
| `AGENTS.ROOT` | core | Provider registry upserted by dbt, LookML, and OSI workflows |
| `AGENTS.TABLES` | core | `INFORMATION_SCHEMA.TABLES` enriched from provider `*_TABLES` views |
| `AGENTS.COLUMNS` | core | `INFORMATION_SCHEMA.COLUMNS` enriched from provider `*_COLUMNS` views |
| `AGENTS.DBT_MODEL` | dbt | dbt models with schema, materialization, documentation, path, and tags |
| `AGENTS.DBT_COLUMN` | dbt | Documented dbt model columns |
| `AGENTS.DBT_DEPENDENCY` | dbt | Direct dbt dependency edges |
Expand Down
Loading