feat: initial scaffold and core implementation of agent-kernel by Copilot · Pull Request #1 · dgenio/agent-kernel

Copilot · 2026-03-02T19:40:15Z

Implements agent-kernel from scratch — a capability-based security kernel for AI agents operating in large tool ecosystems (1000+ tools via MCP, A2A, internal APIs). Provides the authorization, execution, and audit layer sitting above raw tool execution and below the LLM context window.

Package structure (`src/` layout, Python ≥ 3.10, Apache-2.0)

enums.py — SafetyClass (READ/WRITE/DESTRUCTIVE), SensitivityTag (PII/PCI/SECRETS/NONE)
errors.py — 10-class exception hierarchy; no bare ValueError/KeyError anywhere
models.py — Core dataclasses: Capability, Principal, Frame, Handle, ActionTrace, Budgets, etc.
registry.py — CapabilityRegistry with deterministic keyword-overlap search (no LLM, no vector DB)
tokens.py — CapabilityToken + HMACTokenProvider (HMAC-SHA256); tokens bind principal_id + capability_id + constraints for confused-deputy prevention
policy.py — DefaultPolicyEngine: READ always allowed; WRITE requires justification ≥ 15 chars + writer|admin role; DESTRUCTIVE requires admin; PII/PCI enforces tenant attribute + allowed_fields; max_rows capped at 50 (user) / 500 (service)
router.py — StaticRouter with ordered fallback driver chains
drivers/ — InMemoryDriver (Python callables + 200-record deterministic billing dataset), HTTPDriver (httpx async)
firewall/ — Firewall transforms RawResult → Frame; four response modes (summary/table/handle_only/raw); enforces Budgets; regex-based PII/PCI redaction; deterministic summarisation
handles.py — HandleStore with TTL, lazy eviction, pagination (offset/limit), field selection, equality filtering
trace.py / kernel.py — TraceStore + Kernel main entry point wiring all components

Quickstart

kernel = Kernel(registry, router=StaticRouter(routes={"tasks.list": ["memory"]}))
kernel.register_driver(driver)

token = kernel.get_token(CapabilityRequest("tasks.list", goal="list tasks"), principal, justification="")
frame = await kernel.invoke(token, principal=principal, args={})
# frame.facts  →  ['Total rows: 20', 'Top keys: id, title, done', ...]
# frame.handle →  Handle(handle_id='...', total_rows=20, ...)

expanded = kernel.expand(frame.handle, query={"limit": 3, "fields": ["id", "title"]})
trace = kernel.explain(frame.action_id)   # full audit record

Testing & tooling

107 pytest tests, 94% coverage across all modules
pyproject.toml (PEP 621, hatchling), Makefile (fmt/lint/type/test/example/ci)
GitHub Actions CI matrix: Python 3.10 / 3.11 / 3.12 with explicit permissions: contents: read
Three self-contained examples (no internet): basic_cli.py, billing_demo.py, http_driver_demo.py
Docs: architecture.md, security.md, integrations.md, capabilities.md, context_firewall.md

Original prompt

Create the initial scaffold and core implementation for agent-kernel, a Python library that implements a capability-based security kernel for AI agents operating in large tool ecosystems (1000+ tools via MCP, A2A, internal APIs).

This library sits ABOVE contextweaver (a context compilation library, available as a dependency) and provides the authorization, execution, and audit layer.

What this library does

Capability Registry: register task-shaped capabilities (not raw tools) with safety classes and sensitivity tags.
Capability Tokens: HMAC-signed, time-bounded, principal-scoped tokens that authorize specific actions.
Policy Engine: role-based access control with confused-deputy prevention. READ/WRITE/DESTRUCTIVE safety classes, PII/PCI sensitivity handling.
Drivers: pluggable execution layer (InMemoryDriver for testing, HTTPDriver for real APIs, protocol-agnostic MCP adapter interface).
Context Firewall: transforms raw tool output into budgeted Frames (facts + table preview + handles). Never exposes raw output to the LLM by default.
Audit Trail: every action is traced and explainable via kernel.explain(action_id).

Package details

Package name: agent_kernel
Python >= 3.10
pyproject.toml with PEP 621, src/ layout
License: Apache-2.0
Runtime deps: httpx (for HTTPDriver)
Dev deps: pytest, pytest-cov, pytest-asyncio, ruff, mypy
[tool.pytest.ini_options] asyncio_mode = "auto"

Repository structure

agent-kernel/
├── pyproject.toml
├── Makefile                    # fmt, lint, type, test, example, ci
├── LICENSE                     # Apache-2.0
├── README.md
├── CHANGELOG.md
├── CONTRIBUTING.md
├── AGENTS.md                   # AI agent instructions for working in this repo
├── .gitignore
├── .github/workflows/ci.yml   # Python 3.10, 3.11, 3.12: ruff + mypy + pytest
├── docs/
│   ├── architecture.md         # Component deep-dive + Mermaid diagram
│   ├── security.md             # Threat model, confused deputy, token scopes
│   ├── integrations.md         # MCP integration, custom drivers, capability mapping
│   ├── capabilities.md         # Designing good capabilities, naming conventions
│   └── context_firewall.md     # Budgets, frames, handles, redaction, expand
├── examples/
│   ├── basic_cli.py            # Full flow: request → grant → invoke → expand
│   ├── billing_demo.py         # InMemoryDriver with dataset, budgets, handles, pagination
│   └── http_driver_demo.py     # Local mini HTTP server + HTTPDriver (no internet needed)
├── src/
│   └── agent_kernel/
│       ├── __init__.py         # Public API exports + __version__
│       ├── py.typed            # PEP 561
│       ├── models.py           # Core dataclasses: Capability, CapabilityRequest, CapabilityGrant,
│       │                       #   Principal, PolicyDecision, RoutePlan, ImplementationRef,
│       │                       #   RawResult, Frame, Handle, Provenance, ActionTrace,
│       │                       #   Budgets, FieldSpec, ResponseMode
│       ├── enums.py            # SafetyClass (READ/WRITE/DESTRUCTIVE),
│       │                       #   SensitivityTag (PII/PCI/SECRETS/NONE)
│       ├── errors.py           # AgentKernelError, TokenExpired, TokenInvalid, TokenScopeError,
│       │                       #   PolicyDenied, DriverError, FirewallError, CapabilityNotFound,
│       │                       #   HandleNotFound, HandleExpired
│       ├── registry.py         # CapabilityRegistry: register, lookup, keyword-based request matching
│       ├── policy.py           # PolicyEngine protocol + DefaultPolicyEngine (rule-based):
│       │                       #   READ allowed, WRITE needs justification+role, DESTRUCTIVE needs admin,
│       │                       #   PII/PCI requires tenant attribute, max_rows enforcement
│       ├── tokens.py           # CapabilityToken dataclass, TokenProvider protocol,
│       │                       #   HMACTokenProvider (SHA-256, env secret, expiry, signature verify)
│       ├── router.py           # Router protocol + StaticRouter (first match + fallback)
│       ├── drivers/
│       │   ├── __init__.py
│       │   ├── base.py         # Driver protocol, ExecutionContext, RawResult
│       │   ├── memory.py       # InMemoryDriver (simulated capabilities with Python functions)
│       │   └── http.py         # HTTPDriver (httpx-based, timeouts, error mapping)
│       ├── firewall/
│       │   ├── __init__.py
│       │   ├── budgets.py      # Budgets dataclass (max_rows, max_fields, max_chars, max_depth)
│       │   ├── transform.py    # Firewall class: RawResult → Frame with budget enforcement
│       │   ├── redaction.py    # PII/PCI field redaction (email, phone, card_number, ssn)
│       │   └── summarize.py    # Deterministic summarization heuristics (no LLM):
│       │                       #   list-of-dicts → count+stats+top_keys,
│       │                       #   dict → keys+aggregates, str...

</details>



<!-- START COPILOT CODING AGENT TIPS -->
---

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security)

Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>

Copilot

Pull request overview

Initial implementation of the agent-kernel library: a capability-based authorization + execution kernel for agents, including policy gating, HMAC-signed capability tokens, driver routing/execution, a context firewall (Frame/Handle), and an audit trail.

Changes:

Added core runtime modules (models, registry, policy, tokens, router, kernel, handle/trace stores, drivers, firewall).
Added a full pytest suite plus fixtures to validate end-to-end flows and security properties.
Added packaging/tooling, CI workflow, docs, and runnable examples.

Reviewed changes

Copilot reviewed 46 out of 47 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
.github/workflows/ci.yml	CI matrix for lint/format/type/test/examples.
AGENTS.md	Repo conventions and security/quality guidelines for agents.
CHANGELOG.md	Project changelog scaffold.
CONTRIBUTING.md	Contributor workflow and quality bar.
Makefile	Local developer commands aligned with CI.
README.md	Project overview, architecture, and quickstart.
docs/architecture.md	High-level architecture and component diagram.
docs/capabilities.md	Guidance for capability naming and design.
docs/context_firewall.md	Firewall response modes, budgets, handles, redaction.
docs/integrations.md	Driver integration guidance (MCP/HTTP/custom).
docs/security.md	Threat model and security properties.
examples/basic_cli.py	End-to-end demo of request → token → invoke → expand → explain.
examples/billing_demo.py	Demo using deterministic billing dataset + budgets + expansion.
examples/http_driver_demo.py	Demo running a local HTTP server with HTTPDriver.
pyproject.toml	Packaging metadata + dependencies + ruff/mypy/pytest config.
src/agent_kernel/init.py	Public API exports and version.
src/agent_kernel/drivers/init.py	Driver subpackage exports.
src/agent_kernel/drivers/base.py	Driver protocol + execution context.
src/agent_kernel/drivers/http.py	Async HTTP execution driver based on httpx.
src/agent_kernel/drivers/memory.py	In-memory driver + deterministic billing dataset factory.
src/agent_kernel/enums.py	SafetyClass and SensitivityTag enums.
src/agent_kernel/errors.py	Custom exception hierarchy.
src/agent_kernel/firewall/init.py	Firewall subpackage exports.
src/agent_kernel/firewall/budgets.py	Firewall budgets dataclass.
src/agent_kernel/firewall/redaction.py	Regex + field-name based redaction utilities.
src/agent_kernel/firewall/summarize.py	Deterministic summarization heuristics.
src/agent_kernel/firewall/transform.py	Core RawResult → Frame transformer enforcing budgets/modes.
src/agent_kernel/handles.py	HandleStore with TTL + expand (pagination/filters/fields).
src/agent_kernel/kernel.py	Main orchestration: token verify → route → execute → firewall → trace.
src/agent_kernel/models.py	Core dataclasses: Capability, Principal, Frame, Handle, ActionTrace, etc.
src/agent_kernel/policy.py	DefaultPolicyEngine rules + constraint enforcement.
src/agent_kernel/py.typed	PEP 561 marker for typed package.
src/agent_kernel/registry.py	Capability registry + keyword-overlap search.
src/agent_kernel/router.py	Static routing from capability_id → ordered driver chain.
src/agent_kernel/tokens.py	CapabilityToken serialization + HMACTokenProvider signing/verify.
src/agent_kernel/trace.py	TraceStore for in-memory audit traces.
tests/conftest.py	Shared fixtures for kernel, principals, registry, drivers.
tests/test_drivers.py	Driver unit tests (InMemoryDriver + HTTPDriver).
tests/test_firewall.py	Firewall mode/budget/redaction behavior tests.
tests/test_handles.py	HandleStore TTL/eviction/expand behavior tests.
tests/test_kernel.py	Integration tests for full kernel flows + fallback + token scope.
tests/test_models.py	Dataclass construction and serialization tests.
tests/test_policy.py	DefaultPolicyEngine rule tests.
tests/test_registry.py	CapabilityRegistry registration/search tests.
tests/test_router.py	StaticRouter routing semantics tests.
tests/test_tokens.py	HMACTokenProvider issuance/verify/tamper/expiry tests.
tests/test_trace.py	TraceStore record/get/list tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

…stry

…nput

… under 300 lines

…kage

…ackaging Apply the 19 Copilot inline review findings on PR #67, grouped: Packaging / optional deps (#1, #2, #12) - Defer `yaml` and `tomllib`/`tomli` imports into the `DeclarativePolicyEngine.from_yaml` / `from_toml` loaders so `import agent_kernel` works without the `policy` extra installed. Missing parser → `PolicyConfigError` with an install hint. Policy DSL parsing (#3, #4) - Validate types of `roles` (list[str]), `attributes` (dict[str, str]), `min_justification` (int — bool rejected), and `constraints` (mapping) in `_parse_rule()`; raise `PolicyConfigError` with precise messages instead of silently producing misbehaving rules or crashing at evaluation time. Policy DSL explain() (#5) - Correctly report explicit deny rules that fully match (previously fell through to a misleading `no_matching_rule` fallback and dropped the rule's `reason`). Skip partial-match deny rules so the explanation focuses on the actionable allow rule rather than suggesting changes that would only trigger the deny. Example policy files (#6, #7, #8, #9, #10, #11) - Rename `default_action` → `default` (the parser reads `default`, the previous key was silently ignored). - Express PII-with-tenant as an allow rule paired with default-deny; the prior `deny-pii-no-tenant` was inverted under first-match-wins. - Move `allow-secrets-service` before `deny-secrets-non-service`; the deny was previously unreachable. - Tighten `allow-read-*` / `allow-write-*` to `sensitivity: [NONE]` so PII reads route through the dedicated allow-pii rule. Kernel dry-run (#13, #14, #17) - Resolve `DryRunResult.operation` the same way drivers do (`args.get("operation", capability_id)`) so it matches what a driver would actually receive — instead of `capability.impl.operation`, which can diverge. - Mirror the Firewall's admin-only gate for `raw` mode: non-admin principals see their requested `raw` downgraded to `summary` in `DryRunResult`, matching real-invoke behaviour. Prevents probing for raw availability via dry-run. Docs / annotations (#15, #16, #18) - `Kernel.explain_denial()` docstring no longer contradicts itself ("never raises" vs. `CapabilityNotFound`). - `drivers/mcp.py` adds an explicit `_McpError: type[Exception] | None` annotation so mypy --strict is happy across the try/except branches. - `DryRunResult.budget_remaining` docstring no longer references the unimplemented `BudgetManager`; documented as reserved for a future cross-invocation budget mechanism. Protocol softening (#19) - Split `explain()` out of `PolicyEngine` into a new `ExplainingPolicyEngine` protocol so downstream engines that implement only `evaluate()` keep satisfying `PolicyEngine`. `Kernel.explain_denial()` uses `getattr` and raises a clear `AgentKernelError` when the configured engine cannot explain. Both built-in engines satisfy the richer protocol. Tests - Add tests for: explicit-deny fully-matched explanation, partial-match deny skipping, every `_parse_rule` validation error, install-hint paths for `from_yaml` / `from_toml`, dry-run operation resolution, dry-run raw-mode downgrade for non-admin, raw preserved for admin, and explain_denial against an engine without `explain()`. Docs - `docs/agent-context/invariants.md` adds a "Dry-run response-mode parity" trap entry so future contributors keep dry-run in sync with the Firewall's admin gate and the driver's operation resolution. CHANGELOG - Documents all of the above under [Unreleased]. `make ci` equivalents: ruff format/check, mypy --strict, 306 passing tests at 95% coverage, all three example scripts complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Initial plan

81311d9

Copilot AI assigned Copilot and dgenio Mar 2, 2026

Copilot started work on behalf of dgenio March 2, 2026 19:40 View session

Copilot AI and others added 2 commits March 2, 2026 20:01

feat: initial scaffold and core implementation of agent-kernel

cf4f9ff

Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>

fix: add explicit permissions to CI workflow (CodeQL alert)

b8cfb22

Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Create initial scaffold for agent-kernel library~~ feat: initial scaffold and core implementation of agent-kernel Mar 2, 2026

Copilot finished work on behalf of dgenio March 2, 2026 20:06

dgenio requested a review from Copilot March 3, 2026 23:25

Copilot started reviewing on behalf of dgenio March 3, 2026 23:25 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

dgenio added 16 commits March 4, 2026 05:31

fix: replace bare ValueError with CapabilityAlreadyRegistered in regi…

68760bc

…stry

fix: wire default_timeout as fallback in HTTPDriver.execute()

aa690f1

fix: prevent IndexError on empty rows in HandleStore.expand()

7fedd66

fix: include full token and audit_id in CapabilityGrant

32da6bc

fix: validate max_rows constraint and raise PolicyDenied on invalid i…

b27b1a9

…nput

chore: remove unused _deep_copy_truncated and import copy

5dc0b01

fix: avoid json.loads on truncated JSON in raw mode transform

8b7fa79

fix: record effective response_mode in ActionTrace instead of requested

994260f

docs: fix Kernel docstring example to include principal arg

1cc5929

fix: require justification for DESTRUCTIVE operations

610a18f

fix: remove duplicate Budgets class, consolidate to firewall.budgets

bc75e42

fix: tighten _PHONE_RE to require phone-like structure, add regex tests

ccb5dd6

fix: bound HandleStore with max_entries cap and periodic auto-eviction

c325985

refactor: deduplicate get_token via grant_capability, bring kernel.py…

e6f8c38

… under 300 lines

fix: add threading.Lock to _get_secret, fix test_dev_secret state lea…

22f61a8

…kage

style: fix ruff format for get_token return statement

692e06f

dgenio marked this pull request as ready for review March 4, 2026 07:08

dgenio merged commit f50b245 into main Mar 4, 2026
6 checks passed

dgenio mentioned this pull request Mar 6, 2026

Rate limiting in PolicyEngine #39

Closed

8 tasks

dgenio mentioned this pull request Mar 6, 2026

MCP Driver: connect kernel to any MCP server (stdio + Streamable HTTP) #41

Closed

8 tasks

dgenio mentioned this pull request Mar 14, 2026

feat: add sliding-window rate limiting to DefaultPolicyEngine #64

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: initial scaffold and core implementation of agent-kernel#1

feat: initial scaffold and core implementation of agent-kernel#1
dgenio merged 19 commits into
mainfrom
copilot/init-agent-kernel-implementation

Copilot AI commented Mar 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Package structure (src/ layout, Python ≥ 3.10, Apache-2.0)

Quickstart

Testing & tooling

What this library does

Package details

Repository structure

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 2, 2026 •

edited

Loading

Package structure (`src/` layout, Python ≥ 3.10, Apache-2.0)