一个最小可运行 demo,展示 AI / Agent 工作流如何从 intent 生成 trace、evidence bundle、replay verdict 和 audit receipt。
A minimal runnable demo for auditable AI agent workflows.
This repository is the walkthrough demo for the execution-evidence path. It is the guided walkthrough surface across the stack, not the canonical architecture hub and not the canonical evidence-profile spec.
这个仓库不是理论入口,而是“能跑出来”的可信 AI 工作流演示。它把一次 Agent 执行拆成可以复核的 artifact chain:
- intent 输入:说明 Agent 要做什么
- policy / rule reference:说明执行前参考了什么规则或治理约束
- execution trace:记录执行过程和事件链
- evidence bundle:把执行证据打包为可交付对象
- replay / verification result:给出复核或回放结果
- audit receipt:把本次执行的关键证据收束成审计收据
核心目标不是让 AI 回答更多内容,而是让一次 AI / Agent 执行过程可追踪、可复核、可审计。
- Evidence -> agent-evidence
- Architecture -> digital-biosphere-architecture
- Audit -> aro-audit
- Governance -> token-governor
python3 -m demo.agent默认输出写入 artifacts/demo_output/,包括:
interaction/intent.jsoninteraction/action.jsoninteraction/result.jsonevidence/example_audit.jsonevidence/result.json
bash scripts/run_demo.sh这个 wrapper 会刷新 artifacts/demo_output/ 下的本地 demo 输出。
python3 examples/enterprise_sandbox_demo/run.py这个路径展示从 intent 到 audit receipt 的更完整闭环,输出目录为 artifacts/enterprise_sandbox_demo/。
The receipt for the enterprise sandbox chain is checked through the canonical ARO surface aro_audit.receipt_validation with the minimal profile.
Enterprise sandbox demo 会生成:
intent.jsonpolicy.jsontrace.jsonlsep.bundle.jsonreplay_verdict.jsonaudit_receipt.json
这些 artifact 对应一条最小审计链:意图输入、策略约束、执行轨迹、证据包、回放判断、审计收据。
很多 Agent demo 只展示“模型能完成任务”。verifiable-agent-demo 展示的是另一件事:任务完成之后,是否还能说明它为什么被允许执行、执行时发生了什么、输出能否被复核,以及审计者能拿到什么证据。
这正是可信 AI / Agent Evidence / LangChain 工作流进入生产流程时需要补上的部分。
Audit evidence demo output:
Planned follow-up captures:
assets/demo-run.gifassets/artifact-chain.png
See assets/README.md for the capture checklist.
这个仓库证明我能把 AI Agent 工作流从 PoC 做成可交付、可复核、可审计的最小闭环:有 intent、有规则、有 trace、有 evidence bundle、有 replay verdict、有 audit receipt。
verifiable-agent-demo 是执行证据路径的 walkthrough demo。它不是 canonical architecture hub、不是 canonical evidence-profile spec,也不是 audit control plane。
如果你只想看能跑的闭环,从本 README 的 Quick Start 开始。如果你想看 evidence profile 和 validator,去 agent-evidence。如果你想看更完整的架构地图,去 digital-biosphere-architecture。
Shared doctrine:
Sandbox controls execution; portable evidence verifies execution.
- Governance decides what should be allowed.
- Execution integrity proves what actually happened.
- Audit evidence exports artifacts for independent review.
flowchart LR
Persona["Persona (POP)"] --> Intent["Intent Object (AIP)"]
Intent --> Governance["Governance Check"]
Governance --> Trace["Execution Trace"]
Trace --> Audit["Audit Evidence (ARO)"]
Fastest external demo path:
bash scripts/run_demo.sh
make killer-demo
python3 -m http.server --directory docs 8000Existing CrewAI demo path:
bash scripts/setup_framework_venv.sh
.venv/bin/python crew/crew_demo.pyEnvironment notes:
- Python 3 is sufficient for the minimal local path.
- Refresh the tracked deterministic sample bundle with
python3 scripts/refresh_demo_samples.py. - The optional CrewAI and LangChain paths should run from a git-ignored local
.venv/created byscripts/setup_framework_venv.sh. - The pinned framework helper environment currently uses
crewai 1.10.1,langchain 1.2.12, andlangchain-core 1.2.18. - CrewAI currently requires Python
<3.14. - Both demo paths use deterministic local mock data and do not require external API calls.
- Quick walkthrough
- Interaction flow
- Shortest validation loop
- Execution evidence demo note
- Demo artifacts
- Independent verification
The repository also includes a paper-ready evaluation harness for Execution Evidence Architecture for Agentic Software Systems: From Intent Objects to Verifiable Audit Receipts.
Primary entry points:
make eval-baselinemake eval-evidencemake eval-external-baselinemake eval-framework-pairmake eval-langchain-pairmake eval-ablationmake falsification-checksmake human-review-kitmake review-samplemake comparemake paper-evalmake top-journal-pack
The evaluation material is useful for deeper technical review, but it is secondary to the runnable demo path above.
interaction/for explicit interaction objectsevidence/for audit and result artifactsdemo/andcrew/for runnable entry pointsintegration/for persona, intent, and ARO adaptersexamples/enterprise_sandbox_demo/for the intent-to-receipt artifact chaindocs/spec/for schema notes and example payloads
