Surfaced during the v0.1.0 review (PR #2). Related to the Core-baseline conversation. Filing separately so we can track the detection design.
hooks.md says frameworks MUST fire toolCallRequest for every action that escapes the agent's reasoning context. The handshake's methods_evaluated is client-declared. The Guardian can refuse a session, but cannot detect what's missing once the session is underway. A compromised or malicious Observed Agent can declare methods_implemented: ["steps/sessionStart", "steps/userMessage", "steps/agentResponse", "steps/sessionEnd"] and omit toolCallRequest, memoryStore, subagentStart. The Guardian sees a chatty conversation. The agent runs tools, writes memory, spawns subagents, all of it invisible.
The Guardian has no way to tell "no toolCallRequest fired this turn" apart from "agent took no actions this turn." The spec frames non-enforcement as the client's loss (their actions get ALLOW by default) when really it is the Guardian's blindness.
Proposed direction:
- Cross-check
agbom/snapshot (which lists tools, models, subagents the agent is capable of using) against the methods declared in the handshake. If the agent declares it has tools but didn't declare toolCallRequest in methods_implemented, the Guardian should refuse the session.
- Periodic
agbom/snapshot reconciliation against observed hook traffic. If the snapshot shows tools that have never fired toolCallRequest, surface as an audit signal.
- Optional
expected_hooks_per_turn field with Guardian-side anomaly detection
Why deferred from v0.1.0: depends on the Core-baseline conversation in PR #2. If we add a minimum hook set to Core and tie it to AgBOM coverage, this issue is partially addressed at v0.1.0. The full detection mechanism (anomaly thresholds, AgBOM reconciliation logic) is v0.2 work.
References: hooks.md:172, handshake.json methods_evaluated, AgBOM spec
Surfaced during the v0.1.0 review (PR #2). Related to the Core-baseline conversation. Filing separately so we can track the detection design.
hooks.mdsays frameworks MUST firetoolCallRequestfor every action that escapes the agent's reasoning context. The handshake'smethods_evaluatedis client-declared. The Guardian can refuse a session, but cannot detect what's missing once the session is underway. A compromised or malicious Observed Agent can declaremethods_implemented: ["steps/sessionStart", "steps/userMessage", "steps/agentResponse", "steps/sessionEnd"]and omittoolCallRequest,memoryStore,subagentStart. The Guardian sees a chatty conversation. The agent runs tools, writes memory, spawns subagents, all of it invisible.The Guardian has no way to tell "no toolCallRequest fired this turn" apart from "agent took no actions this turn." The spec frames non-enforcement as the client's loss (their actions get ALLOW by default) when really it is the Guardian's blindness.
Proposed direction:
agbom/snapshot(which lists tools, models, subagents the agent is capable of using) against the methods declared in the handshake. If the agent declares it has tools but didn't declaretoolCallRequestinmethods_implemented, the Guardian should refuse the session.agbom/snapshotreconciliation against observed hook traffic. If the snapshot shows tools that have never firedtoolCallRequest, surface as an audit signal.expected_hooks_per_turnfield with Guardian-side anomaly detectionWhy deferred from v0.1.0: depends on the Core-baseline conversation in PR #2. If we add a minimum hook set to Core and tie it to AgBOM coverage, this issue is partially addressed at v0.1.0. The full detection mechanism (anomaly thresholds, AgBOM reconciliation logic) is v0.2 work.
References:
hooks.md:172,handshake.jsonmethods_evaluated, AgBOM spec