MorphSAT: Structured Commit Control for LLM Agent Loops

Technical Note v1 — May 2026 echo313unfolding

1. The Problem

An LLM agent with tool access can loop indefinitely — calling tools, reading results, reasoning, calling more tools — without ever committing to a decision. Alternatively, it can commit prematurely on insufficient evidence. Both failure modes are well-documented: overthinking (Wang et al. 2025, arXiv:2508.17627), tool overuse (Chen et al. 2025, ACL Findings), and the general question of when to abstain (Wen et al. 2025, TACL).

The standard approach is prompt engineering: instruct the model to "be decisive" or "stop after N tools." This treats the model as the decision procedure. MorphSAT takes the opposite position: the model is not the decision procedure. It is embedded inside a structured cognitive control loop that holds decision authority.

2. Architecture

MorphSAT wraps an LLM's tool-use loop in a five-layer control stack:

Layer	Component	Role
1	FSA lifecycle gate	Legal state transitions only. Blocks impossible sequences.
2	Evidence sensors	Bidirectional classification: each tool result produces (threat_delta, safety_delta). Coincidence detection boosts on multi-signal convergence.
3	Shadow monitor	Hidden posture state machine. Controls what happens around the model. The model never sees these states.
4	Split memory	Separate threat and tolerance pattern stores. Familiarity modulates future posture.
5	Receipts	Turn-by-turn audit trace: every state transition, evidence classification, and posture change is logged in a JSON receipt.

The shadow monitor (Layer 3) is the key contribution. It maintains eight posture states — NORMAL, ORIENTING, SAFE_DISTANCE, INVESTIGATING, COMMIT_READY, ESCALATE_READY, ABSTAIN_READY, SWARM_CALL — and transitions between them based on accumulated evidence, not scalar thresholds. When the monitor commits to a direction (escalate, suspicious, benign, or abstain), it communicates that direction to the model via a structured prompt. The model can comply or, rarely, override with specific justification. In the benchmarked gate_assists condition, the model complied with the monitor's direction in 100% of cases.

Novelty is a posture problem, not a threshold problem. MorphSAT v6 treated novelty as a scalar penalty (raise the commit threshold for unfamiliar inputs). This failed: the agent over-investigated benign scenarios and never learned tolerance. v7 replaced novelty-as-penalty with novelty-as-reflex: unfamiliar input triggers an ORIENTING state with a bounded investigation budget. Safe evidence decays the orienting pressure (a tolerance response). This single change improved benign accuracy from 35.7% to 78.6%.

3. Experimental Setup

Task domain: Security alert triage. 20 scenarios across three categories (7 benign, 4 suspicious, 9 escalate). Each scenario presents an alert and provides 5 simulated security tools (check_hash, check_process, scan_file, check_network, check_cve). Tool responses are deterministic per scenario.

Model: Qwen2.5-Coder-7B (Q4_K_M quantization) via llama-server. Temperature 0 (greedy decoding). The model is a 7B coding-focused model, not a security specialist. This is deliberate: we test the control structure's ability to compensate for a model operating outside its primary domain.

Three conditions:

model_decides — The model receives neutral prompts and makes its own verdict. The shadow monitor runs but does not influence the model's output.
gate_overrides — The model outputs a verdict, but the gate replaces it with the monitor's direction. The model's verdict is recorded but discarded.
gate_assists — The monitor communicates its direction to the model via a strong directional prompt ("The investigation controller has concluded this is a THREAT requiring ESCALATION"). The model makes the final call.

v8.3 early-verdict guard: An additional structural control that blocks the model from issuing a verdict before gathering at least 2 tool results. This was added after diagnosing a specific failure (supply_01) where the model issued a premature "suspicious" verdict after a single tool call, before the monitor had accumulated enough evidence to commit.

4. Results

v8.3 receipt: morphsat_v83_gate_authority_20260508T125045Z.json (SHA256-stamped, 60 scenario runs).

Condition	Accuracy	Benign	Suspicious	Escalate	Tool loops
model_decides	85.0%	100%	75.0%	77.8%	0%
gate_overrides	100%	100%	100%	100%	0%
gate_assists	100%	100%	100%	100%	0%

Key findings:

The monitor's evidence accumulation produces the correct direction in 20/20 scenarios. The model is the bottleneck, not the monitor.
gate_overrides corrected 6 model errors (6 helped, 0 hurt).
gate_assists achieved 100% model agreement — the model followed the monitor's direction in every case when given a structured directional prompt.
Escalation accuracy improved +22.2pp from model_decides to gate_assists.

Progression across versions (same 20 scenarios, varying control structures):

Version	Mechanism	Accuracy
v4	Evidence-pressure threshold	62.5%
v7	Shadow posture control	70.0%
v8.0	+ gate_assists prompting	90.0%
v8.2	+ classifier and threshold fixes	97.5%
v8.3	+ early-verdict guard	100%

5. What This Demonstrates

The control structure, not the model, is the decision authority. The same 7B model achieves 85% accuracy when deciding alone and 100% when embedded in the MorphSAT control loop. The gap is not prompt engineering — it is structural: an external state machine accumulates evidence, tracks posture, and communicates direction through a typed interface.

Mapping to cognitive architecture concepts (Laird 2012; Jones & Laird 2019):

Soar concept	MorphSAT equivalent
Working memory	Evidence state (threat/safety scores, evidence vector, shadow state)
Operator proposal	Model proposes tool call
Operator evaluation	Shadow monitor scores evidence (bidirectional + coincidence)
Operator selection	Gate direction (escalate / suspicious / benign / abstain)
Impasse detection	Contradiction gate (both scores high) or swarm trigger (multi-axis)
Chunking / episodic memory	Split memory store: receipt closes the loop for future posture
Metacognition	Shadow states: NORMAL, ORIENTING, SAFE_DISTANCE, etc. (hidden from model)

The key structural difference: Soar's metacognition monitors the agent's own reasoning from the inside. MorphSAT's shadow monitor controls the agent from the outside. This is a deliberate design choice: LLM internals are opaque, so metacognitive control must be extrinsic and structural rather than introspective.

6. Caveats

N=20 scenario benchmark, single model (7B), single domain (security triage). The control structure's generality is demonstrated by design (domain-configurable FSA, JSON-loadable specs) but not yet by multi-domain experiments.
Simulated tools, not live environment. Tool responses are deterministic per scenario. Real-world tool outputs are noisy and variable.
gate_assists 100% is an upper bound on this benchmark, not a claim about arbitrary inputs. The 20 scenarios were designed for discriminability; adversarial or ambiguous scenarios may reveal weaker performance.
The evidence classifier is keyword-based, not learned. Deploying to a real domain would require a domain-specific classifier or a learned evidence encoder.

7. Availability

MorphSAT is open source (MIT license).

PyPI: pip install morphsat (v0.4.0)
Source: github.com/echo313unfolding/MorphSAT
Receipt: Full JSON trace at receipts/morphsat_v83_early_verdict_guard/

References

Chen et al. (2025). SMART: Self-Aware Agent for Tool Overuse Mitigation. ACL Findings.
Jones & Laird (2019). Anticipatory Thinking via Event Cognition. CEUR-WS.
Kozak et al. (2015). Fear and the Defense Cascade. PMC4495877.
Laird (2012). The Soar Cognitive Architecture. MIT Press.
Wang et al. (2025). Stop Spinning Wheels: Mitigating LLM Overthinking. arXiv:2508.17627.
Wen et al. (2025). Know Your Limits: Abstention in LLMs. TACL.
Wray, Jones, & Laird (2023). Constraint Compliance. arXiv:2303.04352.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MorphSAT: Structured Commit Control for LLM Agent Loops

1. The Problem

2. Architecture

3. Experimental Setup

4. Results

5. What This Demonstrates

6. Caveats

7. Availability

References

FilesExpand file tree

morphsat_technical_note.md

Latest commit

History

morphsat_technical_note.md

File metadata and controls

MorphSAT: Structured Commit Control for LLM Agent Loops

1. The Problem

2. Architecture

3. Experimental Setup

4. Results

5. What This Demonstrates

6. Caveats

7. Availability

References