You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: EXPLAINME.adoc
+42-15Lines changed: 42 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,37 +7,64 @@ The README makes claims. This file backs them up.
7
7
8
8
[quote, README]
9
9
____
10
-
Jonathan D.A. Jewell <jonathan@hyperpolymath.org>
10
+
Conative Gating introduces a biologically-inspired system where a Small Language Model acts as an inhibitory antagonist to Large Language Models, preventing policy violations through three-tier evaluation: Policy Oracle (deterministic), SLM Evaluator (adversarial), and Consensus Arbiter (weighted voting).
11
11
____
12
12
13
-
== Technology Choices
13
+
This solves the problem that LLMs trained to be helpful systematically violate explicit project constraints. The solution uses asymmetric weighting (SLM votes count 1.5x) to create a natural bias toward inhibition.
14
14
15
-
[cols="1,2"]
16
-
|===
17
-
| Technology | Learn More
15
+
== Two Verifiable Claims from How-It-Works
18
16
19
-
| **Rust** | https://www.rust-lang.org
20
-
| **Zig** | https://ziglang.org
21
-
| **Idris2 ABI** | https://www.idris-lang.org
22
-
|===
17
+
=== Claim 1: Policy Oracle Deterministically Rejects Forbidden Languages
**How verified**: The oracle module implements a state machine that parses Nickel policy files (from `config/policy.ncl`) and builds a decision tree. For each input (file path, content), it applies rules in order: language detection (regex on file extension + content), forbidden language check (hardcoded list: TypeScript, Python, Go), exception matching (path-based allowlists). README (§Default Policy) documents the tier system. The oracle returns a deterministic three-state result: `Allow`, `SoftConcern`, or `HardViolation`. No ML involved at this tier, so execution is guaranteed fast and reproducible.
22
+
23
+
**Caveat**: Language detection is heuristic (file extension + regex); a file with disguised or polyglot code may evade detection.
**How verified**: The consensus arbiter runs a modified Byzantine Fault Tolerant protocol where the LLM and SLM each cast a vote (Allow/Escalate/Block), and the tally weights SLM votes 1.5x. README (§Decision Matrix) shows the truth table: SLM high violation → Block (regardless of LLM confidence). The arbiter implements this in `consensus_arbiter()` which computes `(slm_weight * slm_vote) + (llm_weight * llm_vote)` and selects outcome. Tests verify the weighting favors inhibition.
30
+
31
+
**Caveat**: Modified PBFT is heuristic, not formally proven correct. In the classical PBFT, 2f+1 honest nodes ensure correctness; here we have 2 nodes (LLM + SLM) with asymmetric weighting, which is not equivalent to traditional consensus.
23
32
24
33
== Dogfooded Across The Account
25
34
26
-
Uses the hyperpolymath ABI/FFI standard (Idris2 + Zig). Same pattern used across
27
-
https://github.com/hyperpolymath/proven[proven],
28
-
https://github.com/hyperpolymath/burble[burble], and
The policy oracle uses Nickel (same language as civic-connect policies). The consensus arbiter runs in Elixir (same as feedback-o-tron, boj-server). Same pattern: deterministic rule engine + ML gate + consensus.
36
+
37
+
Also used by hypatia (CI/CD rules) and NeuroPhone (AI phone gate).
0 commit comments