Invar enables humans to effectively direct AI agents in producing high-quality code.
┌─────────────────────────────────────────────────────────────┐
│ │
│ Human │
│ │ │
│ │ directs (goals, review, decisions) │
│ ▼ │
│ Agent ─────────────────────────────────────────────┐ │
│ │ │ │
│ │ follows uses │ │
│ ▼ ▼ │ │
│ Protocol Tools │ │
│ (INVAR.md) (invar guard, etc.) │ │
│ │ │ │ │
│ └──────────────────────┘ │ │
│ │ │ │
│ ▼ │ │
│ High-Quality Code ◄────────────────────────┘ │
│ │ │
│ │ delivers │
│ ▼ │
│ Human │
│ │
└─────────────────────────────────────────────────────────────┘
Role: Directs the Agent, reviews output, makes final decisions.
What humans do:
- Define what needs to be built
- Review Agent's work
- Approve or reject changes
- Make architectural decisions when asked
What humans DON'T need to do:
- Memorize Invar Protocol details
- Run
invar guardmanually (Agent does this) - Understand every rule (Agent handles compliance)
Human's relationship with Invar: Indirect. Humans benefit from Invar through better Agent output.
Role: Follows Protocol, uses Tools, produces code that meets human's goals.
What agents do:
- Read and internalize INVAR.md
Follow USBV workflow→ Write contracts before code (per DX-91)- Write contracts, separate Core/Shell
- Run
invar guardto verify compliance - Fix violations before presenting to human
Agent's relationship with Invar: Direct and primary. Agent is Invar's first-class user.
Role: Provides structure and verification for Agent's work.
Components:
- Protocol (INVAR.md): Rules and patterns for Agent to follow
- Tools (invar CLI): Verification and assistance for Agent
Invar's design principle: Agent-Native.
- Optimized for Agent consumption (not human reading)
- Automatic over opt-in (Agent won't use unknown features)
- Machine-parseable output (--agent mode)
- Embedded hints (Agent sees them automatically)
"Human-AI Collaboration: Design for both humans and agents."
This implied Invar serves both equally. It doesn't.
"Agent-Native Execution, Human-Directed Purpose"
- Purpose: Help humans achieve goals through Agents
- Execution: Protocol and Tools are Agent-Native (Agent is primary user)
- Success metric: Human can effectively direct Agent to produce quality code
| Design Decision | Agent-Native Approach |
|---|---|
| Protocol length | Short (save Agent tokens) |
| Error messages | Include fix code (Agent needs exact instructions) |
| Output format | JSON available (Agent parses easily) |
| Feature discovery | Auto-embed in output (Agent won't read docs) |
| Defaults | Strict ON (more checking helps Agent) |
| Hints | Always shown (Agent sees them automatically) |
- ❌ Humans can't use Invar directly (they can, it's just not the primary use case)
- ❌ Human experience doesn't matter (human reviews Agent output)
- ❌ Documentation is only for Agents (humans may want to understand the system)
| Criterion | Measure |
|---|---|
| Effective direction | Human can describe task, Agent delivers |
| Trust in output | Human confident in Agent's code quality |
| Reduced review burden | Fewer bugs to catch in review |
| Clear escalation | Agent asks human when genuinely uncertain |
| Criterion | Measure |
|---|---|
| Clear guidance | Protocol provides unambiguous rules |
| Actionable feedback | Guard output includes exact fixes |
| Efficient verification | Fast feedback loop (--changed mode) |
| Fail-safe defaults | Can't accidentally skip important checks |
| Criterion | Measure |
|---|---|
| Code quality | Fewer bugs, better structure |
| Maintainability | Clear Core/Shell separation |
| Verifiability | Contracts document assumptions |
| Consistency | Same patterns across codebase |
Human Success = Agent Effectiveness × Invar Support
Where:
Agent Effectiveness = Protocol Clarity × Tool Utility
Invar Support = Agent-Native Design × Verification Coverage
Translation:
- Humans succeed when Agents are effective
- Agents are effective when Invar provides clear protocols and useful tools
- Invar provides this through Agent-Native design and comprehensive verification
| Aspect | Description |
|---|---|
| Ultimate Goal | Humans effectively direct Agents |
| Primary User | Agent (for Protocol and Tools) |
| Design Philosophy | Agent-Native |
| Human's Role | Commander, not operator |
| Success Metric | Human achieves goals through Agent |
"Invar is Agent infrastructure that serves Human goals."
Invar's Core/Shell architecture builds on two foundational software design patterns:
Alistair Cockburn, 2005 — https://alistair.cockburn.us/hexagonal-architecture
The application's business logic (Core) is isolated from external concerns (Shell) through ports and adapters. This enables testing without I/O and makes the system adaptable to different external systems.
Gary Bernhardt, 2012 — https://github.com/kbilsted/Functional-core-imperative-shell
Pure, functional code (Core) handles all logic and decisions. Imperative code (Shell) handles I/O and side effects. The Shell is a thin wrapper that calls Core functions and performs I/O.
Invar extends these patterns for Agent-Native development:
| Classic Pattern | Invar Extension |
|---|---|
| Separate Core/Shell | + Design-by-Contract (@pre/@post) |
| Pure functions | + Contract completeness (uniquely determines impl) |
| Testable logic | + Reflective verification (understand before fix) |
| Ports/Adapters | + Hierarchical decomposition (leaves first) |
Invar's methodology is validated by recent AI code generation research (6 papers).
| Law | Principle | Research Source |
|---|---|---|
| 1. Separation | Pure logic (Core) and I/O (Shell) physically separate | Determinism enables testing |
| 2. Contract Complete | Define COMPLETE, RECOVERABLE boundaries | Clover (2024): 87% accept, 0% false positive |
| 3. Context Economy | map → signatures → implementation | Token efficiency |
| 4. Decompose First | Break into sub-functions before implementing | Parsel (2023): +75% pass rate |
| 5. Verify Reflectively | Reflect (why?) → Fix → Verify | Reflexion (2023): +11% success |
| 6. Integrate Fully | Verify all feature paths connect correctly | DX-07 post-mortem: local ≠ global |
Human Success = Agent Effectiveness × Invar Support × Problem Solvability
Where:
Problem Solvability = Contract Completeness × Decomposition Quality
Test-first approach improved GPT-4 accuracy from 19% to 44%.
"Generating tests is easier than generating code."
This validates Invar's Contract-First principle.
Contracts/docstrings serve as recovery context for auto-fixing errors.
"When an error occurs, the Agent analyzes the discrepancy between code and expected usage (based on docstring) to propose corrections."
This extends Invar's philosophy: contracts are both verification AND recovery tools.
Hierarchical decomposition improved pass rates by 75%+ on competition problems.
"Like human programmers, start with high-level design, then implement each part gradually."
This validates Invar's Design step: decompose before implement.
Real-world GitHub issues reveal that underspecified problems are unsolvable. Best model (Claude 2) solved only 1.96% of 2,294 real issues.
"Underspecified issue descriptions led to ambiguity on what the problem was and how it should be solved."
This validates Invar's core insight: Contracts eliminate ambiguity.
- @pre/@post make expectations explicit
- Doctests provide concrete examples
- Clear specs → Solvable problems
Verbal reflection on failures improved HumanEval pass@1 from 80% to 91%.
"Reflexion agents verbally reflect on task feedback signals, then maintain their reflective text in episodic memory to induce better decision-making."
This enhances Invar's Verify step: Don't just fix—understand why it failed first.
Three-way consistency checking achieves 87% acceptance while maintaining zero false positives.
"To prevent annotations that are too trivial from being accepted, they test whether the annotations contain enough information to reconstruct functionally equivalent code."
This introduces contract completeness: A complete contract uniquely determines the implementation.
The research converges on a single principle:
Understand before specify. Structure before code. Reflection before fix.
- UNDERSTAND - Inspect context, grasp intent, identify constraints
- SPECIFY - Write COMPLETE contracts and decompose into sub-functions
- BUILD - Write code to pass the tests (leaves first)
- VALIDATE - If failure, reflect → return to appropriate phase → fix
Time spent on understanding and structure pays dividends at every stage:
- During specification → Better contracts from context awareness
- During implementation → Clearer targets (complete contracts = one valid solution)
- During verification → Catch bugs early (three-way consistency)
- During recovery → Explicit iteration to SPECIFY or UNDERSTAND
- For solvability → Clear, complete specs make problems tractable