Skip to content

Latest commit

Β 

History

History
297 lines (241 loc) Β· 10 KB

File metadata and controls

297 lines (241 loc) Β· 10 KB

CLAUDE.md - AI Assistant Context for ICR Plugin

This file provides context for Claude (or other AI assistants) when working with this codebase.

Acknowledgement

This project is inspired by the YouTube video "The AI Failure Mode Nobody Warned You About (And how to prevent it from happening)". The video brilliantly articulates the "intent problem" in AI agents - the gap between what users mean and what AI interprets - which this plugin aims to solve.

Project Overview

ICR (Intent-Check-Receipt) is a Claude Code plugin that implements a safety layer for AI agent operations. It addresses the "intent problem" - the gap between what users mean and what AI interprets.

The Three Phases

  1. Intent: Before executing an action, generate a structured document showing what the AI believes the user wants
  2. Check: Pause for verification (human and/or AI reviewer) based on action severity and confidence
  3. Receipt: Log the full audit trail including interpretation, approval, and outcome

Repository Structure

icr/
β”œβ”€β”€ .claude-plugin/
β”‚   └── plugin.json              # Plugin manifest - defines hooks, commands, skills
β”‚
β”œβ”€β”€ commands/                     # Slash commands (user-invocable)
β”‚   β”œβ”€β”€ receipts.md              # /icr:receipts - View receipt history
β”‚   β”œβ”€β”€ config.md                # /icr:config - View/edit configuration
β”‚   β”œβ”€β”€ check.md                 # /icr:check - Manual check trigger
β”‚   β”œβ”€β”€ trust.md                 # /icr:trust - Trust mode toggle
β”‚   β”œβ”€β”€ audit.md                 # /icr:audit - Deep receipt analysis
β”‚   β”œβ”€β”€ export.md                # /icr:export - Export receipts
β”‚   β”œβ”€β”€ stats.md                 # /icr:stats - Session statistics
β”‚   β”œβ”€β”€ debug.md                 # /icr:debug - Confidence debugging
β”‚   └── simulate.md              # /icr:simulate - Dry-run intent check
β”‚
β”œβ”€β”€ skills/
β”‚   └── intent-analysis/
β”‚       └── SKILL.md             # AI-invocable intent analysis skill
β”‚
β”œβ”€β”€ hooks/
β”‚   └── hooks.json               # Hook event configuration
β”‚
β”œβ”€β”€ scripts/                      # Bash scripts (hook implementations)
β”‚   β”œβ”€β”€ pre-tool-check.sh        # Main PreToolUse hook handler
β”‚   β”œβ”€β”€ permission-check.sh      # PermissionRequest hook
β”‚   β”œβ”€β”€ subagent-check.sh        # SubagentStop hook
β”‚   β”œβ”€β”€ statusline.sh            # Status line display
β”‚   └── lib/                     # Core library scripts
β”‚       β”œβ”€β”€ common.sh            # Shared utilities (logging, JSON helpers)
β”‚       β”œβ”€β”€ severity.sh          # Severity classification (LOW/MEDIUM/HIGH/CRITICAL)
β”‚       β”œβ”€β”€ confidence.sh        # Confidence scoring (0.0-1.0)
β”‚       β”œβ”€β”€ decision.sh          # Decision tree (AUTO_APPROVE/AI_REVIEW/HUMAN_REVIEW)
β”‚       β”œβ”€β”€ intent.sh            # Intent document generation
β”‚       β”œβ”€β”€ receipt.sh           # Receipt logging and rotation
β”‚       └── checkpoint.sh        # Checkpoint management for rollback
β”‚
β”œβ”€β”€ config/
β”‚   └── defaults.json            # Default configuration (Conservative preset)
β”‚
β”œβ”€β”€ schemas/                      # JSON Schema validation
β”‚   β”œβ”€β”€ config.json              # Configuration schema
β”‚   β”œβ”€β”€ receipt.json             # Receipt format schema
β”‚   └── intent-doc.json          # Intent document schema
β”‚
β”œβ”€β”€ prompts/                      # LLM prompt templates
β”‚   β”œβ”€β”€ intent-generation.md     # Template for generating intent documents
β”‚   β”œβ”€β”€ self-critique.md         # AI self-review template
β”‚   └── devils-advocate.md       # Adversarial safety review template
β”‚
β”œβ”€β”€ docs/                         # Documentation
β”‚   β”œβ”€β”€ ICR_PLUGIN_SPEC.md       # Full specification (4700+ lines)
β”‚   β”œβ”€β”€ SUMMARY.md               # Original intent problem research
β”‚   β”œβ”€β”€ QUICK_START.md           # Quick start with 10 use cases
β”‚   β”œβ”€β”€ USER_GUIDE.md            # Comprehensive user documentation
β”‚   β”œβ”€β”€ DEVELOPER_GUIDE.md       # Comprehensive developer documentation
β”‚   β”œβ”€β”€ ARCHITECTURE.md          # System architecture overview
β”‚   β”œβ”€β”€ CONTRIBUTING.md          # Contribution guidelines
β”‚   └── TROUBLESHOOTING.md       # Common issues and solutions
β”‚
β”œβ”€β”€ README.md                     # Project overview and quick reference
β”œβ”€β”€ CHANGELOG.md                  # Version history
β”œβ”€β”€ LICENSE                       # MIT License
└── CLAUDE.md                     # This file

Key Technologies

Technology Purpose Documentation
Bash All scripts are written in Bash Scripts use set -euo pipefail for safety
jq JSON processing in shell scripts Used for parsing/generating JSON
JSON Configuration, receipts, schemas Standard JSON format throughout
JSON Schema Validation Draft-07 JSON Schema
Markdown Commands, skills, prompts Frontmatter YAML + Markdown content

Common Development Tasks

Adding a New Slash Command

  1. Create commands/<name>.md with frontmatter:

    ---
    description: Brief description for help text
    user_invocable: true
    ---
    
    # Command Name
    
    [Command documentation...]
  2. Add command to plugin.json:

    {
      "commands": [
        {"name": "icr:<name>", "path": "commands/<name>.md"}
      ]
    }

Adding a New Severity Rule

Edit config/defaults.json:

{
  "severity": {
    "userRules": [
      {
        "pattern": "ToolName",
        "condition": "args.property === 'value'",
        "severity": "CRITICAL",
        "reason": "Why this is critical"
      }
    ]
  }
}

Modifying Confidence Weights

Edit config/defaults.json:

{
  "confidence": {
    "weights": {
      "ambiguityAnalysis": 0.30,
      "intentToActionDistance": 0.25,
      "historicalPatterns": 0.20,
      "uncertaintyMarkers": 0.25
    }
  }
}

Adding a New Hook

  1. Create script in scripts/<hook-name>.sh
  2. Register in hooks/hooks.json:
    {
      "hooks": {
        "HookEventName": [{
          "hooks": [{
            "type": "command",
            "command": "${CLAUDE_PLUGIN_ROOT}/scripts/<hook-name>.sh"
          }]
        }]
      }
    }

Code Patterns

Script Structure

All scripts follow this pattern:

#!/bin/bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
LIB_DIR="${SCRIPT_DIR}/lib"

source "${LIB_DIR}/common.sh"

main() {
    require_jq
    local input
    input=$(read_json_input)
    # ... processing ...
    echo "$result"
}

if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
    main "$@"
fi

JSON I/O Pattern

  • Hooks receive JSON via stdin
  • Hooks output JSON to stdout
  • Use jq for all JSON operations:
    # Reading
    local value=$(echo "$json" | jq -r '.field')
    
    # Writing
    jq -n --arg f "$field" '{"field": $f}'

Error Handling

  • Scripts use set -euo pipefail for strict error handling
  • Graceful degradation: if component fails, escalate to HUMAN_REVIEW
  • All errors are logged to .claude/icr/errors.log

Testing

Manual Testing

# Test severity classification
echo '{"tool_name": "Bash", "tool_input": {"command": "rm -rf /"}}' | \
  bash scripts/lib/severity.sh

# Test confidence calculation
bash scripts/lib/confidence.sh "delete old files" "Bash" '{"command": "rm"}' '{}' ".claude/icr/receipts"

Integration Testing

Use Claude Code with --plugin-dir ./icr to test the full plugin.

Runtime Data

ICR stores runtime data in .claude/icr/:

.claude/icr/
β”œβ”€β”€ config.json           # User overrides (merged with defaults)
β”œβ”€β”€ session-state.json    # Current session state
β”œβ”€β”€ errors.log            # Error log
β”œβ”€β”€ receipts/             # Audit trail
β”‚   └── YYYY-MM-DD/
β”‚       └── session-xxx/
β”‚           β”œβ”€β”€ session-meta.json
β”‚           └── 001-receipt.json
β”œβ”€β”€ exports/              # Exported receipts
└── checkpoints/          # Checkpoint metadata

Key Concepts

Severity Levels

Level Risk Examples Default Behavior
LOW Minimal Read, Glob, Grep Auto-approve
MEDIUM Moderate Write, Edit Confidence-based
HIGH Significant Bash Usually review
CRITICAL Dangerous Delete, rm -rf Always human review

Confidence Score (0.0 - 1.0)

Calculated from 4 weighted signals:

  • Ambiguity Analysis (30%): How many valid interpretations exist
  • Intent-to-Action Distance (25%): Semantic gap between request and action
  • Historical Patterns (20%): Similarity to past successful actions
  • Uncertainty Markers (25%): Hedging language in the request

Decision Routes

  • AUTO_APPROVE: Confidence β‰₯ auto threshold β†’ proceed immediately
  • AI_REVIEW: Confidence between AI and auto thresholds β†’ AI validates
  • HUMAN_REVIEW: Confidence < human threshold β†’ user must approve

Important Files to Understand

  1. scripts/pre-tool-check.sh: Main entry point, orchestrates the full ICR flow
  2. scripts/lib/decision.sh: Decision tree logic for routing
  3. config/defaults.json: All configurable options with defaults
  4. docs/ICR_PLUGIN_SPEC.md: Complete specification (reference for any questions)

Common Pitfalls

  1. JSON escaping: Always use jq for JSON generation, never string concatenation
  2. Variable quoting: Always quote variables in Bash: "$variable"
  3. Exit codes: Hook scripts must exit 0 for success
  4. Stdin consumption: Read stdin once and store in variable

Getting Help

  • Full Spec: docs/ICR_PLUGIN_SPEC.md has exhaustive details
  • User Guide: docs/USER_GUIDE.md for end-user documentation
  • Developer Guide: docs/DEVELOPER_GUIDE.md for implementation details
  • Troubleshooting: docs/TROUBLESHOOTING.md for common issues