Configuration Reference

← Back to README

This document provides comprehensive configuration documentation for the Inference Gateway CLI, including all configuration options, environment variables, and best practices.

Configuration System Overview
Configuration Layers
Configuration Precedence
Default Configuration
Configuration Options
Environment Variables
Environment Variable Substitution
Configuration Best Practices
Configuration Validation and Troubleshooting

Configuration System Overview

The CLI uses a powerful 2-layer configuration system built on Viper, supporting multiple configuration sources with proper precedence handling.

Configuration Layers

Userspace Configuration (~/.infer/config.yaml)
- Global configuration for the user across all projects
- Used as a fallback when no project-level configuration exists
- Can be created with: infer init --userspace or infer config init --userspace
Project Configuration (.infer/config.yaml in current directory)
- Project-specific configuration that takes precedence over userspace config
- Default location for most commands
- Can be created with: infer init or infer config init

Configuration Precedence

Configuration values are resolved in the following order (highest to lowest priority):

Environment Variables (INFER_* prefix) - Highest Priority
Command Line Flags
Project Config (.infer/config.yaml)
Userspace Config (~/.infer/config.yaml)
Built-in Defaults - Lowest Priority

Example: If your userspace config sets agent.model: "anthropic/claude-4" and your project config sets agent.model: "deepseek/deepseek-chat", the project config wins. However, if you also set INFER_AGENT_MODEL="openai/gpt-4", the environment variable takes precedence over both config files.

Usage Examples

# Create userspace configuration (global fallback)
infer init --userspace

# Create project configuration (takes precedence)
infer init

# Both configurations will be automatically merged when commands are run

You can also specify a custom config file using the --config flag which will override the automatic 2-layer loading.

Default Configuration

Below is the complete default configuration with all available options:

gateway:
  url: http://localhost:8080
  api_key: ""
  timeout: 200
  oci: ghcr.io/inference-gateway/inference-gateway:latest  # OCI image for Docker mode
  run: true    # Automatically run the gateway (enabled by default)
  docker: true  # Use Docker mode by default (set to false for binary mode)
  include_models: []  # Optional: only allow specific models (allowlist)
  exclude_models:
    - ollama_cloud/cogito-2.1:671b
    - ollama_cloud/kimi-k2:1t
    - ollama_cloud/kimi-k2-thinking
    - ollama_cloud/deepseek-v3.1:671b # Block specific models by default
client:
  timeout: 200
  retry:
    enabled: true
    max_attempts: 3
    initial_backoff_sec: 5
    max_backoff_sec: 60
    backoff_multiplier: 2
    retryable_status_codes: [400, 408, 429, 500, 502, 503, 504]
logging:
  debug: false
tools:
  enabled: true # Tools are enabled by default with safe read-only commands
  sandbox:
    directories: [".", "/tmp"] # Allowed directories for tool operations
    protected_paths: # Paths excluded from tool access for security
      - .infer/
      - .git/
      - *.env
  bash:
    enabled: true
    whitelist:
      commands: # Exact command matches
        - ls
        - pwd
        - echo
        - wc
        - sort
        - uniq
        - gh
        - task
        - docker ps
        - kubectl get pods
      patterns: # Regex patterns for more complex commands
        - ^git branch( --show-current)?$
        - ^git checkout -b [a-zA-Z0-9/_-]+( [a-zA-Z0-9/_-]+)?$
        - ^git checkout [a-zA-Z0-9/_-]+
        - ^git add [a-zA-Z0-9/_.-]+
        - ^git diff+
        - ^git remote -v$
        - ^git status$
        - ^git log --oneline -n [0-9]+$
        - ^git commit -m ".+"$
        - ^git push( --set-upstream)?( origin)?( [a-zA-Z0-9/_-]+)?$
  read:
    enabled: true
    require_approval: false
  write:
    enabled: true
    require_approval: true # Write operations require approval by default for security
  edit:
    enabled: true
    require_approval: true # Edit operations require approval by default for security
  delete:
    enabled: true
    require_approval: true # Delete operations require approval by default for security
  grep:
    enabled: true
    backend: auto # "auto", "ripgrep", or "go"
    require_approval: false
  tree:
    enabled: true
    require_approval: false
  web_fetch:
    enabled: true
    whitelisted_domains:
      - golang.org
    safety:
      max_size: 8192 # 8KB
      timeout: 30 # 30 seconds
      allow_redirect: true
    cache:
      enabled: true
      ttl: 3600 # 1 hour
      max_size: 52428800 # 50MB
  web_search:
    enabled: true
    default_engine: duckduckgo
    max_results: 10
    engines:
      - duckduckgo
      - google
    timeout: 10
  todo_write:
    enabled: true
    require_approval: false
  github:
    enabled: true
    token: "%GITHUB_TOKEN%"
    base_url: "https://api.github.com"
    owner: ""
    safety:
      max_size: 1048576  # 1MB
      timeout: 30        # 30 seconds
    require_approval: false
  safety:
    require_approval: true
agent:
  model: "" # Default model for agent operations
  system_prompt: | # System prompt for agent sessions
    Autonomous software engineering agent. Execute tasks iteratively until completion.

    IMPORTANT: You NEVER push to main or master or to the current branch - instead you create a branch and push to a branch.
    IMPORTANT: You NEVER read all the README.md - start by reading 300 lines

    RULES:
    - Security: Defensive only (analysis, detection, docs)
    - Style: no emojis/comments unless asked, use conventional commits
    - Code: Follow existing patterns, check deps, no secrets
    - Tasks: Use TodoWrite, mark progress immediately
    - Chat exports: Read only "## Summary" to "---" section
    - Tools: Batch calls, prefer Grep for search

    WORKFLOW:
    When asked to implement features or fix issues:
    1. Plan with TodoWrite
    2. Search codebase to understand context
    3. Implement solution
    4. Run tests with: task test
    5. Run lint/format with: task fmt and task lint
    6. Commit changes (only if explicitly asked)
    7. Create a pull request (only if explicitly asked)
  system_reminders:
    enabled: true
    interval: 4
    reminder_text: |
      System reminder text for maintaining context
  verbose_tools: false
  max_turns: 50 # Maximum number of turns for agent sessions
  max_tokens: 4096 # The maximum number of tokens that can be generated per request
  max_concurrent_tools: 5 # Maximum concurrent tool executions
chat:
  theme: tokyo-night
  status_bar:
    enabled: true
    indicators:
      model: true
      theme: true
      max_output: false
      a2a_agents: true
      tools: true
      background_shells: true
      mcp: true
      context_usage: true
      session_tokens: true
      git_branch: true
compact:
  enabled: false # Enable automatic conversation compaction
  auto_at: 80 # Compact when context reaches this percentage (20-100)

Configuration Options

Gateway Settings

gateway.url: The URL of the inference gateway (default: http://localhost:8080)
gateway.api_key: API key for authentication (if required)
gateway.timeout: Request timeout in seconds (default: 200)
gateway.run: Automatically run the gateway on startup (default: true)
- When enabled, the CLI automatically starts the gateway before running commands
- The gateway runs in the background and shuts down when the CLI exits
gateway.docker: Use Docker instead of binary mode (default: true)
- true (default): Uses Docker to run the gateway container (requires Docker installed)
- false: Downloads and runs the gateway as a binary (no Docker required)
gateway.oci: OCI image to use for Docker mode (default: ghcr.io/inference-gateway/inference-gateway:latest)
gateway.include_models: Only allow specific models (allowlist approach, default: [], allows all models)
- When set, only the specified models will be allowed by the gateway
- Example: ["deepseek/deepseek-reasoner", "deepseek/deepseek-chat"]
- This is passed to the gateway as the ALLOWED_MODELS environment variable
gateway.exclude_models: Block specific models (blocklist approach, default: [], blocks none)
- When set, all models are allowed except those in the list
- Example: ["openai/gpt-4", "anthropic/claude-4-opus"]
- This is passed to the gateway as the DISALLOWED_MODELS environment variable
- Note: include_models and exclude_models can be used together - the gateway will apply both filters

Client Settings

client.timeout: HTTP client timeout in seconds
client.retry.enabled: Enable automatic retries for failed requests
client.retry.max_attempts: Maximum number of retry attempts
client.retry.initial_backoff_sec: Initial delay between retries in seconds
client.retry.max_backoff_sec: Maximum delay between retries in seconds
client.retry.backoff_multiplier: Backoff multiplier for exponential delay
client.retry.retryable_status_codes: HTTP status codes that trigger retries (e.g., [400, 408, 429, 500, 502, 503, 504])

Logging Settings

logging.debug: Enable debug logging for verbose output

Tool Settings

tools.enabled: Enable/disable tool execution for LLMs (default: true)
tools.sandbox.directories: Allowed directories for tool operations (default: [".", "/tmp"])
tools.sandbox.protected_paths: Paths excluded from tool access for security (default: [".infer/", ".git/", "*.env"])
tools.whitelist.commands: List of allowed commands (supports arguments)
tools.whitelist.patterns: Regex patterns for complex command validation
tools.safety.require_approval: Prompt user before executing any command (default: true)
Individual tool settings: Each tool (Bash, Read, Write, Edit, Delete, Grep, Tree, WebFetch, WebSearch, TodoWrite) has:
- enabled: Enable/disable the specific tool
- require_approval: Override global safety setting for this tool (optional)

Compact Settings

compact.enabled: Enable automatic conversation compaction to reduce token usage (default: false)
compact.auto_at: Percentage of context window (20-100) at which to automatically trigger compaction (default: 80)

Agent Settings

agent.model: Default model for agent operations
agent.system_prompt: System prompt included with every agent session
agent.system_reminders.enabled: Enable/disable system reminders (default: true)
agent.system_reminders.interval: Number of messages between reminders (default: 10)
agent.system_reminders.text: Custom reminder text to provide contextual guidance
agent.verbose_tools: Enable verbose tool output (default: false)
agent.max_turns: Maximum number of turns for agent sessions (default: 50)
agent.max_tokens: Maximum tokens per agent request (default: 8192)
agent.max_concurrent_tools: Maximum number of tools that can execute concurrently (default: 5)

Web Search Settings

web_search.enabled: Enable/disable web search tool for LLMs (default: true)
web_search.default_engine: Default search engine to use ("duckduckgo" or "google", default: "duckduckgo")
web_search.max_results: Maximum number of search results to return (1-50, default: 10)
web_search.engines: List of available search engines
web_search.timeout: Search timeout in seconds (default: 10)

Chat Interface Settings

chat.theme: Chat interface theme name (default: "tokyo-night")
- Available themes: tokyo-night, github-light, dracula
- Can be changed during chat using /theme [theme-name] shortcut
- Affects colors and styling of the chat interface
chat.status_bar.enabled: Enable/disable the entire status bar (default: true)
- When disabled, no status indicators will be shown
- When enabled, individual indicators can be configured
chat.status_bar.indicators: Configuration for individual status bar indicators
- All indicators are enabled by default except max_output to maintain current behavior
- Available indicators:
  - model: Current AI model name (default: true)
  - theme: Current theme name (default: true)
  - max_output: Maximum output tokens (default: false)
  - a2a_agents: A2A agent readiness (ready/total) (default: true)
  - tools: Tool count and token usage (default: true)
  - background_shells: Running background shell count (default: true)
  - mcp: MCP server status and tool count (default: true)
  - context_usage: Token consumption percentage (default: true)
  - session_tokens: Session token usage statistics (default: true)
  - git_branch: Current Git branch name (default: true)
    - Only displays when in a Git repository
    - Uses 5-second cache for performance
    - Automatically updates after Git operations in bash mode
    - Long branch names are truncated with "..." indicator

Example Configuration:

chat:
  theme: tokyo-night
  status_bar:
    enabled: true
    indicators:
      model: true
      theme: false           # Hide theme indicator
      max_output: false
      a2a_agents: true
      tools: true
      background_shells: false # Hide background shells indicator
      mcp: true
      context_usage: true
      session_tokens: true
      git_branch: true       # Show current Git branch

Keybinding Configuration

The CLI supports customizable keybindings for the chat interface. Keybindings are disabled by default and must be explicitly enabled.

chat.keybindings.enabled: Enable/disable custom keybindings (default: false)
chat.keybindings.bindings: Map of keybinding configurations

Features:

Namespace-Based Organization: Action IDs use format namespace_action (e.g., global_quit, mode_cycle_agent_mode)
Context-Aware Conflict Detection: Validates conflicts only within the same namespace
Self-Documenting: All keybindings are visible in config with descriptions
No Runtime Validation: Config loaded once at startup for performance
Explicit Validation: Run infer keybindings validate to check config
Environment Variable Support: Configure keybindings via comma-separated env vars

Example Configuration:

chat:
  theme: tokyo-night
  keybindings:
    enabled: false  # Set to true to enable
    bindings:
      global_quit:  # Namespace: global, Action: quit
        keys:
          - ctrl+c
        description: "exit application"
        category: "global"
        enabled: true
      mode_cycle_agent_mode:  # Namespace: mode, Action: cycle_agent_mode
        keys:
          - shift+tab
        description: "cycle agent mode"
        category: "mode"
        enabled: true

Available Commands:

# List all keybindings
infer keybindings list

# Set custom key for an action (use namespaced action ID)
infer keybindings set mode_cycle_agent_mode ctrl+m

# Disable/enable specific actions
infer keybindings disable display_toggle_raw_format
infer keybindings enable display_toggle_raw_format

# Reset to defaults
infer keybindings reset

# Validate configuration (checks for conflicts within namespaces)
infer keybindings validate

Key Action Namespaces:

Actions are organized by namespace to distinguish between different contexts. The same key can be used in different namespaces without conflict.

global: Application-level actions (e.g., global_quit, global_cancel)
chat: Chat-specific actions (e.g., chat_enter_key_handler)
mode: Agent mode controls (e.g., mode_cycle_agent_mode)
tools: Tool-related actions (e.g., tools_toggle_tool_expansion)
display: Display toggles (e.g., display_toggle_raw_format, display_toggle_todo_box, display_toggle_thinking)
text_editing: Text manipulation (e.g., text_editing_move_cursor_left, text_editing_history_up)
navigation: Viewport navigation (e.g., navigation_scroll_to_top, navigation_page_down)
clipboard: Copy/paste operations (e.g., clipboard_copy_text, clipboard_paste_text)
selection: Selection mode controls (e.g., selection_toggle_mouse_mode)
plan_approval: Plan approval navigation (e.g., plan_approval_plan_approval_accept)
help: Help system (e.g., help_toggle_help)

Web Search API Setup (Optional)

Both search engines work out of the box, but for better reliability and performance in production, you can configure API keys:

Google Custom Search Engine:

Create a Custom Search Engine:
- Go to Google Programmable Search Engine
- Click "Add" to create a new search engine
- Enter a name for your search engine
- In "Sites to search", enter * to search the entire web
- Click "Create"
Get your Search Engine ID:
- In your search engine settings, note the "Search engine ID" (cx parameter)
Get a Google API Key:
- Go to the Google Cloud Console
- Create a new project or select an existing one
- Enable the "Custom Search JSON API"
- Go to "Credentials" and create an API key
- Restrict the API key to the Custom Search JSON API for security

Configure Environment Variables:

export GOOGLE_SEARCH_API_KEY="your_api_key_here"
export GOOGLE_SEARCH_ENGINE_ID="your_search_engine_id_here"

DuckDuckGo API (Optional):

export DUCKDUCKGO_SEARCH_API_KEY="your_api_key_here"

Note: Both engines have built-in fallback methods that work without API configuration. However, using official APIs provides better reliability and performance for production use.

Environment Variables

The CLI supports environment variable configuration with the INFER_ prefix. Environment variables override configuration file settings and are particularly useful for containerized deployments and CI/CD environments.

All configuration fields can be set via environment variables by converting the YAML path to uppercase and replacing dots (.) with underscores (_), then prefixing with INFER_.

Example: gateway.url → INFER_GATEWAY_URL, tools.bash.enabled → INFER_TOOLS_BASH_ENABLED

Gateway Configuration

INFER_GATEWAY_URL: Gateway URL (default: http://localhost:8080)
INFER_GATEWAY_API_KEY: Gateway API key for authentication
INFER_GATEWAY_TIMEOUT: Gateway request timeout in seconds (default: 200)
INFER_GATEWAY_OCI: OCI image for gateway (default: ghcr.io/inference-gateway/inference-gateway:latest)
INFER_GATEWAY_RUN: Auto-run gateway if not running (default: true)
INFER_GATEWAY_DOCKER: Use Docker to run gateway (default: true)

Client Configuration

INFER_CLIENT_TIMEOUT: HTTP client timeout in seconds (default: 200)
INFER_CLIENT_RETRY_ENABLED: Enable retry logic (default: true)
INFER_CLIENT_RETRY_MAX_ATTEMPTS: Maximum retry attempts (default: 3)
INFER_CLIENT_RETRY_INITIAL_BACKOFF_SEC: Initial backoff delay in seconds (default: 5)
INFER_CLIENT_RETRY_MAX_BACKOFF_SEC: Maximum backoff delay in seconds (default: 60)
INFER_CLIENT_RETRY_BACKOFF_MULTIPLIER: Backoff multiplier (default: 2)

Logging Configuration

INFER_LOGGING_DEBUG: Enable debug logging (default: false)
INFER_LOGGING_DIR: Log directory path (default: .infer/logs)

Agent Configuration

INFER_AGENT_MODEL: Default model for agent operations (e.g., deepseek/deepseek-chat)
INFER_AGENT_SYSTEM_PROMPT: Custom system prompt for agent
INFER_AGENT_SYSTEM_PROMPT_PLAN: Custom system prompt for plan mode
INFER_AGENT_VERBOSE_TOOLS: Enable verbose tool output (default: false)
INFER_AGENT_MAX_TURNS: Maximum agent turns (default: 100)
INFER_AGENT_MAX_TOKENS: Maximum tokens per response (default: 8192)
INFER_AGENT_MAX_CONCURRENT_TOOLS: Maximum concurrent tool executions (default: 5)

Chat Configuration

INFER_CHAT_THEME: Chat UI theme (light, dark, dracula, nord, solarized, default: dark)

Tools Configuration

INFER_TOOLS_ENABLED: Enable/disable all local tools (default: true)

Individual Tool Enablement:

INFER_TOOLS_BASH_ENABLED: Enable/disable Bash tool (default: true)
INFER_TOOLS_READ_ENABLED: Enable/disable Read tool (default: true)
INFER_TOOLS_WRITE_ENABLED: Enable/disable Write tool (default: true)
INFER_TOOLS_EDIT_ENABLED: Enable/disable Edit tool (default: true)
INFER_TOOLS_DELETE_ENABLED: Enable/disable Delete tool (default: true)
INFER_TOOLS_GREP_ENABLED: Enable/disable Grep tool (default: true)
INFER_TOOLS_TREE_ENABLED: Enable/disable Tree tool (default: true)
INFER_TOOLS_WEB_FETCH_ENABLED: Enable/disable WebFetch tool (default: true)
INFER_TOOLS_WEB_SEARCH_ENABLED: Enable/disable WebSearch tool (default: true)
INFER_TOOLS_GITHUB_ENABLED: Enable/disable Github tool (default: true)
INFER_TOOLS_TODO_WRITE_ENABLED: Enable/disable TodoWrite tool (default: true)

Tool Approval Configuration:

INFER_TOOLS_BASH_REQUIRE_APPROVAL: Require approval for Bash tool (default: false)
INFER_TOOLS_WRITE_REQUIRE_APPROVAL: Require approval for Write tool (default: true)
INFER_TOOLS_EDIT_REQUIRE_APPROVAL: Require approval for Edit tool (default: true)
INFER_TOOLS_DELETE_REQUIRE_APPROVAL: Require approval for Delete tool (default: true)

Bash Tool Whitelist Configuration:

The Bash tool supports whitelisting commands and patterns for security. These environment variables accept comma-separated or newline-separated values:

INFER_TOOLS_BASH_WHITELIST_COMMANDS: Comma-separated list of whitelisted commands
INFER_TOOLS_BASH_WHITELIST_PATTERNS: Comma-separated list of regex patterns for whitelisted commands

Examples:

# Whitelist specific commands
export INFER_TOOLS_BASH_WHITELIST_COMMANDS="gh,git,npm,task,make"

# Whitelist command patterns (regex)
export INFER_TOOLS_BASH_WHITELIST_PATTERNS="^gh .*,^git .*,^npm .*,^task .*"

# Combined example for GitHub Actions
export INFER_TOOLS_BASH_WHITELIST_COMMANDS="gh,git,npm"
export INFER_TOOLS_BASH_WHITELIST_PATTERNS="^gh .*,^git .*,^npm (install|test|run).*"

Grep Tool Configuration:

INFER_TOOLS_GREP_BACKEND: Grep backend to use (ripgrep or grep, default: ripgrep)

WebSearch Tool Configuration:

INFER_TOOLS_WEB_SEARCH_DEFAULT_ENGINE: Default search engine (duckduckgo or google, default: duckduckgo)
INFER_TOOLS_WEB_SEARCH_MAX_RESULTS: Maximum search results (default: 10)
INFER_TOOLS_WEB_SEARCH_TIMEOUT: Search timeout in seconds (default: 30)

WebFetch Tool Configuration:

INFER_TOOLS_WEB_FETCH_SAFETY_MAX_SIZE: Maximum fetch size in bytes (default: 10485760)
INFER_TOOLS_WEB_FETCH_SAFETY_TIMEOUT: Fetch timeout in seconds (default: 30)
INFER_TOOLS_WEB_FETCH_SAFETY_ALLOW_REDIRECT: Allow HTTP redirects (default: true)
INFER_TOOLS_WEB_FETCH_CACHE_ENABLED: Enable fetch caching (default: true)
INFER_TOOLS_WEB_FETCH_CACHE_TTL: Cache TTL in seconds (default: 900)
INFER_TOOLS_WEB_FETCH_CACHE_MAX_SIZE: Maximum cache size in bytes (default: 104857600)

GitHub Tool Configuration:

INFER_TOOLS_GITHUB_TOKEN: GitHub personal access token
INFER_TOOLS_GITHUB_BASE_URL: GitHub API base URL (default: https://api.github.com)
INFER_TOOLS_GITHUB_OWNER: Default GitHub owner/organization
INFER_TOOLS_GITHUB_REPO: Default GitHub repository
INFER_TOOLS_GITHUB_SAFETY_MAX_SIZE: Maximum GitHub file size in bytes (default: 10485760)
INFER_TOOLS_GITHUB_SAFETY_TIMEOUT: GitHub API timeout in seconds (default: 30)

Sandbox Configuration:

INFER_TOOLS_SANDBOX_DIRECTORIES: Comma-separated list of allowed directories (default: .,/tmp)

Storage Configuration

INFER_STORAGE_ENABLED: Enable conversation storage (default: true)
INFER_STORAGE_TYPE: Storage backend type (memory, sqlite, postgres, redis, default: sqlite)

SQLite Storage:

INFER_STORAGE_SQLITE_PATH: SQLite database path (default: .infer/conversations.db)

PostgreSQL Storage:

INFER_STORAGE_POSTGRES_HOST: PostgreSQL host
INFER_STORAGE_POSTGRES_PORT: PostgreSQL port (default: 5432)
INFER_STORAGE_POSTGRES_DATABASE: PostgreSQL database name
INFER_STORAGE_POSTGRES_USERNAME: PostgreSQL username
INFER_STORAGE_POSTGRES_PASSWORD: PostgreSQL password
INFER_STORAGE_POSTGRES_SSL_MODE: PostgreSQL SSL mode (default: disable)

Redis Storage:

INFER_STORAGE_REDIS_HOST: Redis host
INFER_STORAGE_REDIS_PORT: Redis port (default: 6379)
INFER_STORAGE_REDIS_PASSWORD: Redis password
INFER_STORAGE_REDIS_DB: Redis database number (default: 0)

Conversation Configuration

INFER_CONVERSATION_TITLE_GENERATION_ENABLED: Enable AI-powered title generation (default: true)
INFER_CONVERSATION_TITLE_GENERATION_MODEL: Model for title generation (default: anthropic/claude-4.1-haiku)
INFER_CONVERSATION_TITLE_GENERATION_BATCH_SIZE: Batch size for title generation (default: 5)
INFER_CONVERSATION_TITLE_GENERATION_INTERVAL: Interval in seconds between title generation attempts (default: 30)

A2A (Agent-to-Agent) Configuration

INFER_A2A_ENABLED: Enable/disable A2A tools (default: true)
INFER_A2A_AGENTS: Configure A2A agent endpoints (supports comma-separated or newline-separated format)

A2A Agents Configuration Examples:

# Comma-separated format
export INFER_A2A_AGENTS="http://agent1:8080,http://agent2:8080,http://agent3:8080"

# Newline-separated format (useful in docker-compose)
export INFER_A2A_AGENTS="
http://google-calendar-agent:8080
http://n8n-agent:8080
http://documentation-agent:8080
http://browser-agent:8080
"

A2A Cache Configuration:

INFER_A2A_CACHE_ENABLED: Enable/disable A2A agent card caching (default: true)
INFER_A2A_CACHE_TTL: Cache TTL in seconds for A2A agent cards (default: 300)

A2A Task Configuration:

INFER_A2A_TASK_STATUS_POLL_SECONDS: Status polling interval in seconds (default: 10)
INFER_A2A_TASK_POLLING_STRATEGY: Polling strategy (fixed or exponential, default: exponential)
INFER_A2A_TASK_INITIAL_POLL_INTERVAL_SEC: Initial polling interval for exponential strategy (default: 2)
INFER_A2A_TASK_MAX_POLL_INTERVAL_SEC: Maximum polling interval for exponential strategy (default: 30)
INFER_A2A_TASK_BACKOFF_MULTIPLIER: Backoff multiplier for exponential strategy (default: 1.5)
INFER_A2A_TASK_BACKGROUND_MONITORING: Enable background task monitoring (default: true)
INFER_A2A_TASK_COMPLETED_TASK_RETENTION: Completed task retention in seconds (default: 3600)

A2A Individual Tool Configuration:

INFER_A2A_TOOLS_SUBMIT_TASK_ENABLED: Enable/disable A2A SubmitTask tool (default: true)
INFER_A2A_TOOLS_SUBMIT_TASK_REQUIRE_APPROVAL: Require approval for SubmitTask (default: false)
INFER_A2A_TOOLS_QUERY_AGENT_ENABLED: Enable/disable A2A QueryAgent tool (default: true)
INFER_A2A_TOOLS_QUERY_AGENT_REQUIRE_APPROVAL: Require approval for QueryAgent (default: false)
INFER_A2A_TOOLS_QUERY_TASK_ENABLED: Enable/disable A2A QueryTask tool (default: true)
INFER_A2A_TOOLS_QUERY_TASK_REQUIRE_APPROVAL: Require approval for QueryTask (default: false)

Export Configuration

INFER_EXPORT_OUTPUT_DIR: Output directory for exported conversations (default: ./exports)
INFER_EXPORT_SUMMARY_MODEL: Model for generating export summaries (default: anthropic/claude-4.1-haiku)

Compact Configuration

INFER_COMPACT_ENABLED: Enable automatic conversation compaction (default: false)
INFER_COMPACT_AUTO_AT: Auto-compact after N messages (default: 100)

Git Configuration

INFER_GIT_COMMIT_MESSAGE_MODEL: Model for AI-generated commit messages (default: deepseek/deepseek-chat)

SCM Configuration

INFER_SCM_PR_CREATE_BASE_BRANCH: Base branch for PR creation (default: main)
INFER_SCM_PR_CREATE_BRANCH_PREFIX: Branch prefix for PR creation (default: feature/)
INFER_SCM_PR_CREATE_MODEL: Model for PR creation (default: deepseek/deepseek-chat)
INFER_SCM_CLEANUP_RETURN_TO_BASE: Return to base branch after PR creation (default: true)
INFER_SCM_CLEANUP_DELETE_LOCAL_BRANCH: Delete local branch after PR creation (default: false)

Keybinding Environment Variables

Keybindings can be configured via environment variables (supports comma-separated or newline-separated lists):

# Enable keybindings
export INFER_CHAT_KEYBINDINGS_ENABLED=true

# Set keys for an action (comma-separated or newline-separated)
export INFER_CHAT_KEYBINDINGS_BINDINGS_GLOBAL_QUIT_KEYS="ctrl+q,ctrl+x"

# Multiline format
export INFER_CHAT_KEYBINDINGS_BINDINGS_MODE_CYCLE_AGENT_MODE_KEYS="shift+tab
ctrl+m"

# Enable/disable specific actions
export INFER_CHAT_KEYBINDINGS_BINDINGS_DISPLAY_TOGGLE_RAW_FORMAT_ENABLED=false

Format: INFER_CHAT_KEYBINDINGS_BINDINGS_<ACTION_ID>_<FIELD>

<ACTION_ID>: Uppercase namespaced action ID (e.g., GLOBAL_QUIT, MODE_CYCLE_AGENT_MODE)
<FIELD>: Either KEYS (comma/newline-separated) or ENABLED (true/false)

Environment Variable Substitution

Configuration values support environment variable substitution using the %VAR_NAME% syntax:

gateway:
  api_key: "%INFER_API_KEY%"

tools:
  github:
    token: "%GITHUB_TOKEN%"

This allows sensitive values to be stored as environment variables while keeping them out of configuration files.

Configuration Best Practices

Security

Never commit sensitive data (API keys, tokens) to configuration files
Use environment variable substitution (%VAR_NAME%) for sensitive values
Use environment variables (INFER_*) for CI/CD environments

Organization

Use project config (.infer/config.yaml) for project-specific settings
Use userspace config (~/.infer/config.yaml) for personal preferences
Commit project configs to version control, exclude userspace configs

Example Workflow

# 1. Setup userspace defaults
infer config --userspace agent set-model "deepseek/deepseek-chat"
infer config --userspace agent set-system "You are a helpful assistant"

# 2. Project-specific overrides
infer config agent set-model "deepseek/deepseek-chat"  # Project-specific model
infer config tools bash enable  # Enable bash tools for this project

# 3. Runtime overrides
INFER_AGENT_VERBOSE_TOOLS=true infer chat  # Temporary verbose mode

Configuration Validation and Troubleshooting

The CLI validates configuration on startup and provides helpful error messages for:

Invalid YAML syntax
Unknown configuration keys
Invalid value types (string vs boolean vs integer)
Missing required values

Common Issues

Configuration not found: Check that the config file exists and has correct YAML syntax
Environment variables not working: Ensure proper INFER_ prefix and underscore conversion
Precedence confusion: Remember that environment variables override config files

Debugging

# Enable verbose logging
infer -v config show

# Enable debug logging
INFER_LOGGING_DEBUG=true infer config show

# Check which config file is being used
infer config show | grep "Configuration file"