Skip to content

GrayCodeAI/eyrie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

207 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

eyrie

Universal LLM Provider Runtime

One interface for every model. Authentication, routing, streaming, retries, caching — handled.

Go License CI Release GoDoc

Quick Start · Features · Docs · Examples · Providers · Architecture · Contributing


What is eyrie

eyrie is the LLM provider runtime that powers the hawk coding agent. It handles everything between your application and LLM APIs — authentication, model resolution, streaming, retries, rate limiting, and caching.

When your app calls a model, eyrie figures out which provider to use, how to talk to it, and how to stream the response back. Switch from Anthropic to Ollama? eyrie handles the translation. API returns 529? eyrie retries with backoff. Response hits max_tokens? eyrie continues automatically.

Your app never talks to an LLM API directly. eyrie does.

Quick Start

go get github.com/GrayCodeAI/eyrie

Requires Go 1.26+. Minimal dependencies (UUID, OpenTelemetry, SQLite, keyring).

import "github.com/GrayCodeAI/eyrie/client"

// Create a client — provider auto-detected from environment
c := client.NewEyrieClient(&client.EyrieConfig{
    Provider: client.DetectProvider(),
})

// Stream a response
sr, err := c.StreamChat(ctx, messages, client.ChatOptions{
    Model: "claude-sonnet-4-6",
})
defer sr.Close()

for evt := range sr.Events {
    switch evt.Type {
    case "content":   // stream text
    case "tool_call": // execute tool
    case "done":      // response complete
    }
}

Features

Provider Routing

Automatically detects and routes to the right provider based on environment variables, config files, or explicit selection.

Model Resolution

Maps abstract tiers (opus/sonnet/haiku) to concrete model IDs per provider. Ships with an embedded catalog of pricing, context windows, and capabilities.

Streaming

Parses SSE for Anthropic and OpenAI formats — text, tool calls, and thinking blocks.

Reliability

  • Retries on 429/500/529 with exponential backoff and Retry-After support
  • Auto-continuation when stop_reason == max_tokens
  • Provider fallback chains for high availability

Rate Limiting

Token bucket rate limiter per provider — prevents hitting API limits before they happen.

Caching

  • Response caching with configurable TTL
  • Semantic similarity caching for repeated prompts
  • Anthropic prompt caching breakpoints on system prompt and conversation prefix

Cost Tracking

Built-in cost estimation per call, with per-provider pricing from the embedded model catalog.

Reasoning Controls

Passes reasoning_effort and Anthropic extended-thinking thinking_budget_tokens through to capable models — omitted when unset.

Keyless CI Auth

GitHub OIDC keyless authentication for cloud deployments — mints a short-lived token in GitHub Actions and exchanges it for AWS Bedrock (STS AssumeRoleWithWebIdentity) or GCP Vertex (Workload Identity Federation) credentials, no stored secrets.

OpenAI-Compatible Proxy

Serves POST /v1/chat/completions so existing OpenAI SDK clients can talk to eyrie unchanged.

Load-Balancing Strategies

Named routing strategies beyond weighted random: simple-shuffle, least-busy, latency-based, cost-based, and usage-based.

Pluggable Cache & Audit Sinks

Distributed CacheBackend interface (in-memory default, RESP/Redis-capable) and an AuditSink interface (no-op default, JSONL file sink) for privacy-preserving call metadata.

Model Role Slots

Named primary / weak / editor model slots with fallback to primary, plus an LLM summarizing condenser that shrinks long conversation histories using the weak model.

Rerank & Readiness

POST /rerank endpoint (provider-backed with lexical fallback) and a GET /ready readiness probe alongside the existing health check.

gRPC Skeleton

Dependency-free gRPC API skeleton behind the grpc build tag — wired when generated stubs are available.

Documentation

Detailed documentation is available in the docs/ directory:

Examples

Runnable examples are in the examples/ directory:

Run any example with:

ANTHROPIC_API_KEY=sk-... go run ./examples/basic/

Supported Providers

12 setup gateways in catalog/registry/providers.go (hawk /config uses the same list):

Provider ID Env variable
Anthropic anthropic ANTHROPIC_API_KEY
OpenAI openai OPENAI_API_KEY
Google Gemini gemini GEMINI_API_KEY
OpenRouter openrouter OPENROUTER_API_KEY
xAI (Grok) grok XAI_API_KEY
Z.AI z-ai ZAI_API_KEY
CanopyWave canopywave CANOPYWAVE_API_KEY
OpenCode Go opencodego OPENCODEGO_API_KEY
Kimi (Moonshot) kimi MOONSHOT_API_KEY
Xiaomi (MiMo) Pay-as-you-go xiaomi_mimo_payg XIAOMI_MIMO_PAYG_API_KEY
Xiaomi (MiMo) Token Plan xiaomi_mimo_token_plan XIAOMI_MIMO_TOKEN_PLAN_API_KEY (+ region cn / sgp / ams)
Ollama ollama OLLAMA_BASE_URL (local; no API key)

Runtime auto-detection uses a separate priority order for chat when no deployment is pinned; see config profiles.

Usage

Basic Chat

resp, err := c.Chat(ctx, messages, client.ChatOptions{
    Model: "gpt-4o",
})

Streaming with Continuation

// Auto-continues when max_tokens is hit
resp, err := client.ChatWithContinuation(ctx, provider, messages,
    client.ChatOptions{Model: model},
    client.DefaultContinuationConfig(),
)

Mock Provider for Testing

mock := client.NewMockProvider(client.MockModeFixed)
mock.Response = "Here is the code you asked for..."

resp, _ := mock.Chat(ctx, messages, opts)
// No real API calls — perfect for tests

Model Catalog

cat := catalog.DefaultModelCatalog()

// Get the best model for a tier
model := catalog.GetPreferredProviderModel("anthropic", catalog.TierSonnet, &cat)
// → "claude-sonnet-4-6"

// Check deprecation warnings
warn := catalog.GetModelDeprecationWarning("claude-3-7-sonnet", "anthropic")

Provider Configuration

cfg := config.LoadProviderConfig("")             // load from disk
config.ApplyProviderConfigToEnv(cfg, false, nil) // apply to environment
config.SaveProviderConfig(cfg, "")               // save changes

Architecture

eyrie/
├── cmd/eyrie/              # CLI binary
├── client/                 # Provider client & streaming interface
├── config/                 # Provider configuration & routing
│   └── credential/         # Credential file management
├── catalog/                # Model catalog & tier system
│   ├── discover/           # Model discovery
│   ├── legacy/             # Legacy model support
│   ├── live/               # Live model data
│   └── registry/           # Model registry
├── codeagent/              # Code agent retry & fallback strategies
├── conversation/           # Conversation engine with branching
├── credentials/            # Credential management
├── docs/                   # Documentation & guides
├── examples/               # Runnable code examples
├── router/                 # Weighted provider router
├── runtime/                # Runtime manifest & routing policies
├── storage/                # SQLite conversation DAG store
├── types/                  # Branded types & API errors
├── errors/                 # Error message constants
├── constants/              # API limits
├── utils/                  # Error utilities
├── internal/
│   ├── api/                # HTTP API handlers
│   ├── cache/              # Response cache warmer
│   ├── health/             # Provider health checker
│   ├── observability/      # OpenTelemetry spans & metrics
│   ├── sdk/                # Go, Python, TypeScript client SDKs
│   └── version/            # Version information
└── assets/                 # Logo and branding

See docs/ARCHITECTURE.md for detailed system design and data flows.

Ecosystem

eyrie is part of the hawk-eco:

Component Repository Purpose
hawk GrayCodeAI/hawk AI coding agent
eyrie This repo LLM provider runtime
tok GrayCodeAI/tok Tokenizer & compression
yaad GrayCodeAI/yaad Graph-based memory
trace GrayCodeAI/trace Session capture

Development

Prerequisites

  • Go 1.26+

Build & Test

go build ./...               # Verify the library compiles
go test -race ./...           # Run all tests with race detector
make ci                       # Run full CI suite (lint, test, security)
make cover                    # Generate coverage report

Contributing

We welcome contributions! Please see CONTRIBUTING.md for development setup, commit conventions, and the PR process.

Quick start:

  1. Fork and create a branch: git checkout -b feat/short-description
  2. Make changes in small, focused commits
  3. Run make ci locally
  4. Open a pull request

Use Conventional Commits for commit messages — release-please uses them for versioning.

License

MIT — see LICENSE for details.

© 2026 GrayCode AI

About

Universal LLM provider library — the foundation layer for hawk.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages