HelixAgent API Documentation

Overview

HelixAgent provides a unified OpenAI-compatible API that aggregates responses from multiple LLM providers (DeepSeek, Qwen, OpenRouter, Claude, Gemini) and offers intelligent ensemble capabilities. The system includes advanced features like Model Context Protocol (MCP) support, Language Server Protocol (LSP) integration, intelligent tool orchestration, context management, and security sandboxing.

Base URL

Development: http://localhost:7061
Production: https://api.yourdomain.com

Authentication

HelixAgent uses JWT-based authentication for secure access:

# Get JWT token
curl -X POST http://localhost:7061/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "your-username",
    "password": "your-password"
  }'

# Use token in requests
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  http://localhost:7061/v1/models

OAuth2 Limitations

Important: OAuth tokens from CLI tools are product-restricted and cannot be used for general API calls.

HelixAgent supports OAuth2 authentication for certain providers, but there are important limitations to understand:

Claude OAuth Tokens

Token Source	API Access
`~/.claude/.credentials.json` (from `claude auth login`)	Restricted to Claude Code only

What works:

HelixAgent can read OAuth tokens from the credentials file
Tokens are valid and non-expired

What doesn't work:

Using Claude OAuth tokens for general API requests returns: "This credential is only authorized for use with Claude Code and cannot be used for other API requests."

Solution: Get an API key from https://console.anthropic.com/

Qwen OAuth Tokens

Token Source	API Access
`~/.qwen/oauth_creds.json` (from Qwen CLI login)	For Qwen Portal only

What works:

HelixAgent can read OAuth tokens from the credentials file
Tokens are valid for portal.qwen.ai

What doesn't work:

Using Qwen OAuth tokens for DashScope API returns: "invalid_api_key" (tokens are for portal use only)

Solution: Get a DashScope API key from https://dashscope.aliyuncs.com/

Configuration

To use OAuth2 credentials (with above limitations), set:

# Enable OAuth credential reading for Claude
CLAUDE_CODE_USE_OAUTH_CREDENTIALS=true

# Enable OAuth credential reading for Qwen
QWEN_CODE_USE_OAUTH_CREDENTIALS=true

For production deployments, we recommend using API keys from provider consoles instead of OAuth tokens.

API Endpoint Summary

Endpoint	Method	Authentication	Description
`/health`	GET	None	Basic health check
`/v1/health`	GET	None	Enhanced health with provider status
`/metrics`	GET	None	Prometheus metrics
`/v1/models`	GET	None	List available models
`/v1/providers`	GET	None	Public provider listing
`/v1/auth/register`	POST	None	Register new user
`/v1/auth/login`	POST	None	Authenticate user
`/v1/auth/refresh`	POST	Bearer Token	Refresh JWT token
`/v1/auth/logout`	POST	Bearer Token	Invalidate token
`/v1/auth/me`	GET	Bearer Token	Get current user info
`/v1/completions`	POST	Bearer Token	Legacy text completion
`/v1/completions/stream`	POST	Bearer Token	Streaming text completion
`/v1/chat/completions`	POST	Bearer Token	Chat completion
`/v1/chat/completions/stream`	POST	Bearer Token	Streaming chat completion
`/v1/ensemble/completions`	POST	Bearer Token	Direct ensemble completion
`/v1/providers`	GET	Bearer Token	Detailed provider info
`/v1/providers/:name/health`	GET	Bearer Token	Provider-specific health
`/v1/admin/health/all`	GET	Admin Token	Comprehensive system health
`/v1/mcp/capabilities`	GET	Bearer Token	MCP server capabilities
`/v1/mcp/tools`	GET	Bearer Token	Available MCP tools
`/v1/mcp/tools/call`	POST	Bearer Token	Execute MCP tool
`/v1/mcp/tools/search`	GET/POST	Bearer Token	Search MCP tools
`/v1/mcp/tools/suggestions`	GET	Bearer Token	Tool suggestions
`/v1/mcp/prompts`	GET	Bearer Token	Available MCP prompts
`/v1/mcp/resources`	GET	Bearer Token	Available MCP resources
`/v1/mcp/adapters/search`	GET/POST	Bearer Token	Search MCP adapters
`/v1/mcp/categories`	GET	Bearer Token	Tool categories
`/v1/mcp/stats`	GET	Bearer Token	MCP usage statistics
`/v1/debates`	POST	Bearer Token	Create AI debate
`/v1/debates/{id}`	GET	Bearer Token	Get debate information
`/v1/debates/{id}/status`	GET	Bearer Token	Get debate status
`/v1/debates/{id}/results`	GET	Bearer Token	Get debate results
`/v1/debates/{id}/approve`	POST	Bearer Token	Approve debate gate
`/v1/debates/{id}/reject`	POST	Bearer Token	Reject debate gate
`/v1/debates/{id}/gates`	GET	Bearer Token	Get approval gates
`/v1/debates/{id}/audit`	GET	Bearer Token	Get audit trail
`/v1/debates/team`	GET	None	Get debate team config
`/v1/debates/orchestrator/status`	GET	Bearer Token	Orchestrator status
`/v1/acp/health`	GET	None	ACP service health
`/v1/acp/agents`	GET	None	List ACP agents
`/v1/acp/execute`	POST	None	Execute agent task
`/v1/acp/rpc`	POST	None	JSON-RPC 2.0 endpoint
`/v1/lsp/servers`	GET	Bearer Token	List LSP servers
`/v1/lsp/execute`	POST	Bearer Token	Execute LSP request
`/v1/lsp/sync`	POST	Bearer Token	Sync LSP servers
`/v1/lsp/stats`	GET	Bearer Token	LSP statistics
`/v1/vision/health`	GET	None	Vision service health
`/v1/vision/capabilities`	GET	None	Vision capabilities
`/v1/vision/analyze`	POST	None	Analyze image
`/v1/vision/ocr`	POST	None	OCR extraction
`/v1/vision/detect`	POST	None	Object detection
`/v1/cognee/health`	GET	None	Cognee service health
`/v1/cognee/memory`	POST	None	Add to knowledge graph
`/v1/cognee/search`	POST	None	Search knowledge graph
`/v1/cognee/cognify`	POST	None	Cognify content
`/v1/rag/health`	GET	Bearer Token	RAG system health
`/v1/rag/documents`	POST	Bearer Token	Ingest document
`/v1/rag/search`	POST	Bearer Token	Search documents
`/v1/rag/search/hybrid`	POST	Bearer Token	Hybrid search
`/v1/embeddings/generate`	POST	Bearer Token	Generate embeddings
`/v1/embeddings/search`	POST	Bearer Token	Vector search
`/v1/protocols/execute`	POST	Bearer Token	Execute protocol request
`/v1/protocols/servers`	GET	Bearer Token	List protocol servers
`/v1/sessions`	POST/GET	Bearer Token	Session management
`/v1/agents`	GET	Bearer Token	CLI agent registry
`/v1/features`	GET	None	Feature flags
`/v1/format`	POST	None	Format code
`/v1/formatters`	GET	None	List formatters
`/v1/agentic/workflows`	POST/GET	Bearer Token	Agentic workflows
`/v1/planning/hiplan`	POST	Bearer Token	Hierarchical planning
`/v1/planning/mcts`	POST	Bearer Token	Monte Carlo Tree Search
`/v1/planning/tot`	POST	Bearer Token	Tree of Thoughts
`/v1/llmops/experiments`	POST/GET	Bearer Token	A/B experiments
`/v1/llmops/evaluate`	POST	Bearer Token	Continuous evaluation
`/v1/llmops/prompts`	POST/GET	Bearer Token	Prompt versioning
`/v1/benchmark/run`	POST	Bearer Token	Start benchmark
`/v1/benchmark/results`	GET	Bearer Token	Benchmark results
`/v1/discovery/models`	GET	Bearer Token	Discovered models
`/v1/scoring/model/{id}`	GET	Bearer Token	Model score
`/v1/scoring/top`	GET	Bearer Token	Top models
`/v1/verification/model`	POST	Bearer Token	Verify model
`/v1/verification/status`	GET	Bearer Token	Verification status
`/v1/tasks`	POST/GET	None	Background tasks
`/v1/search/semantic`	POST	Bearer Token	Semantic code search
`/v1/templates`	GET	Bearer Token	Prompt templates
`/v1/checkpoints`	POST/GET	Bearer Token	Workspace snapshots
`/v1/browser/navigate`	POST	Bearer Token	Browser automation
`/v1/skills`	GET	Bearer Token	Skill registry
`/v1/qa/sessions`	POST	Bearer Token	Start QA session
`/v1/graphql`	POST	Bearer Token	GraphQL (feature-flagged)
`/v1/startup/verification`	GET	None	Startup status

Available Models

HelixAgent supports 22+ models from multiple providers:

Ensemble Models

helixagent-ensemble - Intelligent multi-provider aggregation with confidence-weighted voting

DeepSeek Models

deepseek-chat - General purpose chat model
deepseek-coder - Code generation and analysis model

Qwen Models

qwen-turbo - Fast, cost-effective model
qwen-plus - Balanced performance and cost
qwen-max - Highest quality responses

OpenRouter Models

openrouter/grok-4 - Latest Grok model
openrouter/gemini-2.5 - Google's Gemini 2.5
openrouter/anthropic/claude-3.5-sonnet - Advanced reasoning
openrouter/openai/gpt-4o - OpenAI's GPT-4 Omni
openrouter/meta-llama/llama-3.1-405b - Meta's LLaMA 3.1
openrouter/mistralai/mistral-large - Mistral's large model
openrouter/meta-llama/llama-3.1-70b - LLaMA 3.1 70B
openrouter/google/gemma-2-27b - Google's Gemma 2
openrouter/openai/gpt-4-turbo - GPT-4 Turbo
openrouter/microsoft/wizardlm-2-8x22b - Microsoft's WizardLM
openrouter/anthropic/claude-3.5-haiku - Claude 3.5 Haiku
openrouter/meta-llama/llama-3.1-8b - LLaMA 3.1 8B
openrouter/microsoft/wizardlm-2-7b - WizardLM 7B
openrouter/qwen/qwen-2-72b - Qwen 2 72B
openrouter/openai/gpt-4o-mini - GPT-4 Mini
openrouter/google/gemini-flash-1.5 - Gemini Flash

API Endpoints

1. Health Check

GET /health

Basic health check endpoint.

Response:

{
  "status": "ok",
  "timestamp": 1703123456,
  "version": "1.0.0"
}

GET /v1/health

Enhanced health check with provider status.

Response:

{
  "status": "healthy",
  "timestamp": 1703123456,
  "providers": {
    "deepseek": {"status": "healthy", "response_time": 0.8},
    "qwen": {"status": "healthy", "response_time": 0.6},
    "openrouter": {"status": "healthy", "response_time": 1.2}
  },
  "ensemble": {"available": true, "providers": 3}
}

2. Available Models

GET /v1/models

List all available models.

Response:

{
  "object": "list",
  "data": [
    {
      "id": "helixagent-ensemble",
      "object": "model",
      "created": 1703123456,
      "owned_by": "helixagent"
    },
    {
      "id": "deepseek-chat",
      "object": "model", 
      "created": 1703123456,
      "owned_by": "deepseek"
    }
  ]
}

3. Chat Completions

POST /v1/chat/completions

Create chat completion with streaming support.

Request Body:

{
  "model": "helixagent-ensemble",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful AI assistant."
    },
    {
      "role": "user", 
      "content": "Explain quantum computing in simple terms."
    }
  ],
  "max_tokens": 1000,
  "temperature": 0.7,
  "stream": false
}

Response:

{
  "id": "chatcmpl-helixagent-123",
  "object": "chat.completion",
  "created": 1703123456,
  "model": "helixagent-ensemble",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is like..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 75,
    "total_tokens": 100
  },
  "ensemble": {
    "providers_used": ["deepseek", "qwen", "openrouter"],
    "confidence_score": 0.92,
    "voting_strategy": "confidence_weighted"
  }
}

Streaming Response:

data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", ...}
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", ...}
data: [DONE]

4. Text Completions

POST /v1/completions

Legacy text completion endpoint.

Request Body:

{
  "model": "deepseek-chat",
  "prompt": "The future of artificial intelligence is",
  "max_tokens": 100,
  "temperature": 0.5,
  "stream": false
}

Response:

{
  "id": "cmpl-deepseek-456",
  "object": "text_completion",
  "created": 1703123456,
  "model": "deepseek-chat",
  "choices": [
    {
      "text": "...full of exciting possibilities.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

5. Text Embeddings

POST /v1/embeddings

Generate text embeddings for semantic search.

Request Body:

{
  "model": "deepseek-chat",
  "input": "The quick brown fox jumps over the lazy dog",
  "encoding_format": "float"
}

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.1, 0.2, 0.3, ...],
      "index": 0
    }
  ],
  "model": "deepseek-chat",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

6. Provider Information

GET /v1/providers

List all configured LLM providers.

Response:

{
  "providers": [
    {
      "id": "deepseek",
      "name": "DeepSeek",
      "type": "openai_compatible",
      "status": "healthy",
      "models": ["deepseek-chat", "deepseek-coder"],
      "capabilities": ["text_completion", "chat", "embeddings"],
      "supports_streaming": true
    },
    {
      "id": "qwen", 
      "name": "Qwen",
      "type": "openai_compatible",
      "status": "healthy",
      "models": ["qwen-turbo", "qwen-plus", "qwen-max"],
      "capabilities": ["text_completion", "chat", "embeddings"],
      "supports_streaming": true
    }
  ]
}

GET /v1/providers/health

Check health status of all providers.

Response:

{
  "status": "healthy",
  "providers": {
    "deepseek": {
      "status": "healthy",
      "response_time": 0.8,
      "last_check": "2024-01-15T10:30:00Z",
      "error_rate": 0.01
    },
    "qwen": {
      "status": "healthy", 
      "response_time": 0.6,
      "last_check": "2024-01-15T10:30:00Z",
      "error_rate": 0.02
    }
  }

7. Model Context Protocol (MCP) Endpoints

GET /mcp/capabilities

Get MCP server capabilities and supported features.

Response:

{
  "version": "1.0.0",
  "capabilities": {
    "tools": {
      "listChanged": true
    },
    "prompts": {
      "listChanged": true
    },
    "resources": {
      "listChanged": true
    }
  },
  "providers": ["deepseek", "qwen", "openrouter"],
  "mcp_servers": ["filesystem-mcp", "database-mcp"]
}

GET /mcp/tools

List all available MCP tools across configured servers.

Response:

{
  "tools": [
    {
      "name": "read_file",
      "description": "Read contents of a file",
      "inputSchema": {
        "type": "object",
        "properties": {
          "path": {
            "type": "string",
            "description": "File path to read"
          }
        },
        "required": ["path"]
      }
    }
  ]
}

POST /mcp/tools/call

Execute an MCP tool with specified parameters.

Request Body:

{
  "name": "read_file",
  "arguments": {
    "path": "/etc/hostname"
  }
}

Response:

{
  "result": "helixagent-server\n",
  "success": true,
  "execution_time": 0.045
}

GET /mcp/prompts

List available MCP prompts for enhanced interactions.

Response:

{
  "prompts": [
    {
      "name": "summarize",
      "description": "Summarize text content",
      "arguments": [
        {
          "name": "text",
          "description": "Text to summarize",
          "required": true
        }
      ]
    },
    {
      "name": "analyze",
      "description": "Analyze content for insights",
      "arguments": [
        {
          "name": "content",
          "description": "Content to analyze",
          "required": true
        }
      ]
    }
  ]
}

GET /mcp/resources

List available MCP resources and their metadata.

Response:

{
  "resources": [
    {
      "uri": "helixagent://providers",
      "name": "Provider Information",
      "description": "Information about configured LLM providers",
      "mimeType": "application/json"
    },
    {
      "uri": "helixagent://models",
      "name": "Model Metadata",
      "description": "Metadata about available LLM models",
      "mimeType": "application/json"
    }
  ]

8. AI Debate Endpoints

POST /v1/debates

Create and start a new AI debate with multiple participants.

Request Body:

{
  "debateId": "climate-debate-001",
  "topic": "What are the most effective strategies for combating climate change?",
  "maximal_repeat_rounds": 5,
  "consensus_threshold": 0.75,
  "participants": [
    {
      "name": "EnvironmentalEconomist",
      "role": "Economic Analyst",
      "llms": [
        {
          "provider": "claude",
          "model": "claude-3-5-sonnet-20241022",
          "api_key": "${CLAUDE_API_KEY}"
        }
      ]
    },
    {
      "name": "ClimateScientist",
      "role": "Scientific Expert",
      "llms": [
        {
          "provider": "deepseek",
          "model": "deepseek-coder"
        }
      ]
    }
  ],
  "enable_cognee": true,
  "cognee_config": {
    "dataset_name": "climate_debate_analysis"
  }
}

Response:

{
  "debateId": "climate-debate-001",
  "status": "started",
  "estimated_duration": 180,
  "participants_count": 2,
  "created_at": "2024-01-15T10:30:00Z"
}

GET /v1/debates/{debateId}

Get comprehensive information about a specific debate.

Response:

{
  "debateId": "climate-debate-001",
  "topic": "What are the most effective strategies for combating climate change?",
  "status": "completed",
  "progress": {
    "current_round": 3,
    "total_rounds": 5,
    "completed_responses": 6,
    "total_expected_responses": 6
  },
  "participants": [
    {
      "name": "EnvironmentalEconomist",
      "responses_count": 3,
      "avg_quality_score": 0.85
    }
  ],
  "quality_metrics": {
    "overall_score": 0.82,
    "consensus_achieved": true,
    "consensus_confidence": 0.78
  }
}

GET /v1/debates/{debateId}/status

Get real-time debate progress and status.

Response:

{
  "debateId": "climate-debate-001",
  "status": "in_progress",
  "current_round": 2,
  "current_participant": "ClimateScientist",
  "time_elapsed": 45,
  "estimated_remaining": 135,
  "active_participants": 2,
  "errors": []
}

GET /v1/debates/{debateId}/results

Get complete debate results and analysis.

Response:

{
  "debateId": "climate-debate-001",
  "topic": "What are the most effective strategies for combating climate change?",
  "status": "completed",
  "duration": 156,
  "rounds_completed": 5,
  "consensus": {
    "achieved": true,
    "confidence": 0.82,
    "final_position": "A combination of economic incentives, technological innovation, and policy frameworks provides the most effective approach to combating climate change.",
    "key_agreements": [
      "Carbon pricing mechanisms are essential",
      "Technological innovation must be accelerated",
      "International cooperation is crucial"
    ]
  },
  "participants": [
    {
      "name": "EnvironmentalEconomist",
      "total_responses": 5,
      "avg_quality_score": 0.88,
      "contribution_score": 0.85,
      "persuasion_effectiveness": 0.75
    }
  ],
  "quality_metrics": {
    "overall_debate_quality": 0.86,
    "argument_diversity": 0.92,
    "evidence_quality": 0.81,
    "reasoning_depth": 0.89
  },
  "cognee_insights": {
    "key_themes": ["economic_policy", "technological_innovation", "international_cooperation"],
    "sentiment_analysis": "constructive_dialogue",
    "recommendations": [
      "Implement carbon pricing mechanisms",
      "Increase R&D investment in clean technologies",
      "Strengthen international climate agreements"
    ]
  }
}

GET /v1/debates/{debateId}/report

Generate and download a formatted debate report.

Query Parameters:

format: Report format (json, pdf, html) - default: json

Response: (JSON format shown)

{
  "report_title": "Climate Change Debate Analysis",
  "generated_at": "2024-01-15T11:30:00Z",
  "executive_summary": "A comprehensive debate between economic and scientific experts on climate change strategies...",
  "debate_metrics": {
    "duration_minutes": 2.6,
    "participant_count": 2,
    "total_responses": 10,
    "consensus_achieved": true
  },
  "key_findings": [
    "Economic incentives are crucial for adoption of clean technologies",
    "Scientific evidence supports immediate action",
    "Policy frameworks must balance economic and environmental goals"
  ],
  "recommendations": [
    "Implement comprehensive carbon pricing",
    "Accelerate clean technology development",
    "Foster international cooperation"
  ]
}

9. Authentication Endpoints

POST /v1/auth/register

Request Body:

{
  "username": "newuser",
  "password": "securepassword123",
  "email": "user@example.com"
}

Response:

{
  "success": true,
  "message": "User registered successfully",
  "user_id": "user_123456",
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

POST /v1/auth/login

Authenticate and get JWT token.

Request Body:

{
  "username": "existinguser",
  "password": "securepassword123"
}

Response:

{
  "success": true,
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expires_in": 86400,
  "user": {
    "id": "user_123456",
    "username": "existinguser",
    "email": "user@example.com"
  }
}

POST /v1/auth/refresh

Refresh JWT token.

Headers:

Authorization: Bearer <current_token>

Response:

{
  "success": true,
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expires_in": 86400
}

POST /v1/auth/logout

Invalidate current token.

Headers:

Authorization: Bearer <current_token>

Response:

{
  "success": true,
  "message": "Logged out successfully"
}

GET /v1/auth/me

Get current user information.

Headers:

Authorization: Bearer <current_token>

Response:

{
  "user": {
    "id": "user_123456",
    "username": "existinguser",
    "email": "user@example.com",
    "created_at": "2024-01-15T10:30:00Z"
  }
}

8. Ensemble Configuration

POST /v1/ensemble/completions

Direct ensemble completion endpoint with advanced configuration.

Request Body:

{
  "prompt": "Explain quantum computing",
  "model": "helixagent-ensemble",
  "temperature": 0.7,
  "max_tokens": 1000,
  "ensemble_config": {
    "strategy": "confidence_weighted",
    "min_providers": 2,
    "confidence_threshold": 0.8,
    "fallback_to_best": true,
    "timeout": 30,
    "preferred_providers": ["deepseek", "qwen"]
  },
  "memory_enhanced": true
}

Response:

{
  "id": "ensemble-123",
  "object": "ensemble.completion",
  "created": 1703123456,
  "model": "deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 75,
    "total_tokens": 100
  },
  "ensemble": {
    "voting_method": "confidence_weighted",
    "responses_count": 3,
    "scores": {
      "deepseek": 0.92,
      "qwen": 0.85,
      "openrouter": 0.78
    },
    "metadata": {
      "selection_reason": "highest_confidence"
    },
    "selected_provider": "deepseek",
    "selection_score": 0.92
  }
}

9. Provider Management

GET /v1/providers (Protected)

Get detailed provider information including capabilities.

Headers:

Authorization: Bearer <token>

Response:

{
  "providers": [
    {
      "name": "deepseek",
      "supported_models": ["deepseek-chat", "deepseek-coder"],
      "supported_features": ["streaming", "function_calling"],
      "supports_streaming": true,
      "supports_function_calling": true,
      "supports_vision": false,
      "metadata": {
        "max_tokens": 4096,
        "rate_limit": "100/min"
      }
    }
  ],
  "count": 3
}

GET /v1/providers/:name/health

Check health status of specific provider.

Example: GET /v1/providers/deepseek/health

Response:

{
  "provider": "deepseek",
  "healthy": true,
  "response_time_ms": 850,
  "last_check": "2024-01-15T10:30:00Z"
}

Error Response (unhealthy):

{
  "provider": "openrouter",
  "healthy": false,
  "error": "API key invalid or expired",
  "last_check": "2024-01-15T10:30:00Z"
}

10. Admin Endpoints

GET /v1/admin/health/all

Get comprehensive health status of all components (admin only).

Headers:

Authorization: Bearer <admin_token>

Response:

{
  "provider_health": {
    "deepseek": null,
    "qwen": "connection timeout",
    "openrouter": null,
    "claude": "rate limit exceeded"
  },
  "timestamp": 1703123456,
  "overall_status": "degraded"
}

11. Metrics Endpoint

GET /metrics

Prometheus metrics endpoint (no authentication required).

Response:

# HELP helixagent_requests_total Total number of requests
# TYPE helixagent_requests_total counter
helixagent_requests_total{endpoint="/v1/chat/completions",method="POST"} 1234

# HELP helixagent_request_duration_seconds Request duration in seconds
# TYPE helixagent_request_duration_seconds histogram
helixagent_request_duration_seconds_bucket{endpoint="/v1/chat/completions",le="0.1"} 1000
helixagent_request_duration_seconds_bucket{endpoint="/v1/chat/completions",le="0.5"} 1200

# HELP helixagent_provider_responses_total Total provider responses
# TYPE helixagent_provider_responses_total counter
helixagent_provider_responses_total{provider="deepseek",status="success"} 1000
helixagent_provider_responses_total{provider="deepseek",status="error"} 50

Error Handling

Error Response Format

{
  "error": {
    "message": "Invalid model specified",
    "type": "invalid_request_error", 
    "code": "invalid_model",
    "param": "model",
    "timestamp": "2024-01-15T10:30:00Z"
  }
}

Common Error Codes

Code	Description
`invalid_model`	Model not found or not supported
`invalid_request`	Request validation failed
`missing_api_key`	API key required but not provided
`invalid_api_key`	API key is invalid or expired
`rate_limit_exceeded`	Rate limit exceeded
`insufficient_quota`	API quota exceeded
`model_overloaded`	Model is currently overloaded
`provider_error`	LLM provider returned an error
`ensemble_failed`	Ensemble processing failed

Rate Limiting

HelixAgent implements intelligent rate limiting:

Anonymous requests: 100 requests/minute
Authenticated users: 1000 requests/minute
Premium users: 10000 requests/minute

Rate limit headers are included in responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1703123456

Streaming

All chat and text completion endpoints support streaming:

Request:

{
  "model": "helixagent-ensemble",
  "messages": [{"role": "user", "content": "Hello"}],
  "stream": true
}

Response Format:

data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {"role": "assistant"}}]}

data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {"content": "Hello"}}]}

data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {}, "finish_reason": "stop"}]}

data: [DONE]

SDK Examples

Go SDK

package main

import (
    "context"
    "fmt"
    "dev.helix.agent-go"
)

func main() {
    client := helixagent.NewClient("your-api-key")
    
    resp, err := client.CreateChatCompletion(context.Background(), &helixagent.ChatCompletionRequest{
        Model: "helixagent-ensemble",
        Messages: []helixagent.Message{
            {Role: "user", Content: "Hello, how are you?"},
        },
    })
    
    if err != nil {
        panic(err)
    }
    
    fmt.Println(resp.Choices[0].Message.Content)
}

Python SDK

from helixagent import HelixAgentClient

client = HelixAgentClient(api_key="your-api-key")

response = client.chat.completions.create(
    model="helixagent-ensemble",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)

print(response.choices[0].message.content)

JavaScript SDK

import { HelixAgentClient } from '@helixagent/client';

const client = new HelixAgentClient({ apiKey: 'your-api-key' });

const response = await client.chat.completions.create({
    model: 'helixagent-ensemble',
    messages: [
        { role: 'user', content: 'Hello, how are you?' }
    ]
});

console.log(response.choices[0].message.content);

Monitoring & Metrics

Prometheus Metrics

HelixAgent exposes comprehensive metrics at /metrics:

http_requests_total - Total HTTP requests
http_request_duration_seconds - Request duration histogram
llm_requests_total - LLM provider requests
llm_response_time_seconds - Provider response times
llm_error_rate_total - Error rates by provider
ensemble_requests_total - Ensemble processing requests

Grafana Dashboard

Import the pre-configured dashboard:

curl -X POST http://admin:admin@localhost:3000/api/dashboards/db \
  -H "Content-Type: application/json" \
  -d @monitoring/dashboards/helixagent-dashboard.json

Advanced Services

HelixAgent includes several advanced services that enhance LLM interactions and provide enterprise-grade capabilities. These services are currently available as internal APIs and will be exposed via REST endpoints in future releases.

1. Model Context Protocol (MCP)

The MCP service enables seamless integration with external tools and services through a standardized protocol.

Features:

Auto-discovery of MCP servers
Tool registration and management
Health monitoring and failover
Secure tool execution