HelixAgent provides a unified OpenAI-compatible API that aggregates responses from multiple LLM providers (DeepSeek, Qwen, OpenRouter, Claude, Gemini) and offers intelligent ensemble capabilities. The system includes advanced features like Model Context Protocol (MCP) support, Language Server Protocol (LSP) integration, intelligent tool orchestration, context management, and security sandboxing.
Development: http://localhost:7061
Production: https://api.yourdomain.com
HelixAgent uses JWT-based authentication for secure access:
# Get JWT token
curl -X POST http://localhost:7061/auth/login \
-H "Content-Type: application/json" \
-d '{
"username": "your-username",
"password": "your-password"
}'
# Use token in requests
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
http://localhost:7061/v1/modelsImportant: OAuth tokens from CLI tools are product-restricted and cannot be used for general API calls.
HelixAgent supports OAuth2 authentication for certain providers, but there are important limitations to understand:
| Token Source | API Access |
|---|---|
~/.claude/.credentials.json (from claude auth login) |
Restricted to Claude Code only |
What works:
- HelixAgent can read OAuth tokens from the credentials file
- Tokens are valid and non-expired
What doesn't work:
- Using Claude OAuth tokens for general API requests returns: "This credential is only authorized for use with Claude Code and cannot be used for other API requests."
Solution: Get an API key from https://console.anthropic.com/
| Token Source | API Access |
|---|---|
~/.qwen/oauth_creds.json (from Qwen CLI login) |
For Qwen Portal only |
What works:
- HelixAgent can read OAuth tokens from the credentials file
- Tokens are valid for
portal.qwen.ai
What doesn't work:
- Using Qwen OAuth tokens for DashScope API returns: "invalid_api_key" (tokens are for portal use only)
Solution: Get a DashScope API key from https://dashscope.aliyuncs.com/
To use OAuth2 credentials (with above limitations), set:
# Enable OAuth credential reading for Claude
CLAUDE_CODE_USE_OAUTH_CREDENTIALS=true
# Enable OAuth credential reading for Qwen
QWEN_CODE_USE_OAUTH_CREDENTIALS=trueFor production deployments, we recommend using API keys from provider consoles instead of OAuth tokens.
| Endpoint | Method | Authentication | Description |
|---|---|---|---|
/health |
GET | None | Basic health check |
/v1/health |
GET | None | Enhanced health with provider status |
/metrics |
GET | None | Prometheus metrics |
/v1/models |
GET | None | List available models |
/v1/providers |
GET | None | Public provider listing |
/v1/auth/register |
POST | None | Register new user |
/v1/auth/login |
POST | None | Authenticate user |
/v1/auth/refresh |
POST | Bearer Token | Refresh JWT token |
/v1/auth/logout |
POST | Bearer Token | Invalidate token |
/v1/auth/me |
GET | Bearer Token | Get current user info |
/v1/completions |
POST | Bearer Token | Legacy text completion |
/v1/completions/stream |
POST | Bearer Token | Streaming text completion |
/v1/chat/completions |
POST | Bearer Token | Chat completion |
/v1/chat/completions/stream |
POST | Bearer Token | Streaming chat completion |
/v1/ensemble/completions |
POST | Bearer Token | Direct ensemble completion |
/v1/providers |
GET | Bearer Token | Detailed provider info |
/v1/providers/:name/health |
GET | Bearer Token | Provider-specific health |
/v1/admin/health/all |
GET | Admin Token | Comprehensive system health |
/v1/mcp/capabilities |
GET | Bearer Token | MCP server capabilities |
/v1/mcp/tools |
GET | Bearer Token | Available MCP tools |
/v1/mcp/tools/call |
POST | Bearer Token | Execute MCP tool |
/v1/mcp/tools/search |
GET/POST | Bearer Token | Search MCP tools |
/v1/mcp/tools/suggestions |
GET | Bearer Token | Tool suggestions |
/v1/mcp/prompts |
GET | Bearer Token | Available MCP prompts |
/v1/mcp/resources |
GET | Bearer Token | Available MCP resources |
/v1/mcp/adapters/search |
GET/POST | Bearer Token | Search MCP adapters |
/v1/mcp/categories |
GET | Bearer Token | Tool categories |
/v1/mcp/stats |
GET | Bearer Token | MCP usage statistics |
/v1/debates |
POST | Bearer Token | Create AI debate |
/v1/debates/{id} |
GET | Bearer Token | Get debate information |
/v1/debates/{id}/status |
GET | Bearer Token | Get debate status |
/v1/debates/{id}/results |
GET | Bearer Token | Get debate results |
/v1/debates/{id}/approve |
POST | Bearer Token | Approve debate gate |
/v1/debates/{id}/reject |
POST | Bearer Token | Reject debate gate |
/v1/debates/{id}/gates |
GET | Bearer Token | Get approval gates |
/v1/debates/{id}/audit |
GET | Bearer Token | Get audit trail |
/v1/debates/team |
GET | None | Get debate team config |
/v1/debates/orchestrator/status |
GET | Bearer Token | Orchestrator status |
/v1/acp/health |
GET | None | ACP service health |
/v1/acp/agents |
GET | None | List ACP agents |
/v1/acp/execute |
POST | None | Execute agent task |
/v1/acp/rpc |
POST | None | JSON-RPC 2.0 endpoint |
/v1/lsp/servers |
GET | Bearer Token | List LSP servers |
/v1/lsp/execute |
POST | Bearer Token | Execute LSP request |
/v1/lsp/sync |
POST | Bearer Token | Sync LSP servers |
/v1/lsp/stats |
GET | Bearer Token | LSP statistics |
/v1/vision/health |
GET | None | Vision service health |
/v1/vision/capabilities |
GET | None | Vision capabilities |
/v1/vision/analyze |
POST | None | Analyze image |
/v1/vision/ocr |
POST | None | OCR extraction |
/v1/vision/detect |
POST | None | Object detection |
/v1/cognee/health |
GET | None | Cognee service health |
/v1/cognee/memory |
POST | None | Add to knowledge graph |
/v1/cognee/search |
POST | None | Search knowledge graph |
/v1/cognee/cognify |
POST | None | Cognify content |
/v1/rag/health |
GET | Bearer Token | RAG system health |
/v1/rag/documents |
POST | Bearer Token | Ingest document |
/v1/rag/search |
POST | Bearer Token | Search documents |
/v1/rag/search/hybrid |
POST | Bearer Token | Hybrid search |
/v1/embeddings/generate |
POST | Bearer Token | Generate embeddings |
/v1/embeddings/search |
POST | Bearer Token | Vector search |
/v1/protocols/execute |
POST | Bearer Token | Execute protocol request |
/v1/protocols/servers |
GET | Bearer Token | List protocol servers |
/v1/sessions |
POST/GET | Bearer Token | Session management |
/v1/agents |
GET | Bearer Token | CLI agent registry |
/v1/features |
GET | None | Feature flags |
/v1/format |
POST | None | Format code |
/v1/formatters |
GET | None | List formatters |
/v1/agentic/workflows |
POST/GET | Bearer Token | Agentic workflows |
/v1/planning/hiplan |
POST | Bearer Token | Hierarchical planning |
/v1/planning/mcts |
POST | Bearer Token | Monte Carlo Tree Search |
/v1/planning/tot |
POST | Bearer Token | Tree of Thoughts |
/v1/llmops/experiments |
POST/GET | Bearer Token | A/B experiments |
/v1/llmops/evaluate |
POST | Bearer Token | Continuous evaluation |
/v1/llmops/prompts |
POST/GET | Bearer Token | Prompt versioning |
/v1/benchmark/run |
POST | Bearer Token | Start benchmark |
/v1/benchmark/results |
GET | Bearer Token | Benchmark results |
/v1/discovery/models |
GET | Bearer Token | Discovered models |
/v1/scoring/model/{id} |
GET | Bearer Token | Model score |
/v1/scoring/top |
GET | Bearer Token | Top models |
/v1/verification/model |
POST | Bearer Token | Verify model |
/v1/verification/status |
GET | Bearer Token | Verification status |
/v1/tasks |
POST/GET | None | Background tasks |
/v1/search/semantic |
POST | Bearer Token | Semantic code search |
/v1/templates |
GET | Bearer Token | Prompt templates |
/v1/checkpoints |
POST/GET | Bearer Token | Workspace snapshots |
/v1/browser/navigate |
POST | Bearer Token | Browser automation |
/v1/skills |
GET | Bearer Token | Skill registry |
/v1/qa/sessions |
POST | Bearer Token | Start QA session |
/v1/graphql |
POST | Bearer Token | GraphQL (feature-flagged) |
/v1/startup/verification |
GET | None | Startup status |
HelixAgent supports 22+ models from multiple providers:
helixagent-ensemble- Intelligent multi-provider aggregation with confidence-weighted voting
deepseek-chat- General purpose chat modeldeepseek-coder- Code generation and analysis model
qwen-turbo- Fast, cost-effective modelqwen-plus- Balanced performance and costqwen-max- Highest quality responses
openrouter/grok-4- Latest Grok modelopenrouter/gemini-2.5- Google's Gemini 2.5openrouter/anthropic/claude-3.5-sonnet- Advanced reasoningopenrouter/openai/gpt-4o- OpenAI's GPT-4 Omniopenrouter/meta-llama/llama-3.1-405b- Meta's LLaMA 3.1openrouter/mistralai/mistral-large- Mistral's large modelopenrouter/meta-llama/llama-3.1-70b- LLaMA 3.1 70Bopenrouter/google/gemma-2-27b- Google's Gemma 2openrouter/openai/gpt-4-turbo- GPT-4 Turboopenrouter/microsoft/wizardlm-2-8x22b- Microsoft's WizardLMopenrouter/anthropic/claude-3.5-haiku- Claude 3.5 Haikuopenrouter/meta-llama/llama-3.1-8b- LLaMA 3.1 8Bopenrouter/microsoft/wizardlm-2-7b- WizardLM 7Bopenrouter/qwen/qwen-2-72b- Qwen 2 72Bopenrouter/openai/gpt-4o-mini- GPT-4 Miniopenrouter/google/gemini-flash-1.5- Gemini Flash
Basic health check endpoint.
Response:
{
"status": "ok",
"timestamp": 1703123456,
"version": "1.0.0"
}Enhanced health check with provider status.
Response:
{
"status": "healthy",
"timestamp": 1703123456,
"providers": {
"deepseek": {"status": "healthy", "response_time": 0.8},
"qwen": {"status": "healthy", "response_time": 0.6},
"openrouter": {"status": "healthy", "response_time": 1.2}
},
"ensemble": {"available": true, "providers": 3}
}List all available models.
Response:
{
"object": "list",
"data": [
{
"id": "helixagent-ensemble",
"object": "model",
"created": 1703123456,
"owned_by": "helixagent"
},
{
"id": "deepseek-chat",
"object": "model",
"created": 1703123456,
"owned_by": "deepseek"
}
]
}Create chat completion with streaming support.
Request Body:
{
"model": "helixagent-ensemble",
"messages": [
{
"role": "system",
"content": "You are a helpful AI assistant."
},
{
"role": "user",
"content": "Explain quantum computing in simple terms."
}
],
"max_tokens": 1000,
"temperature": 0.7,
"stream": false
}Response:
{
"id": "chatcmpl-helixagent-123",
"object": "chat.completion",
"created": 1703123456,
"model": "helixagent-ensemble",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing is like..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 75,
"total_tokens": 100
},
"ensemble": {
"providers_used": ["deepseek", "qwen", "openrouter"],
"confidence_score": 0.92,
"voting_strategy": "confidence_weighted"
}
}Streaming Response:
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", ...}
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", ...}
data: [DONE]Legacy text completion endpoint.
Request Body:
{
"model": "deepseek-chat",
"prompt": "The future of artificial intelligence is",
"max_tokens": 100,
"temperature": 0.5,
"stream": false
}Response:
{
"id": "cmpl-deepseek-456",
"object": "text_completion",
"created": 1703123456,
"model": "deepseek-chat",
"choices": [
{
"text": "...full of exciting possibilities.",
"index": 0,
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}Generate text embeddings for semantic search.
Request Body:
{
"model": "deepseek-chat",
"input": "The quick brown fox jumps over the lazy dog",
"encoding_format": "float"
}Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.1, 0.2, 0.3, ...],
"index": 0
}
],
"model": "deepseek-chat",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}List all configured LLM providers.
Response:
{
"providers": [
{
"id": "deepseek",
"name": "DeepSeek",
"type": "openai_compatible",
"status": "healthy",
"models": ["deepseek-chat", "deepseek-coder"],
"capabilities": ["text_completion", "chat", "embeddings"],
"supports_streaming": true
},
{
"id": "qwen",
"name": "Qwen",
"type": "openai_compatible",
"status": "healthy",
"models": ["qwen-turbo", "qwen-plus", "qwen-max"],
"capabilities": ["text_completion", "chat", "embeddings"],
"supports_streaming": true
}
]
}Check health status of all providers.
Response:
{
"status": "healthy",
"providers": {
"deepseek": {
"status": "healthy",
"response_time": 0.8,
"last_check": "2024-01-15T10:30:00Z",
"error_rate": 0.01
},
"qwen": {
"status": "healthy",
"response_time": 0.6,
"last_check": "2024-01-15T10:30:00Z",
"error_rate": 0.02
}
}
Get MCP server capabilities and supported features.
Response:
{
"version": "1.0.0",
"capabilities": {
"tools": {
"listChanged": true
},
"prompts": {
"listChanged": true
},
"resources": {
"listChanged": true
}
},
"providers": ["deepseek", "qwen", "openrouter"],
"mcp_servers": ["filesystem-mcp", "database-mcp"]
}List all available MCP tools across configured servers.
Response:
{
"tools": [
{
"name": "read_file",
"description": "Read contents of a file",
"inputSchema": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "File path to read"
}
},
"required": ["path"]
}
}
]
}Execute an MCP tool with specified parameters.
Request Body:
{
"name": "read_file",
"arguments": {
"path": "/etc/hostname"
}
}Response:
{
"result": "helixagent-server\n",
"success": true,
"execution_time": 0.045
}List available MCP prompts for enhanced interactions.
Response:
{
"prompts": [
{
"name": "summarize",
"description": "Summarize text content",
"arguments": [
{
"name": "text",
"description": "Text to summarize",
"required": true
}
]
},
{
"name": "analyze",
"description": "Analyze content for insights",
"arguments": [
{
"name": "content",
"description": "Content to analyze",
"required": true
}
]
}
]
}List available MCP resources and their metadata.
Response:
{
"resources": [
{
"uri": "helixagent://providers",
"name": "Provider Information",
"description": "Information about configured LLM providers",
"mimeType": "application/json"
},
{
"uri": "helixagent://models",
"name": "Model Metadata",
"description": "Metadata about available LLM models",
"mimeType": "application/json"
}
]
Create and start a new AI debate with multiple participants.
Request Body:
{
"debateId": "climate-debate-001",
"topic": "What are the most effective strategies for combating climate change?",
"maximal_repeat_rounds": 5,
"consensus_threshold": 0.75,
"participants": [
{
"name": "EnvironmentalEconomist",
"role": "Economic Analyst",
"llms": [
{
"provider": "claude",
"model": "claude-3-5-sonnet-20241022",
"api_key": "${CLAUDE_API_KEY}"
}
]
},
{
"name": "ClimateScientist",
"role": "Scientific Expert",
"llms": [
{
"provider": "deepseek",
"model": "deepseek-coder"
}
]
}
],
"enable_cognee": true,
"cognee_config": {
"dataset_name": "climate_debate_analysis"
}
}Response:
{
"debateId": "climate-debate-001",
"status": "started",
"estimated_duration": 180,
"participants_count": 2,
"created_at": "2024-01-15T10:30:00Z"
}Get comprehensive information about a specific debate.
Response:
{
"debateId": "climate-debate-001",
"topic": "What are the most effective strategies for combating climate change?",
"status": "completed",
"progress": {
"current_round": 3,
"total_rounds": 5,
"completed_responses": 6,
"total_expected_responses": 6
},
"participants": [
{
"name": "EnvironmentalEconomist",
"responses_count": 3,
"avg_quality_score": 0.85
}
],
"quality_metrics": {
"overall_score": 0.82,
"consensus_achieved": true,
"consensus_confidence": 0.78
}
}Get real-time debate progress and status.
Response:
{
"debateId": "climate-debate-001",
"status": "in_progress",
"current_round": 2,
"current_participant": "ClimateScientist",
"time_elapsed": 45,
"estimated_remaining": 135,
"active_participants": 2,
"errors": []
}Get complete debate results and analysis.
Response:
{
"debateId": "climate-debate-001",
"topic": "What are the most effective strategies for combating climate change?",
"status": "completed",
"duration": 156,
"rounds_completed": 5,
"consensus": {
"achieved": true,
"confidence": 0.82,
"final_position": "A combination of economic incentives, technological innovation, and policy frameworks provides the most effective approach to combating climate change.",
"key_agreements": [
"Carbon pricing mechanisms are essential",
"Technological innovation must be accelerated",
"International cooperation is crucial"
]
},
"participants": [
{
"name": "EnvironmentalEconomist",
"total_responses": 5,
"avg_quality_score": 0.88,
"contribution_score": 0.85,
"persuasion_effectiveness": 0.75
}
],
"quality_metrics": {
"overall_debate_quality": 0.86,
"argument_diversity": 0.92,
"evidence_quality": 0.81,
"reasoning_depth": 0.89
},
"cognee_insights": {
"key_themes": ["economic_policy", "technological_innovation", "international_cooperation"],
"sentiment_analysis": "constructive_dialogue",
"recommendations": [
"Implement carbon pricing mechanisms",
"Increase R&D investment in clean technologies",
"Strengthen international climate agreements"
]
}
}Generate and download a formatted debate report.
Query Parameters:
format: Report format (json,pdf,html) - default:json
Response: (JSON format shown)
{
"report_title": "Climate Change Debate Analysis",
"generated_at": "2024-01-15T11:30:00Z",
"executive_summary": "A comprehensive debate between economic and scientific experts on climate change strategies...",
"debate_metrics": {
"duration_minutes": 2.6,
"participant_count": 2,
"total_responses": 10,
"consensus_achieved": true
},
"key_findings": [
"Economic incentives are crucial for adoption of clean technologies",
"Scientific evidence supports immediate action",
"Policy frameworks must balance economic and environmental goals"
],
"recommendations": [
"Implement comprehensive carbon pricing",
"Accelerate clean technology development",
"Foster international cooperation"
]
}Register a new user account.
Request Body:
{
"username": "newuser",
"password": "securepassword123",
"email": "user@example.com"
}Response:
{
"success": true,
"message": "User registered successfully",
"user_id": "user_123456",
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}Authenticate and get JWT token.
Request Body:
{
"username": "existinguser",
"password": "securepassword123"
}Response:
{
"success": true,
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_in": 86400,
"user": {
"id": "user_123456",
"username": "existinguser",
"email": "user@example.com"
}
}Refresh JWT token.
Headers:
Authorization: Bearer <current_token>
Response:
{
"success": true,
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_in": 86400
}Invalidate current token.
Headers:
Authorization: Bearer <current_token>
Response:
{
"success": true,
"message": "Logged out successfully"
}Get current user information.
Headers:
Authorization: Bearer <current_token>
Response:
{
"user": {
"id": "user_123456",
"username": "existinguser",
"email": "user@example.com",
"created_at": "2024-01-15T10:30:00Z"
}
}Direct ensemble completion endpoint with advanced configuration.
Request Body:
{
"prompt": "Explain quantum computing",
"model": "helixagent-ensemble",
"temperature": 0.7,
"max_tokens": 1000,
"ensemble_config": {
"strategy": "confidence_weighted",
"min_providers": 2,
"confidence_threshold": 0.8,
"fallback_to_best": true,
"timeout": 30,
"preferred_providers": ["deepseek", "qwen"]
},
"memory_enhanced": true
}Response:
{
"id": "ensemble-123",
"object": "ensemble.completion",
"created": 1703123456,
"model": "deepseek-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing is..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 75,
"total_tokens": 100
},
"ensemble": {
"voting_method": "confidence_weighted",
"responses_count": 3,
"scores": {
"deepseek": 0.92,
"qwen": 0.85,
"openrouter": 0.78
},
"metadata": {
"selection_reason": "highest_confidence"
},
"selected_provider": "deepseek",
"selection_score": 0.92
}
}Get detailed provider information including capabilities.
Headers:
Authorization: Bearer <token>
Response:
{
"providers": [
{
"name": "deepseek",
"supported_models": ["deepseek-chat", "deepseek-coder"],
"supported_features": ["streaming", "function_calling"],
"supports_streaming": true,
"supports_function_calling": true,
"supports_vision": false,
"metadata": {
"max_tokens": 4096,
"rate_limit": "100/min"
}
}
],
"count": 3
}Check health status of specific provider.
Example: GET /v1/providers/deepseek/health
Response:
{
"provider": "deepseek",
"healthy": true,
"response_time_ms": 850,
"last_check": "2024-01-15T10:30:00Z"
}Error Response (unhealthy):
{
"provider": "openrouter",
"healthy": false,
"error": "API key invalid or expired",
"last_check": "2024-01-15T10:30:00Z"
}Get comprehensive health status of all components (admin only).
Headers:
Authorization: Bearer <admin_token>
Response:
{
"provider_health": {
"deepseek": null,
"qwen": "connection timeout",
"openrouter": null,
"claude": "rate limit exceeded"
},
"timestamp": 1703123456,
"overall_status": "degraded"
}Prometheus metrics endpoint (no authentication required).
Response:
# HELP helixagent_requests_total Total number of requests
# TYPE helixagent_requests_total counter
helixagent_requests_total{endpoint="/v1/chat/completions",method="POST"} 1234
# HELP helixagent_request_duration_seconds Request duration in seconds
# TYPE helixagent_request_duration_seconds histogram
helixagent_request_duration_seconds_bucket{endpoint="/v1/chat/completions",le="0.1"} 1000
helixagent_request_duration_seconds_bucket{endpoint="/v1/chat/completions",le="0.5"} 1200
# HELP helixagent_provider_responses_total Total provider responses
# TYPE helixagent_provider_responses_total counter
helixagent_provider_responses_total{provider="deepseek",status="success"} 1000
helixagent_provider_responses_total{provider="deepseek",status="error"} 50
{
"error": {
"message": "Invalid model specified",
"type": "invalid_request_error",
"code": "invalid_model",
"param": "model",
"timestamp": "2024-01-15T10:30:00Z"
}
}| Code | Description |
|---|---|
invalid_model |
Model not found or not supported |
invalid_request |
Request validation failed |
missing_api_key |
API key required but not provided |
invalid_api_key |
API key is invalid or expired |
rate_limit_exceeded |
Rate limit exceeded |
insufficient_quota |
API quota exceeded |
model_overloaded |
Model is currently overloaded |
provider_error |
LLM provider returned an error |
ensemble_failed |
Ensemble processing failed |
HelixAgent implements intelligent rate limiting:
- Anonymous requests: 100 requests/minute
- Authenticated users: 1000 requests/minute
- Premium users: 10000 requests/minute
Rate limit headers are included in responses:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1703123456All chat and text completion endpoints support streaming:
Request:
{
"model": "helixagent-ensemble",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}Response Format:
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {"role": "assistant"}}]}
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {"content": "Hello"}}]}
data: {"id": "chatcmpl-123", "object": "chat.completion.chunk", "choices": [{"delta": {}, "finish_reason": "stop"}]}
data: [DONE]
package main
import (
"context"
"fmt"
"dev.helix.agent-go"
)
func main() {
client := helixagent.NewClient("your-api-key")
resp, err := client.CreateChatCompletion(context.Background(), &helixagent.ChatCompletionRequest{
Model: "helixagent-ensemble",
Messages: []helixagent.Message{
{Role: "user", Content: "Hello, how are you?"},
},
})
if err != nil {
panic(err)
}
fmt.Println(resp.Choices[0].Message.Content)
}from helixagent import HelixAgentClient
client = HelixAgentClient(api_key="your-api-key")
response = client.chat.completions.create(
model="helixagent-ensemble",
messages=[
{"role": "user", "content": "Hello, how are you?"}
]
)
print(response.choices[0].message.content)import { HelixAgentClient } from '@helixagent/client';
const client = new HelixAgentClient({ apiKey: 'your-api-key' });
const response = await client.chat.completions.create({
model: 'helixagent-ensemble',
messages: [
{ role: 'user', content: 'Hello, how are you?' }
]
});
console.log(response.choices[0].message.content);HelixAgent exposes comprehensive metrics at /metrics:
http_requests_total- Total HTTP requestshttp_request_duration_seconds- Request duration histogramllm_requests_total- LLM provider requestsllm_response_time_seconds- Provider response timesllm_error_rate_total- Error rates by providerensemble_requests_total- Ensemble processing requests
Import the pre-configured dashboard:
curl -X POST http://admin:admin@localhost:3000/api/dashboards/db \
-H "Content-Type: application/json" \
-d @monitoring/dashboards/helixagent-dashboard.jsonHelixAgent includes several advanced services that enhance LLM interactions and provide enterprise-grade capabilities. These services are currently available as internal APIs and will be exposed via REST endpoints in future releases.
The MCP service enables seamless integration with external tools and services through a standardized protocol.
Features:
- Auto-discovery of MCP servers
- Tool registration and management
- Health monitoring and failover
- Secure tool execution
Current Status: Internal service, REST API integration planned
LSP integration provides advanced code intelligence capabilities.
Features:
- Workspace symbols and references
- Code completion with context awareness
- Refactoring support
- Multi-language support (Go, Python, JavaScript, etc.)
Current Status: Internal service, REST API integration planned
Intelligent tool management and execution orchestration.
Features:
- Dynamic tool discovery and registration
- Tool validation and security checks
- Parallel tool execution
- Dependency management and cycle detection
Current Status: Internal service, REST API integration planned
Advanced context handling with ML-based relevance scoring.
Features:
- Multi-source context aggregation
- Relevance scoring and ranking
- Context compression and optimization
- Conflict detection and resolution
Current Status: Internal service, REST API integration planned
Secure execution environment for tool operations.
Features:
- Docker containerization
- Resource limits and isolation
- Command validation and sanitization
- Audit logging and monitoring
Current Status: Internal service, REST API integration planned
Workflow orchestration for complex multi-step operations.
Features:
- Code analysis workflows
- Tool chain execution
- Parallel processing
- Error handling and recovery
Current Status: Internal service, REST API integration planned
HelixAgent provides a complete OpenAPI 3.0 specification for automated API documentation and client generation.
Local Development:
# Download the OpenAPI spec
curl -o helixagent-openapi.yaml http://localhost:7061/openapi.yaml
# Or view in browser
open http://localhost:7061/swagger-ui/Production:
curl -o helixagent-openapi.yaml https://api.yourdomain.com/openapi.yamlTypeScript/JavaScript:
npx openapi-typescript-codegen --input helixagent-openapi.yaml --output ./client --client axiosPython:
pip install openapi-python-client
openapi-python-client generate --path helixagent-openapi.yaml --output ./python-clientGo:
go install github.com/deepmap/oapi-codegen/cmd/oapi-codegen@latest
oapi-codegen -package helixagent helixagent-openapi.yaml > client.gen.goTypeScript Client Example:
import { HelixAgentClient } from './client';
const client = new HelixAgentClient({
BASE: 'http://localhost:7061',
TOKEN: 'your-jwt-token'
});
// Make a chat completion request
const response = await client.chatCompletions({
model: 'helixagent-ensemble',
messages: [
{ role: 'user', content: 'Explain quantum computing' }
],
temperature: 0.7,
max_tokens: 1000
});
console.log(response.choices[0].message.content);Python Client Example:
from helixagent_client import HelixAgentClient
client = HelixAgentClient(
base_url="http://localhost:7061",
token="your-jwt-token"
)
response = client.chat_completions(
model="helixagent-ensemble",
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)Go Client Example:
package main
import (
"context"
"fmt"
"github.com/your-org/helixagent-client"
)
func main() {
client := helixagent.NewClient("http://localhost:7061")
client.SetToken("your-jwt-token")
req := helixagent.ChatCompletionRequest{
Model: "helixagent-ensemble",
Messages: []helixagent.Message{
{Role: "user", Content: "Explain quantum computing"},
},
Temperature: 0.7,
MaxTokens: 1000,
}
resp, err := client.ChatCompletions(context.Background(), req)
if err != nil {
panic(err)
}
fmt.Println(resp.Choices[0].Message.Content)
}- Use
helixagent-ensemblefor most reliable responses - Choose specific models for cost optimization
- Consider response time requirements
- Always handle
model_overloadederrors with retry - Implement exponential backoff for rate limits
- Monitor ensemble confidence scores
- Enable streaming for long responses
- Use appropriate
max_tokenslimits - Batch requests when possible
- Use JWT tokens for authentication
- Validate input parameters
- Implement rate limiting client-side
- Use generated clients from OpenAPI spec for type safety
- Implement proper error handling in clients
- Cache responses when appropriate
- Monitor API usage and costs
For more detailed information, see the HelixAgent GitHub repository.