This tutorial will guide you through setting up and using HelixAgent, a production-ready LLM facade system that intelligently routes requests across multiple LLM providers.
- Docker and Docker Compose
- curl (for API testing)
- Git
# Clone the repository
git clone https://github.com/vasic-digital/HelixAgent.git
cd HelixAgent
# The project is already built and ready to runCreate a .env file with your LLM provider API keys:
# HelixAgent Configuration
PORT=8080
HELIXAGENT_API_KEY=your-super-secret-api-key-here
# JWT Secret (generate a secure random string)
JWT_SECRET=your-secure-jwt-secret-here
# LLM Provider API Keys (get these from each provider)
CLAUDE_API_KEY=sk-ant-api03-your-claude-key-here
DEEPSEEK_API_KEY=sk-your-deepseek-key-here
GEMINI_API_KEY=your-gemini-api-key-here
# Optional: Qwen and Z.AI if you have them
QWEN_API_KEY=your-qwen-key-here
ZAI_API_KEY=your-zai-key-here
# Database (will use SQLite for this tutorial)
DB_HOST=localhost
DB_PORT=5432
DB_USER=helixagent
DB_PASSWORD=password
DB_NAME=helixagent_db
# Redis (optional for this tutorial)
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=- Claude (Anthropic): https://console.anthropic.com/
- DeepSeek: https://platform.deepseek.com/
- Gemini (Google): https://makersuite.google.com/app/apikey
# Build and start the services
docker-compose up --build -d
# Check that services are running
docker-compose ps
# View logs
docker-compose logs -f helixagent# Check if HelixAgent is running
curl http://localhost:7061/health
# Expected response:
# {"status":"healthy"}curl -X POST http://localhost:7061/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Hello, can you tell me about yourself in one sentence?",
"model": "claude-3-sonnet-20240229",
"max_tokens": 100,
"temperature": 0.7
}'curl -X POST http://localhost:7061/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Hello, can you tell me about yourself in one sentence?",
"model": "deepseek-coder",
"max_tokens": 100,
"temperature": 0.7
}'curl -X POST http://localhost:7061/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Hello, can you tell me about yourself in one sentence?",
"model": "gemini-pro",
"max_tokens": 100,
"temperature": 0.7
}'HelixAgent's magic happens with ensemble voting - it routes your request to multiple providers and returns the best response:
curl -X POST http://localhost:7061/v1/ensemble/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Explain quantum computing in simple terms",
"ensemble_config": {
"strategy": "confidence_weighted",
"min_providers": 3,
"confidence_threshold": 0.8,
"preferred_providers": ["claude", "deepseek", "gemini"]
}
}'Expected Response:
{
"id": "ensemble-123",
"object": "ensemble.completion",
"created": 1677652288,
"model": "claude-3-sonnet-20240229",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously, allowing them to solve certain complex problems much faster than classical computers..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 150,
"total_tokens": 170
},
"ensemble": {
"voting_method": "confidence_weighted",
"responses_count": 3,
"scores": {
"claude": 0.92,
"deepseek": 0.88,
"gemini": 0.85
},
"selected_provider": "claude",
"selection_score": 0.92
}
}curl -X POST http://localhost:7061/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [
{
"role": "system",
"content": "You are a helpful coding assistant."
},
{
"role": "user",
"content": "Write a simple Python function to calculate fibonacci numbers"
}
],
"temperature": 0.7,
"max_tokens": 200
}'# List all available providers
curl http://localhost:7061/v1/providers
# Check provider health
curl http://localhost:7061/v1/providers/claude/health
# Get available models
curl http://localhost:7061/v1/models# Check enhanced health with provider status
curl http://localhost:7061/v1/health
# View Prometheus metrics
curl http://localhost:7061/metrics# Stop and remove containers
docker-compose down
# Remove volumes (optional - this will delete data)
docker-compose down -v- Multi-Provider Intelligence: HelixAgent routes requests across Claude, DeepSeek, and Gemini
- Ensemble Voting: The system intelligently selects the best response using confidence scoring
- OpenAI Compatibility: Use familiar API patterns with enhanced capabilities
- Health Monitoring: Built-in health checks and metrics collection
- Easy Deployment: Docker-based setup for quick testing
- Add Authentication: Set up user accounts and JWT tokens
- Configure Memory: Enable Cognee for context-aware responses
- Set up Monitoring: Configure Grafana dashboards for visualization
- Scale Up: Deploy with load balancing for production use
- Add More Providers: Integrate additional LLM providers as needed
- API Key Errors: Double-check your API keys in
.env - Port Conflicts: Ensure port 8080 is available
- Database Issues: Check PostgreSQL container logs
- Rate Limits: Some providers have rate limits - wait and retry
# Check container logs
docker-compose logs helixagent
# Restart services
docker-compose restart
# Rebuild from scratch
docker-compose down
docker-compose up --build --force-recreate- Streaming: Add
"stream": truefor real-time responses - Memory Enhancement: Add
"memory_enhanced": truefor context awareness - Custom Routing: Configure provider preferences and weights
- Rate Limiting: Implement per-user rate limits
- Plugin System: Extend functionality with custom plugins
Congratulations! You've successfully set up and used HelixAgent, an enterprise-grade LLM facade system. The system is now intelligently routing your requests across multiple providers and delivering the best possible responses through ensemble voting. 🚀