Skip to content

WiredMind2/OpenTrade

Repository files navigation

Trading Backtester Production System

A comprehensive, production-ready algorithmic trading backtesting system with sentiment analysis, machine learning models, and real-time monitoring capabilities.

πŸš€ Features

Core Functionality

  • News-based Sentiment Analysis: Process financial news to predict market sentiment
  • Machine Learning Models: LightGBM-based prediction models for 1d, 3d, and 7d horizons
  • Intraday Backtesting: Realistic backtesting with slippage and commission modeling
  • Multi-horizon Predictions: Support for multiple prediction timeframes
  • Interactive Charts: Real-time OHLC charts with AI prediction overlays
  • Advanced Visualization: Volume histograms, confidence bands, and prediction aggregation
  • Data Caching: Optimized performance with intelligent data caching

Production-Ready Infrastructure

  • Configuration Management: Centralized config with environment variable support
  • Comprehensive Logging: Structured logging with different levels and output destinations
  • Error Handling & Recovery: Circuit breakers, retry mechanisms, and graceful degradation
  • Database Migrations: Version-controlled schema evolution
  • Data Validation: Real-time data quality monitoring and anomaly detection
  • Feature Engineering: Automated feature extraction and selection pipeline
  • Model Versioning: Model lifecycle management and A/B testing framework
  • REST API: FastAPI-based endpoints for external integration

Monitoring & Observability

  • Performance Metrics: System and application performance monitoring
  • Health Checks: Comprehensive health check endpoints
  • Alerting System: Real-time alerting for critical issues
  • Quality Monitoring: Data quality dashboards and trend analysis

Security & Deployment

  • Authentication: API key and JWT-based authentication

  • CI/CD Pipeline: Automated testing, building, and deployment

    Note: This repository does not include a .github/workflows directory by default. If you want automated CI, add your GitHub Actions workflows under .github/workflows/.

πŸ€– Model System

Overview

The system features a new modular model architecture that enables seamless integration of various machine learning models for trading predictions. This design supports both pre-trained joblib models and custom Python-based models, providing flexibility for different use cases.

Model Bundle Canonical Format

All models are stored in a standardized bundle format:

{
  "meta": {
    "name": "str",
    "type": "str",
    "version": "str",
    "description": "str",
    "config_schema": {}
  },
  "model": "estimator",
  "extras": {}
}

Adding a New Model

For Joblib Models

Save your trained model in the canonical format using joblib.dump().

For Python Models

Create a new class in backend/models/ that inherits from BaseModel and implements the required methods (e.g., predict, train).

Developer Workflow

  • Model Discovery: Use the model registry to list available models
  • API Integration: Access models through dedicated API endpoints
  • Version Management: Track model versions and performance metrics

Running the Application and Tests

  1. Activate the virtual environment: & .venv\Scripts\Activate.ps1
  2. Start the backend from repo root: python main.py
  3. Start the frontend: cd frontend && npm run dev
  4. Run tests: pytest

πŸ“ Project Structure

trading-backtesting/
β”œβ”€β”€ main.py                    # Top-level import shim that re-exports backend app
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                # FastAPI app entry point
β”‚   β”œβ”€β”€ schemas/               # Pydantic models (e.g., `schemas/udf.py`)
β”‚   β”œβ”€β”€ routes/                # API route modules (health, predictions, backtests, scripts, websocket, ...)
β”‚   β”œβ”€β”€ config.py              # Configuration management
β”‚   β”œβ”€β”€ logging_config.py      # Comprehensive logging setup
β”‚   β”œβ”€β”€ error_handling.py      # Error handling and recovery
β”‚   β”œβ”€β”€ data_processing.py     # ETL / data processing utilities
β”‚   β”œβ”€β”€ data_validation.py     # Data quality monitoring
β”‚   β”œβ”€β”€ routes/monitoring.py  # Performance metrics and monitoring
β”‚   β”œβ”€β”€ requirements.txt           # Backend Python dependencies
β”‚   └── scripts/                   # Original trading and data ingestion scripts
β”œβ”€β”€ db/                        # Database files & schema
β”‚   └── schema.sql              # Database schema
β”œβ”€β”€ frontend/                  # React/TypeScript frontend
β”‚   β”œβ”€β”€ package.json
β”‚   β”œβ”€β”€ src/
β”‚   └── README.md
β”œβ”€β”€ models/                    # Trained model artifacts (.joblib files)
β”œβ”€β”€ tests/                     # Comprehensive test suite (API, integration, unit tests)
β”œβ”€β”€ htmlcov/                   # Generated coverage report
β”œβ”€β”€ .venv/                     # Local development virtual environment (not committed by policy)
└── README.md                  # This README

πŸ–₯ Running the Frontend

# In one terminal: start backend (ensure .venv is activated)
python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

# In a second terminal: start the frontend
cd frontend
npm run dev

οΏ½ Docker Development

This repository includes Docker support for local development with both backend and frontend services.

Start with Docker Compose

cp .env.example .env
docker compose up --build
  • Backend: http://localhost:8000
  • Frontend: http://localhost:5173

The backend service mounts ./backend, ./data, and .env for live development. The frontend service mounts ./frontend and supports hot reload.

Stop Docker services

docker compose down

Optional: VS Code Dev Container

If you use VS Code, open the repo in the dev container. The .devcontainer/devcontainer.json configuration forwards ports 8000 and 5173.

οΏ½πŸ›  Installation & Setup

Prerequisites

  • Python 3.10+
  • SQLite (or PostgreSQL for production)

Quick Start

  1. Clone and Setup:

    git clone <repository-url>
    cd trading-backtesting
    python -m venv .venv
    # PowerShell
    & .venv\Scripts\Activate.ps1
    # Or use the cross-platform-activation for bash/macOS:
    # source .venv/bin/activate
    # Install backend Python requirements
    pip install -r backend/requirements.txt
    # Install frontend dependencies (optional, if you will run the frontend)
    cd frontend
    npm install
    cd ..
  2. Environment Configuration:

    cp .env.example .env
    # Edit .env with your configuration
  3. Initialize Database:

    # Run schema migration to create the database schema
    python backend/scripts/apply_schema.py
    
    # Optionally run the ingestion & pipeline scripts to populate sample data
    python backend/scripts/run_pipeline.py
  4. Start API Server:

    # From repo root (after activating .venv):
    python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
    
    # Or run the module directly from backend:
    cd backend
    python main.py

πŸ”§ Configuration

The system uses a hierarchical configuration system:

  1. Default Configuration (config.py)
  2. Environment Variables (.env file)
  3. Runtime Configuration (API calls)

Key configuration sections:

  • Database: Connection settings, pool sizes, timeouts
  • API: Server settings, CORS, authentication
  • Trading: Capital, commissions, slippage, exposure limits
  • Models: Model paths, training parameters
  • Logging: Log levels, output formats, destinations
  • Monitoring: Alert thresholds, performance metrics

πŸ“Š API Documentation

Core Endpoints

Health & Monitoring

  • GET /health - System health check
  • GET /metrics - Performance metrics
  • GET /docs - Interactive API documentation

Predictions

  • POST /predict - Make trading predictions
  • GET /predictions/recent - Get recent predictions
  • GET /models - List available models

Backtesting

  • POST /backtest - Run backtest
  • GET /backtest/{id} - Get backtest results

Data Access

  • GET /data/prices/{ticker} - Get price data
  • GET /portfolio/current - Get current portfolio

Script Execution

  • POST /scripts/execute - Execute data processing or ML script
  • GET /scripts/status/{execution_id} - Get script execution status
  • GET /scripts/executions - List all script executions
  • POST /scripts/pipeline/run - Run the full data processing pipeline
  • GET /scripts/pipeline/status/{execution_id} - Get pipeline execution status

Authentication

API endpoints support authentication via:

  • API Key (header: Authorization: Bearer <key>)
  • JWT Tokens (for advanced use cases)

πŸ§ͺ Testing

Test Status

This repository includes a comprehensive test suite that covers API endpoints, the backtesting engine, data processing and integrations. Run the tests locally to verify the current status and coverage.

Running Tests

  • API endpoints (health, predictions, backtests, data, portfolio, scripts, monitoring, websockets)
  • Backtesting engine functionality
  • Script execution and pipeline management
  • Integration workflows

Running Tests

# Run all tests (make sure .venv is activated)
pytest tests/

# Run with coverage
pytest --cov=backend --cov-report=html

# Run a specific tests file
pytest tests/test_backtesting.py

# Run with verbose output
pytest -v

Test Categories

  • Unit Tests: Individual component testing (API endpoints, data validation, utilities)
  • Integration Tests: End-to-end workflow testing (pipeline execution, backtest flows)
  • WebSocket Tests: Real-time communication testing
  • Script Execution Tests: Background task and pipeline validation

πŸ“ˆ Usage Examples

Making Predictions

import requests

response = requests.post('http://localhost:8000/predict', json={
    'ticker': 'AAPL',
    'horizon': '1d',
    'context': {'market_conditions': 'normal'}
})

prediction = response.json()
print(f"Predicted return: {prediction['predicted_return']:.4f}")

Running Backtests

response = requests.post('http://localhost:8000/backtest', json={
    'strategy_name': 'sentiment_momentum',
    'start_date': '2025-01-01',
    'end_date': '2025-12-31',
    'initial_capital': 100000,
    'parameters': {'sentiment_threshold': 0.02}
})

backtest_id = response.json()['id']

Data Quality Monitoring

from data_validation import create_data_quality_monitor

monitor = create_data_quality_monitor()
reports = monitor.run_quality_checks(['price_daily', 'sentiment_predictions'])

for table, report in reports.items():
    print(f"{table}: {report.quality_level.value} ({report.quality_score:.2%})")

πŸ” Monitoring & Alerting

Health Checks

The system provides multiple health check endpoints:

  • Database connectivity
  • Model availability
  • Data freshness
  • System resources
  • API response times

Performance Monitoring

Track key metrics:

  • Prediction latency
  • Model accuracy over time
  • Data quality scores
  • System resource usage
  • Error rates

Alerting

Configure alerts for:

  • Model accuracy degradation
  • Data quality issues
  • System resource constraints
  • API performance degradation
  • Prediction confidence thresholds

Code Quality Standards

  • Type Hints: All functions should have type annotations
  • Documentation: Comprehensive docstrings for all public APIs
  • Testing: Minimum 80% test coverage
  • Logging: Appropriate logging for debugging and monitoring
  • Error Handling: Comprehensive error handling and recovery

πŸ™ Acknowledgments

  • Built with FastAPI, pandas, scikit-learn, and LightGBM
  • Uses TA-Lib for technical analysis indicators
  • Inspired by modern MLOps best practices
  • Designed for production financial trading systems

⚠️ Disclaimer: This software is for educational and research purposes. Trading involves risk of financial loss. Past performance does not guarantee future results. Always conduct your own research and consider consulting with financial professionals before making investment decisions.

About

A comprehensive Python system for algorithmic trading backtesting with sentiment analysis and machine learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors