Financial AI MLOps is an enterprise-grade machine learning operations project focused on detecting anomalies in financial market transactions. Built on Databricks, it provides a comprehensive end-to-end framework starting from real-time streaming data ingestion to automated model retraining and serving.
The primary objective is to reliably capture financial data streams (e.g., via Finnhub WebSockets), process them using Delta Live Tables (DLT), generate predictive features, and serve robust anomaly detection models capable of identifying irregular market behaviors.
This repository encompasses a complete MLOps lifecycle, broken down into the following core functionalities:
- Real-time Streaming: Consumes continuous websocket streams from the Finnhub API for real-time trade data.
- Historical Batching: Periodically pulls historical market data using Alpha Vantage for robust training sets and baselining.
- Delta Live Tables (DLT): A declarative pipeline architecture transforming raw data into high-value assets.
- Bronze: Raw ingestion layer.
- Silver: Cleaned, formatted, and validated transactional data.
- Gold: Aggregated features ready for ML training and inference.
- Feature Store Integration: Centralized tracking and lookup of engineered features to maintain consistency between offline training and online serving.
- Multi-Model Support: Implementations for XGBoost, LightGBM, Random Forest, and Isolation Forest.
- Model Tournament: An automated training system that concurrently trains multiple algorithms on the latest Gold data, tuning hyperparameters and evaluating relative performance across custom metrics (e.g., PR AUC, F1 Score).
- Data & Concept Drift Detection: Continuously calculates Population Stability Index (PSI) and Jensen-Shannon Divergence on incoming live data against reference windows.
- Automated Retraining: Programmatic triggers that initiate a retraining pipeline (Model Tournament) if performance degrades or substantial drift is detected.
To ensure zero-downtime, safe, and highly performant model deployments, this project employs advanced deployment and release strategies:
Before any newly trained model is deployed to production, it must survive a "Champion vs. Challenger" validation phase. The system automatically benchmarks the Challenger against the currently deployed Champion across primary (PR AUC) and secondary (F1, Precision, Recall) metrics. A new model is only promoted if it proves definitively better based on configured thresholds.
Production deployments support A/B testing directly within Databricks Model Serving. Traffic can be weighted and split between the stable model and a newly promoted model, allowing the team to measure real-world performance differences without impacting all end-users.
Model serving includes a dedicated rollback_manager. If continuous monitoring detects severe performance degradation or latency spikes in the newest deployment, the system can automatically orchestrate a rollback to the previous known-good model state, minimizing business risk.
Infrastructure as Code (IaC) and pipeline automation are handled using Databricks Asset Bundles (DABs) (databricks.yml). Changes merged to the main branch trigger GitHub Actions that automatically validate code, run tests, and deploy the updated resources (Jobs, DLT Pipelines, Workflows) to Databricks environments (Dev, Acc, Prd).
For a comprehensive guide on setting up the infrastructure, running pipelines, and managing operations, please refer to our internal Operational Documents:
- 🧭 Project Structure Map
- 🚀 Comprehensive MLOps Setup Guide
- ✅ Pipeline Testing & Validation
- 📖 Operations Runbook
Note: Canonical configurations are found in project_config.yml and pyproject.toml.