Skip to content

Commit b568862

Browse files
committed
feat: add comprehensive README for Financial AI MLOps project
1 parent 945006a commit b568862

1 file changed

Lines changed: 65 additions & 0 deletions

File tree

README.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Financial AI MLOps
2+
3+
![Python](https://img.shields.io/badge/Python-3.11+-blue.svg)
4+
![Databricks](https://img.shields.io/badge/Databricks-Asset%20Bundles-orange.svg)
5+
![MLflow](https://img.shields.io/badge/MLflow-3.1.1-blue.svg)
6+
7+
## Introduction
8+
9+
**Financial AI MLOps** is an enterprise-grade machine learning operations project focused on detecting anomalies in financial market transactions. Built on Databricks, it provides a comprehensive end-to-end framework starting from real-time streaming data ingestion to automated model retraining and serving.
10+
11+
The primary objective is to reliably capture financial data streams (e.g., via Finnhub WebSockets), process them using Delta Live Tables (DLT), generate predictive features, and serve robust anomaly detection models capable of identifying irregular market behaviors.
12+
13+
## Key Functionalities
14+
15+
This repository encompasses a complete MLOps lifecycle, broken down into the following core functionalities:
16+
17+
### 1. Data Ingestion (Streaming & Historical)
18+
* **Real-time Streaming:** Consumes continuous websocket streams from the Finnhub API for real-time trade data.
19+
* **Historical Batching:** Periodically pulls historical market data using Alpha Vantage for robust training sets and baselining.
20+
21+
### 2. Data Processing & Feature Engineering (DLT)
22+
* **Delta Live Tables (DLT):** A declarative pipeline architecture transforming raw data into high-value assets.
23+
* *Bronze:* Raw ingestion layer.
24+
* *Silver:* Cleaned, formatted, and validated transactional data.
25+
* *Gold:* Aggregated features ready for ML training and inference.
26+
* **Feature Store Integration:** Centralized tracking and lookup of engineered features to maintain consistency between offline training and online serving.
27+
28+
### 3. Model Training & Tournament
29+
* **Multi-Model Support:** Implementations for XGBoost, LightGBM, Random Forest, and Isolation Forest.
30+
* **Model Tournament:** An automated training system that concurrently trains multiple algorithms on the latest Gold data, tuning hyperparameters and evaluating relative performance across custom metrics (e.g., PR AUC, F1 Score).
31+
32+
### 4. Advanced Monitoring & Observability
33+
* **Data & Concept Drift Detection:** Continuously calculates Population Stability Index (PSI) and Jensen-Shannon Divergence on incoming live data against reference windows.
34+
* **Automated Retraining:** Programmatic triggers that initiate a retraining pipeline (Model Tournament) if performance degrades or substantial drift is detected.
35+
36+
---
37+
38+
## Deployment Strategies
39+
40+
To ensure zero-downtime, safe, and highly performant model deployments, this project employs advanced deployment and release strategies:
41+
42+
### Champion / Challenger Gating
43+
Before any newly trained model is deployed to production, it must survive a "Champion vs. Challenger" validation phase. The system automatically benchmarks the Challenger against the currently deployed Champion across primary (PR AUC) and secondary (F1, Precision, Recall) metrics. A new model is only promoted if it proves definitively better based on configured thresholds.
44+
45+
### A/B Testing & Traffic Splitting
46+
Production deployments support A/B testing directly within Databricks Model Serving. Traffic can be weighted and split between the stable model and a newly promoted model, allowing the team to measure real-world performance differences without impacting all end-users.
47+
48+
### Automated Rollbacks
49+
Model serving includes a dedicated `rollback_manager`. If continuous monitoring detects severe performance degradation or latency spikes in the newest deployment, the system can automatically orchestrate a rollback to the previous known-good model state, minimizing business risk.
50+
51+
### CI/CD with Databricks Asset Bundles
52+
Infrastructure as Code (IaC) and pipeline automation are handled using **Databricks Asset Bundles (DABs)** (`databricks.yml`). Changes merged to the main branch trigger GitHub Actions that automatically validate code, run tests, and deploy the updated resources (Jobs, DLT Pipelines, Workflows) to Databricks environments (Dev, Acc, Prd).
53+
54+
---
55+
56+
## Getting Started
57+
58+
For a comprehensive guide on setting up the infrastructure, running pipelines, and managing operations, please refer to our internal [Operational Documents](./Operational_Documents/):
59+
60+
* 🧭 [Project Structure Map](./Operational_Documents/PROJECT_STRUCTURE.md)
61+
* 🚀 [Comprehensive MLOps Setup Guide](./Operational_Documents/COMPREHENSIVE_MLOPS_SETUP_GUIDE.md)
62+
*[Pipeline Testing & Validation](./Operational_Documents/GETTING_STARTED_PIPELINE_TESTING.md)
63+
* 📖 [Operations Runbook](./Operational_Documents/RUNBOOK.md)
64+
65+
*Note: Canonical configurations are found in `project_config.yml` and `pyproject.toml`.*

0 commit comments

Comments
 (0)