AutoML is a high-performance analytics engine designed to automate the end-to-end machine learning lifecycle. By ingesting raw CSV data, the system orchestrates a sophisticated pipeline—from automated preprocessing and meta-learning-driven model selection to hyperparameter optimization and explainable AI (XAI).
The platform bridges the gap between raw data and actionable intelligence, delivering a polished web interface for real-time monitoring, visual insights, and comprehensive PDF reporting.
-
Intelligent Data Preprocessing: Automated handling of missing values, encoding of categorical variables, and date-time feature engineering. Outputs a standardized
processed.csvready for modeling. -
Deep Exploratory Data Analysis (EDA): Generates summary statistics, missing-value profiles, correlation heatmaps, and distribution plots. All assets are version-controlled under
runs/<run_id>/. - Meta-Learning Engine: Extracts high-level dataset characteristics (meta-features) to recommend optimal model architectures based on historical experiment performance.
-
AutoML Model Training: Automatically detects problem types (Regression/Classification), trains a diverse model zoo, and persists the champion model as
best_model.pkl. - Hyperparameter Optimization: Leverages Optuna for Bayesian optimization, fine-tuning models like XGBoost and Random Forest beyond default configurations.
-
Comprehensive Evaluation:
-
Regression:
$R^2$ ,$RMSE$ ,$MAE$ ,$MSE$ . - Classification: Accuracy, Confusion Matrices, and Precision-Recall curves.
-
Regression:
- Explainable AI (XAI): Integrated SHAP and Feature Importance analysis to provide transparency into model decision-making.
- Experiment Tracking: A RAG-inspired memory system that stores meta-features and performance metrics to improve future model recommendations.
- Modern Web UI: Responsive dashboard built with Flask and vanilla HTML/JS/Tailwind.
- Seamless Workflow: One-click pipeline execution with live status updates.
- Centralized Results: Interactive leaderboard, downloadable PDF reports, and localized storage indexed by unique
run_id.
| Component | Technologies |
|---|---|
| Backend | Python, Flask (REST API), Scikit-learn, XGBoost |
| Optimization/XAI | Optuna, SHAP |
| Data & Viz | Pandas, Numpy, Seaborn, Matplotlib (Agg backend) |
| Frontend | HTML5, Tailwind CSS, JavaScript (Fetch API) |
| Architecture | Thread-based Async Training, Modular ML Pipelines, CORS-enabled Communication |
1. Clone the repository and install dependencies:
git clone https://github.com/BhaveshBhakta/Intelligent-ML-Analytics-Engine.git
cd Intelligent-ML-Analytics-Engine
pip install -r requirements.txt2. Launch the backend:
python -m backend.app3. Access the UI:
Navigate to http://localhost:5000 in your browser. All outputs are automatically persisted in:
runs/<RUN_ID>/
- Advanced Meta-Learning: Implementing transformer-based recommendation models.
- Deep Learning: Integration of PyTorch/TensorFlow for neural architecture search (NAS).
- Enterprise Readiness: Multi-user authentication and Dockerized cloud deployment (AWS/GCP).