A Python implementation for combining multiple machine learning models using Bayesian Networks. This project provides tools for model stacking, feature importance analysis, and automated hyperparameter tuning using Bayesian optimization.
- Bayesian Network model stacking
- Automated hyperparameter optimization
- Feature importance analysis
- Multiple classifier support (Random Forest, SVM, XGBoost, etc.)
- Comprehensive performance metrics and visualizations
- Support for both sequential and parallel processing
- Python 3.8 or higher
- pip package manager
- Virtual environment management tools
- Clone the repository:
git clone https://github.com/Narden91/bayesian-combining.git
cd bayesian-combining- Create and activate a Python virtual environment:
Linux/macOS:
python -m venv env
source env/bin/activateWindows:
python -m venv env
env\Scripts\activate- Install required dependencies:
pip install -r requirements.txtThe project uses YAML configuration files located in the config/ directory:
config.yaml: Main configuration file for model parameters- Custom configurations can be added for different experiments
python main.pypython main_multiprocessing.pypython main.py --config path/to/custom_config.yamlThe project is organized as follows. Each Python module in the src directory has a specific responsibility:
bayesian-combining/
├── config/ # Configuration files
│ └── config.yaml # Main configuration file
├── data/ # Data directory
├── output/ # Output directory for results
├── src/ # Source code
│ ├── aggregate_results_combining.py
│ ├── bayesian_net_importance_score.py
│ ├── bayesian_net_importance.py
│ ├── bayesian_network.py
│ ├── classification.py
│ ├── explainability.py
│ ├── hyperparameters.py
│ ├── importance_tracker.py
│ ├── main_backup.py
│ ├── main_multiprocessing.py
│ ├── main_process.py
│ ├── main.py
│ ├── preprocessing.py
│ ├── results_analysis.py
│ ├── task_analysis.py
│ └── utils.py
├── .gitignore # Git ignore file
├── bayesian_folder_conv.py
├── LICENSE # License file
├── organize_results.ipynb # Jupyter notebook for results organization
├── README.md # Project documentation
└── requirements.txt # Project dependencies
-
Bayesian Network Implementation:
bayesian_network.py: Core implementation of Bayesian Network model stackingbayesian_net_importance.py: Feature importance analysis using Bayesian Networksbayesian_net_importance_score.py: Scoring mechanisms for Bayesian Network features
-
Model Management:
classification.py: Implementation of various classification modelshyperparameters.py: Hyperparameter optimization using Optunapreprocessing.py: Data preprocessing and feature engineering
-
Analysis and Tracking:
importance_tracker.py: Tracks feature importance across experimentsresults_analysis.py: Analysis of experimental resultsexplainability.py: Model explainability toolstask_analysis.py: Task-specific analysis utilities
-
Core Processing:
main.py: Main entry point for sequential processingmain_multiprocessing.py: Entry point for parallel processingmain_process.py: Core processing logicutils.py: Utility functions used across the project
-
Additional Tools:
organize_results.ipynb: Jupyter notebook for organizing and visualizing resultsbayesian_folder_conv.py: Utilities for folder structure conversion
When installing new packages, update requirements.txt:
pip freeze > requirements.txtpython -m pytest tests/- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- TODO
Emanuele Nardone - emanuele.nardone@unicas.it
Project Link: https://github.com/Narden91/bayesian-combining