This project provides an Automatic Machine Learning (AutoML) application built using Streamlit. Users can upload datasets in CSV/XLSX format, select a model, and train it automatically with preprocessing and evaluation metrics. The trained model and its performance metrics can be downloaded for further use.
- Supports multiple models:
- Random Forest (Classification & Regression)
- Logistic Regression (Classification)
- Support Vector Machine (SVM) (Classification & Regression)
- Linear Regression (Regression)
- Automatic Preprocessing:
- Handles missing values
- Encodes categorical variables
- Scales numerical features
- Performance Metrics:
- Classification: Accuracy, Precision, Recall, F1-score, Confusion Matrix
- Regression: Mean Squared Error (MSE), R² Score
- Visualizations:
- Confusion matrix for classification
- Downloadable Outputs:
- Trained model (
.pklfile) - Model performance report (
metrics.txt)
- Trained model (
Make sure you have Python installed (>=3.7). You can install the required dependencies using:
pip install -r requirements.txtRun the application with:
streamlit run app.py- Upload a dataset (CSV/XLSX format).
- Select a model type (Random Forest, Logistic Regression, SVM, or Linear Regression).
- Click "Train Model" to start training.
- View the metrics and confusion matrix (for classification models).
- Download the trained model and performance report.
For testing, you can use datasets such as:
- Iris Dataset (for classification)
- Boston Housing Dataset (for regression)
- Streamlit (UI framework)
- Scikit-Learn (Machine learning models)
- Pandas & NumPy (Data handling)
- Seaborn & Matplotlib (Visualization)
- Joblib (Model serialization)
This project is open-source and licensed under the MIT License.
Feel free to contribute by submitting issues or pull requests!
Developed by Tharun Bala.