A comprehensive implementation of various regression techniques using Python, including Simple Linear Regression, Multiple Linear Regression, Polynomial Regression, and advanced techniques like Ridge and Lasso Regression, as well as Logistic Regression for classification tasks. The project also includes a web application for deploying the models.
This project demonstrates different regression and classification analysis techniques:
- Linear Regression Techniques:
- Simple Linear Regression (Height-Weight Analysis)
- Multiple Linear Regression (Economic Index Analysis)
- Polynomial Regression (Non-linear Data Modeling)
- Ridge and Lasso Regression (Regularization Techniques with Forest Fires Analysis)
- Model Training and Optimization (Forest Fire Prediction)
- Logistic Regression Classification:
- Binary Classification
- Multiclass Classification
- Handling Imbalanced Datasets
- ROC-AUC Evaluation
- Web Application for Model Deployment
- Clone the repository:
git clone https://github.com/Abhinavexists/Regression.git
cd Regression- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install required packages:
pip install -r requirements.txt- numpy (>= 1.21.0)
- pandas (>= 1.3.0)
- matplotlib (>= 3.4.0)
- seaborn (>= 0.11.0)
- scikit-learn (>= 1.0.0)
- jupyter (>= 1.0.0)
- ipykernel (>= 6.0.0)
- statsmodels (>= 0.13.0)
- flask (for web application)
- pickle (for model serialization)
- Analyzes the relationship between height and weight
- Features standardization and model evaluation
- Includes R-squared and adjusted R-squared calculations
- Analyzes economic indices including interest rates and unemployment rates
- Implements feature scaling and cross-validation
- Includes comprehensive model evaluation metrics
- Demonstrates handling of non-linear relationships
- Implements polynomial feature transformation
- Includes model comparison with different polynomial degrees
- Implements L1 and L2 regularization techniques
- Handles multicollinearity in features
- Demonstrates parameter tuning using cross-validation
- Compares performance between Ridge and Lasso models
- Uses Algerian Forest Fires dataset for real-world application
- Advanced model training techniques
- Hyperparameter optimization
- Model validation and testing
- Performance comparison across different regression techniques
- Forest fire prediction using weather and environmental factors
- Data preprocessing and feature engineering for real-world data
- Binary classification implementation
- Multiclass classification techniques
- Strategies for handling imbalanced datasets
- ROC curve analysis and AUC score evaluation
- Threshold optimization
- Probability calibration
- Flask-based web interface for model predictions
- User-friendly input form for forest fire prediction
- Deployment-ready configuration for AWS Elastic Beanstalk
- Saved models for quick inference
- Make sure all dependencies are installed:
pip install flask- Run the application locally:
python application.py- Access the application in your browser at
http://localhost:8080
The project includes configuration for deployment to AWS Elastic Beanstalk:
- Install the EB CLI:
pip install awsebcli- Initialize your EB application:
eb init -p python-3.8 regression-app- Create an environment and deploy:
eb create regression-env- Open the deployed application:
eb openAll implementations include various evaluation metrics:
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- Root Mean Squared Error (RMSE)
- R-squared Score
- Adjusted R-squared Score
- Cross-validation Scores
Each notebook includes detailed visualizations:
- Scatter plots
- Correlation matrices
- Regression lines and decision boundaries
- Residual plots
- Pair plots
- Regularization path plots
- Learning curves
- ROC curves and confusion matrices
The project uses several datasets:
- Height_Weight.csv: Basic dataset for simple linear regression
- Economic_Index.csv: Economic indicators for multiple linear regression
- Algerian_forest_fires_dataset_UPDATE.csv: Original forest fires dataset
- Algerian_forest_fires_cleaned_dataset.csv: Preprocessed forest fires dataset for advanced regression techniques
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.