๐ฏ Machine Learning to Enhance Customer Retention
This project aims to predict customer churn in the telecom industry using machine learning models such as Decision Tree, XGBoost, and SVM. By identifying at-risk customers early, telecom companies can take proactive steps to retain them and reduce revenue loss. The end-to-end pipeline includes preprocessing, model building, evaluation, and deployment through a FastAPI.
- Problem Statement
- Objective
- Challenges
- Project Lifecycle
- File Description
- Usage
- Tools and Technologies
- Success Criteria
- Expected Outcome
- References
- Connect With Me
Customer churn significantly impacts telecom business growth. Retaining existing customers is more cost-effective than acquiring new ones. This project aims to develop a churn prediction model using customer usage, demographics, and service data to enable personalized retention strategies.
- Predict whether a customer is likely to churn using ML algorithms
- Provide actionable insights into churn drivers
- Deploy a user-friendly prediction tool using FastAPI
- Imbalanced dataset: Fewer churners vs. non-churners
- Missing/inconsistent data in customer records
- Selecting the most impactful features
- Ensuring robust performance on unseen data
- Providing explainability for business decision-making
- Data Collection
- Kaggleโs Telco Customer Churn dataset
- Data Preprocessing
- Handling nulls, encoding categoricals, scaling features
- Exploratory Data Analysis (EDA)
- Visualizing churn rates by tenure, services, and demographics
- Model Building
- Train Decision Tree, XGBoost, and SVM classifiers
- Model Evaluation
- Accuracy, ROC-AUC, precision, recall, F1-score
- Model Deployment
- FastAPI for churn prediction
- Monitoring & Maintenance
- Model performance tracking and periodic updates
Preprocessing.pyโ Prepares and cleans raw input dataTrain_model.pyโ Trains models and saves best onepredict.pyโ Runs model inference on new/test dataAPI.pyโ Interactive APIeda.ipynbโ Exploratory data analysis notebookmodeling.ipynbโ Model training and comparison notebookevaluation.ipynbโ Evaluation metrics and visualizations
- Prepare your
train-data.csvandtest-data.csv. - Run preprocessing:
python Preprocessing.py - Train model:
python Train_model.py - Predict churn on test data:
python predict.py - Run API:
uvicorn main:app --reload
- Achieve high classification accuracy and ROC-AUC score on test data.
- Demonstrate the modelโs ability to generalize well to unseen data.
- Provide actionable insights into churn drivers via EDA and feature importance.
- Deliver an easy-to-use Streamlit app for business users to predict churn on new data.
- Maintain reproducibility and code modularity for future enhancements.
Download from Kaggle Telco Customer Churn