Full-stack ML application for estimating annual insurance charges with SHAP-based explainability and LLM-powered interpretation.
- Insurance Charges Prediction – AutoGluon tabular regression estimating annual insurance costs from demographic and health factors
- SHAP Explainability – Per-prediction feature contributions showing how each input affects the estimated charge
- LLM Interpretation – Human-readable explanation of predictions with headline, key factors, and caveats via OpenAI
- Extrapolation Warnings – Alerts when input values fall outside the model's training distribution
- Model Evaluation – Built-in reports with R², MAPE, and SMAPE metrics and business interpretation
- EDA Reports – Automated exploratory data analysis with visualizations in Markdown format
The system uses an AutoGluon TabularPredictor trained on the US Health Insurance Dataset (1,300 records with age, sex, BMI, children, smoker status, and region). When a user submits parameters, the backend runs the prediction, checks for extrapolation beyond training bounds, computes SHAP feature contributions using a TreeExplainer, and generates a structured interpretation via GPT-4o-mini. The interpretation includes a headline, bullet-point explanations of key cost drivers, and caveats about model limitations.
| Category | Technologies |
|---|---|
| Backend | Python 3.13, FastAPI, Uvicorn |
| Frontend | TypeScript, Next.js, React, Tailwind CSS |
| AI/ML | AutoGluon, SHAP, OpenAI |
| Data | pandas, NumPy, scikit-learn, Pydantic |
| Package Management | uv (backend), pnpm (frontend) |
| Deployment | Docker, GitHub Actions, Google Artifact Registry |
# Backend
cd backend
cp .env.example .env # Add your OpenAI API key
uv sync
uv run uvicorn insurance_pricing.main:app --reload
# Frontend
cd frontend
pnpm install
pnpm devOpen http://localhost:3000 (frontend) and http://localhost:8000/docs (API docs).
