Sales Forecasting with Explainable AI (XAI)

Executive Summary

This repository defines an integrated sales forecasting system resolving the Walmart Recruiting II Sales in Stormy Weather analytical challenge. The defining problem requires predictive algorithms to quantify how severe meteorological phenomena influence the purchasing velocity of weather-sensitive retail inventory across diverse geographic locations. The foundational training data originates directly from the official Kaggle competition registry [//www.kaggle.com/competitions/walmart-recruiting-sales-in-stormy-weather].

The technical implementation unifies a centralized data warehouse methodology with a structured Machine Learning Operations pipeline. The infrastructure relies on PostgreSQL as the foundational Relational Database Management System. Data Build Tool executes structured query logic to map raw inputs into analytical dimensional models. The machine learning sequence incorporates Optuna for mathematical hyperparameter optimization. A FastAPI application serves inference payloads. A Streamlit graphical interface visualizes Explainable Artificial Intelligence interpretations.

Architectural Hierarchy

The physical distribution of files reflects stringent structural separation defining specific operational scopes.

sales_forecasting_xai/
├── backend/                # Application Programming Interface network endpoints
├── data_pipeline/          # Database runtime and analytical query formulation
│   ├── dbt/                # Data Build Tool dimensional transformations
│   └── infra/              # Virtual container orchestration definitions
├── web_ui/               # Graphical interface application elements
├── ml/                     # Machine learning algorithms and tuning matrices
└── shared/                 # Centralized parameter targets and temporary local storage

The operational domains enforce strict capability boundaries.

Directory Module	Evaluated Capability
data_pipeline	Database infrastructure provisioning alongside analytical logic aggregation
ml	Predictive algorithm mathematical training and modeling configurations
shared	Global variable assignments enforcing parameter inheritance
backend	Endpoint mapping logic distributing trained model inferences
web_ui	Graphical translation protocols analyzing interpretation patterns

Deployment Strategy

Required Dependencies

The implementation requires specific host libraries.

A container runtime environment
Python version 3.10 and above
The uv package manager
Free network ports spanning 5432 for database access
Free network ports spanning 8000 for backend routing and 8501 for frontend display

Execution Sequence

Step 1. Start PostgreSQL.

cd data_pipeline/infra/postgres
docker compose up -d

Step 2. Prepare data and train the model.

cd ml
uv run python scripts/prepare_data.py
uv run python scripts/tune.py
uv run python scripts/train.py --best-params outputs/best_params.json

Step 3. Start backend and frontend from the project root.

docker compose up -d

Backend API: http://localhost:8000 — docs at /docs
Dashboard: http://localhost:8501

Environment Configuration

The application authenticates using variables located within the root environment configuration file.

Environment Variable	Operational Boundary
POSTGRES_USER	Master username for PostgreSQL authentication
POSTGRES_PASSWORD	Security key for PostgreSQL access
POSTGRES_DB	Target database namespace
POSTGRES_HOST	Database host network address
POSTGRES_PORT	Database communication port

Strategic Decisions

The system topology reflects precise engineering decisions.

PostgreSQL and Data Build Tool. PostgreSQL provides a standard relational engine simplifying data persistence. Data Build Tool guarantees idempotency and testability for SQL transformations.
Optuna. Optuna applies mathematical optimization techniques to replace exhaustive grid search matrices.
FastAPI and Streamlit. FastAPI implements asynchronous task execution supporting simultaneous client connections. Streamlit facilitates the mathematical translation of Explainable Artificial Intelligence matrices into visual representation charts.

Navigation Guide

The repository enforces modular separation of concerns.

Application Frontend. Evaluates the interactive components and Explainable Artificial Intelligence frameworks.
Application Backend. Outlines the prediction rendering boundaries.
Machine Learning Logic. Describes the pipeline constructing the LightGBM models.
Database Infrastructure. Contextualizes the containerized PostgreSQL environment.
Analytical Models. Delineates the Data Build Tool structured queries.
Shared Resources. Identifies centralized parameter constraints and local staging records.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
backend		backend
data_pipeline		data_pipeline
images		images
ml		ml
shared		shared
web_ui		web_ui
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sales Forecasting with Explainable AI (XAI)

Executive Summary

Architectural Hierarchy

Deployment Strategy

Required Dependencies

Execution Sequence

Environment Configuration

Strategic Decisions

Navigation Guide

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Sales Forecasting with Explainable AI (XAI)

Executive Summary

Architectural Hierarchy

Deployment Strategy

Required Dependencies

Execution Sequence

Environment Configuration

Strategic Decisions

Navigation Guide

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages