Demand Forecast

A production-ready demand forecasting system using state-of-the-art Transformer neural networks. Predicts SKU-level demand with confidence intervals, supporting multiple time series clustering, categorical feature embeddings, and advanced architectures including TFT, PatchTST, and lightweight models for CPU deployment.

Features

Multiple Model Architectures:
- Standard: Transformer encoder-decoder with optional improvements (RoPE, Pre-LN, FiLM, stochastic depth)
- Advanced (V2): Research-grade model with TFT-style variable selection, PatchTST embeddings, and quantile forecasting
- Lightweight: CPU-optimized models (TCN, MLP-Mixer) for edge deployment with ONNX export
Hyperparameter Tuning: Optuna integration for automated hyperparameter optimization
Multi-Cluster Training: Automatic time series clustering with separate models per cluster
Categorical Embeddings: Dynamic embedding layers for product attributes (color, size, category, etc.)
Uncertainty Quantification: Quantile forecasting and confidence intervals
CLI Interface: Easy-to-use command-line tools for training, evaluation, tuning, and prediction
Configurable Pipeline: YAML-based configuration for all hyperparameters
Production Ready: Proper logging, error handling, type hints, and comprehensive test suite

Quick Start

Installation

# Clone the repository
git clone https://github.com/alessiosavi/demand-forecast.git
cd demand-forecast

# Install in development mode
pip install -e ".[dev]"

# For ONNX export support (lightweight models)
pip install -e ".[deploy]"

Generate Sample Data

demand-forecast generate-data sample_data.csv --products 50 --stores 10 --days 730

Train a Model

# Copy and customize the example configuration
cp config.example.yaml config.yaml

# Train the standard model
demand-forecast train --config config.yaml

# Train with advanced model
demand-forecast train --config config.yaml --model-type advanced

# Train with lightweight model for CPU deployment
demand-forecast train --config config.yaml --model-type lightweight

Hyperparameter Tuning

# Run Optuna hyperparameter tuning
demand-forecast tune --config config.yaml --n-trials 50 --timeout 3600

Generate Predictions

demand-forecast predict models/model.pt input_data.csv --config config.yaml --output predictions.csv

Model Architectures

Standard Model (`AdvancedDemandForecastModel`)

Classic Transformer encoder-decoder with optional modern improvements:

Feature	Description
RoPE	Rotary Position Embeddings for better sequence modeling
Pre-LayerNorm	Improved training stability
FiLM Conditioning	Feature-wise Linear Modulation for static features
Stochastic Depth	Regularization through random layer dropping
Improved Head	GELU activation in output projection

Advanced Model V2 (`AdvancedDemandForecastModelV2`)

Research-grade architecture combining state-of-the-art techniques:

Variable Selection Networks (VSN): TFT-style feature importance learning
Gated Residual Networks (GRN): Enhanced information flow
Patch Embedding: PatchTST-style time series tokenization
Series Decomposition: Autoformer-style trend/seasonality separation
Quantile Output: Probabilistic forecasting with uncertainty

Lightweight Models

CPU-optimized architectures for edge deployment:

Model	Description	Parameters
`LightweightDemandModel`	TCN + FiLM conditioning	< 1M
`LightweightMixerModel`	MLP-Mixer architecture	< 500K

Features:

ONNX export for optimized inference
TorchScript compilation
INT8 quantization support

Architecture Overview

demand_forecast/
├── cli.py                 # Typer CLI commands
├── config/settings.py     # Pydantic configuration models
├── core/
│   ├── pipeline.py        # Main orchestration
│   ├── trainer.py         # Training loop with callbacks
│   ├── evaluator.py       # Validation and metrics
│   └── tuning.py          # Optuna hyperparameter tuning
├── data/
│   ├── loader.py          # Data loading with validation
│   ├── preprocessor.py    # Scaling, filtering, resampling
│   ├── feature_engineering.py  # Categorical encoding
│   └── dataset.py         # PyTorch Dataset
├── models/
│   ├── transformer.py     # Standard model with improvements
│   ├── transformer_v2.py  # Advanced research-grade model
│   ├── lightweight.py     # CPU-optimized models
│   ├── components.py      # Reusable building blocks
│   ├── losses.py          # Loss functions (Huber, quantile, SMAPE)
│   └── wrapper.py         # Multi-cluster ModelWrapper + factory
├── inference/
│   ├── predictor.py       # Prediction wrapper
│   └── confidence.py      # Confidence interval calculation
└── utils/                 # Utilities (clustering, metrics, etc.)

Configuration

The system is configured via YAML files. See config.example.yaml for all options:

data:
  input_path: "sales_data.csv"
  resample_period: "1W"        # Weekly aggregation
  max_zeros_ratio: 0.7         # Filter sparse SKUs

timeseries:
  window: 52                   # 52-week lookback
  n_out: 16                    # 16-week forecast horizon

model:
  model_type: "standard"       # standard, advanced, or lightweight
  d_model: 256                 # Transformer dimension
  nhead: 8                     # Attention heads
  num_encoder_layers: 4
  num_decoder_layers: 4
  dropout: 0.3
  # Optional improvements
  use_rope: false              # Rotary Position Embeddings
  use_pre_layernorm: false     # Pre-LN for stability
  use_film_conditioning: false # FiLM for static features
  stochastic_depth_rate: 0.0   # Stochastic depth regularization

training:
  num_epochs: 10
  batch_size: 128
  learning_rate: 0.00001
  early_stop_patience: 3

tuning:
  enabled: false
  n_trials: 50
  timeout: 3600                # 1 hour timeout
  metric: "mse"                # Optimize MSE
  sampler: "tpe"               # TPE sampler

CLI Commands

Command	Description
`train`	Train the demand forecasting model
`evaluate`	Evaluate a trained model on test data
`predict`	Generate predictions with confidence intervals
`tune`	Run Optuna hyperparameter tuning
`generate-data`	Generate synthetic sales data for testing
`preprocess`	Preprocess raw data and save as Parquet
`version`	Show version information

Common Options

All commands support --verbose (-v) for debug logging.

Training with visualization:

demand-forecast train --config config.yaml --plot --plot-dir plots/training

Hyperparameter tuning:

demand-forecast tune --config config.yaml --n-trials 100 --metric mae

Evaluation with metrics export:

demand-forecast evaluate models/model.pt data.csv --config config.yaml \
    --output metrics.json --plot

Prediction with confidence intervals and plots:

demand-forecast predict models/model.pt data.csv --config config.yaml \
    --output predictions.csv --confidence 0.95 --plot

Run demand-forecast --help for detailed usage information.

Hyperparameter Tuning

The system includes Optuna integration for automated hyperparameter optimization:

from demand_forecast.core.tuning import HyperparameterTuner, TuningConfig, SearchSpace

# Define search space
search_space = SearchSpace(
    d_model=[64, 128, 256],
    nhead=[4, 8],
    num_layers=(1, 6),
    dropout=(0.1, 0.5),
    learning_rate=(1e-5, 1e-3),
)

# Configure tuning
config = TuningConfig(
    n_trials=50,
    timeout=3600,
    metric="mse",
    direction="minimize",
)

# Run tuning
tuner = HyperparameterTuner(config, search_space)
best_params = tuner.tune(train_data, val_data)

Or use the convenience function:

from demand_forecast.core.tuning import quick_tune

best_params = quick_tune(
    train_dataloader=train_dl,
    val_dataloader=val_dl,
    n_trials=20,
)

Model Details

Transformer Encoder-Decoder

The model uses a Transformer architecture optimized for time series:

Static Embeddings: SKU and categorical features are embedded and projected
Encoder: Processes historical sales with positional encoding
Decoder: Generates forecasts with causal masking and cross-attention
Multi-Cluster: Separate model per cluster for heterogeneous demand patterns

New Components

Component	Description
`RotaryPositionEmbedding`	RoPE for better long-range dependencies
`GatedResidualNetwork`	TFT-style gated residual connections
`VariableSelectionNetwork`	Learnable feature importance
`PatchEmbedding`	PatchTST-style time series tokenization
`SeriesDecomposition`	Autoformer trend/seasonality separation
`FiLMConditioning`	Feature-wise Linear Modulation
`TemporalConvNet`	Dilated causal convolutions
`InterpretableMultiHeadAttention`	TFT-style interpretable attention

Loss Functions

HuberLoss: Robust to outliers (default)
QuantileLoss: For probabilistic forecasting
CombinedForecastLoss: Huber + quantile + decomposition
SMAPELoss: Symmetric MAPE for percentage errors
MASELoss: Mean Absolute Scaled Error

Clustering

Time series are clustered using K-means on TSFresh meta-features. The optimal K is selected via the elbow method with Davies-Bouldin Index validation.

Development

Setup Development Environment

# Install with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

# Run linting
ruff check .
mypy demand_forecast/

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=demand_forecast

# Run specific test categories
pytest tests/test_models/           # Model tests
pytest tests/test_models/test_tuning.py  # Tuning tests

Project Structure

demand-forecast/
├── demand_forecast/       # Main package
├── tests/                 # Test suite
│   ├── test_models/       # Model and tuning tests
│   ├── test_core/         # Core module tests
│   └── test_utils/        # Utility tests
├── docs/                  # Documentation
├── config.example.yaml    # Example configuration
├── pyproject.toml         # Project metadata
└── Makefile              # Development commands

Documentation

Architecture Guide - System design and component details
API Reference - Module and class documentation
Configuration Guide - All configuration options
Quick Start Guide - Step-by-step tutorial
Development Guide - Contributing and development setup

Requirements

Python 3.10+
PyTorch 2.0+
Optuna 3.0+ (for hyperparameter tuning)
CUDA (optional, for GPU acceleration)

See pyproject.toml for complete dependency list.

License

MIT License - see LICENSE for details.

Citation

If you use this software in your research, please cite:

@software{demand_forecast,
  title = {Demand Forecast: Transformer-based Time Series Forecasting},
  year = {2025},
  url = {https://github.com/alessiosavi/demand-forecast}
}

Acknowledgments

Built with PyTorch
Hyperparameter tuning with Optuna
Configuration via Pydantic
CLI powered by Typer
Feature extraction with TSFresh

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
demand_forecast		demand_forecast
docs		docs
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.gpu		Dockerfile.gpu
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.example.yaml		config.example.yaml
debug_notebook.py		debug_notebook.py
debug_quick.py		debug_quick.py
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Demand Forecast

Features

Quick Start

Installation

Generate Sample Data

Train a Model

Hyperparameter Tuning

Generate Predictions

Model Architectures

Standard Model (AdvancedDemandForecastModel)

Advanced Model V2 (AdvancedDemandForecastModelV2)

Lightweight Models

Architecture Overview

Configuration

CLI Commands

Common Options

Hyperparameter Tuning

Model Details

Transformer Encoder-Decoder

New Components

Loss Functions

Clustering

Development

Setup Development Environment

Running Tests

Project Structure

Documentation

Requirements

License

Citation

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Standard Model (`AdvancedDemandForecastModel`)

Advanced Model V2 (`AdvancedDemandForecastModelV2`)

Packages