A comprehensive machine learning system for predicting river discharge at long term (monthly to seasonal) timescales using ensemble methods and advanced feature engineering.
Author: Sandro Hunziker
π’ Active β Ongoing development as part of SAPPHIRE Forecast Tools
The Long Term Discharge Forecasting System implements a modular pipeline that integrates multiple data sources, employs various machine learning algorithms, and uses ensemble techniques to produce robust discharge predictions. The system is designed for operational forecasting in Central Asian basins with complex hydrology influenced by snow and glacier dynamics. This code was develeoped in the context of the SAPPHIRE project and funded by the Swiss Agency for Development and Cooperation. The code is the backbone of the machine learning based long-term forecasting component in the SAPPHIRE Forecast Tools. Note that this repository is still work in progress.
The diagram above illustrates the complete workflow from data ingestion to operational forecasts, including feature engineering, model training, and ensemble generation.
- Multiple Model Families: Linear regression baselines and gradient boosted tree-based models (XGBoost, LightGBM, CatBoost)
- Advanced Feature Engineering: Time-series features, elevation band aggregation, and glacier-related features
- Ensemble Methods: Naive averaging, temporal meta-models, and uncertainty quantification
- Comprehensive Evaluation: Interactive dashboards and extensive metrics
- Modular Architecture: Easy to extend with new models and data sources
- Model Descriptions - Detailed model specifications and ensemble strategies
- Feature Engineering - Feature extraction and preprocessing pipelines
- Adding New Models - Guide for implementing custom forecast models
- Development & Production Plan - Workflow for package integration
- Data Processing - Data loading and feature engineering utilities
- Model Implementations - Individual model class details
- Evaluation Pipeline - Metrics and evaluation workflows
- Visualization Dashboard - Interactive dashboard guide
- Testing Guide - Test suite organization and best practices
For using it as a package (as in the SAPPHIRE Forecast Tools).
pip install git+https://github.com/hydrosolutions/long-term-forecasting.git@vX.X.X- Python 3.11
- uv package manager
# Clone the repository
git clone [repository-url]
cd lt_forecasting
# Install uv if not already installed
pip install uv
# Install project dependencies
uv sync# Run calibration and hindcast for a model
uv run python scripts/calibrate_hindcast.py --config_path example_config/DUMMY_MODEL
# Run hyperparameter tuning
uv run python scripts/tune_hyperparams.py --config_path example_config/DUMMY_MODEL
# Run complete evaluation pipeline
./run_evaluation_pipeline.sh
# Launch interactive dashboard
uv run python -m dev_tools.visualization.dashboardThe system follows a modular approach where different forecasting classes implement similar interfaces but handle different model types:
- LINEAR_REGRESSION: Statistical baseline with period-specific models
- SciRegressor: ML models (XGBoost, LightGBM, CatBoost)
This modular design enables:
- Efficient ensemble creation (process data once, train multiple models)
- Flexible feature combinations
- Easy addition of new model types
Example ensemble strategy:
- Models: XGBoost, LightGBM, CatBoost
- Feature sets:
- Fβ β (Q, P, T)
- Fβ β (Q, T, P, Snow data)
- Fβ β (GlacierMapper, Q, T, P)
- Result: 9 different models that can be ensembled
lt_forecasting/
βββ lt_forecasting/ # Core production package
β βββ __init__.py # Package initialization
β βββ scr/ # Data processing and feature engineering
β β βββ data_loading.py # Data ingestion and merging
β β βββ data_utils.py # Preprocessing and transformations
β β βββ FeatureExtractor.py # Time-series feature engineering
β β βββ FeatureProcessingArtifacts.py # Preprocessing state management
β β βββ sci_utils.py # ML utilities
β β βββ [documentation.md](lt_forecasting/scr/documentation.md) # Component documentation
β β
β βββ forecast_models/ # Model implementations
β β βββ base_class.py # Abstract base class for all models
β β βββ LINEAR_REGRESSION.py # Period-specific linear regression
β β βββ SciRegressor.py # Tree-based models (XGB, LGBM, CatBoost)
β β βββ [documentation.md](lt_forecasting/forecast_models/documentation.md) # Model details
β β
β βββ log_config.py # Logging configuration
β
βββ dev_tools/ # Development-only tools
β βββ eval_scr/ # Evaluation utilities
β β βββ metric_functions.py # Performance metrics (NSE, KGE, RΒ², etc.)
β β βββ eval_helper.py # Evaluation helper functions
β β βββ [description.md](dev_tools/eval_scr/description.md) # Metrics documentation
β β
β βββ evaluation/ # Evaluation pipeline
β β βββ evaluate_pipeline.py # Main evaluation orchestrator
β β βββ ensemble_builder.py # Ensemble creation and management
β β βββ prediction_loader.py # Load and process predictions
β β βββ [README.md](dev_tools/evaluation/README.md) # Pipeline documentation
β β
β βββ visualization/ # Interactive dashboard and plotting
β βββ dashboard.py # Streamlit-based dashboard
β βββ dashboard_components.py # UI components
β βββ plotting_utils.py # Visualization functions
β βββ [README.md](dev_tools/visualization/README.md) # Dashboard guide
β
βββ example_config/ # Configuration templates
β βββ DUMMY_MODEL/ # Example configuration set
β βββ [description.md](example_config/description.md) # Config guide
β
βββ scripts/ # Development scripts
β βββ calibrate_hindcast.py # Model training and prediction
β βββ tune_hyperparams.py # Hyperparameter optimization
β
βββ tests/ # Comprehensive test suite
β βββ unit/ # Unit tests for individual components
β βββ functionality/ # Functionality and integration tests
β βββ integration/ # Full integration tests
β βββ [README.md](tests/README.md) # Testing guide
β
βββ docs/ # Project documentation
β βββ [Overview.md](docs/Overview.md) # System architecture
β βββ [model_description.md](docs/model_description.md) # Model details
β
βββ scratchpads/ # Development notes and planning
β βββ issues/ # Issue-specific work
β βββ planning/ # Feature planning
β βββ [README.md](scratchpads/README.md) # Development workflow
β
βββ setup.py # Package installation script
βββ pyproject.toml # Project configuration
βββ Shell Scripts:
βββ calibration_script.sh # Basic calibration
βββ tune_and_calibrate_script.sh # Tuning + calibration
βββ run_evaluation_pipeline.sh # Full evaluation
βββ run_model_workflow.sh # Complete workflow
Each model experiment requires a configuration directory with:
data_paths.json- Input data file pathsexperiment_config.json- Experiment setup and basin selectionfeature_config.json- Feature engineering parametersgeneral_config.json- Model and processing settingsmodel_config.json- Algorithm-specific hyperparameters
See example_config/ for templates.
The system uses a robust validation approach:
- Leave-One-Out Cross-Validation (LOO-CV): Applied to all years except the last 3
- Hold-out Test Set: Final 3 years reserved for unbiased evaluation
- Meta-learning Assumption: LOO-CV predictions represent model behavior on unseen data
- Period-specific models (36 periods = 3 per month)
- Features from discharge, precipitation, temperature
- Snow information from SnowMapper FSM (elevation zones)
- Serves as baseline and input for advanced models
- Algorithms: XGBoost, LightGBM, CatBoost, ... All models which support sklearn-style fit and predict funtions.
- Extended Features:
- All linear regression features
- GlacierMapper data (SLA, FSC)
- Linear regression predictions as meta-features
- Advantages: Captures non-linear relationships and interactions
- Hyperparameter Tuning Hyperparameter tuning using Optuna is implemented only for XGB, LGBM and Catboost.
- Naive Ensemble: Simple average of all base predictors
- Uncertainty Quantification: Asymmetric Laplace distribution for prediction intervals.
- Temporal Meta-Model: Detects and corrects model drift using historical forecasts [not yet operationally tested]
predictions.csv:
date | Q_model1 | Q_model2 | Q_model3 | Q_model_name | valid_from | valid_to
model_name referes to the feature set used and Q_model_name is the mean of model_1 ... model_n.
predictions_uncertainty.csv:
date | Q_05 | Q_10 | Q_50 | Q_90 | Q_95 | Q_mean | valid_from | valid_to
The system integrates multiple data sources:
- Discharge Data: Historical river discharge observations
- Forcing Data: Temperature and precipitation
- Snow Data: Snow water equivalent (SWE), height of snow (HS), runoff (ROF)
- Static Basin Characteristics: Elevation, area, glacier coverage
- Remote Sensing Timeseries / New Features New features can easily be added and passed to the classes in long dataframe format.
The system supports extensive feature engineering, all features are fully configurable in the feature_config.json:
- Rolling window statistics (mean, slope, peak-to-peak)
- Multiple window sizes and lag periods
- Period-based features (36 periods = 3 per month)
- Elevation band aggregation (configurable zones)
- Basin-specific characteristics
- Glacier-related features from GlacierMapper
- Global normalization
- Per-basin normalization
- Long-term mean scaling (period-based)
# Full model workflow (tuning + calibration + evaluation)
./run_model_workflow.sh
# Hyperparameter tuning followed by calibration
./tune_and_calibrate_script.sh
# Evaluation pipeline for multiple models
./run_evaluation_pipeline.sh
# Basic calibration only
./calibration_script.sh# Launch interactive dashboard
uv run python visualization/dashboard.py
# Dashboard will be available at http://localhost:8501-
Copy the example configuration:
cp -r example_config/DUMMY_MODEL example_config/MY_MODEL
-
Edit configuration files according to your needs:
- Update data paths in
data_paths.json - Select basins in
experiment_config.json - Configure features in
feature_config.json - Set model parameters in
model_config.json
- Update data paths in
# Run all tests
uv run pytest -v
# Run specific test file
uv run pytest tests/test_linear_regression.py -v
# Run with coverage
uv run pytest --cov=. --cov-report=html# Format code with ruff
uv run ruff format
# Check code style
uv run ruff check
-
Create a scratchpad for complex features:
touch scratchpads/planning/my-feature.md
-
Use the scratchpad template (see scratchpads/README.md)
-
Follow test-driven development practices
-
Document architectural decisions
Sandro Hunziker - hunziker@hydrosolutions.ch
