EDpyFlow is a containerized, end-to-end Python workflow built on TEASER and OpenModelica for running building energy simulations and training surrogate models for predicting residential building heat demand. All dependencies are bundled in the container — OpenModelica, AixLib, TEASER, and the required Python environments — so no prior knowledge of Modelica tooling is required.
In its default configuration, the pipeline covers four TABULA DE building typologies (SFH, TH, MFH, AB), six German cities, and three refurbishment levels, yielding 21,600 building configurations per fidelity level. The resulting trained models constitute the EDSurrogate model family (EDSurrogate-2el and EDSurrogate-4el).
The pipeline proceeds in five stages. Each stage is self-contained and uses file-based data exchange. The pipeline can be entered or interrupted at any point without reprocessing upstream results.
| Step | Script | Description |
|---|---|---|
| 1 | src/sampling/generate_samples.py |
LHS sampling of building configurations |
| 2 | src/modeling/generate_thermal_models.py |
Generates thermal models in Modelica using TEASER |
| 3 | src/simulation/run_simulations.py |
Runs annual energy simulations in OpenModelica |
| 4 | src/data_prep/generate_dataset.py |
Assembles simulation results into a dataset |
| 5 | src/training/train_surrogate.py |
Trains an XGBoost surrogate model |
- Apptainer to build and run the container
- Weather files in
.mosformat for the six locations (seedata/locations/README.md)
Build the container before first use (from the repo root):
cd container && apptainer build ../EDpyFlow.sif EDpyFlow.defAll parameters are set in config.yaml:
run_name— name of the run; outputs are written toruns/{run_name}/locations— city names and their weather filesrefurbishment_status— refurbishment levels to simulatesampling— LHS parameters (samples per typology, seed, criterion)num_elements— number of RC elements in the thermal modelsimulation— simulation duration, timestep, and optional raw output retentionsurrogate— XGBoost hyperparameters, train/val/test split ratios, and model name
Note: Change
run_namefor each new run to avoid overwriting previous results.
Run the full pipeline:
python EDpyFlow.pyRun a single step:
python EDpyFlow.py --step sampling
python EDpyFlow.py --step teaser
python EDpyFlow.py --step simulate
python EDpyFlow.py --step dataset
python EDpyFlow.py --step surrogateA custom container path can be specified with --container:
python EDpyFlow.py --container /path/to/EDpyFlow.sifAll outputs are written to runs/{run_name}/:
runs/{run_name}/
├── config.yaml ← copy of config at time of run
├── samples.csv ← building configurations (Step 1)
├── simulation_input/ ← Modelica packages (Step 2)
│ ├── residentials_berlin/
│ └── ...
├── simulation_output/ ← simulation results (Step 3)
│ ├── sim_results_berlin.json
│ └── ...
├── simulation.log ← simulation log (Step 3)
├── synthetic_dataset/
│ └── dataset.csv ← training dataset (Step 4)
└── models/
└── {model_name}.json ← trained surrogate model (Step 5)
For questions, please contact bagherinejad@mbd.rwth-aachen.de or open an issue at https://github.com/mbd-rwth/EDpyFlow/issues.
This work was performed as part of the ENERsyte project and received funding from Innovationsförderagentur.NRW through the Grüne Gründungen.NRW initiative of the Ministry for the Environment, Nature Conservation and Transport of the State of North Rhine-Westphalia within the framework of the EFRE/JTF-Programme NRW 2021-2027, Co-funded by the European Union (EFRE-20800324).