Skip to content

Commit a0b6e06

Browse files
Merge pull request #2 from VForWaTer/feature/multiple-input-modes
multiple input modes
2 parents 76c5b5f + dd569f3 commit a0b6e06

11 files changed

Lines changed: 476394 additions & 25944 deletions

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,4 +39,4 @@ RUN cd /src/report && npm install
3939
COPY ./CITATION.cf[f] /src/CITATION.cff
4040

4141
WORKDIR /src
42-
CMD ["python", "run.py"]
42+
CMD ["sh", "-c", "python detect_input.py && python run.py"]

README.md

Lines changed: 57 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -1,128 +1,87 @@
1-
# tool_simulation_evaluation
1+
# Simulation Evaluation
22

3-
[![Docker Image CI](https://https://github.com/KIT-HYD/simulation_evaluation/blob/combined-data/.github/workflows/docker-image.yml/badge.svg)](https://github.com/KIT-HYD/simulation_evaluation/blob/combined-data/.github/workflows/docker-image.yml)
3+
A containerized tool for evaluating hydrological simulations against observations across multiple catchments. It computes standard performance metrics and produces an interactive HTML report with time series plots and statistical summaries.
44

5+
## How it works
56

6-
This is a containerized Python tool following the Tool Specification for reusable research software using Docker.
7+
For each catchment, the tool:
8+
1. Auto-detects the input mode from the structure of `/in`
9+
2. Loads observation and simulation time series
10+
3. Computes performance metrics (NSE, KGE, R², MSE, RMSE)
11+
4. Writes a metrics summary to `/out`
12+
5. Generates a self-contained interactive HTML report
713

8-
Data:
9-
CAMELS-DE: hydrometeorological time series and attributes for 1582 catchments in Germany
10-
A. Dolich et al.
11-
https://doi.org/10.5281/zenodo.13837553
14+
---
1215

13-
Model code and software:
14-
Hy2DL: Hybrid Hydrological modeling using Deep Learning methods
15-
Eduardo Acuña Espinoza et al.
16-
https://github.com/KIT-HYD/Hy2DL/tree/v1.1
16+
## Input data
1717

18-
## Description
18+
The tool supports three input structures, **auto-detected** at runtime — no `input.json` or mode parameter needed. Just mount your data in `/in` with the right structure.
1919

20-
The simulation evaluation tool is designed to assess the performance of hydrological simulations against observed hydrological data. It automates the process
21-
of loading data from multiple catchments, computing key evaluation metrics, and generating visualizations for a comprehensive analysis. The tool outputs an
22-
interactive HTML report containing performance summaries, time series plots, and statistical comparisons. The tool also outputs a .csv file conatining all the
23-
metrics for the cathcments.
20+
### Mode 0 — per-location files, both columns in one file
2421

25-
## Key features
26-
27-
## 1. Data loading and preprocessing
28-
29-
- Supports loading both simulation and observation data from CSV or Parquet files.
30-
- Allows flexible structure
31-
- Separate files for observed and simulated data
32-
- A single file containing both observations and simulations
33-
- Uses wildcards to match multiple files within directories.
34-
35-
## 2. Performance metrics
36-
- For each catchment, the tool calculates the most frequently used hydrological performance metrics, such as:
37-
- Nash-Sutcliffe Efficiency (NSE)
38-
- Kling-Gupta Efficiency (KGE)
39-
- Coefficient of determination (R2)
40-
- Mean Squared Error (MSE)
41-
- Root Mean Squared Error (RMSE)
42-
43-
## 3. Output generation
44-
45-
- Saves results in .csv and .json formats:
46-
- metrics_summary.csv - A summary of computed metrics for all cathcments
47-
- metrics_summary.json - JSON representation for programmatic access
48-
- Generates an HTML report containing:
49-
- Time series plots for catcments
50-
- Performance metric tables
51-
52-
53-
## How generic?
54-
55-
Tools using this template can be run by the [toolbox-runner](https://github.com/hydrocode-de/tool-runner).
56-
That is only convenience, the tools implemented using this template are independent of any framework.
57-
58-
The main idea is to implement a common file structure inside container to load inputs and outputs of the
59-
tool. The template shares this structures with the [R template](https://github.com/vforwater/tool_template_r),
60-
[NodeJS template](https://github.com/vforwater/tool_template_node) and [Octave template](https://github.com/vforwater/tool_template_octave),
61-
but can be mimiced in any container.
22+
```
23+
/in/
24+
discharge_5694.csv ← columns: date, obs, sim
25+
discharge_8731.csv
26+
```
6227

63-
Each container needs at least the following structure:
28+
### Mode 1 — per-location files, separate obs and sim
6429

6530
```
66-
/
67-
|- in/
68-
| |- parameters.json
69-
|- out/
70-
| |- ...
71-
|- src/
72-
| |- tool.yml
73-
| |- run.py
31+
/in/
32+
obs/
33+
discharge_5694.csv ← columns: date, obs
34+
discharge_8731.csv
35+
sim/
36+
discharge_5694.csv ← columns: date, sim
37+
discharge_8731.csv
7438
```
7539

76-
* `parameters.json` are parameters. Whichever framework runs the container, this is how parameters are passed.
77-
* `tool.yml` is the tool specification. It contains metadata about the scope of the tool, the number of endpoints (functions) and their parameters
78-
* `run.py` is the tool itself, or a Python script that handles the execution. It has to capture all outputs and either `print` them to console or create files in `/out`
40+
Catchments are matched by the last `_`-delimited segment of the filename (`discharge_5694.csv``5694`).
7941

80-
## How to build the image?
42+
### Mode 2 — two combined files, all catchments
8143

82-
You can build the image from within the root of this repo by
8344
```
84-
docker build -t tbr_python_tempate .
45+
/in/
46+
all_observations.csv ← columns: date, catchment_id, obs
47+
all_simulations.csv ← columns: date, catchment_id, sim
8548
```
8649

87-
Use any tag you like. If you want to run and manage the container with [toolbox-runner](https://github.com/hydrocode-de/tool-runner)
88-
they should be prefixed by `tbr_` to be recognized.
50+
The tool detects Mode 2 when exactly two data files are present in `/in`. It identifies obs vs sim from the filename (expects `obs` to appear in the observation filename). The location column is assumed to be named `catchment_id`.
8951

90-
Alternatively, the contained `.github/workflows/docker-image.yml` will build the image for you
91-
on new releases on Github. You need to change the target repository in the aforementioned yaml.
52+
---
9253

93-
## How to run?
54+
## Column name defaults
9455

95-
This template installs the json2args python package to parse the parameters in the `/in/parameters.json`. This assumes that
96-
the files are not renamed and not moved and there is actually only one tool in the container. For any other case, the environment variables
97-
`PARAM_FILE` can be used to specify a new location for the `parameters.json` and `TOOL_RUN` can be used to specify the tool to be executed.
98-
The `run.py` has to take care of that.
56+
The auto-detection script writes `input.json` with these default column names:
9957

100-
To invoke the docker container directly run something similar to:
101-
```
102-
docker run --rm -it -v /path/to/local/in:/in -v /path/to/local/out:/out -e TOOL_RUN=foobar tbr_python_template
103-
```
58+
| Parameter | Default |
59+
|---|---|
60+
| `index_column` | `date` |
61+
| `observation_column` | `obs` |
62+
| `simulation_column` | `sim` |
63+
| `location_column` | `catchment_id` *(Mode 2 only)* |
10464

105-
Then, the output will be in your local out and based on your local input folder. Stdout and Stderr are also connected to the host.
65+
If your files use different column names, edit `/in/input.json` after the container starts, or provide your own `input.json` before running — the auto-detection will skip writing a new one if it detects one already exists.
10666

107-
With the [toolbox runner](https://github.com/hydrocode-de/tool-runner), this is simplyfied:
67+
---
10868

109-
```python
110-
from toolbox_runner import list_tools
111-
tools = list_tools() # dict with tool names as keys
69+
## Outputs
11270

113-
foobar = tools.get('foobar') # it has to be present there...
114-
foobar.run(result_path='./', foo_int=1337, foo_string="Please change me")
115-
```
116-
The example above will create a temporary file structure to be mounted into the container and then create a `.tar.gz` on termination of all
117-
inputs, outputs, specifications and some metadata, including the image sha256 used to create the output in the current working directory.
71+
| File | Description |
72+
|---|---|
73+
| `/out/metrics_summary.csv` | Per-catchment metrics table |
74+
| `/out/metrics_summary.json` | Same data in JSON format |
75+
| `/out/simulation_report.html` | Self-contained interactive report |
76+
77+
---
78+
79+
## References
11880

119-
## What about real tools, no foobar?
81+
**Data:** CAMELS-DE — hydrometeorological time series for 1582 German catchments.
82+
A. Dolich et al. https://doi.org/10.5281/zenodo.13837553
12083

121-
Yeah.
84+
**Model:** Hy2DL — Hybrid Hydrological modeling using Deep Learning.
85+
Eduardo Acuña Espinoza et al. https://github.com/KIT-HYD/Hy2DL/tree/v1.1
12286

123-
1. change the `tool.yml` to describe your actual tool
124-
2. add any `pip install` or `apt-get install` needed to the dockerfile
125-
3. add additional source code to `/src`
126-
4. change the `run.py` to consume parameters and data from `/in` and useful output in `out`
127-
5. build, run, rock!
12887

0 commit comments

Comments
 (0)